Error during online installation with tiup: executor.ssh.execute_failed: Failed to execute command over SSH

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tiup 在线安装报错Error: executor.ssh.execute_failed: Failed to execute command over SSH

| username: Hacker_5YEF49J7

tiup cluster deploy tidb-join v6.5.0 ./topology.yaml --user root -p

topology.yaml (12.8 KB)
tiup-cluster-debug-2023-01-06-10-59-19.log (513.3 KB)

| username: caiyfc | Original post link

Both 35 and 37 are reporting errors. It seems that the mutual trust between these two IPs is not properly established.

| username: 裤衩儿飞上天 | Original post link

192.168.0.35 and 192.168.0.37 haven’t set up passwordless SSH login, right?

| username: xingzhenxiang | Original post link

Is the SSH mutual trust or root password between 192.168.0.35 and 192.168.0.37 different from other machines?

| username: Hacker_5YEF49J7 | Original post link

I don’t understand why 35 and 37 need to go to /tidb. Shouldn’t they be under the /tidb/app/ directory? 33 is normal.

| username: Hacker_5YEF49J7 | Original post link

I didn’t set up passwordless authentication.

| username: Hacker_5YEF49J7 | Original post link

The root password is the same.

| username: xingzhenxiang | Original post link

Did you check manually? Try this, does it work?

| username: ffeenn | Original post link

Clear the known_hosts file, and also check if the authorized_keys file is normal. After redoing the passwordless setup, try again.

| username: Hacker_5YEF49J7 | Original post link

Manual inspection is fine.

| username: Hacker_5YEF49J7 | Original post link

Is it root doing passwordless login? Or is it the tidb user doing passwordless login?

| username: ffeenn | Original post link

The image is not visible. Please provide the text content that needs to be translated.

| username: xingzhenxiang | Original post link

I have always used password installation. As long as the root password of the machine is consistent, tiup will automatically set up the tidb user and password-free access.

| username: Hacker_5YEF49J7 | Original post link

I also use root. I just don’t understand why logging into 31 allows me to normally create /tidb/app/deploy, but logging into 35 creates the /tidb directory instead.

| username: ffeenn | Original post link

If there have been changes to the sshd service on the server before, it might cause automatic trust failures. Try configuring passwordless login manually again. It is generally related to the known_hosts file.

| username: Billmay表妹 | Original post link

Refer to: TiDB 环境与系统配置检查 | PingCAP 文档中心

| username: tidb菜鸟一只 | Original post link

Global variables are applied to all deployments and used as the default value of the deployments if a specific deployment value is missing.
global:

The user who runs the TiDB cluster.

user: “tidb”

| username: tidb菜鸟一只 | Original post link

I see that you set it to start with the tidb user. Do you have a tidb user on 35 and 37?

| username: 孤君888 | Original post link

It must be the TiDB user doing passwordless authentication.

| username: Hacker_5YEF49J7 | Original post link

31, 35, and 37 all have TiDB users. I manually set up passwordless login successfully on 31, but it didn’t work on 35 and 37. I used the same method, and the sshd_config configuration is also the same. The permissions for the .ssh directory and the files within it are also the same. I don’t know why it’s not working.