Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tiup cluster check 失败
[TiDB Usage Environment] / Test / Poc
[TiDB Version] 7.1
tiup cluster check failed
[root@localhost ssh]# tiup cluster check mytidb7 --cluster
tiup is checking updates for component cluster …
Starting component cluster
: /root/.tiup/components/cluster/v1.14.0/tiup-cluster check mytidb7 --cluster
- Download necessary tools
- Downloading check tools for linux/amd64 … Done
- Collect basic system information
- Getting system info of 192.168.0.100:22 … Error
Error: executor.ssh.execute_failed: Failed to execute command over SSH for ‘tidb@192.168.0.100:22’ {ssh_stderr: , ssh_stdout: , ssh_command: export LANG
sr/sbin /usr/bin/sudo -H bash -c “test -d /tmp || (mkdir -p /tmp && chown tidb:$(id -g -n tidb) /tmp)”}, cause: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
Verbose debug logs have been written to /root/.tiup/logs/tiup-cluster-debug-2023-12-11-22-55-41.log.
tiup-cluster-debug-2023-12-11-22-55-41.log (17.7 KB)
Is SSH mutual trust all normal?
I am in a single-machine test environment, and I remember changing the hostname.
I installed it using root, and self-testing with root showed no issues.
However, I see in the logs that it accesses tidb@192.168.0.100. I’m not quite sure what the trust mechanism here is?
Can tidb@192.168.0.100 be connected via SSH?
Can you check what your global.user configuration is?
Issues with SSH not being able to connect
Have you configured passwordless access? Try using telnet to check if port 22 is accessible.
Here is the default TiDB. SSH to tidb@192.168.0.100 is connected, but a password is required.
I have encountered the same problem. After upgrading to 5.0.1, the problem was resolved.
I did an strace, I’ll study it tomorrow. It’s a personal test environment, so there’s no rush, and I haven’t found any other anomalies.
This error occurs when tiup executes the SSH command to connect to the target machine on port 22 and encounters an access issue. You need to first check the network connectivity and port status.
Add a password option and manually enter the password.
Try manually SSHing into root@192.168.0.22 to see if you can access it.
Try using the following command: tiup cluster check ./topo.yaml --apply --user root -p
It should be this issue. Tiup cluster also uses the tidb user to SSH over. So if the passwordless login is not set up, you can try using -p to add a password.
Tried it, still doesn’t work well.
Thank you everyone, I won’t reply to each one individually. I found the reason, it was because SELinux was not disabled.
Here is the reproduction process:
Found concrete evidence in the audit log
This issue is indeed difficult to troubleshoot, but the initial error message also indicated that the network link was down.