Failed to shut down TiDB cluster

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb集群关闭失败

| username: Jackie492391142

[TiDB Usage Environment] Production Environment
[TiDB Version] 5.3.0
[Operation] Used the command tiup cluster stop tidb-cluster to shut down the cluster
[Encountered Issue] Unable to shut down the cluster
[Attachment: Screenshot/Log/Monitoring]
When using the command tiup cluster stop tidb-cluster to shut down the cluster, the following error is prompted. After resetting the passwordless login for the tidb user and re-executing the command, the same error is still prompted. The following error occurred when trying to shut down prometheus-9090 individually.

Error Prompt
Error: failed to stop prometheus: failed to stop: 192.168.1.7 prometheus-9090.service, please check the instance’s log(/tidb/tidb-deploy/prometheus-9090/log) for more detail.: executor.ssh.execute_failed: Failed to execute command over SSH for ‘tidb@192.168.1.7:22’ {ssh_stderr: , ssh_stdout: , ssh_command: export LANG=C; PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin /usr/bin/sudo -H bash -c “systemctl daemon-reload && systemctl stop prometheus-9090.service”}, cause: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain

| username: 我是咖啡哥 | Original post link

First, use the tidb user on the control machine to test if passwordless login is working properly. Also, check if the tidb user has normal sudo permissions on the target machine.

| username: Kongdom | Original post link

Judging by the error message, it seems to be an issue with passwordless authentication. Try resetting the SSH passwordless authentication.

| username: Jackie492391142 | Original post link

Thank you! The reason was due to upgrading the SSH version.

| username: Raymond | Original post link

Isn’t tiup supposed to use its own version of SSH?

| username: 会飞的土拨鼠 | Original post link

Use the command tiup cluster stop tidb-cluster to shut down the cluster. Failure might be related to SSH passwordless login. It could be due to the upgrade of OpenSSH to version 8.8p1 or above because of previous OpenSSH vulnerabilities, which changed some encryption methods. You need to reconfigure the passwordless login between the TiDB cluster and the control machine.

| username: 会飞的土拨鼠 | Original post link

Security scans may reveal some OpenSSH vulnerabilities (high risk). It is possible that manually upgrading to a higher version of OpenSSH has caused the control machine to be unable to control TiDB nodes via passwordless SSH.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.