After adding a service to the TiDB cluster, executing the command: tiup cluster reload tidb-test throws an exception

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb集群添加服务后,执行命令:tiup cluster reload tidb-test 抛异常

| username: johnnnyli

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version] 5.4
[Reproduction Path] Operations performed that led to the issue
[Encountered Issue: Issue Phenomenon and Impact] error: init config failed: mt1010:2379: transfer from /root/.tiup/storage/cluster/clusters/tidb-test/config-cache/pd-mt1010-2379.service to /tmp/pd_335adcbd-631d-45fb-acd5-e296898abb06.service failed: executor.ssh.execute_failed: Failed to transfer file over SCP for ‘tidb@mt1010:9922’ {ssh_stderr: Authentication failed. lost connection, ssh_stdout: , ssh_command: scp -r -o StrictHostKeyChecking=no -P 9922 -o ConnectTimeout=5 -i /root/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa /root/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa /root/.tiup/storage/cluster/clusters/tidb-test/config-cache/pd-gwmidc1010-2379.service tidb@mt1010:/tmp/pd_335adcbd-631d-45fb-acd5-e296898abb06.service}, cause: exit status 1
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

| username: hey-hoho | Original post link

Error message indicates scp failure, test if there is an issue with SSH mutual trust.

| username: johnnnyli | Original post link

There is no issue with SSH mutual trust, and there is no problem with node-to-node hopping. I have set up passwordless authentication for both the root and tidb users.

| username: tiger-liu | Original post link

The reason is that the /tmp directory does not have sufficient permissions. You can manually execute the following command on the control node:
scp -r -o StrictHostKeyChecking=no -P 9922 -o ConnectTimeout=5 -i /root/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa /root/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa /root/.tiup/storage/cluster/clusters/tidb-test/config-cache/pd-gwmidc1010-2379.service tidb@mt1010:/tmp/pd_335adcbd-631d-45fb-acd5-e296898abb06.service

| username: johnnnyli | Original post link

I just tried this scp command. My installation and usage were done using the root user, but this scp is using the tidb user. I don’t quite understand why this is.

| username: TiDBer_pkQ5q1l0 | Original post link

If tiup does not specify a user, it will default to using the current user.

| username: johnnnyli | Original post link

However, from this error, it seems that the file distribution is using the tidb user. Is it necessary to specify a particular user when scaling up or down?

| username: johnnnyli | Original post link

I’ll give it a try.

| username: xingzhenxiang | Original post link

It’s best to use the one generated by tiup for TiDB’s password-free setup.

| username: 孤君888 | Original post link

It looks like an SSH connection issue based on the error message.

| username: johnnnyli | Original post link

By setting up passwordless authentication between TiDB and the root user, and placing each other’s public keys into their respective authorized_keys files, the issue has been resolved.

| username: johnnnyli | Original post link

Copy the public and private keys under the root user to the tiup directory, and re-execute tiup cluster reload tidb-pro

cp /root/.ssh/id_rsa /home/tidb/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa
cp /root/.ssh/authorized_keys /home/tidb/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa.pub

This method has resolved the issue.