Tiup Unable to Shut Down Instance

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tiup 无法关闭实例

| username: 巨化斑鸠

[TiDB Usage Environment] Production Environment
[TiDB Version] 4.0.2
[Reproduction Path] Stopping TiDB error
[Encountered Problem: Problem Phenomenon and Impact]
tiup cluster stop xxx -N xxxxxx
Stopping component tidb
Stopping instance XXXXX
Failed to execute operation: Failed to activate service ‘org.freedesktop.systemd1’: timed out

Failed to execute operation: Failed to activate service ‘org.freedesktop.systemd1’: timed out

Error: failed to stop tidb: failed to stop: XXXXXX tidb-4000.service, please check the instance’s log(/home/tidb/deploy/log) for more detail.: executor.ssh.execute_failed: Failed to execute command over SSH for ‘TIDB@XXXXX’ {ssh_stderr: Failed to execute operation: Failed to activate service ‘org.freedesktop.systemd1’: timed out , ssh_stdout: , ssh_command: export LANG=C; PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin /usr/bin/sudo -H bash -c “systemctl daemon-reload && systemctl stop tidb-4000.service”}, cause: Process exited with status 1

Due to systemd service timeout, it becomes unavailable. Ultimately, it leads to the inability to stop the TiDB service.

If not using tiup to safely shut down TiDB, TiKV, PD on the server (when systemctl is unavailable), how should it be operated?

[Resource Configuration] Enter TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachment: Screenshot/Log/Monitoring]

| username: cassblanca | Original post link

It seems there is an issue with SSH mutual trust. Please check it first.

| username: forever | Original post link

Can you confirm if SSH can connect? Check the status of the tdb-4000.service on the target node to see if it has started.

| username: zhaokede | Original post link

Take a look at what changes have been made recently at the server system level.

| username: 路在何chu | Original post link

The network is not connected, right? Definitely can’t manage it.

| username: ffeenn | Original post link

  1. High server load causing remote execution timeout, 2. Server network failure.
    First, log in to the TiDB node to check these two items.
| username: 这里介绍不了我 | Original post link

Check if you can SSH into the control machine.

| username: tidb菜鸟一只 | Original post link

Is systemctl no longer available when you log into the corresponding machine?

| username: xingzhenxiang | Original post link

Manually check on the corresponding server to see if it can be executed.

| username: 不想干活 | Original post link

Is it a network issue or are the username and password missing? Try SSH manually.

| username: Kongdom | Original post link

This segment should be the command being executed, right?

| username: dba远航 | Original post link

First, verify if SSH is functioning properly.

| username: zhanggame1 | Original post link

The TiDB component can be killed.

| username: zhaokede | Original post link

The original poster’s issue might persist; even after killing the process, it may not be possible to restart and manage the cluster through tiup.

| username: andone | Original post link

Check if the SSH cluster is accessible.

| username: 这里介绍不了我 | Original post link

Isn’t it a bit too aggressive?

| username: 随便改个用户名 | Original post link

It should be an issue with SSH. The previous SSH failure had a similar error.

| username: 哈喽沃德 | Original post link

First check mutual trust, if it doesn’t work, then kill.

| username: zhanggame1 | Original post link

From the error message, it seems that there is an issue with the mutual trust or permissions of the tidb user. Try to SSH into the tidb user from the tiup machine to see if passwordless login works. After logging in successfully, check if you have the permissions to execute the systemctl command.

| username: 哈喽沃德 | Original post link

It is possible that the passwords were the same at the time, but later one machine changed the password, and mutual trust was never established.