What are the reasons for TiFlash startup failure?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tiflash启动失败是什么原因啊

| username: TiDBer_8AdJLfr2

[TiDB Usage Environment] / Test
[TiDB Version] 6.5
[Reproduction Path] What operations were performed that caused the problem
[Encountered Problem: Problem Phenomenon and Impact]
[Resource Configuration]
[Attachment: Screenshot/Log/Monitoring]

2023-02-14T15:18:48.030+0800 INFO SSHCommand {“host”: “11.11.4.28”, “port”: “22”, “cmd”: “export LANG=C; PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin /usr/bin/sudo -H bash -c "systemctl daemon-reload && systemctl start tiflash-9000.service"”, “stdout”: “”, “stderr”: “”}
2023-02-14T15:18:48.030+0800 INFO CheckPoint {“host”: “11.11.4.28”, “port”: 22, “user”: “tidb”, “sudo”: true, “cmd”: “systemctl daemon-reload && systemctl start tiflash-9000.service”, “stdout”: “”, “stderr”: “”, “hash”: “4c2debca67421ec7b87f91c02032b8fbffa5c9e5”, “func”: “github.com/pingcap/tiup/pkg/cluster/executor.(CheckPointExecutor).Execute", “hit”: false}
2023-02-14T15:18:48.148+0800 INFO SSHCommand {“host”: “11.11.4.28”, “port”: “22”, “cmd”: “export LANG=C; PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin ss -ltn”, “stdout”: "State Recv-Q Send-Q Local Address:Port Peer Address:Port \nLISTEN 0 128 127.0.0.1:6010 : \nLISTEN 0 128 127.0.0.1:46144 : \nLISTEN 0 128 127.0.0.1:34021 : \nLISTEN 0 128 :20180 : \nLISTEN 0 128 :22 : \nLISTEN 0 128 [::1]:6010 [::]: \nLISTEN 0 128 [::]:10080 [::]: \nLISTEN 0 128 [::]:4000 [::]:
\nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:2379 [::]:* \nLISTEN 0 128 [::]:2380 [::]:* \nLISTEN 0 128 [::]:22 [::]:* \n”, “stderr”: “”}
2023-02-14T15:18:48.148+0800 INFO CheckPoint {“host”: “11.11.4.28”, “port”: 22, “user”: “tidb”, “sudo”: false, “cmd”: “ss -ltn”, “stdout”: “State Recv-Q Send-Q Local Address:Port Peer Address:Port \nLISTEN 0 128 127.0.0.1:6010 : \nLISTEN 0 128 127.0.0.1:46144 : \nLISTEN 0 128 127.0.0.1:34021 : \nLISTEN 0 128 :20180 : \nLISTEN 0 128 :22 : \nLISTEN 0 128 [::1]:6010 [::]: \nLISTEN 0 128 [::]:10080 [::]: \nLISTEN 0 128 [::]:4000 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:2379 [::]:* \nLISTEN 0 128 [::]:2380 [::]:* \nLISTEN 0 128 [::]:22 [::]:* \n”, “stderr”: “”, “hash”: “4c2debca67421ec7b87f91c02032b8fbffa5c9e5”, “func”: “github.com/pingcap/tiup/pkg/cluster/executor.(CheckPointExecutor).Execute", “hit”: false}
2023-02-14T15:18:48.249+0800 INFO SSHCommand {“host”: “11.11.4.29”, “port”: “22”, “cmd”: “export LANG=C; PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin /usr/bin/sudo -H bash -c "systemctl daemon-reload && systemctl start tiflash-9000.service"”, “stdout”: “”, “stderr”: “”}
2023-02-14T15:18:48.250+0800 INFO CheckPoint {“host”: “11.11.4.29”, “port”: 22, “user”: “tidb”, “sudo”: true, “cmd”: “systemctl daemon-reload && systemctl start tiflash-9000.service”, “stdout”: “”, “stderr”: “”, “hash”: “4c2debca67421ec7b87f91c02032b8fbffa5c9e5”, “func”: "github.com/pingcap/tiup/pkg/cluster/executor.(CheckPointExecutor).Execute", “hit”: false}
2023-02-14T15:18:48.364+0800 INFO SSHCommand {“host”: “11.11.4.29”, “port”: “22”, “cmd”: “export LANG=C; PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin ss -ltn”, “stdout”: "State Recv-Q Send-Q Local Address:Port Peer Address:Port \nLISTEN 0 128 127.0.0.1:6010 : \nLISTEN 0 128 :20180 : \nLISTEN 0 128 :22 : \nLISTEN 0 128 [::1]:6010 [::]: \nLISTEN 0 128 [::]:10080 [::]: \nLISTEN 0 128 [::]:4000 [::]:
\nLISTEN 0 128 [::]:20160 [::]:
\nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:2379 [::]:* \nLISTEN 0 128 [::]:2380 [::]:* \nLISTEN 0 128 [::]:22 [::]:* \n”, “stderr”: “”}
2023-02-14T15:18:48.364+0800 INFO CheckPoint {“host”: “11.11.4.29”, “port”: 22, “user”: “tidb”, “sudo”: false, “cmd”: “ss -ltn”, “stdout”: “State Recv-Q Send-Q Local Address:Port Peer Address:Port \nLISTEN 0 128 127.0.0.1:6010 : \nLISTEN 0 128 :20180 : \nLISTEN 0 128 :22 : \nLISTEN 0 128 [::1]:6010 [::]: \nLISTEN 0 128 [::]:10080 [::]: \nLISTEN 0 128 [::]:4000 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:20160 [::]:* \nLISTEN 0 128 [::]:2379 [::]:* \nLISTEN 0 128 [::]:2380 [::]:* \nLISTEN 0 128 [::]:22 [::]:* \n”, “stderr”: “”, “hash”: “4c2debca67421ec7b87f91c02032b8fbffa5c9e5”, “func”: “github.com/pingcap/tiup/pkg/cluster/executor.(*CheckPointExecutor).Execute”, “hit”: false}

| username: TiDBer_8AdJLfr2 | Original post link

The output is taken from the /home/tidb/.tiup/logs/tiup-cluster-debug-2023-02-14-15-37-10.log file, and it indicates that the log in /data1/tidb-deploy/tiflash-9000/log is an empty file.

| username: 裤衩儿飞上天 | Original post link

Network not working? Is port 9000 occupied? Is passwordless authentication set up on this node (or is the password correct)? Is the firewall restricting access?

| username: TiDBer_8AdJLfr2 | Original post link

The network is connected, and passwordless SSH login has been set up for the three TiDB servers. Testing shows that all three servers can SSH into each other. The servers are newly installed, and the ports are not occupied.

| username: WalterWj | Original post link

Manually log in to the TiFlash node, go to the script directory under the deploy directory, and manually execute the run_tiflash script. See if it can start and if there are any errors.

| username: tracy0984 | Original post link

Has TiFlash successfully started before?

| username: 考试没答案 | Original post link

Please send your command after desensitizing it. I often see this error. Not sure if it’s for the same reason.

| username: 考试没答案 | Original post link

Also, is your SSH port 22? Ours is 11820. The port may vary depending on the company.