Timed out waiting during TiFlash node restart in TiUP cluster upgrade

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tiup升级集群过程中重启 tiflash 节点 timed out waiting

| username: magdb

[Test Environment] Testing environment
[TiDB Version] V7.1.1
[Reproduction Path]
Plan to use tiup to upgrade the TiDB cluster from v7.1.1 to 7.1.3 in the testing environment. During the upgrade and restart of TiFlash, an error occurred: “failed to start: failed to start tiflash,” causing the upgrade to terminate. Subsequent display checks of the cluster status show TiFlash status as up.
[Encountered Problem: Symptoms and Impact]
The upgrade process terminates at TiFlash and cannot proceed to the next step. Restarting TiFlash results in the same issue. Checking the TiFlash logs shows the following error:


No TCP and HTTP servers are created
[Resource Configuration]

[Attachments: Screenshots/Logs/Monitoring]

| username: Jellybean | Original post link

The information in this log indicates that a deprecated path parameter configuration was used. It seems unlikely that the failure to start is caused by this. However, you can try removing this configuration item from the configuration file to see if it helps.

| username: wangccsy | Original post link

Still need to check the network.

| username: magdb | Original post link

The network should be fine. I can normally stop this TiFlash node on the master machine, but restarting it doesn’t work and reports a timeout error. :joy:

| username: magdb | Original post link

After removing the path parameter and restarting the TiFlash node, there is an error: “The configuration storage.main section is not defined. Please check your configuration file.” This configuration file was automatically created during the upgrade.

| username: Jellybean | Original post link

For the configuration parameters of tiflash regarding storage.main, you can refer to the official documentation.

| username: 有猫万事足 | Original post link

I encountered this issue when upgrading to 7.5.

Increasing the tiup wait time resolved it.

I set it to 10 minutes. In the end, it didn’t take that long, but setting it longer should work.

| username: dba远航 | Original post link

There may be incompatible parameters.

| username: 烂番薯0 | Original post link

Can they ping each other and SSH?

| username: tidb菜鸟一只 | Original post link

Could you please share the parameters so we can see how they are configured?