[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version] v5.3.0
[Reproduction Path] No operations
[Encountered Problem: Problem Phenomenon and Impact]
One of the three TiFlash nodes keeps restarting
[Resource Configuration]
[Attachment: Screenshot/Log/Monitoring]

It may be due to configuration or hardware issues with the TiFlash node. You can troubleshoot the problem by following these steps:

  1. Check the TiFlash node logs to see if there are any errors or abnormal information. You can use the following command to view the TiFlash node logs:
tail -f /path/to/tiflash/log/tiflash.log
  1. Check the TiFlash node configuration file to ensure that the parameters in the configuration file are set correctly. You can refer to the TiFlash command line parameters documentation to understand the TiFlash configuration parameters.
  2. Check the hardware resources of the TiFlash node, such as CPU, memory, disk, etc., to see if they are sufficient. You can use the following commands to view the hardware resource usage of the TiFlash node:
df -h
tiflash.log log

TiFlash has been running for a while, with a total of 3 nodes. One node suddenly started to keep restarting last night, and no configuration changes were made before the restart.
Checked the operating system resources, CPU, memory, and disk space are all normal at the moment. The resource configuration of the error-reporting node is consistent with the other currently running TiFlash nodes.
Please help to continue analyzing the cause of the error, thank you~

Take a look at this table: SELECT * FROM INFORMATION_SCHEMA.TIFLASH_TABLES a WHERE a.DATABASE=‘db_23463’ AND a.TABLE=‘t_23465’;

I found the data rows on the other two nodes:

Is this normal?

I scaled down the problematic node and then scaled it up again. The TiFlash process is now running normally and currently synchronizing data. The TiFlash scaling method is referenced as follows:
Using TiUP to Scale TiDB Cluster | PingCAP Documentation Center

