TiFlash Node Keeps Restarting

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiFlash节点不断重启

| username: tracy0984

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version] v5.3.0
[Reproduction Path] No operations
[Encountered Problem: Problem Phenomenon and Impact]
One of the three TiFlash nodes keeps restarting
[Resource Configuration]
[Attachment: Screenshot/Log/Monitoring]

| username: Billmay表妹 | Original post link

It may be due to configuration or hardware issues with the TiFlash node. You can troubleshoot the problem by following these steps:

  1. Check the TiFlash node logs to see if there are any errors or abnormal information. You can use the following command to view the TiFlash node logs:
tail -f /path/to/tiflash/log/tiflash.log
  1. Check the TiFlash node configuration file to ensure that the parameters in the configuration file are set correctly. You can refer to the TiFlash command line parameters documentation to understand the TiFlash configuration parameters.
  2. Check the hardware resources of the TiFlash node, such as CPU, memory, disk, etc., to see if they are sufficient. You can use the following commands to view the hardware resource usage of the TiFlash node:
df -h
| username: tracy0984 | Original post link

tiflash.log log

TiFlash has been running for a while, with a total of 3 nodes. One node suddenly started to keep restarting last night, and no configuration changes were made before the restart.
Checked the operating system resources, CPU, memory, and disk space are all normal at the moment. The resource configuration of the error-reporting node is consistent with the other currently running TiFlash nodes.
Please help to continue analyzing the cause of the error, thank you~

| username: tidb菜鸟一只 | Original post link

Take a look at this table: SELECT * FROM INFORMATION_SCHEMA.TIFLASH_TABLES a WHERE a.DATABASE=‘db_23463’ AND a.TABLE=‘t_23465’;

| username: tracy0984 | Original post link

I found the data rows on the other two nodes:

Is this normal?

| username: tracy0984 | Original post link

I scaled down the problematic node and then scaled it up again. The TiFlash process is now running normally and currently synchronizing data. The TiFlash scaling method is referenced as follows:
Using TiUP to Scale TiDB Cluster | PingCAP Documentation Center

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.