TiFlash Crash Failure in Version 7.1.2

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 7.1.2tiflash宕机故障

| username: wenyi

When creating TiFlash replicas, I found that TiFlash crashed.


After restarting TiFlash, it crashes again after a few minutes. This is a very strange phenomenon. Even though TiFlash has crashed, the system load is very high, and the number of WA processes is large and lasts for a long time. Only by restarting the system does the load come down.

| username: zhanggame1 | Original post link

Where is the tilflash log?

| username: tidb菜鸟一只 | Original post link

Is there only TiFlash on this machine? Check the logs.

| username: wenyi | Original post link

These are all the tiflash log files. There is another log file that is very large, I will compress it and upload it.

| username: wenyi | Original post link

tiflash.zip (10.0 MB)

| username: wenyi | Original post link

This node only deploys TiFlash.

| username: tidb菜鸟一只 | Original post link

Is the continuous performance analysis feature on your dashboard page enabled? If it is enabled, try turning it off and then try again.
Disable TiFlash continuous profiling in dashboard · Issue #1529 · pingcap/tidb-dashboard · GitHub

| username: TiDBer_小阿飞 | Original post link

Where is the configuration file for TiFlash?

| username: wenyi | Original post link

Not enabled.

| username: wenyi | Original post link

The configuration file is default, no parameters have been modified.

| username: TiDBer_yyy | Original post link

Watching…
It feels like there is a problem with TiFlash, and apart from scaling up or down, there isn’t a good solution.

| username: 有猫万事足 | Original post link

You should still share the configuration file for us to take a look.

I checked the logs and found:

!!!=========================Modifications in meta haven’t persisted=========================!!!

This error is not the root cause of the problem.

Because by the time it reaches here, the entire TiFlash has already started to crash.
The issue seems more related to:

[2023/10/29 07:52:54.155 +08:00] [WARN] [StorageConfigParser.cpp:273] [“The configuration path is deprecated. Check [storage] section for new style.”] [thread_id=1]
[2023/10/29 07:52:54.748 +08:00] [WARN] [Server.cpp:658] [“No TCP and HTTP servers are created”] [thread_id=1]

The fact that no TCP and HTTP servers are created seems to be the main issue, and this problem is somewhat related to the parsing of the above configuration.
Therefore, what is written in the configuration file and whether the configuration file pointed to during TiFlash startup is correct are more valuable to check.

| username: TiDBer_yyy | Original post link

I misread the post and replied incorrectly. I also encountered a similar issue. Both TiFlash instances went down…

| username: 有猫万事足 | Original post link

These two configuration files must exist, right? It wouldn’t be surprising if it doesn’t start without them.

bin/tiflash/tiflash server --config-file conf/tiflash.toml

If you check with ps -ef, the tiflash process will also specify this configuration file as a parameter when it starts.