Summary of TiKV Unexpected Crash and Restart Failure: Panic Mark File

  1. TiKV cannot restart
  2. Deleting the panic_mark_file and restarting still fails

Solution: First, expand the TiKV instance into the cluster, then shrink the faulty instance;

[2024/01/25 20:46:37.618 +08:00] [FATAL] [] [“panic_mark_file tidb/tidb-data/tikv-20160/panic_mark_file exists, there must be something wrong with the db. Do not remove the panic_mark_file and force the TiKV node to restart. Please contact TiKV maintainers to investigate the issue. If needed, use scale in and scale out to replace the TiKV node. Scale a TiDB Cluster Using TiUP | PingCAP Docs”]

The log clearly states, “Do not remove the panic_mark_file and force the TiKV node to restart.”

It is really commendable and worth learning that TiDB error messages include solutions! :+1:

Do not remove the panic_mark_file and force the TiKV node to restart.

An important method for troubleshooting issues is logs.

Forcing a restart is useless; it will result in an error.

Forced restart is useless. I have encountered this before, and restarting or dealing with files doesn’t help.

The logs are quite detailed.

Learned something new again.

Learned, mark.

