TiKV Unable to Start

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TIKV 无法启动

| username: Steve阿辉

[TiDB Usage Environment] Production Environment
[TiDB Version] A TiKV server crashed a few days ago. After deleting and re-adding it, data is being backfilled. Due to business needs, the cluster was restarted at noon, and it was found that this TiKV on port 80 could not start. The startup log is as follows.
[Reproduction Path] What operations were performed to cause the issue
[Encountered Problem: Problem Phenomenon and Impact]
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

[2023/03/04 02:20:05.231 +08:00] [INFO] [store.rs:1761] [“merged peer receives a stale message”] [msg_type=MsgRequestPreVote] [current_region_epoch=“conf_ver: 12 version: 2907”] [region_id=85025388]
[2023/03/04 02:20:05.231 +08:00] [INFO] [store.rs:570] [“raft message is stale, tell to gc”] [msg_type=MsgRequestPreVote] [current_region_epoch=“conf_ver: 12 version: 2907”] [region_id=85025388]
[2023/03/04 02:20:05.686 +08:00] [INFO] [advance.rs:296] [“check leader failed”] [to_store=5] [error=“"[get tikv client] store is tombstone \"id: 5 address: \\\"172.16.16.80:20160\\\" state: Tombstone version: \\\"6.1.2\\\" status_address: \\\"172.16.16.80:20180\\\" git_hash: \\\"cbdd34e96961a0be2adff9db050b8d50b13e5b0d\\\" start_timestamp: 1677561910 deploy_path: \\\"/data/tidb-deploy/tikv-20160/bin\\\" last_heartbeat: 1677593725357880105 node_state: Removed\""”]
[2023/03/04 02:20:06.688 +08:00] [INFO] [advance.rs:296] [“check leader failed”] [to_store=5] [error=“"[get tikv client] store is tombstone \"id: 5 address: \\\"172.16.16.80:20160\\\" state: Tombstone version: \\\"6.1.2\\\" status_address: \\\"172.16.16.80:20180\\\" git_hash: \\\"cbdd34e96961a0be2adff9db050b8d50b13e5b0d\\\" start_timestamp: 1677561910 deploy_path: \\\"/data/tidb-deploy/tikv-20160/bin\\\" last_heartbeat: 1677593725357880105 node_state: Removed\""”]
[2023/03/04 02:20:07.689 +08:00] [INFO] [advance.rs:296] [“check leader failed”] [to_store=5] [error=“"[get tikv client] store is tombstone \"id: 5 address: \\\"172.16.16.80:20160\\\" state: Tombstone version: \\\"6.1.2\\\" status_address: \\\"172.16.16.80:20180\\\" git_hash: \\\"cbdd34e96961a0be2adff9db050b8d50b13e5b0d\\\" start_timestamp: 1677561910 deploy_path: \\\"/data/tidb-deploy/tikv-20160/bin\\\" last_heartbeat: 1677593725357880105 node_state: Removed\""”]
[2023/03/04 02:20:07.691 +08:00] [INFO] [store.rs:1761] [“merged peer receives a stale message”] [msg_type=MsgRequestPreVote] [current_region_epoch=“conf_ver: 12 version: 2899”] [region_id=85025358]
[2023/03/04 02:20:07.691 +08:00] [INFO] [store.rs:570] [“raft message is stale, tell to gc”] [msg_type=MsgRequestPreVote] [current_region_epoch=“conf_ver: 12 version: 2899”] [region_id=85025358]
[2023/03/04 02:20:07.867 +08:00] [INFO] [pd.rs:1461] [“try to merge”] [merge=“target { id: 169045 start_key: 7480000000000001FF855F728000000000FF5EF47D0000000000FA end_key: 7480000000000001FF865F72FFFFFFFFFFFFFFFFFF0000000000FB region_epoch { conf_ver: 11 version: 887 } peers { id: 169046 store_id: 1 } peers { id: 169047 store_id: 4 } peers { id: 237644 store_id: 212649 } }”] [region_id=237623]
[2023/03/04 02:20:07.867 +08:00] [WARN] [peer.rs:4217] [“skip proposal”] [error_code=KV:Raftstore:Unknown] [err=“Other("[components/raftstore/src/store/peer.rs:3983]: log gap too large, skip merge: matched: 2318, committed: 2318, last index: 2331, last_snapshot: 2304")”] [peer_id=237628] [region_id=237623]
[2023/03/04 02:20:07.867 +08:00] [INFO] [pd.rs:1461] [“try to merge”] [merge=“target { id: 85025030 start_key: 7480000000000005FFD15F698000000000FF0000010380000000FF0000000301423037FF5637573550FF4A4AFF000000000000F901FF73756D6D6572206FFFFF75746669742066FF6FFF7220776F6D65FF6E00FE0419A89200FF0000000000000000FB end_key: 7480000000000005FFD15F698000000000FF0000010380000000FF0000000301423038FF3732424A47FF3932FF000000000000F901FF7061636B206F6620FFFF776F726B6F7574FF20FF746F70732066FF6F72FF20776F6D65FF6E0000FD0419A91AFF0000000000000000FC region_epoch { conf_ver: 11 version: 2893 } peers { id: 85025031 store_id: 1 } peers { id: 85025032 store_id: 4 } peers { id: 85025033 store_id: 212649 } }”] [region_id=85025112]
[2023/03/04 02:20:07.867 +08:00] [WARN] [peer.rs:4217] [“skip proposal”] [error_code=KV:Raftstore:Unknown] [err=“Other("[components/raftstore/src/store/peer.rs:3983]: log gap too large, skip merge: matched: 500, committed: 500, last index: 1645, last_snapshot: 6")”] [peer_id=85025115] [region_id=85025112]
[2023/03/04 02:20:08.462 +08:00] [INFO] [store.rs:1761] [“merged peer receives a stale message”] [msg_type=MsgRequestPreVote] [current_region_epoch=“conf_ver: 12 version: 2887”] [region_id=85024977]
[2023/03/04 02:20:08.462 +08:00] [INFO] [store.rs:570] [“raft message is stale, tell to gc”] [msg_type=MsgRequestPreVote] [current_region_epoch=“conf_ver: 12 version: 2887”] [region_id=85024977]
[2023/03/04 02:20:08.693 +08:00] [INFO] [advance.rs:296] [“check leader failed”] [to_store=5] [error=“"[get tikv client] store is tombstone \"id: 5 address: \\\"172.16.16.80:20160\\\" state: Tombstone version: \\\"6.1.2\\\" status_address: \\\"172.16.16.80:20180\\\" git_hash: \\\"cbdd34e96961a0be2adff9db050b8d50b13e5b0d\\\" start_timestamp: 1677561910 deploy_path: \\\"/data/tidb-deploy/tikv-20160/bin\\\" last_heartbeat: 1677593725357880105 node_state: Removed\""”]
[2023/03/04 02:20:09.692 +08:00] [INFO] [advance.rs:296] [“check leader failed”] [to_store=5] [error=“"[get tikv client] store is tombstone \"id: 5 address: \\\"172.16.16.80:20160\\\" state: Tombstone version: \\\"6.1.2\\\" status_address: \\\"172.16.16.80:20180\\\" git_hash: \\\"cbdd34e96961a0be2adff9db050b8d50b13e5b0d\\\" start_timestamp: 1677561910 deploy_path: \\\"/data/tidb-deploy/tikv-20160/bin\\\" last_heartbeat: 1677593725357880105 node_state: Removed\""”]
[2023/03/04 02:20:10.059 +08:00] [INFO] [pd.rs:1461] [“try to merge”] [merge=“target { id: 85025030 start_key: 7480000000000005FFD15F698000000000FF0000010380000000FF0000000301423037FF5637573550FF4A4AFF000000000000F901FF73756D6D6572206FFFFF75746669742066FF6FFF7220776F6D65FF6E00FE0419A89200FF0000000000000000FB end_key: 7480000000000005FFD15F698000000000FF0000010380000000FF0000000301423038FF3732424A47FF3932FF000000000000F901FF7061636B206F6620FFFF776F726B6F7574FF20FF746F70732066FF6F72FF20776F6D65FF6E0000FD0419A91AFF0000000000000000FC region_epoch { conf_ver: 11 version: 2893 } peers { id: 85025031 store_id: 1 } peers { id: 85025032 store_id: 4 } peers { id: 85025033 store_id: 212649 } }”] [region_id=85025112]
[2023/03/04 02:20:10.059 +08:00] [WARN] [peer.rs:4217] [“skip proposal”] [error_code=KV:Raftstore:Unknown] [err=“Other("[components/raftstore/src/store/peer.rs:3983]: log gap too large, skip merge: matched: 500, committed: 500, last index: 1645, last_snapshot: 6")”] [peer_id=85025115] [region_id=85025112]
[2023/03/04 02:20:10.671 +08:00] [INFO] [peer.rs:1164] [“deleting applied snap file”] [snap_file=21481_38_67822] [peer_id=214639307] [region_id=21481]
[2023/03/04 02:20:10.671 +08:00] [INFO] [peer.rs:1164] [“deleting applied snap file”] [snap_file=31885_39_80127] [peer_id=214639311] [region_id=31885]
[2023/03/04 02:20:10.671 +08:00] [INFO] [peer.rs:1164] [“deleting applied snap file”] [snap_file=117033_41_1225] [peer_id=214639306] [region_id=117033]
[2023/03/04 02:20:10.671 +08:00] [INFO] [peer.rs:1164] [“deleting applied snap file”] [snap_file=132049_40_43987] [peer_id=214639301] [region_id=132049]
[2023/03/04 02:20:10.671 +08:00] [INFO] [peer.rs:1164] [“deleting applied snap file”] [snap_file=188485_36_76] [peer_id=214639300] [region_id=188485]
[2023/03/04 02:20:10.671 +08:00] [INFO] [snap.rs:674] [“set_snapshot_meta total cf files count: 3”]
[2023/03/04 02:20:10.675 +08:00] [INFO] [snap.rs:674] [“set_snapshot_meta total cf files count: 3”]
[2023/03/04 02:20:10.680 +08:00] [INFO] [snap.rs:674] [“set_snapshot_meta total cf files count: 3”]
[2023/03/04 02:20:10.689 +08:00] [INFO] [snap.rs:674] [“set_snapshot_meta total cf files count: 4”]
[2023/03/04 02:20:10.695 +08:00] [INFO] [snap.rs:674] [“set_snapshot_meta total cf files count: 3”]
[2023/03/04 02:20:10.695 +08:00] [INFO] [advance.rs:296] [“check leader failed”] [to_store=5] [error=“"[get tikv client] store is tombstone \"id: 5 address: \\\"172.16.16.80:20160\\\" state: Tombstone version: \\\"6.1.2\\\" status_address: \\\"172.16.16.80:20180\\\" git_hash: \\\"cbdd34e96961a0be2adff9db050b8d50b13e5b0d\\\" start_timestamp: 1677561910 deploy_path: \\\"/data/tidb-deploy/tikv-20160/bin\\\" last_heartbeat: 1677593725357880105 node_state: Removed\""”]
[2023/03/04 02:20:11.694 +08:00] [INFO] [advance.rs:296] [“check leader failed”] [to_store=5] [error=“"[get tikv client] store is tombstone \"id: 5 address: \\\"172.16.16.80:20160\\\" state: Tombstone version: \\\"6.1.2\\\" status_address: \\\"172.16.16.80:20180\\\" git_hash: \\\"cbdd34e96961a0be2adff9db050b8d50b13e5b0d\\\" start_timestamp: 1677561910 deploy_path: \\\"/data/tidb-deploy/tikv-20160/bin\\\" last_heartbeat: 1677593725357880105 node_state: Removed\""”]
[2023/03/04 02:20:12.703 +08:00] [INFO] [advance.rs:296] [“check leader failed”] [to_store=5] [error=“"[get tikv client] store is tombstone \"id: 5 address: \\\"172.16.16.80:20160\\\" state: Tombstone version: \\\"6.1.2\\\" status_address: \\\"172.16.16.80:20180\\\" git_hash: \\\"cbdd34e96961a0be2adff9db050b8d50b13e5b0d\\\" start_timestamp: 1677561910 deploy_path: \\\"/data/tidb-deploy/tikv-20160/bin\\\" last_heartbeat: 1677593725357880105 node_state: Removed\""”]
[2023/03/04 02:20:12.954 +08:00] [INFO] [store.rs:1761] [“merged peer receives a stale message”] [msg_type=MsgRequestPreVote] [current_region_epoch=“conf_ver: 12 version: 40506”] [region_id=211857]
[2023/03/04 02:20:12.955 +08:00] [INFO] [store.rs:570] [“raft message is stale, tell to gc”] [msg_type=MsgRequestPreVote] [current_region_epoch=“conf_ver: 12 version: 40506”] [region_id=211857]
[2023/03/04 02:20:13.366 +08:00] [INFO] [pd.rs:1461] [“try to merge”] [merge=“target { id: 169045 start_key: 7480000000000001FF855F728000000000FF5EF47D0000000000FA end_key: 7480000000000001FF865F72FFFFFFFFFFFFFFFFFF0000000000FB region_epoch { conf_ver: 11 version: 887 } peers { id: 169046 store_id: 1 } peers { id: 169047 store_id: 4 } peers { id: 237644 store_id: 212649 } }”] [region_id=237623]

| username: h5n1 | Original post link

Is this the final log? Based on the provided log, there is a tombstone TiKV, and there should have been a scale-in operation before. Check with tiup cluster display to see if there is one, and execute tiup cluster prune if necessary. If you can’t see it in tiup, check with pd-ctl store. You can use the following command to clean it up:

pd-ctl -u http://pd_ip:2379 store remove-tombstone
| username: xfworld | Original post link

It’s best to confirm the status of the cluster and check the exact status of the processed TiKV nodes. Then follow @h5n1’s instructions to handle it.