TiKV Abnormal Node Cannot Be Properly Taken Offline

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tikv异常节点无法正常下线

| username: 天明0829

The TiKV node is abnormal and cannot be started. Subsequently, the node was taken offline through scaling down, and then the node was forcibly taken offline. After a day, it was found that there were still error logs in the logs connecting to the abnormal node. I would like to ask, under what circumstances does this anomaly occur? How can it be resolved?

| username: 像风一样的男子 | Original post link

Check the status of each store in pd-ctl.

| username: Fly-bird | Original post link

Was there any manual restart of the KV service on the TiKV server at this time point when the TiKV node was abnormal and could not start?

| username: 普罗米修斯 | Original post link

Check if this node has any blocked leaders and regions pending migration.

| username: 路在何chu | Original post link

pd-ctl also needs to be removed, store remove-tombstone

| username: TiDBer_小阿飞 | Original post link

It is necessary to stop and restart all related nodes in a loop.