How to Handle Regions in PENDING State Due to TiKV Node Disconnection and Replica Loss

Because two physical nodes went offline, some regions lost parts and are now in a pending state. How can this be resolved?

How many TiKVs and how many replicas?

3 replicas, approximately 60 KV instances

:upside_down_face: In this case, it is possible that 2 out of 3 replicas are on those two physical nodes, requiring an unsafe recovery.

P.S: You can first follow the documentation to find the region, and if it doesn’t work, contact the original manufacturer for guidance. :joy:

Check this out~

Three key strategies can be referenced here:

Is the original node unable to start?

I have already recovered using the commands from Online Unsafe Recovery 使用文档 | PingCAP 文档中心. Thank you, everyone.

Is there any data loss during recovery?

Not at the moment.

  • This feature was introduced starting from version v6.1.0. In TiDB versions below v6.1, it is an experimental feature and its behavior differs from what is described in this document, thus it is not recommended to use it. When using this feature in other versions, please refer to the corresponding version documentation.

Therefore, this indicates that it is very necessary to upgrade the cluster in a timely manner.

Normally, a single remaining replica cannot be used, but from a technical perspective, the possibility of specific repairs.

