How to Handle Regions in PENDING State Due to TiKV Node Disconnection and Replica Loss

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TIKV 节点掉线导致副本丢失,如何处理状态为PENDING的region

| username: 饭光小团

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Problem Phenomenon and Impact]
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots / Logs / Monitoring]

Because two physical nodes went offline, some regions lost parts and are now in a pending state. How can this be resolved?

| username: 托马斯滑板鞋 | Original post link

How many TiKVs and how many replicas?

| username: 饭光小团 | Original post link

3 replicas, approximately 60 KV instances

| username: 托马斯滑板鞋 | Original post link

:upside_down_face: In this case, it is possible that 2 out of 3 replicas are on those two physical nodes, requiring an unsafe recovery.

P.S: You can first follow the documentation to find the region, and if it doesn’t work, contact the original manufacturer for guidance. :joy:

| username: Billmay表妹 | Original post link

Check this out~

| username: 普罗米修斯 | Original post link

Three key strategies can be referenced here:

| username: xingzhenxiang | Original post link

Is the original node unable to start?

| username: 饭光小团 | Original post link

I have already recovered using the commands from Online Unsafe Recovery 使用文档 | PingCAP 文档中心. Thank you, everyone.

| username: heiwandou | Original post link

Is there any data loss during recovery?

| username: 饭光小团 | Original post link

Not at the moment.

| username: Jellybean | Original post link

  • This feature was introduced starting from version v6.1.0. In TiDB versions below v6.1, it is an experimental feature and its behavior differs from what is described in this document, thus it is not recommended to use it. When using this feature in other versions, please refer to the corresponding version documentation.

Therefore, this indicates that it is very necessary to upgrade the cluster in a timely manner.

| username: dba远航 | Original post link

Normally, a single remaining replica cannot be used, but from a technical perspective, the possibility of specific repairs.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.