Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: 求助:对PD,TIKV灾难演练碰到的一些问题

[TiDB Usage Environment]
Test Environment
[TiDB Version]
v4.0.13
[Reproduction Path]
I used pd and tikv to store some clusterid and kv separately. Recently, I conducted a disaster recovery drill. The solution was to use pd-recover and tikv-br for backup and recovery. During the testing process, I encountered some issues. Although I have read part of the documentation, I still couldn’t find a solution. I hope someone can help, thank you.
Disaster Recovery Scenario 1: Complete data loss of tikv
Solution: Use tikv-br for remote data backup and recovery
Testing Process:
- Create a three-node cluster and write 5 kv. The tikv store situation is as follows:
storeid is 1, 4, 5
2) Stop all tikv services and delete the tikv data directory
3) Restart tikv services, but the restart fails. After checking the logs, I found the following
-
It seems that the tikv node is down and has not been converted to tombstone before being kicked out
-
Stop all tikv services and use curl -X POST ‘http://192.168.3.40:16000/pd/api/v1/store/$storeid/state?state=Tombstone’
-
Restart pd services and then start tikv services. Tikv starts successfully. Check tikv storeid as
-
Since storeid is 8, 9, 10, it is not possible to perform tikv-br restore on the nodes. Restore requires the original storeid 1, 4, 5
-
How to start a new tikv using the original storeid? Or can tikv-br restore to the new storeid???
Disaster Recovery Scenario 2: Complete data loss of pd
Solution: Use pd-recover to repair and restore the original clusterid
- Stop pd services and delete the pd data directory
- Start pd services. Pd starts successfully and initializes a new cluster id
- Execute ./pd-recover --endpoints http://192.168.3.40:16000 --cluster-id original id --alloc-id=01
- Clusterid is restored, but since I have written some clusterid information into pd’s data, can I back up and restore it?
[Encountered Issues: Problem Phenomenon and Impact]
- When tikv’s storeid has changed, how to use tikv-br for recovery
- When pd’s data is lost, besides restoring pd clusterid, can pd data be backed up and restored?