Help, TiDB won't start

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 求救,tidb起不来

| username: Hacker_qeCXjAN2

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version] 6.1.7
[Reproduction Path] What operations were performed that led to the issue
[Encountered Issue: Problem Phenomenon and Impact] Due to abnormal operations by other services, the disk became full. The cluster originally had three TiKV nodes: 1.3, 1.4, and 1.5. Nodes 1.4 and 1.5 had full disks, so I manually forced the removal of nodes 1.4 and 1.5. After that, TiDB crashed (unrelated to the cluster, the disk was damaged). After replacing the disk, I tried to scale in and then scale out TiDB, but TiDB failed to start and still reported connection errors with nodes 1.4 and 1.5, even though these two nodes had been forcibly removed. Is there any way to recover from this situation? Thank you!
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]

| username: 路在何chu | Original post link

Please also post the logs, buddy.

| username: zhang_2023 | Original post link

Restart it.

| username: zhanggame1 | Original post link

Use tiup display to check for any lag and see how many TiKV nodes are online.

| username: 啦啦啦啦啦 | Original post link

Two out of three nodes are down, the majority of replicas are gone, and a leader cannot be elected. You might need to use unsafe recovery. Refer to this link for guidance:

| username: Hacker_qeCXjAN2 | Original post link

They’re all like this.

| username: TiDBer_jYQINSnf | Original post link

Force shrink by 2? Use unsafe-recover

| username: 这里介绍不了我 | Original post link

Online Unsafe Recovery Documentation | PingCAP Documentation Center

| username: 小龙虾爱大龙虾 | Original post link

Forcing a scale-down operation is very dangerous. When scaling down, it will also warn you that data loss may occur, yet you still did it. I suggest you don’t make any further moves for now. Although the advice given by everyone is correct, find someone who understands TiDB to proceed.

| username: Hacker007 | Original post link

Consider whether it would be faster to set up a new system and resynchronize the data.

| username: tidb菜鸟一只 | Original post link

Refer to this Column - Three Strategies for Handling Abnormal TiKV Scale-Down Offline | TiDB Community

| username: 哈喽沃德 | Original post link

Did you purchase the enterprise edition? Get official technical support to take a look.

| username: Soysauce520 | Original post link

If you have backups and incremental backups, it is recommended to rebuild and restore; you won’t lose data. If not, follow the suggestion above: unsafe recovery, which will result in data loss.

| username: ffeenn | Original post link

Is the data on the two TiKV servers that were scaled down retained?

| username: 江湖故人 | Original post link

Data will definitely be lost.

| username: h5n1 | Original post link

This is the correct answer.

| username: TiDBer_小阿飞 | Original post link

After reading this document, it feels very complicated.

| username: Kongdom | Original post link

Yes, but data recovery itself is a meticulous task.

| username: db_user | Original post link

This requires data repair. With three replicas, data will not be lost. After repair, you can start it. Ensure that the number of TiKV instances is greater than or equal to the number of replicas before performing any operations.

| username: heiwandou | Original post link

Safe recovery.