[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version] 6.1.7
[Reproduction Path] What operations were performed that led to the issue
[Encountered Issue: Problem Phenomenon and Impact] Due to abnormal operations by other services, the disk became full. The cluster originally had three TiKV nodes: 1.3, 1.4, and 1.5. Nodes 1.4 and 1.5 had full disks, so I manually forced the removal of nodes 1.4 and 1.5. After that, TiDB crashed (unrelated to the cluster, the disk was damaged). After replacing the disk, I tried to scale in and then scale out TiDB, but TiDB failed to start and still reported connection errors with nodes 1.4 and 1.5, even though these two nodes had been forcibly removed. Is there any way to recover from this situation? Thank you!
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
Please also post the logs, buddy.

Restart it.

Use tiup display to check for any lag and see how many TiKV nodes are online.

Two out of three nodes are down, the majority of replicas are gone, and a leader cannot be elected. You might need to use unsafe recovery. Refer to this link for guidance:

They’re all like this.

Force shrink by 2? Use unsafe-recover

Online Unsafe Recovery Documentation | PingCAP Documentation Center

Forcing a scale-down operation is very dangerous. When scaling down, it will also warn you that data loss may occur, yet you still did it. I suggest you don’t make any further moves for now. Although the advice given by everyone is correct, find someone who understands TiDB to proceed.

Consider whether it would be faster to set up a new system and resynchronize the data.

Refer to this Column - Three Strategies for Handling Abnormal TiKV Scale-Down Offline | TiDB Community

Did you purchase the enterprise edition? Get official technical support to take a look.

If you have backups and incremental backups, it is recommended to rebuild and restore; you won’t lose data. If not, follow the suggestion above: unsafe recovery, which will result in data loss.

Is the data on the two TiKV servers that were scaled down retained?

Data will definitely be lost.

This is the correct answer.

After reading this document, it feels very complicated.

Yes, but data recovery itself is a meticulous task.

This requires data repair. With three replicas, data will not be lost. After repair, you can start it. Ensure that the number of TiKV instances is greater than or equal to the number of replicas before performing any operations.

Safe recovery.