Pd-ctl cannot delete tikv-Tombstone

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: pd-ctl不能删除tikv-Tombstone

| username: 奋斗的大象

[TiDB Usage Environment] Production Environment
[TiDB Version] 6.1.0
[Reproduction Path] One machine was down for a long time and failed to restart
[Problem Encountered: Causing TiDB to fail to start]
[Resource Configuration] * *
[Attachments: Screenshots/Logs/Monitoring] [raft_client.rs:516] [“connection aborted”] [addr=10.114.26.112:20162] [receiver_err=“Some(RpcFailure(RpcStatus { code: 14-UNAVAILABLE, message: "failed to connect to all addresses", details: }))”] [sink_error=“Some(RpcFinished(Some(RpcStatus { code: 14-UNAVAILABLE, message: "failed to connect to all addresses", details: })))”] [store_id=183060412]

| username: yytest | Original post link

Based on the information you provided, the issue seems to be due to the TiKV node being unable to establish a connection with other nodes. The error message “[raft_client.rs:516] [“connection aborted”] [addr=10.114.26.112:20162]” indicates that TiDB attempted to communicate with the TiKV node at address 10.114.26.112 on port 20162, but the connection was aborted. Additionally, “[receiver_err=“Some(RpcFailure(RpcStatus { code: 14-UNAVAILABLE, message: “failed to connect to all addresses”, details: }))”]" and “[sink_error=“Some(RpcFinished(Some(RpcStatus { code: 14-UNAVAILABLE, message: “failed to connect to all addresses”, details: })))”]" further confirm the connection failure issue.

To resolve this issue, you can try the following steps:

  1. Check Network Connection: Ensure that the network connection between all TiDB cluster nodes is normal. You can use the ping command to test connectivity between nodes.
  2. Check Firewall Settings: Confirm that the firewall is not blocking the communication ports between TiDB nodes. TiDB clusters typically require certain ports to be open for node communication.
  3. Check TiKV Status: Log in to the problematic TiKV node and check its status. You can use the tikv-ctl tool to check the health of the TiKV node.
| username: 奋斗的大象 | Original post link

The firewall is turned off, but TiDB still won’t start:
[ERROR] [tidb.go:89] [“[ddl] init domain failed”] [error=“[tikv:9005]Region is unavailable”]

| username: wakaka | Original post link

What does the status of the cluster nodes look like in tiup? Even if one TiKV node that cannot be started is considered broken, it does not affect the use of the entire cluster.

| username: tidb菜鸟一只 | Original post link

I feel that there is an issue with your TiKV node going offline. Just use the three-step method…
Column - Three-Step Method for Handling TiKV Scaling Down and Offline Exceptions | TiDB Community

| username: Kongdom | Original post link

:thinking: Has TiKV been deleted? If it has been deleted but TiDB cannot start, you can refer to the three-step solution mentioned above.