Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tidb节点tikv节点缩容后、状态为NA
[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] What operations were performed that led to the issue
[Encountered Issue: Issue Phenomenon and Impact]
As shown in the picture, 106 is a KV node. After scaling down, its status became Tombstone. Then, after executing pd-ctl store remove-tombstone, it changed to NA status. How can I completely decommission this node?
[Resource Configuration]
[Attachments: Screenshots / Logs / Monitoring]
What is the version? If the offline operation is unsuccessful, back up the data!!! Use the force parameter to force offline only as a last resort.
Refer to the following articles:
Version 5.3.0 has been offline for a long time. Moreover, the physical machine has also been removed. It’s just that the cluster information is still there.
First, confirm that there is no data, then back up the data, and use the force parameter to delete it forcibly.
Check if this store still exists using pd-ctl store.
There is a high probability that this store exists. This situation arises from the removal of physical data without proper offline procedures.
No, it has been checked here.
Try deleting this node in the tiup toml file.
Make a backup before deleting.
You can force offline using the force parameter.
The physical machines are no longer available. You can use the force option of scale in to clean up this type of topology information.
For certain components, the service will not be stopped immediately and the data will not be deleted. Instead, after the data scheduling is completed, the user needs to manually execute the tiup cluster prune
command to clean up.
This forced logout will do.
After the TiKV status changes to tombstone, you should use tiup cluster prune
to clean it up. If there is still residual monitoring information, use pd-ctl store remove-tombstone
. Now, you can directly use scale-in -N xxx --force
to force delete it.
After executing scale-in -N --force, the following error is prompted: Error: failed to scale in: cannot find node id ‘192.168.90.106’ in topology. It still cannot be deleted~~
Backup the file located at home directory .tiup/storage/cluster/clusters/{cluster-name}/meta.yaml, and then delete the corresponding node content inside.
Enter the PD console. Check the store limit to see if the node is still there.
If the node is still there, confirm that the node is not needed and then perform unsafe operations. Try deleting it first. If that doesn’t work, then use unsafe.
Okay, thank you~ I’ll try it during non-production time.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.