Cluster Unavailable After Scaling Down PD Node with Dual IPs

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 双IP的PD节点缩容后集群不可用

| username: Kongdom

[TiDB Usage Environment] Production Environment
[TiDB Version] V5.1.0
[Encountered Issue] Cluster becomes unavailable after scaling down PD nodes
[Reproduction Path] Scale down problematic PD nodes
[Issue Phenomenon and Impact]
Servers 210, 211, 212 (Server 212 has dual IPs, one IP is 212, the other is 213)
During deployment, each IP of 211, 212, 213 was mixed with PD, TiDB, and TiKV nodes.

After discovering that one server deployed components through two IPs, we wanted to scale down one of the IPs (213). As a result, the cluster became unavailable.

Question 1: How to quickly fix and ensure the cluster is available in the shortest time?
Question 2: Is this dual IP server deployment deploying two sets of components or actually deploying one set of components?
Question 3: How to remove a set of components on a dual IP server?

| username: Kongdom | Original post link

  1. Perform PD cluster recovery through PD RECOVER.
  2. Actually deployed a set of components.
  3. First, scale out to a sufficient number of nodes, consider dual IPs as a single node, and then scale in.
| username: Kongdom | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.