Failed to Join PD Cluster

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: pd join集群失败

| username: Timber

[TiDB Usage Environment] Testing
[TiDB Version] v6.1.0
[Reproduction Path] Existing cluster pd-1, adding a new pd node pd-2. According to the official documentation:

Add pd-2 to pd-1 using the parameter --join, start pd-2, and it fails to start, reporting the error [2023/06/12 10:17:28.553 +00:00] [WARN] [stream.go:682] [“request sent was ignored by remote peer due to cluster ID mismatch”] [remote-peer-id=e91fc2eb575225bd] [remote-peer-cluster-id=9d361620aeb0a397] [local-member-id=2c9fc5b208a8cbdb] [local-member-cluster-id=d22fbedef770b55c] [error=“cluster ID mismatch”].
At the same time, pd-1 also has a warning [2023/06/12 10:18:01.953 +00:00] [WARN] [http.go:543] [“request cluster ID mismatch”] [local-member-id=e91fc2eb575225bd] [local-member-cluster-id=9d361620aeb0a397] [local-member-server-version=3.4.3] [local-member-server-minimum-cluster-version=3.0.0] [remote-peer-server-name=2c9fc5b208a8cbdb] [remote-peer-server-version=3.4.3] [remote-peer-server-minimum-cluster-version=3.0.0] [remote-peer-cluster-id=d22fbedef770b55c]
[Encountered Problem: Problem Phenomenon and Impact] Unable to add pd to the existing cluster.
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

| username: Timber | Original post link

Changed the approach, joined the new pd-2 to the existing pd-1, and pd-2 can start normally. However, the issue is that the logs of these two PDs show they belong to different clusters. Using pd-ctl to check also shows that each belongs to its own cluster, which does not match the expectation.

| username: Billmay表妹 | Original post link

According to your description, the issue might be caused by inconsistent cluster IDs between pd-1 and pd-2. The cluster ID is the unique identifier for a TiDB cluster, and if the cluster IDs of different nodes are inconsistent, it will lead to communication issues between the nodes.

To resolve this issue, ensure that the cluster IDs of pd-1 and pd-2 are consistent. You can follow these steps:

  1. Execute the command tiup cluster display <cluster-name> on the pd-1 node to check the current cluster ID.

  2. Execute the command tiup cluster edit-config <cluster-name> on the pd-2 node to edit the cluster configuration file.

  3. Add the following content to the configuration file:

cluster-id = "<cluster-id>"

where <cluster-id> is the cluster ID you checked on the pd-1 node.

  1. Save the configuration file and exit the editor.

  2. Execute the command tiup cluster reload <cluster-name> -R pd-ctl on the pd-2 node to reload the cluster configuration.

  3. Execute the command tiup cluster start <cluster-name> -R pd-ctl on the pd-2 node to start the pd-2 node.

If the above method does not resolve the issue, you may try reinitializing the TiDB cluster.