Process and Considerations for Switching PD Leader

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: pd切换leader流程及注意事项

| username: TiDBer_Y2d2kiJh

[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version] v5.4.0
[Reproduction Path] Planning to scale down and remove the old servers, now only the PD nodes are left. How to switch the PD leader, and what should be noted during the process?
[Encountered Issues: Problem Phenomenon and Impact]
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

| username: Fly-bird | Original post link

It switches automatically, right? Just maintain 3 instances and switch them one by one.

| username: TiDBer_Y2d2kiJh | Original post link

Is it okay to first scale down the two old PDs that are not leader nodes, and then switch the last PD after adding three new PDs?

| username: 像风一样的男子 | Original post link

During business downturns, you can directly reduce it by 3, and it will automatically switch the leader.

| username: tidb菜鸟一只 | Original post link

Directly scaling down may cause short-term errors in the business. It is recommended to first scale down the two old PD nodes that are not leader nodes, then switch the last PD. After the switch is complete, take the last PD offline. This way, the business will not be affected.

| username: 有猫万事足 | Original post link

You can directly use the command to switch:

tiup ctl:v{version} pd member

to check the name of each member. Then,

tiup ctl:v{version} pd member transfer {target pd name}

Switch to the new cluster and scale down the old one.

If you want to be more meticulous, you can set a higher priority for the new PD. Once you set the priority, any future elections will likely occur on the new PD, unless all the new PDs fail.

tiup ctl:v{version} pd member leader_priority {pd name} {priority number}

The priority for those without a set priority should be 0. Set all new PDs to 1, so any future elections will only produce a PD leader from the new PDs. With the priority set, the leader should automatically switch to the new PD, and you won’t need to transfer it again manually.

| username: TiDBer_小阿飞 | Original post link

Keep the PD leader node, scale-in the other nodes, then scale-out new PD nodes, move the leader node to the new node, and finally delete the last old node.

| username: zhanggame1 | Original post link

You can directly scale up or down, there shouldn’t be any issues.

| username: xingzhenxiang | Original post link

It’s already version 5+, just go ahead and do it.

| username: xingzhenxiang | Original post link

Or you can forcefully remove the leader from the current member:

member leader resign

| username: Kongdom | Original post link

:thinking: On my side, everything is handed over to the cluster for automatic switching, without manual switching. It will automatically switch during scaling down.

| username: ajin0514 | Original post link

It switches automatically.

| username: xingzhenxiang | Original post link

Whether to switch manually depends on the version. As I mentioned earlier, v5.x can switch automatically. Here’s a picture for reference:

| username: ajin0514 | Original post link

Upgrade the version, and it will be fine.

| username: 普罗米修斯 | Original post link

Sure, previously tested switching PD by first scaling out and then scaling in, no issues.

| username: TiDBer_小阿飞 | Original post link

Leave an old node, start a new node, switch the L node to the new node, stop the old node, shrink it, and then restart the entire cluster.

| username: 路在何chu | Original post link

This method works.

| username: ti-tiger | Original post link

  • Ensure that the new PD node is in normal working condition and has sufficient resources and performance to take on the Leader role.
  • During the Leader switch, there may be a brief period of access interruption or performance degradation, so it is necessary to perform the operation during an appropriate time window to minimize the impact on the business.
  • It is best to back up the current PD node data before switching the Leader to prevent unexpected situations.
  • Before performing the PD Leader switch, it is recommended to first verify in a test environment or POC environment to ensure the safety and correctness of the operation.
| username: 路在何chu | Original post link

I haven’t paid attention to this. During low business peak periods, we directly scale down, and it switches on its own. There’s no requirement for who must be the leader; the cluster can decide on its own.