Deployment Issues of PD Quantity

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: pd 数量的部署问题

| username: Raymond

Recently, I discovered an interesting phenomenon:

  1. Deploying 1 PD, the cluster can still be used.
  2. Deploying 2 PDs, the cluster can still be used.
  3. When deploying 2 PDs and shutting down 1 PD, the entire cluster becomes unavailable, even though the other PD process and port are still there.
  4. When deploying 2 PDs and using tiup scale-in to scale down 1 PD, the cluster remains available.

Can any expert explain this strange phenomenon? Logically, PD should be deployed in pairs to elect a leader, right?

| username: Hi70KG | Original post link

Shutting down 1 PD is considered a failure.
The PD Leader Key lease timeout will trigger a re-election of the Leader. Since you have deployed 2 PDs, the system has recorded this information. Now, with the election possibly having fewer than 2 leaders, there might be anomalies. I will verify this tomorrow.
Using tiup scale-in to scale in 1 PD is equivalent to deploying only 1 PD. The system information will be updated, and now there will be only 1 leader.
Deploying 1 PD means no election is needed. It makes sense not to trigger an election; one villager is the village chief. :smirk:

| username: ddhe9527 | Original post link

The Majority (Quorum) of an etcd cluster is calculated as N/2 + 1, where N is the total number of nodes in the etcd cluster. Quorum indicates the minimum number of nodes that must be running normally for etcd to provide services. If the total number of nodes is less than 3, majority election is not considered. It can also be understood that the majority of 1 is 1, the majority of 2 is 2, and the majority of 3 is also 2…

Nodes (Instances) Quorum
1 1
2 2
3 2
4 3
5 3
6 4
7 4
| username: 啦啦啦啦啦 | Original post link

It’s not surprising, it’s normal. The election of the PD leader follows the Raft mechanism. In the absence of downtime, there will be 1 or 2 leaders, but if one goes down, it cannot elect a leader based on the majority. When deploying 2 PDs, if you use tiup scale-in to scale down 1 PD, it becomes 1 PD, which then falls into one of the first two scenarios again.

| username: Raymond | Original post link

I always thought that for this kind of Raft component, at least 2 should be deployed.

| username: Raymond | Original post link

  1. Deploying 1 PD, this PD will be the leader.
  2. Deploying 2 PDs, one of them can be elected as the leader.
  3. However, when there are 2 PDs and one of them goes down, a leader cannot be elected, and the PD component will become unusable.
    Is this understanding correct?
| username: Raymond | Original post link

Thank you for your reply, much appreciated.

| username: cs58_dba | Original post link

A triangle is the most stable; it definitely needs an odd number to elect a Leader, otherwise, there will be no clear leader.

| username: system | Original post link

This topic will be automatically closed 60 days after the last reply. No new replies are allowed.