Dear experts, in the technical documentation for dual-region multi-AZ deployment
it states:
Dual-region three AZ requires setting 5 replicas. Why 5 replicas?
Parameters raftstore.raft-min-election-timeout-ticks: 1000
raftstore.raft-max-election-timeout-ticks: 1200
How should these be understood? Based on what are these two values set?
What does the term “tick” mean? Is it a node?
The document mentions “In the rac1 rack of AZ1, one server deploys TiDB and PD services, and the other two servers deploy TiKV services. Each TiKV server deploys two TiKV instances (tikv-server), and rac2, rac4, rac5, and rac6 are similar.” What is the purpose of deploying two TiKV instances on each TiKV server?
Answering the second question with two parameters:
If a Region Follower does not receive a heartbeat from the Leader within the raft-election-timeout interval, it will determine that the Leader has failed and initiate a new election.
raft-election-timeout = raft-base-tick-interval * raft-election-timeout-ticks
raft-base-tick-interval defaults to 1s
When raft-min-election-timeout-ticks is 0, it takes the value of raft-election-timeout-ticks
Guessing the third question is to reduce the number of Regions on a single TiKV instance.
I think this depends on the resource situation. If it’s a physical machine with high configuration, deploying 2 instances can fully utilize the resources.
The more replicas, the better the high availability. It is not mandatory to use 5 replicas; 3 replicas are also possible, but only one replica can fail.
Second question:
I saw it in a video tutorial somewhere, but I can’t find it now. It’s related to leader election. Can tick be understood as the number of heartbeats?
Third question:
I looked at the documentation carefully. According to the configuration file below, this description is incorrect. It should be that 2 TiKV servers are deployed on each rack, with one instance deployed on each server.
Here it mentions the tick message and heartbeat. You can refer to it.
raft-base-tick-interval is the time interval at which the Raftstore drives the Raft state machine of each Region, meaning that a tick message needs to be sent to the Raft state machine at this interval. Increasing this interval can effectively reduce the number of messages from Raftstore.
If a Region Follower does not receive a heartbeat from the Leader within the raft-election-timeout interval, it will assume that the Leader has failed and initiate a new election. raft-heartbeat-interval is the interval at which the Leader sends heartbeats to the Followers. Therefore, increasing the raft-base-tick-interval can reduce the number of network messages sent by Raft within a unit of time, but it will also increase the time Raft takes to detect a Leader failure.