How to Understand Several Parameters in Dual-Region Multi-AZ Deployment?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 双区域多AZ部署中的几个参数该怎么理解?

| username: 逍遥_猫

Dear experts, in the technical documentation for dual-region multi-AZ deployment

it states:

  1. Dual-region three AZ requires setting 5 replicas. Why 5 replicas?
  2. Parameters raftstore.raft-min-election-timeout-ticks: 1000
    raftstore.raft-max-election-timeout-ticks: 1200
    How should these be understood? Based on what are these two values set?
    What does the term “tick” mean? Is it a node?
  3. The document mentions “In the rac1 rack of AZ1, one server deploys TiDB and PD services, and the other two servers deploy TiKV services. Each TiKV server deploys two TiKV instances (tikv-server), and rac2, rac4, rac5, and rac6 are similar.” What is the purpose of deploying two TiKV instances on each TiKV server?
| username: 逍遥_猫 | Original post link

Answering the second question with two parameters:
If a Region Follower does not receive a heartbeat from the Leader within the raft-election-timeout interval, it will determine that the Leader has failed and initiate a new election.
raft-election-timeout = raft-base-tick-interval * raft-election-timeout-ticks
raft-base-tick-interval defaults to 1s
When raft-min-election-timeout-ticks is 0, it takes the value of raft-election-timeout-ticks

Guessing the third question is to reduce the number of Regions on a single TiKV instance.

| username: 我是咖啡哥 | Original post link

I think this depends on the resource situation. If it’s a physical machine with high configuration, deploying 2 instances can fully utilize the resources.

| username: 我是咖啡哥 | Original post link

First question:

The more replicas, the better the high availability. It is not mandatory to use 5 replicas; 3 replicas are also possible, but only one replica can fail.

Second question:
I saw it in a video tutorial somewhere, but I can’t find it now. It’s related to leader election. Can tick be understood as the number of heartbeats?

Third question:
I looked at the documentation carefully. According to the configuration file below, this description is incorrect. It should be that 2 TiKV servers are deployed on each rack, with one instance deployed on each server.


You can see from the configuration file that each instance has a different IP.

| username: 我是咖啡哥 | Original post link

Here it mentions the tick message and heartbeat. You can refer to it.

raft-base-tick-interval is the time interval at which the Raftstore drives the Raft state machine of each Region, meaning that a tick message needs to be sent to the Raft state machine at this interval. Increasing this interval can effectively reduce the number of messages from Raftstore.

If a Region Follower does not receive a heartbeat from the Leader within the raft-election-timeout interval, it will assume that the Leader has failed and initiate a new election. raft-heartbeat-interval is the interval at which the Leader sends heartbeats to the Followers. Therefore, increasing the raft-base-tick-interval can reduce the number of network messages sent by Raft within a unit of time, but it will also increase the time Raft takes to detect a Leader failure.