Disaster Recovery Solution for Single Cluster Based on Multiple Replicas---How to Scale Up and Down

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 基于多副本的单集群容灾方案—如何进行扩缩容

| username: Hacker_eZSjet7O

[TiDB Usage Environment] Production Environment
[TiDB Version] v7.5

Multi-replica single-cluster disaster recovery solution—How to scale in and out
The documentation does not cover this part.

| username: Jasper | Original post link

I don’t quite understand what you mean. TiDB defaults to 3 replicas. How is your environment and configuration different from the usual?

| username: Hacker_eZSjet7O | Original post link

When deploying multiple replicas, they are different, such as az, rack/label, etc. How do you handle these during scaling?

| username: Hacker_eZSjet7O | Original post link

Single Region Dual AZ Deployment for TiDB | PingCAP Documentation Center

| username: 有猫万事足 | Original post link

When scaling in or out, just make sure the labels are set correctly. There’s nothing else to worry about.

| username: Jasper | Original post link

Oh, I understand what you mean. You have set a label. The scaling operation is the same, just make sure that after scaling, the topology ensures that the number of TiKV instances in each zone/az/rack is consistent. Otherwise, it may lead to data imbalance.

| username: dba远航 | Original post link

Check out video 303, it specifically covers various scaling up and scaling down.

| username: tidb菜鸟一只 | Original post link

Setting labels is mainly for different racks or data centers. If you want to scale out in a specific rack or data center, you can specify the corresponding label. However, when scaling in, you need to consider whether the nodes corresponding to the label in this rack or data center are sufficient. For example, if this label already has only one node, and you have three labels corresponding to three replicas in total, and you want to scale in the only node on this label, that is definitely not feasible.

| username: Jellybean | Original post link

The most important point is to clearly understand the impact on the cluster of the presence or absence of the node to be operated on before and after scaling. If this evaluation is done well, there should be no major issues.

  • If it is scaling out, you should carefully set the label for the new node and prepare an accurate configuration file.
  • If it is scaling in, you must ensure that after scaling in, there are enough remaining nodes at the corresponding label level to accept the migrated data.

Then, execute the scaling commands using tiup scale-out or scale-in.

| username: Billmay表妹 | Original post link

You can take a look at this solution

Disaster Recovery Solution Based on Multi-Replica Single Cluster

| username: Aaronz | Original post link

To maintain the minimum requirements for Raft high availability and keep it highly available, you should be able to operate according to the scale-in documentation, configuring the correct nodes and role information.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.