Is it possible to scale in multiple TiKV nodes simultaneously when using scale-in?

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 使用 scale-in 缩容 TiKV 时可以同时缩容好几个 TiKV 吗

| username: TiDBer_KkruFifg

[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version]
[Reproduction Path] What operations were performed that caused the issue
[Encountered Issue: Problem Phenomenon and Impact]

Currently, there is a cluster version 5.1.x with dozens of TiKV, but the business data is relatively small, and we need to scale down by 10 TiKV.
I would like to ask if it is possible to execute the scale-down like this:
tiup cluster scale-in xxxx_cluster -N,,…
Or is it recommended to scale down one by one as shown below:
tiup cluster scale-in xxxx_cluster -N
tiup cluster scale-in xxxx_cluster -N
tiup cluster scale-in xxxx_cluster -N

I understand that even if multiple TiKV are specified for scale-down in tiup, tiup internally scales down one TiKV at a time, otherwise the cluster would malfunction.
However, I am not sure if there are any unexpected issues that might arise when specifying multiple TiKV for scale-down at the same time?

Adding a question, is it possible to scale up in batches, for example, can a cluster scale up by 6 TiKV at once?


[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]

| username: DBRE | Original post link

It is possible to scale down multiple TiKV nodes simultaneously, but it is not recommended. It is advisable to scale down one by one.

| username: DBAER | Original post link

Normally, it is safer to scale down sequentially, allowing for manual judgment of whether the scaling down is normal. Batch scaling increases the risk.

| username: Hacker_PtIIxHC1 | Original post link

It is possible but not recommended. Concurrent scaling down will not be faster than sequential scaling down because scheduling will be more frequent, which may also affect TiDB’s response time.

| username: dba远航 | Original post link

Batch scaling down carries significant risks, primarily because the I/O cannot handle it.

| username: zhanggame1 | Original post link

Not recommended to do this, the risk is too high.

| username: Kamner | Original post link

It’s better to be cautious in the production environment, but you can do whatever you want in the testing environment.

| username: TiDBer_KkruFifg | Original post link

May I ask if it is possible to scale out in batches? For example, can a cluster be expanded by 10 TiKV nodes at once? Thanks.

| username: 有猫万事足 | Original post link

There is no problem with this.

Because scaling down requires adding some replica data, it needs some write I/O. And this write I/O pressure will be given to the remaining machines that have not been scaled down.

However, when scaling up, the added replica initially joins the original raft group as a learner role, and the new write I/O will not be given to the original machines. Once the learner role data is completed, it will be converted to leader and the extra replicas will be deleted. The leader switch will be completed.

| username: Kongdom | Original post link

It is also supported; you can expand multiple at once.

| username: 呢莫不爱吃鱼 | Original post link

It is possible, but not recommended.

| username: TiDBer_QYr0vohO | Original post link


| username: 友利奈绪 | Original post link

No problem.

| username: TiDBer_21wZg5fm | Original post link

The risk of batch expansion is a bit high; it’s better to expand gradually and observe for any anomalies.

| username: kelvin | Original post link

I suggest tackling them one by one.

| username: RyanHowe | Original post link

For cloud TiDB, scaling down TiKV is done automatically one by one. During the process of scaling down a store, the leaders and regions within the store will be balanced to other TiKVs. Once the store’s status changes to tombstone, another TiKV will be automatically scaled down.

| username: xiaoqiao | Original post link

It’s better to take it one step at a time.

| username: 饭光小团 | Original post link

It is not recommended to scale down multiple instances at once, as it can lead to multiple regions being scheduled simultaneously, which can affect performance.

| username: 舞动梦灵 | Original post link

It can be done, but it may affect performance. I am also planning to scale down from 30 kv nodes to 20. I intend to scale down one by one. If I scale down 10 at once, I am afraid it will affect the business. Moreover, if it does affect, I am not sure if interrupting the scaling down process will cause any issues.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.