Can the table be accessed when adjusting TiFlash replicas (from 3 replicas to 2 replicas)?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tiflash调整副本(由3副本调整为2副本)时候表能不能访问?

| username: jingyesi3401

[TiDB Usage Environment] Production environment, testing

[TiDB Version] v5.1.0

[Encountered Problem: Problem Phenomenon and Impact] Currently, the production TiFlash has four nodes, each with 158GB of memory, 48 cores, and 3~3.5TB of storage. It is currently set to three replicas. However, as the table data volume increases, the available disk space is getting smaller. Now we are preparing to adjust TiFlash from 3 replicas to 2 replicas. Can the table be accessed normally during the adjustment?

| username: xfworld | Original post link

You can refer to the assistant’s reply, which is very detailed.

When you need to adjust the number of TiFlash replicas, you can execute the alter table set tiflash replica statement in TiDB. This statement is executed as a DDL statement. After executing this statement, TiDB will convert it into a series of DDL operations and then send these DDL operations to the TiFlash instance. The TiFlash instance will periodically start a subprocess to handle operations related to adding or deleting TiFlash replicas. During this process, if you occasionally see a non-resident process named tiflash_cluster_manager (referred to as “pd buddy” on the official website) in the process list, it is normal. Its logs will be output to tiflash_cluster_manager.log. You can refer to [1] for the specific work of adding TiFlash replicas at various stages in the cluster.

It should be noted that the command to build TiFlash replicas by database actually performs a series of DDL operations for the user, which requires relatively high resources. If an interruption occurs during execution, the operations that have been successfully executed will not be rolled back, and the operations that have not been executed will not continue. Therefore, when performing this operation, you need to pay attention to the resource situation of the cluster to avoid problems. You can refer to [2] and [3] for details.

During the process of adjusting TiFlash replicas, if you are adding replicas, TiFlash will automatically synchronize the existing data to the new replicas without cleaning the original data. If you are reducing replicas, TiFlash will synchronize the existing data to other replicas and then delete the replica, without cleaning the original data. Therefore, during the process of adjusting TiFlash replicas, the original data will not be cleaned but will be automatically synchronized to the new replicas or other replicas. You can refer to [1] and [2] for details.

It should be noted that the synchronization process of TiFlash replicas takes some time, and the specific time depends on the size of the data and the network bandwidth. During the synchronization process, TiFlash replicas may be in an unavailable state, so you need to pay attention to the availability of the cluster when adjusting replicas.

You can refer to this content:

| username: jingyesi3401 | Original post link

Also, when I adjust my version 5.1.0 from 3 replicas to 2 replicas, can the disk space be freed?

| username: xfworld | Original post link

I recommend using version 6.1.x, as it has more bug fixes…

Disk space will be released asynchronously by TiFlash’s scheduling…

Two replicas of TiFlash are sufficient… Having more doesn’t really help…

| username: redgame | Original post link

It needs to be tested in practice, as the differences can be significant in different environments. Conduct performance and stress tests in a testing environment outside of the production environment to evaluate the impact of adjusting the number of replicas on cluster performance.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.