In TiDB v5.4.0 and later versions, when backup tasks are performed on the cluster under a heavy workload, why does the speed of backup tasks become slow?

Starting from TiDB v5.4.0, BR introduces the auto-tune feature for backup tasks. For clusters in v5.4.0 or later versions, this feature is enabled by default. When the cluster workload is heavy, the feature limits the resources used by backup tasks to reduce the impact on the online cluster. For more information, refer to Backup Auto-Tune.

TiKV supports dynamically configuring the auto-tune feature. You can enable or disable the feature by the following methods without restarting your cluster:

  • Disable auto-tune: Set the TiKV configuration item backup.enable-auto-tune to false.
  • Enable auto-tune: Set backup.enable-auto-tune to true. For clusters upgraded from v5.3.x to v5.4.0 or later versions, the auto-tune feature is disabled by default. You need to manually enable it.

To use tikv-ctl to enable or disable auto-tune, refer to Use auto-tune.

In addition, auto-tune reduces the default number of threads used by backup tasks. For details, see backup.num-threads](TiKV Configuration File | PingCAP Docs). Therefore, on the Grafana Dashboard, the speed, CPU usage, and I/O resource utilization used by backup tasks are lower than those of versions earlier than v5.4.0. Before v5.4.0, the default value of backup.num-threads was CPU * 0.75, that is, the number of threads used by backup tasks makes up 75% of the logical CPU cores. The maximum value of it was 32. Starting from v5.4.0, the default value of this configuration item is CPU * 0.5, and its maximum value is 8.

When you perform backup tasks on an offline cluster, to speed up the backup, you can modify the value of backup.num-threads to a larger number using tikv-ctl.