How to Optimize the Speed of TiKV Pushing CDC Data?

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 如何优化 TiKV 推送 CDC数据的速度?

| username: 迷人的Ti

[TiDB Usage Environment] Production Environment
[TiDB Version] 5.4.3
Using grpc to subscribe to cdc, with the number of Regions being 2400+ and the cdc change volume being 5 million+. It takes a total of 20 minutes for all Regions from subscription to receiving the first heartbeat.

Is there any way to optimize this time? For example, by adjusting grpc parameters or tikv cdc parameters.

| username: 迷人的Ti | Original post link

I have already tried increasing the values of the three parameters: cdc.incremental-scan-concurrency, cdc.incremental-scan-speed-limit, and cdc.incremental-scan-threads, but there was no optimization effect.

| username: Billmay表妹 | Original post link

You can try the following methods for optimization:

  1. Adjust TiCDC parameters: You have already tried adjusting the cdc.incremental-scan-concurrency, cdc.incremental-scan-speed-limit, and cdc.incremental-scan-threads parameters but did not see optimization effects. Besides these parameters, there are other parameters you can try adjusting, such as cdc.region-concurrency and cdc.region-split-check-diff. You can try adjusting these parameters one by one based on the actual situation to find the best configuration.
  2. Adjust TiKV parameters: TiCDC performance is also affected by TiKV, so you can try adjusting TiKV parameters to improve performance. For example, you can adjust parameters like raftstore.apply-pool-size and to increase TiKV’s concurrent processing capability.
  3. Increase TiCDC instances: If your TiCDC instance’s resources (CPU, memory, network bandwidth, etc.) have reached their limits, consider increasing the number of TiCDC instances to improve concurrent processing capability. You can deploy multiple TiCDC instances on different machines and use multiple TiCDC instances to subscribe to different Regions to distribute the subscription pressure.
  4. Adjust network configuration: Network latency may also affect TiCDC subscription latency. You can try optimizing network configuration, such as adjusting network bandwidth and reducing network congestion, to improve TiCDC subscription performance.
  5. Use TiDB Binlog: If your business scenario allows, consider using TiDB Binlog instead of TiCDC. TiDB Binlog is a log-based incremental data subscription method that can provide lower latency and higher throughput compared to TiCDC. You can evaluate whether using TiDB Binlog can optimize subscription latency based on actual needs.
| username: Jellybean | Original post link

Typically, TiKV pushing data is not the performance bottleneck; rather, it is the internal processing or synchronization to downstream that is more likely to encounter bottlenecks.
You can refer to my scenario optimization and give it a try.

| username: dba远航 | Original post link

CDC involves both the source and the destination, both of which need to keep up in terms of performance, as well as the network.

| username: 像风一样的男子 | Original post link

The configuration of the CDC cluster needs to follow the recommended configuration.

| username: kkpeter | Original post link

Huh? Didn’t they say that CDC performance is better? Are they planning to remove the TiDB Binlog component later?