6.5.1 TiCDC Synchronization Latency Optimization

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 6.5.1 ticdc 同步延迟优化

| username: Vincent_Wang

[TiDB Usage Environment] Production Environment
[TiDB Version] v6.5.1
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Problem Phenomenon and Impact]
ticdc delay is more than 1 second. Previously, when using version 6.5 for testing, the delay was mostly within seconds. Now the delay has increased, and it is believed to be caused by the delay in changefeed resolved ts, as the resolved ts is also more than 1 second. The main time consumption is in changefeed resolved ts.
[Resource Configuration]
24 ticdc servers, previously 12, changed to 24 but the delay is still more than 1 second.
changefeed worker-num was previously 24, changed to 48, but the delay is still more than 1 second.

–changefeed configuration

Synchronize tables without indexes

force-replicate = true
[filter]

Ignore transactions with specified start_ts

ignore-txn-start-ts = [1, 2]

Filter rules

Filter rule syntax: 表库过滤 | PingCAP 文档中心

rules = [‘xxxxx.*’]

[mounter]

Number of mounter threads, used to decode data output from TiKV

worker-num = 48
[sink]

–changefeed creation statement
tiup cdc cli changefeed create --pd=http://192.168.xx.xx:2379 --sink-uri=“mysql://xx:xx@192.168.xx.xx:4000/?worker-count=32&max-txn-row=5000&transaction-atomicity=none” --changefeed-id=“mysql-xx” --config=/opt/soft/scale/changefeed_xx.toml

[Attachment: Screenshot/Log/Monitoring]

| username: Meditator | Original post link

  1. Is there a large table (with high TPS) in the upstream TiDB?
  2. TiCDC schedules tasks to processors based on table dimensions. If a certain table has a high TPS and the TiCDC server is non-differentiated, multiple TiCDC servers cannot solve the issue.
| username: Vincent_Wang | Original post link

Yes, with high TPS on large tables, how should this be handled? The parameter transaction-atomicity=none indicates that single-table transactions will be split. This parameter has already been added, how can it be made faster?

| username: sdojjy | Original post link

TiKV has a configuration min_ts_interval that can adjust the frequency of resolved ts advancement, but it may affect the performance of CDC in the case of a large number of regions.

set config tikv `cdc.min_ts_interval`="200ms"

Additionally, you can check the “Slow Table” item in Grafana to see which module is introducing significant delays.