Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: 【Q&A 回顾】TiCDC 源码解读#4 | TiCDC Scheduler 工作原理解析

This article is the fourth issue of TiCDC source code interpretation - TiCDC Scheduler working principle analysis, including the on-site Q&A collation, video review, and sharing material download collection. If you have more to discuss about this issue, feel free to leave a comment below.
- Video replay: TiCDC 源码解读#4 TiCDC Scheduler 工作原理解析_哔哩哔哩_bilibili
- Sharing material download: TiCDC Source Code Interpretation #4 TiCDC Scheduler Working Principle Analysis.pdf (3.3 MB)
- Article collation: 专栏 - TiCDC 源码解读(4)-- TiCDC Scheduler 工作原理解析 | TiDB 社区
- Detailed review of the entire TiCDC source code interpretation series: 【Resource Summary】The Most Complete Resources of TiCDC Source Code Interpretation Series!!!
Q&A Review of This Issue
Below is the Q&A review of this issue “TiCDC Scheduler Working Principle Analysis”:
Q: Can TiCDC upgrade across major versions? For example, from 5.4 to 6.4, 5.4.0 to 5.4.3?
A: It can upgrade across major versions, but it does not support rolling upgrades.
Q: The upgrade of TiCDC before TiDB v6.3 seems very slow.
A: The upgrade process is completed quickly, just offline the node and then online it. However, there will be a time-consuming restart of synchronization tasks. Under heavy workloads, there will be a significant increase in synchronization delay after the node upgrade is completed.
Q: What is the distinction between TiCDC and DM?
A: TiCDC is mainly used to synchronize incremental data changes from the upstream TiDB cluster to the downstream target node. TiDB is the data source. DM uses MySQL as the data source to perform database and table merging operations and input to the downstream target TiDB. TiDB is the data destination.
Q: If the upstream TiCDC and TiDB cluster completely crash and are completely inaccessible, such as in a data center disaster, how can the synchronization point be determined downstream?
A: There are several solutions:
- If the downstream is MySQL / TiDB, you can use the sync point function to get the checkpoint previously synchronized. If the downstream is Kafka, you can check the CommitTs of the last written data in the Topic, which is the corresponding Checkpoint.
- You can consider using the Redo Log function provided by TiCDC to synchronize data to the downstream in case of a disaster.
Q: What learning materials are recommended for this issue?
A: The unit test coverage of the Scheduler module is very high. You can understand the internal implementation details by running unit tests. Refer to all files ending with _test.go under tiflow/cdc/scheduler/internal/v3 at v6.4.0 · pingcap/tiflow · GitHub.
Announcement of Live Interaction Winners
Congratulations to the following users who participated in the interaction and won prizes! You can get 100 TiDB community points/questions.
The following users can add WeChat ID: Oneandtwii before January 15, 2023, and reply with your video account nickname + TiDB community nickname to redeem points.
No. | Video Account User |
---|---|
1 | Magic Wings |
2 | Potato |
If you have more to discuss about this issue, feel free to leave a comment below.