Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: TICDC的同步任务可以暂停多长时间?
After using OpenAPI to call the pause synchronization task
POST /api/v2/changefeeds/{changefeed_id}/pause
How long can this synchronization task be paused? For example, if the database continues to have data changes, will it fill up the cache or disk?
This depends on how long you can tolerate data inconsistency in your business. Additionally, it depends on whether resources are sufficient.
As long as there is no garbage collection (GC), it can stop for as long as it wants, it’s just occupying disk space.
That’s right. If it stops for too long and GC occurs, you can only reinitialize a set of data offline to the downstream, keep the data consistent between upstream and downstream, and then create a new changefeed task.
According to the principle of TiCDC, TiCDC monitors the information of each Region Raft Log in the upstream TiKV in real-time and generates corresponding data change information with multiple SQL statements based on the differences before and after each transaction. Therefore, it depends on how long the information of the Region Raft Log can be retained.
Determined by tidb_gc_life_time and tidb_gc_max_wait_time, with a default value of 24 hours.
Do not exceed the gc ttl time configured in cdc.
Limit memory usage: When calling the interface to pause synchronization tasks, you can specify a parameter to limit memory usage. When memory usage reaches this limit, the synchronization task will automatically stop to avoid memory overflow.
Regularly check memory and disk usage: During the pause of synchronization tasks, regularly check memory and disk usage to ensure they do not exceed limits. If memory or disk usage is found to be approaching the limit, measures can be taken, such as clearing memory or increasing disk space.
Set a timeout for synchronization tasks: When calling the interface to pause synchronization tasks, you can specify a timeout. When the pause time of the synchronization task exceeds this timeout, the synchronization task will automatically resume. This way, even if memory and disk space are sufficient, you can avoid the synchronization task being paused for too long.