Using CDC for data synchronization, worried that the specified start ts is already earlier than the GC safe point, how to handle it

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 使用cdc做数据同步,担心需要指定的start ts已经早于gc safe point,怎么处理

| username: TiDBer_20QjYTLl

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version] 6.5.0
Using CDC for data synchronization, if unexpected situations occur, such as downtime or changefeed task anomalies, and resynchronization is needed, there is a concern that the specified start ts is already earlier than the GC safe point. How should this be handled?
Can tikv_gc_life_time be adjusted?

| username: WalterWj | Original post link

If a CDC task fails, it will default to blocking GC for 24 hours.

| username: TiDBer_20QjYTLl | Original post link

I want to know if there is a way to adjust the frequency of GC. Currently, the database parameter tikv_gc_life_time is 10 minutes, and tikv_gc_safe_point changes every 10 minutes. If the frequency can be lowered, will it affect the overall data storage and memory?

| username: zhanggame1 | Original post link

Extend tikv_gc_life_time, for example, we are currently using 24 hours, and there will be no issues within 24 hours.

If you extend tikv_gc_life_time and perform a lot of updates and deletes, it will result in a pile of old version data in TiKV, affecting query performance.

| username: TiDBer_20QjYTLl | Original post link

Where can I adjust tikv_gc_life_time? Can I directly modify the mysql.tidb table?

| username: WalterWj | Original post link

系统变量 | PingCAP 文档中心 The new version is a variable parameter, you can directly change it. You can search for it on the official website.

| username: zhanggame1 | Original post link

The above answer is misleading. Here, log in to the database with the root user and execute:
set global tidb_gc_life_time=24h;

| username: TiDBer_5cwU0ltE | Original post link

I think we can do some preparation work to ensure that no unexpected situations occur and there is no downtime…

| username: dba远航 | Original post link

You can adjust tikv_gc_life_time.

| username: Jasper | Original post link

You can adjust gc-ttl for CDC. Its priority is higher than tikv_gc_life_time. This means if your CDC task is stuck, and tikv_gc_life_time is set to 10 minutes while gc-ttl is set to 24 hours, the data will not be garbage collected within 24 hours.
You can refer to the complete configuration below, which includes an explanation of gc-ttl:

| username: redgame | Original post link

The one mentioned by the expert upstairs is more useful, gc-ttl.

| username: zhang_2023 | Original post link

The GC time can be adjusted.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.