Issues with tick gc-ttl Settings

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tick gc-ttl设置问题

| username: jeff

[TiDB Usage Environment] Production Environment
[TiDB Version]
[Encountered Problem: Problem Phenomenon and Impact]
[CDC:ErrSnapshotLostByGC] fail to create or maintain changefeed due to snapshot loss caused by GC. checkpoint-ts 444498506308386858 is earlier than GC safepoint at 44451646197963163

After adjusting gc-ttl to 48 hours, the error GC safepoint still remains at 24 hours, it seems unsuccessful. How can I check the gc point time of ticdc?

[Resource Configuration]
[Attachment: Screenshot/Log/Monitoring]

| username: asddongmen | Original post link

  1. Errors cannot be recovered after adjustments because GC has already occurred.
  2. You can check using the pdcli command, for example:
./pd-ctl service-gc-safepoint
{
  "service_gc_safe_points": [
    {
      "service_id": "gc_worker",
      "expired_at": 9223372036854775807,
      "safe_point": 444539433238396928
    },
    {
      "service_id": "ticdc-default-15674009460217235928",
      "expired_at": 1695802828,
      "safe_point": 444521874323931139
    }
  ],
  "gc_safe_point": 444521874323931139
}
| username: jeff | Original post link

My current:
service-gc-safepoint
{
“service_gc_safe_points”: [
{
“service_id”: “gc_worker”,
“expired_at”: 9223372036854775807,
“safe_point”: 444494231631036416
},
{
“service_id”: “ticdc”,
“expired_at”: 1695870438,
“safe_point”: 444539610081263657 Why is this time point current? I didn’t adjust anything, it should logically be 24 hours ago.
}
],
“gc_safe_point”: 444494231631036416
}

| username: jeff | Original post link

I backed up some data in TiDB and restored it in the downstream MySQL. When creating a CDC task, it prompts that the checkpoint is earlier than the GC save point. The gc-ttl for CDC is set to 48 hours, but it didn’t work.

| username: Jasper | Original post link

First, the gc-ttl of ticdc takes effect after you set up ticdc. Its purpose is to ensure that data will not be garbage collected within a certain period after a ticdc failure, allowing ticdc to resume its previous synchronization tasks. For more details, you can refer to the link: TiCDC 常见问题和故障处理 | PingCAP 文档中心

Therefore, when you create a new task, it is affected by tidb_gc_life_time. You need to ensure that the interval between the full backup time and the time you set up ticdc does not exceed the tidb_gc_life_time set upstream.

For example, if you start a backup today at 17:00 and set up ticdc synchronization tomorrow at 17:00, your tidb_gc_life_time must be set to at least 24 hours to ensure that the kv change log within this day is not lost.

| username: Fly-bird | Original post link

Did the GC time setting take effect?

| username: 路在何chu | Original post link

It could be this issue. We have backed up again, let’s see what happens next.

| username: ajin0514 | Original post link

Check if the GC is effective.