GC in version 6.5.3 cannot progress normally

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: v6.5.3版本gc不能正常推进

| username: du拉松

【TiDB Usage Environment】Production Environment
【TiDB Version】Upgraded from v5.4.0 to v6.5.3
【Reproduction Path】There was a changefeed in ticdc, after directly removing the changefeed and scaling down the related ticdc nodes, it was found that GC did not occur for more than 20 hours.
【Encountered Problem: Phenomenon and Impact】Currently, GC cannot proceed normally. The following related logs were found in the GC leader:

[2023/07/27 08:53:26.921 +08:00] [INFO] [gc_worker.go:625] ["[gc worker] there's another service in the cluster requires an earlier safe point. gc will continue with the earlier one"] [uuid=625ddd430bc000a] [ourSafePoint=443111232467894272] [minSafePoint=443110872819695626]
[2023/07/27 08:53:26.921 +08:00] [INFO] [gc_worker.go:601] ["[gc worker] last safe point is later than current one. No need to gc. This might be caused by manually enlarging gc lifetime"] ["leaderTick on"=625ddd430bc000a] ["last safe point"=2023/07/26 09:10:34.914 +08:00] ["current safe point"=2023/07/26 09:10:34.914 +08:00]

Checking service-gc-safepoint, you can see the related safepoint of ticdc:

{
  "service_gc_safe_points": [
    {
      "service_id": "gc_worker",
      "expired_at": 9223372036854775807,
      "safe_point": 443111201010876416
    },
    {
      "service_id": "ticdc-default-4589598632202768367",
      "expired_at": 1690420235,
      "safe_point": 443110872819695626
    }
  ],
  "gc_safe_point": 443110872819695626
}

However, after scaling up ticdc, there is no related changefeed list in cdc.
【Resource Configuration】
【Attachments: Screenshots/Logs/Monitoring】


image

| username: 裤衩儿飞上天 | Original post link

The default gc-ttl for cdc is 24 hours, you can refer to: TiCDC FAQ | PingCAP Documentation Center

| username: h5n1 | Original post link

Running tiup cdc:v5.1.0 cli --pd=<PD_ADDRESS> unsafe reset will clear the CDC tasks, but since you’ve already removed the CDC, it won’t have much impact.

| username: du拉松 | Original post link

Yes, the CDC list is gone, but the service_gc_safe_points still exists.
After executing: tiup cdc:v6.5.3 cli --pd=http://172.16.105.24:2379 unsafe reset, the service_gc_safe_points is gone; then the gc leader logs are normal.
However, I don’t know why this situation occurs.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.