Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: TICDC新建changefeed总是报etcd超时

[TiDB Usage Environment] Production Environment / Testing / Poc
Production environment, TiDB is deployed in k8s, with about 5000-10000 tables in the TiDB instance, but only about 150 tables are configured for TiCDC synchronization. The total data volume in TiDB is not large, less than 10G.
Currently, the entire environment is still in testing, so the tables configured with TiCDC have little traffic, with each table having only hundreds of thousands of rows. The environment has frequent DDL operations, often involving truncate table operations.
[TiDB Version]
TiDB 5.4
[Reproduction Path] What operations were performed when the issue occurred
Failed to create changefeed through TiCDC’s openapi, always encountering etcd timeout issues.
[Encountered Issue: Problem Phenomenon and Impact]
curl -X POST http://127.0.0.1:8301/api/v1/changefeeds -d ‘{“changefeed_id”:“k1”,“sink_uri”:“kafka://broker-kafka-test-az1-0.jvessel-open-hb.jdcloud.com:9092/tidb_version_test?protocol=canal-json&kafka-version=2.4.0&max-message-bytes=1073741824”, “filter_rules”:[“test.test1”]}’
Returns CDC:ErrPDEtcdAPIError]etcd api call error: context deadline exceeded
This issue occurs almost 100% of the time, and when it occurs, the HTTP request returns in about 12-14 seconds.
However, creating through cdc cli does not have this issue.
[Resource Configuration]
TiCDC configuration: 8C, 16G;
/cdc server --addr=0.0.0.0:8301 --advertise-addr=tidb-test-ticdc-0tidb-test-ticdc-peer.tidb-test.svc:8301 --gc-ttl=86400 --log-file=/tmp/cdc_data/log/cdc.log --log-level=info --pd=http://tidb-test-pd:2379
[Attachments: Screenshots/Logs/Monitoring]
When creation fails, there are logs:
[2022/12/28 09:08:12.272 +00:00] [ERROR] [client.go:502] [“[pd] tso request is canceled due to timeout”] [dc-location=global] [error=“[PD:client:ErrClientGetTSOTimeout]get TSO timeout”]
[2022/12/28 09:08:12.272 +00:00] [ERROR] [client.go:786] [“[pd] getTS error”] [dc-location=global] [error=“[PD:client:ErrClientGetTSO]rpc error: code = Canceled desc = context canceled: rpc error: code = Canceled desc = context canceled”]
[2022/12/28 09:08:12.272 +00:00] [INFO] [client.go:730] [“[pd] tso stream is not ready”] [dc=global]
[2022/12/28 09:08:12.272 +00:00] [INFO] [acquirer.go:71] [“get time from pd failed, retry later”] [error=“rpc error: code = Canceled desc = context canceled”] [errorVerbose=“rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(
[2022/12/28 09:08:16.273 +00:00] [ERROR] [client.go:502] [”[pd] tso request is canceled due to timeout"] [dc-location=global] [error=“[PD:client:ErrClientGetTSOTimeout]get TSO timeout”]
[2022/12/28 09:08:16.274 +00:00] [ERROR] [client.go:786] [“[pd] getTS error”] [dc-location=global] [error=“[PD:client:ErrClientGetTSO]rpc error: code = Canceled desc = context canceled: rpc error: code = Canceled desc = context canceled”]
[2022/12/28 09:08:16.274 +00:00] [INFO] [client.go:730] [“[pd] tso stream is not ready”] [dc=global]
[2022/12/28 09:08:16.274 +00:00] [INFO] [acquirer.go:71] [“get time from pd failed, retry later”] [error=“rpc error: code = Canceled desc = context canceled”] [errorVerbose=“rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(
[2022/12/28 09:08:20.274 +00:00] [ERROR] [client.go:502] [”[pd] tso request is canceled due to timeout"] [dc-location=global] [error=“[PD:client:ErrClientGetTSOTimeout]get TSO timeout”]
[2022/12/28 09:08:20.274 +00:00] [ERROR] [client.go:786] [“[pd] getTS error”] [dc-location=global] [error=“[PD:client:ErrClientGetTSO]rpc error: code = Canceled desc = context canceled: rpc error: code = Canceled desc = context canceled”]
[2022/12/28 09:08:20.274 +00:00] [INFO] [client.go:730] [“[pd] tso stream is not ready”] [dc=global]
[2022/12/28 09:08:20.274 +00:00] [INFO] [acquirer.go:71] [“get time from pd failed, retry later”] [error=“rpc error: code = Canceled desc = context canceled”] [errorVerbose=“rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(
[2022/12/28 09:08:24.275 +00:00] [ERROR] [client.go:502] [”[pd] tso request is canceled due to timeout"] [dc-location=global] [error=“[PD:client:ErrClientGetTSOTimeout]get TSO timeout”]
[2022/12/28 09:08:24.276 +00:00] [ERROR] [client.go:786] [“[pd] getTS error”] [dc-location=global] [error=“[PD:client:ErrClientGetTSO]rpc error: code = Canceled desc = context canceled: rpc error: code = Canceled desc = context canceled”]
[2022/12/28 09:08:24.276 +00:00] [INFO] [client.go:730] [“[pd] tso stream is not ready”] [dc=global]
[2022/12/28 09:08:24.276 +00:00] [INFO] [acquirer.go:71] [“get time from pd failed, retry later”] [error=“rpc error: code = Canceled desc = context canceled”] [errorVerbose=“rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(
[2022/12/28 09:08:28.277 +00:00] [ERROR] [client.go:502] [”[pd] tso request is canceled due to timeout"] [dc-location=global] [error=“[PD:client:ErrClientGetTSOTimeout]get TSO timeout”]
[2022/12/28 09:08:28.277 +00:00] [ERROR] [client.go:786] [“[pd] getTS error”] [dc-location=global] [error=“[PD:client:ErrClientGetTSO]rpc error: code = Canceled desc = context canceled: rpc error: code = Canceled desc = context canceled”]
[2022/12/28 09:08:28.277 +00:00] [INFO] [client.go:730] [“[pd] tso stream is not ready”] [dc=global]
[2022/12/28 09:08:28.277 +00:00] [INFO] [acquirer.go:71] [“get time from pd failed, retry later”] [error=“rpc error: code = Canceled desc = context canceled”] [errorVerbose=“rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(
[2022/12/28 09:08:32.279 +00:00] [ERROR] [client.go:502] [”[pd] tso request is canceled due to timeout"] [dc-location=global] [error=“[PD:client:ErrClientGetTSOTimeout]get TSO timeout”]
[2022/12/28 09:08:32.279 +00:00] [ERROR] [client.go:786] [“[pd] getTS error”] [dc-location=global] [error=“[PD:client:ErrClientGetTSO]rpc error: code = Canceled desc = context canceled: rpc error: code = Canceled desc = context canceled”]
[2022/12/28 09:08:32.279 +00:00] [INFO] [client.go:730] [“[pd] tso stream is not ready”] [dc=global]
[2022/12/28 09:08:32.279 +00:00] [INFO] [acquirer.go:71] [“get time from pd failed, retry later”] [error=“rpc error: code = Canceled desc = context canceled”] [errorVerbose=“rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(
[2022/12/28 09:08:36.281 +00:00] [ERROR] [client.go:502] [”[pd] tso request is canceled due to timeout"] [dc-location=global] [error=“[PD:client:ErrClientGetTSOTimeout]get TSO timeout”]
[2022/12/28 09:08:36.281 +00:00] [ERROR] [client.go:786] [“[pd] getTS error”] [dc-location=global] [error=“[PD:client:ErrClientGetTSO]rpc error: code = Canceled desc = context canceled: rpc error: code = Canceled desc = context canceled”]