TiCDC unable to synchronize data, error: etcd client outCh blocking too long

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: ticdc 无法同步数据,报错:etcd client outCh blocking too long

| username: Hacker_ojLJ8Ndr

ticdc cannot synchronize, no error information displayed, and the log warnings are as follows:

[2023/08/03 17:29:00.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=10.599113445s] [role=processor]
[2023/08/03 17:29:01.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=11.599196548s] [role=processor]
[2023/08/03 17:29:02.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=12.600323247s] [role=processor]
[2023/08/03 17:29:03.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=13.59911505s] [role=processor]
[2023/08/03 17:29:04.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=14.599893366s] [role=processor]
[2023/08/03 17:29:05.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=15.599842259s] [role=processor]
[2023/08/03 17:29:06.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=16.600315897s] [role=processor]
[2023/08/03 17:29:07.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=17.599149941s] [role=processor]
[2023/08/03 17:29:08.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=18.600282456s] [role=processor]
[2023/08/03 17:29:09.182 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=19.600627829s] [role=processor]
[2023/08/03 17:29:10.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=20.599791854s] [role=processor]
[2023/08/03 17:29:11.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=21.599458008s] [role=processor]
[2023/08/03 17:29:12.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=22.599127427s] [role=processor]
[2023/08/03 17:29:13.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=23.599763841s] [role=processor]
[2023/08/03 17:29:14.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=24.599174768s] [role=processor]
[2023/08/03 17:29:15.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=25.600263766s] [role=processor]
[2023/08/03 17:29:16.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=26.599329814s] [role=processor]
[2023/08/03 17:29:17.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=27.599595961s] [role=processor]
[2023/08/03 17:29:18.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=28.599149714s] [role=processor]
[2023/08/03 17:29:19.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=29.59949637s] [role=processor]
[2023/08/03 17:29:20.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=30.599115826s] [role=processor]
[2023/08/03 17:29:21.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=31.59928108s] [role=processor]
[2023/08/03 17:29:22.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=32.599275573s] [role=processor]
[2023/08/03 17:29:23.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=33.599748676s] [role=processor]
[2023/08/03 17:29:24.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=34.599139836s] [role=processor]
[2023/08/03 17:29:25.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=35.599114923s] [role=processor]
[2023/08/03 17:29:26.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=36.599141394s] [role=processor]
[2023/08/03 17:29:27.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=37.599736012s] [role=processor]
[2023/08/03 17:29:28.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=38.599553315s] [role=processor]
[2023/08/03 17:29:29.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=39.59912175s] [role=processor]
[2023/08/03 17:29:30.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=40.599826815s] [role=processor]
[2023/08/03 17:29:31.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=41.599205771s] [role=processor]
[2023/08/03 17:29:32.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=42.600046078s] [role=processor]
[2023/08/03 17:29:33.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=43.59911869s] [role=processor]
[2023/08/03 17:29:34.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=44.599363853s] [role=processor]
[2023/08/03 17:29:35.180 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=45.599124334s] [role=processor]
[2023/08/03 17:29:36.181 +08:00] [WARN] [client.go:259] ["etcd client outCh blocking too long, the etcdWorker may be stuck"] [duration=46.599703028s] [role=processor]

Similar to this topic TiCDC无法同步,不断打告警日志【etcd client outCh blocking too long】 - TiDB 的问答社区

| username: 爱白话的晓辉 | Original post link

Post the parameters and configuration file for creating CDC.

| username: Billmay表妹 | Original post link

Is this the bug?

| username: Hacker_ojLJ8Ndr | Original post link

It looks like it from the description.

| username: Billmay表妹 | Original post link

That’s strange. Normally, this bug should have been fixed. I don’t know why it still surfaced.

| username: Hacker_ojLJ8Ndr | Original post link

The configuration should be fine; I’ve always used this configuration, but this problem appeared for the first time today.

| username: Hacker_ojLJ8Ndr | Original post link

There’s also this error

[2023/08/03 17:41:29.762 +08:00] [WARN] [server.go:447] ["topic handler returned error"] [error="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled"] [errorVerbose="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister.func1\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:230\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:252\ngithub.com/pingcap/tiflow/pkg/p2p.(*MessageServer).run.func2\n\tgithub.com/pingcap/tiflow/pkg/p2p/server.go:249\nruntime.goexit\n\truntime/asm_amd64.s:1594"]
[2023/08/03 17:41:29.763 +08:00] [WARN] [manager.go:162] ["processor close took too long"] [namespace=default] [changefeed=sync-tidb-test] [capture=879ea6bf-e014-4c93-aa0c-6edef50d1c46] [duration=4m0.489560052s]
[2023/08/03 17:41:29.763 +08:00] [WARN] [etcd_worker.go:293] ["EtcdWorker reactor tick took too long"] [duration=4m0.489780512s] [role=processor]
[2023/08/03 17:41:29.763 +08:00] [WARN] [server.go:271] ["handler not found"] [topic=changefeed/default/sync-tidb-test/agent]
[2023/08/03 17:41:29.856 +08:00] [WARN] [server.go:447] ["topic handler returned error"] [error="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled"] [errorVerbose="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister.func1\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:230\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:252\ngithub.com/pingcap/tiflow/pkg/p2p.(*MessageServer).run.func2\n\tgithub.com/pingcap/tiflow/pkg/p2p/server.go:249\nruntime.goexit\n\truntime/asm_amd64.s:1594"]
[2023/08/03 17:41:29.857 +08:00] [WARN] [server.go:271] ["handler not found"] [topic=changefeed/default/sync-tidb-test/scheduler]
[2023/08/03 17:41:29.966 +08:00] [WARN] [server.go:447] ["topic handler returned error"] [error="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled"] [errorVerbose="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister.func1\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:230\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:252\ngithub.com/pingcap/tiflow/pkg/p2p.(*MessageServer).run.func2\n\tgithub.com/pingcap/tiflow/pkg/p2p/server.go:249\nruntime.goexit\n\truntime/asm_amd64.s:1594"]
[2023/08/03 17:41:40.528 +08:00] [WARN] [replication_manager.go:503] ["schedulerv3: cannot advance checkpoint since missing table"] [namespace=default] [changefeed=sync-tidb-test] [tableID=13120]
| username: Billmay表妹 | Original post link

Based on the information you provided, the issue might be due to TiCDC being unable to connect to etcd or etcd responding too slowly. You can try the following methods to resolve this issue:

  1. Check if etcd is running normally. You can use the following command to check the status of etcd:

    tiup ctl:v5.1.1 pdctl member list
    

    If the status of etcd is abnormal, you can restart etcd using the following command:

    tiup cluster restart <cluster-name> --node pd
    

    Here, <cluster-name> is the name of the TiDB cluster.

  2. Check if the configuration of TiCDC is correct. You can use the following command to view the configuration of TiCDC:

    tiup ctl:v5.1.1 cdc cli --pd=http://<pd-address>:<pd-port> changefeed query -c <changefeed-id>
    

    Here, <pd-address> and <pd-port> are the address and port number of PD, and <changefeed-id> is the ID of the TiCDC synchronization task.

    You can check if the configuration of TiCDC is correct, including the version of TiCDC and the configuration of the TiCDC synchronization task.

  3. Check the logs of TiCDC. You can use the following command to view the logs of TiCDC:

    tiup ctl:v5.1.1 cdc cli --pd=http://<pd-address>:<pd-port> changefeed log -c <changefeed-id>
    

    Here, <pd-address> and <pd-port> are the address and port number of PD, and <changefeed-id> is the ID of the TiCDC synchronization task.

    You can check the logs of TiCDC to see if there are any other errors or abnormal situations.

| username: Hacker_ojLJ8Ndr | Original post link

The problem has been identified: there is a table that does not exist in the target environment, but there is no error message when using tiup cdc cli changefeed list.

| username: fubinz | Original post link

Was this non-existent table created upstream after the changefeed was created, or did it exist before the changefeed start-ts?

| username: Hacker_ojLJ8Ndr | Original post link

No, the table name downstream has been changed.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.