Around 10 AM, all TiCDC tasks suddenly stopped with the same error.
Looking at the logs, it’s quite confusing:
[2023/07/31 16:35:46.219 +08:00] [ERROR] [changefeed.go:240] ["an error occurred in Owner"] [namespace=default] [changefeed=tencent-sync-qm-task] [error="[CDC:ErrSnapshotSchemaNotFound]schema 41766 not found in schema snapshot"] [errorVerbose="[CDC:ErrSnapshotSchemaNotFound]schema 41766 not found in schema snapshot\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/normalize.go:164\ngithub.com/pingcap/tiflow/cdc/puller.(*ddlJobPullerImpl).handleRenameTables\n\tgithub.com/pingcap/tiflow/cdc/puller/ddl_puller.go:252\ngithub.com/pingcap/tiflow/cdc/puller.(*ddlJobPullerImpl).handleJob\n\tgithub.com/pingcap/tiflow/cdc/puller/ddl_puller.go:360\ngithub.com/pingcap/tiflow/cdc/puller.(*ddlJobPullerImpl).Run.func2\n\tgithub.com/pingcap/tiflow/cdc/puller/ddl_puller.go:123\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.1.0/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1594"]
After running admin show ddl jobs, it was found that it was triggered right after executing a RENAME DDL. The executed DDL was RENAME TABLE a TO a_bak, a_new TO a;, and the schema_id of the database where the table is located is exactly 41766.
I tried deleting the changefeed, but the error still exists. Is there any other way to recover?
Additionally, the existing changefeeds do not actually synchronize the database with schema_id 41766…
Well, there is a workaround. This error occurs because the database with schema_id 41766 was not synchronized, causing TiCDC to not recognize this schema. So, you can add any table from this database to all changefeeds first, and then remove it.
The command to add the database is:
tiup cdc cli changefeed pause -c tencent-sync-qm-task
# Modify the changefeed config, add synchronization of any table from the corresponding database in the rules (you need to create an empty table in the downstream in advance)
tiup cdc cli changefeed pause --config cdc-qm.toml -c tencent-sync-qm-task
tiup cdc cli changefeed resume -c tencent-sync-qm-task
After the task status is successful, perform the same operation to remove the newly added table from the changefeed.