Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: ticdc多个changefeed包含相同的表
To improve efficiency, please provide the following information. Clear problem descriptions can be resolved more quickly:
【TiDB Usage Environment】Production, Testing, Research
【TiDB Version】
【Encountered Problem】
Created two changefeeds named cdc1 and cdc2.
Both changefeeds are configured to synchronize the same table testdb.test.
In this case, is there only one changefeed synchronization task at the same time to replicate the data on this test table?
【Reproduction Path】What operations were performed to encounter the problem
【Problem Phenomenon and Impact】
【Attachments】
Please provide the version information of each component, such as cdc/tikv, which can be obtained by executing cdc version/tikv-server --version.
In the case of multiple sinks, I haven’t tested it.
In my environment, there is only one downstream TiDB environment, and there are two changefeeds synchronizing the same table upstream at the same time. I haven’t found any anomalies.
Use changefeed query
to check the task status and see the checkpoint rolling situation.
No, two changefeeds will simultaneously synchronize the data on this test table.
For example, when performing an insert operation on the test table, what internal mechanism is used for deduplication during downstream replication? Otherwise, it would result in two insert operations.
But the problem is that you did configure two changefeeds, didn’t you? Isn’t that twice?
Recreated a non-primary key table and observed that it indeed generated 2 identical records. Additionally, there were duplicate DDL execution entries in the CDC logs when the table was created.
If a primary key table is synchronized, it can also be synced downstream, but no related primary key conflict information was found in the logs.
It won’t deduplicate; changefeeds are independent of each other. They don’t know which tables other changefeeds are capturing or where they are syncing to.
After recreating a non-primary key table, I indeed saw that the target produced 2 identical records. Additionally, when creating the table, there was information in the CDC log about the DDL being executed twice.
This does happen. If two changefeeds have the same downstream, both DML and DDL will be synced redundantly.
Then it still wasn’t used according to the guidelines, which will cause a bunch of problems
This topic will be automatically closed 60 days after the last reply. No new replies are allowed.