Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 搭建tidb备库,先pitr恢复,再用cdc同步增量数据异常
[TiDB Usage Environment] Production Environment
[TiDB Version] v7.1.3
[Reproduction Path]
Preparing to set up a TiDB standby database, considering that in previous versions, creating CDC with high latency would likely cause CDC to fail to start, so this time we adopted the following strategy:
-
Deploy BR log incremental backup on the primary database
-
Perform a full database backup on the primary database
-
Perform PITR full database + incremental recovery on the standby database, and subsequently performed multiple PITR incremental recoveries, all showing successful recovery without any anomalies
-
Establish CDC on the primary database to synchronize incremental data to the standby database
-
Various errors occurred after establishing CDC
-
Continued PITR incremental recovery, which was still successful.
[Encountered Issues: Problem Phenomenon and Impact]
Especially the error mentioning primary key conflict in WIND_TB_OBJECT_6489_OPLOG. This table doesn’t even have a primary key, only a few regular indexes, and it’s a partitioned table.
I see that the error is “duplicate entry,” which means there are duplicate values. Although it mentions “primary” later, it might just be a manifestation of the error. TiCDC best practices require tables to have effective indexes, which are essentially primary keys or unique indexes. Is it possible that your table has duplicate data because it lacks a primary key? You can group by all columns to check if there are any duplicate values.
TiCDC Overview | PingCAP Documentation Center
TiDB has an implicit primary key, right? Is it rowid?
Is it true that CDC does not obtain the initial synchronization position of the primary database?
This table is a partitioned table, a log record table, without primary keys and unique indexes. CDC synchronization has enabled force-replicate to support the synchronization of tables without primary keys and unique indexes.
This shouldn’t be the case. TSO is set based on the output of restore-to when the downstream recovery is successful.
What is the tidb_rowid after the downstream cluster is restored?
You can check by executing the SQL: show table test.t next_row_id;
There will be issues without a primary key, _tidb_rowid might be duplicated. The data accuracy cannot be guaranteed either.
This return value is consistent between the primary and standby databases: 958148143. It seems a bit far from the primary key error mentioned above.
Can you see the erroneous DML statement in the CDC log? Is it an insert or an update?
I didn’t see it in the logs, but for this table, we only have insert operations, no updates or deletions.
Could you briefly describe the schema and the corresponding recovery steps? We will try to reproduce it.
- All component versions are 7.1.3 by default.

- Table structure (without primary key and unique index) and insert statements.

- Steps to set up upstream and downstream clusters?
- The article describes multiple recoveries, so what are the respective time points for selecting the log backups?
The error message says that Primary has nothing to do with rowid, right?
Hello, I would like to confirm again, when creating a changefeed, is the procedure to first stop PITR, and then use the restore-to time of PITR as the startTs for the changefeed?
@porpoiselxj Hello! Could you help organize a complete minimal reproducible example? For instance, including table creation, data import, etc. If we can reproduce this issue internally, we might be able to identify the root cause more quickly.