Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tidb lightning导入后,业务写入报主键冲突?是bug吗?
[TiDB Usage Environment] Production
[TiDB Version] v5.2.3
[Issue Phenomenon and Impact]
After using dumpling to export data from TiDB v4.0.0 and then using lightning to import it into TiDB v5.2.3, the business occasionally reports primary key conflicts. The business SQL does not specify the primary key ID. After restarting the tidb-server, the issue of primary key conflicts does not occur.
Error log:
[2022/08/10 09:35:17.654 +08:00] [INFO] [conn.go:1007] [“command dispatched failed”] [conn=9368065] [connInfo=“id:9368065, addr:10.10.10.10:33656 status:10, collation:utf8_general_ci, user:xxxxxx_wr”] [command=Query] [status=“inTxn:0, autocommit:1”] [sql=“INSERT INTO axxxxy ( x,y,z,… ) VALUES ( ‘2022-08-10 09:35:17.678’,‘2022-08-10 09:35:17.678’,0,… )”] [txn_mode=PESSIMISTIC] [err=“[kv:1062]Duplicate entry ‘2432514’ for key ‘PRIMARY’”]
If the business does not have a retry mechanism, it is best to restart the TiDB node after the import. You can refer to this article: 专栏 - 记一场DM同步引发的Auto_Increment主键冲突漫谈 | TiDB 社区
The description in the column is not quite the same. The business did not specify primary key insertion, and it was exported from TiDB 4.0.0 and then imported into TiDB 5.2.3. The business traffic was directly switched from 4.0.0 to 5.2.3, and even several days after switching to 5.2.3, there were still primary key conflicts occurring.
What is being used as the primary key?
id
bigint(20) NOT NULL AUTO_INCREMENT COMMENT ‘Primary Key’,
You can use admin check table
to inspect the problematic data table and check if the data indexes are consistent. It is possible that the data indexes became inconsistent due to some reason during the import, causing the Duplicate entry issue.
I think this is a normal phenomenon. TiDB’s auto-increment IDs are counted separately, with each TiDB instance assigned an auto-increment range. Let me explain this issue: suppose there are three TiDB instances, 192.1(a), 192.2(b), and 192.3(c).
- a’s current auto-increment value: 101
- b’s current auto-increment value: 3001
- c’s current auto-increment value: 6001
Our previous data was migrated and synchronized from MySQL, which means the ID values were specified during insertion. Suppose we now insert a record with ID 6002 (equivalent to importing a record with ID 6002 using Lightning). If this statement is executed on instance c, then c’s next auto-increment value will be 6003, which is fine. However, if this insertion is executed on instance b, then c’s next auto-increment value will still be 6002. At this point, using an insertion without specifying the auto-increment primary key on instance c will cause a primary key conflict.
If you use Dumpling to export from TiDB, the exported data itself will include the auto-increment column. This means that Lightning will not reassign auto-increment IDs for the newly imported table; the auto-increment IDs of the new table should be the same as the old table. It seems that the probability of errors during import is relatively small.
You can check the current largest ID number, and then set the initial value of the auto_increment_offset
auto-increment field.
All auto-increment primary keys need to be incremented to be greater than the maximum value of the auto-increment keys before the migration.