Execute LOAD DATA statement with REPLACE INTO specified, an error occurs when a primary key conflict happens

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 执行LOAD DATA语句,如果指定 REPLACE INTO,当发生主键冲突时报错

| username: TiDBer_Bo0lt2rY

[TiDB Usage Environment] Production Environment / Testing
[TiDB Version] v6.5
[Reproduction Path]

  1. Execute the SQL statement: LOAD DATA LOCAL INFILE ‘B_PRODCONSTI.csv’ REPLACE INTO TABLE B_PRODCONSTI FIELDS TERMINATED BY ‘,’ ENCLOSED BY ‘"’ LINES TERMINATED BY ‘\n’ IGNORE 1 LINES (TENANT, PRODUID, CONSTIPRODUID, CAMOUNT, STATE, UPTIME)
  2. If the imported data has the same primary key, the following error will occur:
    assertion failed: key: 748000000000001d9d5f72010031003900370030ff0033003400350000fd060a00800000025b, assertion: NotExist, start_ts: 441033509315805186, existing start ts: 441033501700259844, existing commit ts: 441033501713629190
  3. If there is no primary key conflict, or if IGNORE INTO is used instead of REPLACE INTO, there will be no issue.
| username: dba-kit | Original post link

It is probably a bug, but actually, you can also use tidb-lightning to import and set how to handle conflicts there.

| username: TiDBer_Bo0lt2rY | Original post link

You’re right, you can use tidb-lightning for importing. However, tidb-lightning is more suitable for manual operations, while LOAD DATA is more convenient for handling with programs. Currently, I first import into a temporary table, and then use REPLACE INTO to import into the actual table. If the data volume is not too large, this method is barely acceptable.

| username: dba-kit | Original post link

Okay, I’ll move it to the BUG feedback section and let the official team investigate it. However, even if this BUG is resolved, it will probably be fixed in the next version. :joy: