Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tidb-ligntning在物理导入模式下发生冲突时,只会对主表的数据做去重,并不会删除index对应的值
As mentioned, although the conflict strategy has been set to remove, if there is a conflict in the original file, it will cause the main table data to be normal, but an error will be reported when using ADMIN CHECK TABLE
.
Normally, when deleting a record from the main table, the corresponding values in other index trees should also be deleted based on that record.
PS: An error was also reported when trying to clean up the index:
mysql> admin check table t1;
ERROR 8223 (HY000): data inconsistency in table: t1, index: idx_1, handle: {SA0CFCSR03GR3PS, 002420, 1844474736059351040}, index-values:"handle: {SA0CFCSR03GR3PS, 002420, 1844474736059351040}, values: [KindMysqlTime 2016-03-18 KindString SA0CFCSR03GR3PS KindString 002420 KindString SA0CFCSR03GR3PS KindString 002420 KindMysqlTime 2016-03-18]" != record-values:""
mysql> ADMIN RECOVER INDEX t1 idx_1;
ERROR 1105 (HY000): [components/tidb_query_executors/src/table_scan_executor.rs:422]: Data is corrupted, missing data for NOT NULL column (offset = 0)
mysql>
I don’t know if the conflict strategy in the new version 8.0 has solved this problem.
@ShawnYan Try it out in version 8.0~
Looking at the logs, there is also output for “resolve duplicate rows completed,” and the end marker is also [“tidb lightning exit”] [finished=true].
Is it a single Lightning node import or multiple Lightning nodes importing in parallel?
Could you please share the Lightning logs?
It was imported on a single node. However, the logs seem to contain S3 AK information, so it’s not convenient to provide all of them. Can I just provide the warning information after the import is completed?
Here is the log after the import was successful. The logs during the import phase were all normal; I also confirmed with the developer that the file he provided had issues with duplicate data.
Is it feasible to delete all indexes before importing and rebuild the indexes after importing?
As long as there are no duplicates in the original data, this issue should not occur. If this problem does arise, you can try using ADMIN RECOVER INDEX t1 idx_1;
to repair the index. My scenario is quite special; the dirty data that appeared happened to fill a not null field with a null value, so the repair failed, and I had to delete and rebuild all indexes.
Got it. Let me check if version 8.0 has this issue.