Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: DM安全模式的insert语句疑问
To ensure a single semantic execution, DM will convert insert to replace in restart/safe mode, which is equivalent to executing delete+insert statements. This should be to prevent duplicate inserts, right? If we use the INSERT IGNORE INTO ...
syntax, wouldn’t it also solve this problem and improve efficiency? Can someone clarify this?
Purely from a semantic perspective, “replace into” can ensure that the last update is inserted into the database, while “insert ignore into” is for the first insertion. If there are subsequent changes, it might not work. Additionally, “ignore” ignores all errors, so it doesn’t guarantee that other errors won’t be ignored, such as issues with data length. Are there potential compatibility issues between TiDB and other databases? Ignoring such errors could be problematic.
From normal data, an insert should definitely only be executed once. After a restart, a portion of the binlog is consumed. Insert ignore and replace actually have the same effect. Other errors are not considered for now.
REPLACE INTO
deletes the original data and then inserts;
INSERT IGNORE INTO
ignores the insertion of data, which seems to consume less.
Temporarily closed, unresolved.
@lance6716 previously used TLA to do a related verification GitHub - lance6716/DM-safe-mode-model: TLA+ model of "safe mode" feature of PingCAP's DM. It can be changed to insert ignore, but it hasn’t been updated in the repo yet. If you have time, you can try to contribute 
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.