[TiDB Usage Environment] Production Environment
[TiDB Version] 5.4
[Reproduction Path] Save or update operation
[Encountered Problem: Problem Phenomenon and Impact]
[Resource Configuration]
The database generates completely identical duplicate data every day, with either 2 or 3 duplicates. The SQL logic is to update first, and if the update returns 0, then insert.
My understanding is that it should be caused by concurrency.
Session 1
SQL1: update test set a=1 where a=0; — returns result 0
SQL2: insert into test(a) values (1);
Session 2
SQL1: update test set a=1 where a=0; — if executed after session 2 SQL2, returns result 1; — if executed before session 2 SQL2, returns result 0
SQL2: insert into test(a) values (1); — if executed after session 2 SQL2, does not execute; if executed before session 2 SQL2, causes duplicate insertion
So as long as SQL1 of session 1 and session 2 are executed in parallel, SQL2 will both execute.
You need to note that the default transaction isolation level of TiDB is RR, not RC.
When the transaction isolation level is Repeatable Read, only the data modified by other transactions that have been committed at the start of the transaction can be read. Uncommitted data or data committed by other transactions after the transaction starts is not visible. For this transaction, the transaction statements can see the modifications made by previous statements.