Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tidb并发的时候拿的数据不是最新的
[TiDB Usage Environment] Production Environment / Testing
[TiDB Version] v5.4.2
[Reproduction Path] What operations were performed to cause the issue
There are two existing businesses, A and B:
Business A starts a transaction with the following SQL statements:
SELECT * from t1 where id=1 FOR UPDATE;
...
UPDATE t1 SET num = num-1 WHERE id=1;
Business B also has a query in the transaction with the following SQL statement:
SELECT * from t1 where id=1 FOR UPDATE;
Execute businesses A and B simultaneously or sequentially.
[Encountered Issue: Problem Phenomenon and Impact]
When business A has not completed, business B will block and wait. After business A completes and the transaction commits, business B will continue to run, but the data retrieved by SELECT * from t1 where id=1 FOR UPDATE
in business B is the data before business A’s modification.
Repeated testing shows that after one transaction commits and releases the lock, the other transaction still retrieves the old data. This causes many data issues.
Could you please explain the reason for this and suggest any solutions?
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]
This issue is related to MVCC.
Transactions A and B start, and the data version obtained by both transactions is the data version when the first SQL statement is executed.
If transaction B wants to query or modify data based on transaction A, then transaction B should start after transaction A commits.
It was originally designed this way.
It’s easy to solve. You haven’t changed the transaction level, right? TiDB’s default transaction level is RR. Change it to RC level. In the default RR level, after transaction A starts, if transaction B starts later, the changes made by transaction A will not be visible to transaction B.
This does not conform to the situation of phantom reads. If another transaction is started after committing, then the lock would be useless.
How should this situation be resolved, expert?
Currently using the default settings. Would there be any other impacts if the rc level is used?
It doesn’t have much impact, and the performance of RC is slightly better. RR isolation is not very useful as it cannot access data committed by other transactions. MySQL uses RR due to historical reasons related to binlog, while Oracle uses the RC level.
However, MySQL’s repeatable read doesn’t have this issue either.
Just set global transaction_isolation='READ-COMMITTED';
and it will work.
I have studied it, and the two RR levels are different. In TiDB, once transaction A starts, transaction B will not be able to see the changes made by transaction A if it starts afterward. In MySQL, if transaction A makes changes, transaction B can still see the changes made by transaction A even if it starts before transaction A.
I misunderstood, “for update” retrieves the latest current data.
Transaction B should get the data submitted by Transaction A.
This situation is not a phantom read.
I tested it on my side and got the latest data. I’m not sure what version and isolation level of TiDB you are using.
TiDB version: v4.0.8
Isolation level: REPEATABLE-READ
The official documentation provides a detailed description of the monitoring indicators: TiDB 监控框架概述 | PingCAP 文档中心.
REPEATABLE-READ
TiDB version: v5.4.2
Hold on, I’ll test it with your script and post a screenshot.
It depends on the business requirements. Choose one among RR, optimistic locking, and pessimistic locking.
Have the test results come out? I tested version 7.1, and with “for update” I got the latest value, but without “for update” it was the old value.
How should one generally choose when there are requirements for sales, inventory, and financial transactions?
TiDB version: v5.4.2
Isolation level: REPEATABLE-READ
Transaction B will be blocked until transaction A commits and retrieves the committed data.
May I ask, if transaction A has tables t1 and t2, and transaction B also has tables t1 and t2. In the previous scenario, if transaction A adds a “for update” lock on table t1 and updates tables t1 and t2, will transaction B see the data in table t2 before or after the changes?
That is expected. “For update” is a current read, so it will get the latest data. Your previous question was that after transaction 1 updates, transaction 2’s “for update” query gets the old data. This issue does not exist.