May I ask, when TiDB executes a transaction, does it need to first read the data involved in the transaction from TiKV into the memory of the TiDB server? When reading data from TiKV, my understanding is that it is a current read (directly reading the latest data) and does not need to be judged based on start.tso. However, the official documentation states that during a transaction:
TiDB retrieves the data corresponding to the start_ts version from TiKV.
Is this statement referring to a simple read-only transaction, or is it a write transaction where data is read first and then modified? My understanding is that in a write transaction in TiDB (such as an update), when reading data, it should be a current read (locking and reading the latest data) and does not need to be judged based on start.tso. Is my understanding correct?
Whether it is optimistic transactions or pessimistic transactions, whether it is read requests or write requests, they can all occur at the same point in time. This can lead to issues:
When reading, the value is X. Should it remain consistent when writing?
When writing, the value is X. Should it remain consistent when reading?
Since there are N requests, they may fall on the same point in time or adjacent points in time, requiring the retrieval of the latest valid value within the time period.
Because pessimistic transactions and optimistic transactions handle data differently, it is more appropriate to consider several points together.
I recommend you check out Tong Mu’s interpretation of transactions:
The concept of current read is under the pessimistic transaction mode. In the optimistic transaction mode, all reads are snapshot reads, and lock acquisition and conflict checks are resolved during the prewrite phase.
Ah, TiDB does not support current reads in optimistic transaction mode. In other words, in optimistic mode, reads only exist in the form of snapshot reads? So, current reads like “select xxx for update” are not supported?
Optimistic transactions support select for update, but it is still a snapshot read, so the result is the same as select. It just goes through the lock and unlock process during COMMIT.
Hello, teacher.
Actually, I don’t quite understand. So, may I ask, if in optimistic transaction mode, an update statement is executed to update data, it must first read this data into the TiDB server’s mem buffer. When reading this data, is it also read in a snapshot read manner (by comparing with the transaction’s start.tso)? Wouldn’t the data read this way not be the latest? Could there be any issues?
For example, when MySQL updates data, it directly reads the latest version of the data through the current read and updates the data on this latest version.
There is no issue. In the prewrite phase, there is a step for version checking, which checks whether the commit_ts of the write column is later than its own start_ts. If it is, it indicates a version conflict (another transaction has already been committed), and the entire transaction is directly canceled. Therefore, although optimistic transactions use snapshot reads, their successful commitment is based on the premise that no other transactions have been committed before them. This also ensures that even snapshot reads are of the latest data. Additionally, this cannot be compared with MySQL, as MySQL does not have optimistic transactions.
It can actually be understood that in the optimistic transaction model, when updating data, although data is read in a snapshot read manner, it is actually possible to read the latest data.