Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 查询流程疑问
[TiDB Usage Environment] Production Environment
[TiDB Version] 6.6
[Reproduction Path] Operations performed that led to the issue
[Encountered Issue: Problem Description and Impact] The two-phase commit returns a success after the primary commit. However, at this point, the secondary may not have completed the commit. When querying the related key at this time, is it possible to only retrieve the data from the previous transaction?
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]
What mode is used for data isolation?
And, what is the scenario you envision?
“The secondary might not have completed the commit,” at this point, the Lock of the secondary still exists and points to the primary. When a new transaction queries, it will first see the existence of the Lock and then query the status of the primary. If it finds that the primary has already committed, it will automatically commit the secondary and then return the data. Therefore, there won’t be a situation where “only part of the data committed by the previous transaction can be seen.”
The two-phase commit process in TiDB:
- TiDB selects a Key from the current data to be written as the Primary Key for the current transaction.
- TiDB obtains the write routing information for all data from PD and categorizes all Keys according to the routing.
- TiDB concurrently initiates prewrite requests to all involved TiKV nodes. Upon receiving the prewrite data, TiKV checks for data version conflicts or expiration. Data that meets the criteria will be locked.
- TiDB receives all prewrite responses and ensures all prewrites are successful.
- TiDB obtains a second globally unique incrementing version number from PD, defined as the
commit_ts
for this transaction.
- TiDB initiates the second phase of the commit to the TiKV node where the Primary Key resides. Upon receiving the commit operation, TiKV checks data validity and cleans up the locks left from the prewrite phase.
- TiDB receives the success information for the two-phase commit.
The situation you mentioned does not exist.
Here, TiKV first processes the submission of the Primary Key. After the submission is successful, TiDB receives the successful submission. However, at this time, the secondary keys have not been submitted to TiKV, so they cannot be queried.
According to the previous response, it seems that when a passive query is made, it is found that the secondary lock has not been released. After querying the primary status, it is then submitted, and the data is returned. Are there two processes: passive query submission and active asynchronous submission?
It seems that both RC/SI levels have this issue. During the time period when the primary in Percolator has completed the commit but the secondary asynchronous commit is not yet finished, the KV modified by this transaction appears to be partially visible externally.
The reading process is as follows:
- When reading a key, if no conflicting lock is found, return the corresponding value and finish.
- If a lock is found and the current lock corresponds to the
primary
key: If the lock has not timed out, wait. If the lock has timed out, Rollback it and retrieve the information from the previous version, then finish.
- If a lock is found and the current lock corresponds to the
secondary
key, find the information of the primary
specified in the lock, determine whether the current transaction has been successfully committed based on the status of the primary
, and return the corresponding specific value.
TiDB only references the arguments from the Percolator paper. For optimistic and pessimistic transactions, it considers actual application scenarios when implementing them. Whether it’s 2PC or 1PC, the primary key submission is the main focus.
The third state you mentioned won’t occur; otherwise, transactions can’t be guaranteed. How would we proceed then?
The essence of your question is actually about how TiDB ensures transaction consistency and isolation. Let me answer first: it won’t.
Here’s the explanation:
You mentioned the primary and secondary in two-phase commit, indicating that you want to understand a write operation transaction involving multiple rows of data. Here, the primary accurately refers to the primary key of the main row of this transaction, and the secondary refers to the secondary keys of other rows in the transaction besides the main row.
In the second phase of 2PC, after the main row clears the lock information and commits successfully according to commit_ts, to reduce user transaction latency, it will immediately return to the user indicating that the current write transaction has been successfully committed. At this point, the transaction has ended on the user side, and the data has also been persisted in TiKV (it can self-recover even if there is a failure). While returning to the user that the transaction has been successfully committed, TiKV is also concurrently committing the secondary keys according to commit_ts, including clearing the lock information in the lock CF and writing the commit information to the write CF. This is the basic process of writing.
Then your question is “at this time, querying the related key, can only part of the data from the previous transaction be queried,” indicating that during the process where the primary key has been committed but the secondary keys have not, a read transaction comes to read the data of the same primary key and secondary keys.
- First, this read transaction can read the primary key data because it has been committed and the lock information has been cleared.
- Then the key point is whether it can read the latest secondary keys data, which it actually can.
The process is roughly as follows: when this read transaction reads the secondary keys and finds a lock, which is a secondary lock, it then checks the status of the primary lock it points to and finds that the primary lock has been cleared and the main row has been committed, indicating that this transaction has been committed. Therefore, it can read the latest data. At this point, it will interact with the TiKV system to complete the commit and lock clearing of the secondary keys, ensuring that the transaction is fully committed, and then return the data of the secondary keys. Thus, this read transaction can correctly read the latest data, and the situation where only part of the data can be read, as you mentioned, will not occur.
Thank you for the patient and detailed explanation.
The locks on secondary keys should be row locks, which will block other commit operations on that row.
THX.
The lock here is not a row lock, but rather a marker written into the LOCK ColumnFamily in RocksDB.
The locks for secondary keys will add a lock to each key-value that needs to be updated. Internally, RocksDB will write lock information for each key in the lock CF to implement this.
Simply put, for the user, the effect can be understood as a row lock on the data row.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.