Issues Related to Delete Within Transactions

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 事务内Delete相关问题

| username: TiDBer_U58GZgGJ

According to the official blog, DML operations are all in the local buffer of the transaction. So is delete also targeted at the buffer? It won’t actually delete from the KV before committing. Then after deleting and selecting again, wouldn’t it read the deleted data from the KV? In actual operations, this situation does not occur. How is this resolved?

| username: WinterLiu | Original post link

It is recommended to look into the implementation of distributed transactions and MVCC. TiDB ensures consistent reads of data.

| username: tidb狂热爱好者 | Original post link

Deleting within a transaction can be very slow. It is recommended to use batch mode for deletion.

| username: zhanggame1 | Original post link

The delete operation itself is also an insertion of a key-value pair. After performing a delete and then selecting, during the select operation, searching in TiKV will reveal that the original key has a delete record, indicating it has been deleted.

| username: TiDBer_U58GZgGJ | Original post link

The delete you mentioned is also an insertion of kv, so is this kv inserted into the transaction buffer before commit or directly into TiKV?

| username: yiduoyunQ | Original post link

Which blog?

| username: zhaokede | Original post link

Consistency ensures that deleted data will not be read.

| username: TiDBer_U58GZgGJ | Original post link

About transactions.

| username: TIDB-Learner | Original post link

Deletion is not a true deletion; it just marks the data as deleted. If the transaction has not yet been committed, other processes can definitely access the data. If the transaction has been committed, other processes’ transactions that were committed before the deletion transaction can read the data; otherwise, they cannot. This is due to MVCC (snapshot read) and the ACID properties of transactions. TiDB’s distributed read-write principles, timestamp, memory, and log management mechanisms. Let’s encourage each other.

| username: Soysauce520 | Original post link

When scanning kv, it will skip those with timestamps and delete markers. Your explain will have a skip execution plan.

| username: TiDBer_tvqzG8Dk | Original post link

Transaction consistency

| username: tidb狂热爱好者 | Original post link

Deleting transactions can be very slow.

| username: wangcw | Original post link

Deleting within a transaction is too slow and does not immediately release space, so usually only logical deletion is performed.

| username: Soysauce520 | Original post link

In the execution plan, you will see the skip operation, which refers to the records marked for deletion. This means they are filtered out during the reading phase.

| username: 迷途小书童 | Original post link

Looking forward to an article from the expert explaining it clearly.

| username: xiaohaozifeifeifei | Original post link

Consistent read, marked for deletion.

| username: yg_1988 | Original post link

When committing, logs are written in the LSM tree manner. Delete operations also write new logs, and historical data is only truly cleared during the merge process.

| username: TiDBer_pOZcnpJA | Original post link

TiDB uses MVCC to handle concurrent reads and writes. When performing a DELETE operation, it actually creates a new version of the row marked as deleted. Subsequent SELECT operations will read data based on transaction visibility rules and typically will not see the deleted version.

| username: 有猫万事足 | Original post link

You can take a look at this article, before commit, this kv is in the cache.

Additionally, TiDB does not support READ-UNCOMMITTED. So you don’t need to worry about other threads needing to access this part of the data in the write cache.