Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tidb 删除语句在tikv里面怎么存储的
【TiDB Usage Environment】Production Environment
【TiDB Version】
【Reproduction Path】What operations were performed when the issue occurred
【Encountered Issue: Issue Phenomenon and Impact】
When deleting a piece of data in TiDB, how is it stored in TiKV?
Key_delete → Value?
Does anyone know the principle? Please share.
【Resource Configuration】
Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
【Attachments: Screenshots/Logs/Monitoring】
TiDB uses TiKV to store data at the underlying level, and TiKV employs the LSM tree data structure, which is an append only
model. This means that all changes to the data are reflected in appends. When a piece of data is deleted, the key of that data is marked as deleted rather than being directly removed from the disk. This key marked for deletion will be cleaned up during subsequent GC processes. Therefore, the actual operation of deleting a piece of data is to insert the key of that data into TiKV and set its value to empty. This operation generates a Key_delete with an empty corresponding value.
In TiDB, the data key is not the row_id, but is composed of the table’s primary key and indexes. Data in TiDB is stored in rows, and each row of data has a unique primary key. The value of the primary key can be of any type, including integers, strings, etc. In TiDB, both primary keys and indexes are stored in the form of B+ trees. Each node contains multiple key-value pairs, where the key is the value of the primary key or index, and the value is a pointer to the data row. When TiDB executes a query operation, it searches the B+ tree based on the query conditions to find the corresponding key-value pair, and then uses the pointer to locate the corresponding data row. Therefore, the data key in TiDB is not the row_id, but is composed of the primary key and indexes.
If you are interested in the source code, you can check out:
This is a continuous learning process. If you are interested, make sure to keep learning and keep watching.
I understand that when deleting,
the key is
tablePrefix{TableID}_recordPrefixSep{RowID}delete{time}
This can be understood in many ways,
but it must conform to the semantics of the RocksDB compaction filter or the compaction process.
TiDB uses a transaction model based on Percolator, abstracting a row of data into three CF (column families) for storage: default, write, and lock. Among them:
- default CF stores the actual data
${key}_${start_ts} --> ${value}
- write CF stores the version information of the data, where commit_ts represents the actual version of a record
${key}_${commit_ts} --> ${start_ts}
- lock CF stores lock information. Transactions in the process of being committed will add a lock, which includes the location of the primary lock
${key} --> ${start_ts, primary_key, ..etc}
The image is not available for translation. Please provide the text content directly for translation.
At the region level, TiKV will append a record that has been deleted for a day. At the LSM level, during splitting or merging, the deleted data will be physically removed.
Mark the key of the data as deleted.
Add more versions, then perform garbage collection (GC).