Why does UnsafeDestroyRange have a certain probability of not completely deleting data?

translator_bot · June 23, 2024, 4:29am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 为何UnsafeDestroyRange有一定概率删数据删不干净?

| username: kuiper

[TiDB Usage Environment] Production Environment
[TiDB Version] Not using TiKV, only using TiKV for KV storage, using TxnClient
[Encountered Issues]

UnsafeDestroyRange has a certain probability of failing to delete data. During testing, it was found that there is a certain probability (about 6%) that data intended to be destroyed by DestroyRange can still be scanned by txn.Iter. Why is this happening? I briefly reviewed the TiKV code. Currently, its implementation first calls Rocksdb::DeleteFilesInRange to attempt to quickly free up space. If that fails, it iterates and writes tombstones one by one. However, I don’t understand why data remnants appear. Additionally, can the individual Deletes be replaced with Rocksdb::DeleteRange to write RangeTombstones? Because it seems that iterating and deleting one by one might be a bit costly, and the GCWorker’s queue can easily get filled up.
Performing UnsafeDestroyRange on a very large range (above TB level) will cause TiKV’s IO to be maxed out, lasting for several hours. Is there a good solution for this?

I hope someone from the official team can clarify these issues. Thank you very much!

translator_bot · June 23, 2024, 4:29am

| username: jiyf | Original post link

When I previously looked at the TiDB GC code, I also wondered why UnsafeDestroyRange needs to be called twice:

The first time is to execute UnsafeDestroyRange according to the safepoint.
The second time is 24 hours after the safepoint.

This RFC explains why the first cleanup cannot be completed:

And why do we need to check it one more time after 24 hours? After deleting the range the first time, if coincidentally PD is trying to move a Region or something, some data may still appear in the range. So check it one more time after a proper time to greatly reduce the possibility.

The IO saturation might be caused by compaction:

After deleting all files in a very-large range, it may trigger RocksDB’s compaction (even if it is not really needed) and even cause stalling.

translator_bot · June 23, 2024, 4:29am

| username: kuiper | Original post link

Okay, thank you for your reply~

translator_bot · June 23, 2024, 4:29am

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.