Why does UnsafeDestroyRange have a certain probability of not completely deleting data?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 为何UnsafeDestroyRange有一定概率删数据删不干净?

| username: kuiper

[TiDB Usage Environment] Production Environment
[TiDB Version] Not using TiKV, only using TiKV for KV storage, using TxnClient
[Encountered Issues]

  1. UnsafeDestroyRange has a certain probability of failing to delete data. During testing, it was found that there is a certain probability (about 6%) that data intended to be destroyed by DestroyRange can still be scanned by txn.Iter. Why is this happening? I briefly reviewed the TiKV code. Currently, its implementation first calls Rocksdb::DeleteFilesInRange to attempt to quickly free up space. If that fails, it iterates and writes tombstones one by one. However, I don’t understand why data remnants appear. Additionally, can the individual Deletes be replaced with Rocksdb::DeleteRange to write RangeTombstones? Because it seems that iterating and deleting one by one might be a bit costly, and the GCWorker’s queue can easily get filled up.

  2. Performing UnsafeDestroyRange on a very large range (above TB level) will cause TiKV’s IO to be maxed out, lasting for several hours. Is there a good solution for this?

I hope someone from the official team can clarify these issues. Thank you very much!

| username: jiyf | Original post link

When I previously looked at the TiDB GC code, I also wondered why UnsafeDestroyRange needs to be called twice:

  1. The first time is to execute UnsafeDestroyRange according to the safepoint.
  2. The second time is 24 hours after the safepoint.

This RFC explains why the first cleanup cannot be completed:

And why do we need to check it one more time after 24 hours? After deleting the range the first time, if coincidentally PD is trying to move a Region or something, some data may still appear in the range. So check it one more time after a proper time to greatly reduce the possibility.

The IO saturation might be caused by compaction:

After deleting all files in a very-large range, it may trigger RocksDB’s compaction (even if it is not really needed) and even cause stalling.

| username: kuiper | Original post link

Okay, thank you for your reply~

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.