Deleting Data from a 1.1 Billion Row Table Becomes Slower Over Time

translator_bot · June 22, 2024, 12:16am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 11亿表的数据删除数据越删越慢

| username: zzw6776

[TiDB Usage Environment] Production Environment

When deleting around 600 million out of 1.1 billion records in a table using the following SQL, the deletion statement becomes very slow, taking more than 3 seconds to execute, whereas initially it took only 0.5 seconds. I searched the forum and found no solutions, nor have I modified the GC time. I personally feel that the previously deleted data has not been GC’d, causing the current deletions to scan a lot of data.

Executing SQL:

EXPLAIN analyze DELETE from h_equip_bin_box_status_history where gmt_create < '2023-07-01 00:00:00' limit 10000

Execution Plan:

Delete_5	N/A	0	root		time:3.44s, loops:1	N/A	172.7 KB	N/A
└─Limit_9	10000.00	10000	root		time:3.34s, loops:11	offset:0, count:10000	N/A	N/A
  └─TableReader_14	10000.00	10000	root		time:3.34s, loops:10, cop_task: {num: 645, max: 654.6ms, min: 440.1µs, avg: 300.1ms, p95: 505.5ms, max_proc_keys: 992, p95_proc_keys: 224, tot_proc: 3m12.2s, tot_wait: 621ms, rpc_num: 645, rpc_time: 3m13.6s, copr_cache_hit_ratio: 0.09, distsql_concurrency: 60}	data:Limit_13	3.15 MB	N/A
    └─Limit_13	10000.00	10208	cop[tikv]		tikv_task:{proc max:767ms, min:1ms, avg: 327ms, p80:392ms, p95:504ms, iters:720, tasks:645}, scan_detail: {total_process_keys: 10208, total_process_keys_size: 1321747, total_keys: 715402150, get_snapshot_time: 51.6ms, rocksdb: {delete_skipped_count: 1204377, key_skipped_count: 716605941, block: {cache_hit_count: 1022827, read_count: 924, read_byte: 55.0 MB, read_time: 36.8ms}}}	offset:0, count:10000	N/A	N/A
      └─Selection_12	10000.00	10208	cop[tikv]		tikv_task:{proc max:767ms, min:1ms, avg: 327ms, p80:392ms, p95:504ms, iters:720, tasks:645}	lt(lyzhhw4.h_equip_bin_box_status_history.gmt_create, 2023-07-01 00:00:00.000000)	N/A	N/A
        └─TableFullScan_11	12383.59	10208	cop[tikv]	table:h_equip_bin_box_status_history	tikv_task:{proc max:767ms, min:0s, avg: 327ms, p80:392ms, p95:504ms, iters:720, tasks:645}	keep order:false	N/A	N/A

translator_bot · June 22, 2024, 12:16am

| username: tidb菜鸟一只 | Original post link

I suggest referring to this article:

translator_bot · June 22, 2024, 12:16am

| username: h5n1 | Original post link

GC is not a physical deletion, it just marks in RocksDB, and the data is truly deleted when RocksDB compacts.
6.5.1 Use non-transactional DML, batch delete

translator_bot · June 22, 2024, 12:16am

| username: zzw6776 | Original post link

What are the timing points for rocksdb compact? Is there any related documentation?

translator_bot · June 22, 2024, 12:16am

| username: ShawnYan | Original post link

Documentation for related parameters

translator_bot · June 22, 2024, 12:16am

| username: h5n1 | Original post link

You can take a look at this first

translator_bot · June 22, 2024, 12:16am

| username: TiDBer_vfJBUcxl | Original post link

Check this out:

translator_bot · June 22, 2024, 12:16am

| username: Kongdom | Original post link

Did this statement not use the index?

Previously, it was said that this value is a deleted copy, right?

translator_bot · June 22, 2024, 12:16am

| username: zhanggame1 | Original post link

This is a characteristic of TiDB; the more you delete, the slower it gets, and after deletion, regions are not released. It’s best to use partition tables and then drop partitions to delete data.

The official documentation states that the more you delete, the slower it gets TiDB Best Practices | PingCAP Documentation Center

translator_bot · June 22, 2024, 12:16am

| username: zzw6776 | Original post link

There is an index but it is not being used. I have analyzed the table but it is still not working. If I force the use of the index with FORCE INDEX(idx_gmt_create), the result is the same, it still takes a long time.

translator_bot · June 22, 2024, 12:16am

| username: tidb菜鸟一只 | Original post link

Change the time condition to gmt_create < ‘2023-07-01 00:00:00’ and gmt_create > ‘2023-06-30 00:00:00’. You can refer to the article I forwarded above to automatically generate the delete script.

translator_bot · June 22, 2024, 12:16am

| username: redgame | Original post link

Getting slower and slower, I can say it’s not common.

translator_bot · June 22, 2024, 12:16am

| username: tony5413 | Original post link

I would like to ask, since you are deleting data up to July 1, 2023, why not create a new table instead of deleting a large amount of data? Won’t there be a lot of fragmentation after deletion?

translator_bot · June 22, 2024, 12:16am

| username: 像风一样的男子 | Original post link

Deleting a large amount of data is a pain point for TiDB.

translator_bot · June 22, 2024, 12:16am

| username: Kongdom | Original post link

Uh… suddenly found a shortcut~

translator_bot · June 22, 2024, 12:16am

| username: zhanggame1 | Original post link

After researching today, I found that delete operations do not release the number of regions already occupied, nor do they reduce the size of regions, and they cannot free up the occupied disk space. The deletion process involves a large amount of data writing, resulting in more disk space being occupied the more you delete.

translator_bot · June 22, 2024, 12:16am

| username: zzw6776 | Original post link

Because the underlying principle of deletion is just appending data for marking… it will only truly free up space when it compacts. However, I don’t quite understand the timing of the compaction. It’s been several days since I deleted it, and the regions haven’t decreased.

translator_bot · June 22, 2024, 12:16am

| username: zzw6776 | Original post link

Since there are more than 300 million records after July 1st, even creating a new table to insert data might not be easy to operate… I tried using insert into…select before and found that it didn’t run at all.

translator_bot · June 22, 2024, 12:16am

| username: cy6301567 | Original post link

Has such a large table been partitioned? Without partitioning, it will definitely be slow. Partitioning can speed up data deletion.

translator_bot · June 22, 2024, 12:16am

| username: h5n1 | Original post link

Because the underlying principle of deletion is just appending data and marking it… You have to wait for it to compact to actually free up space. But I don’t quite understand the timing of the compaction; it’s been several days since I deleted, and the regions haven’t decreased.

Manually compact TiKV one by one during low peak times.