Abnormal TiDB Data Growth, Suspected GC Not Cleaning

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb数据增长异常,怀疑gc没有清理

| username: tug_twf

[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version] 6.1.7
[Reproduction Path]
[Encountered Problem: The storage space has recently increased significantly, but there has been no major change in business.

From the GC perspective, it seems to be progressing normally.

However, querying some frequently updated tables reveals that they occupy a lot of space, still suspecting that the GC is not progressing normally.

| username: zhanggame1 | Original post link

Group by the table below to see the total size of regions for each table.

| username: TIDB-Learner | Original post link

The region did not merge or compress properly?

| username: Soysauce520 | Original post link

You can check the QPS to preliminarily confirm whether there is a change in the business volume. The statement about the region size being 1T seems to be incorrect.

| username: tug_twf | Original post link

Current troubleshooting progress: Confirmed that there is an issue with GC.

  1. Checked the region information in PD, no empty regions.
  2. Found the related table, but the key is an index, and the index with ID 2 cannot be found.
  3. Admin check on the related table shows no anomalies.
  4. PD-CTL region check down & miss shows no anomalies.

Next steps:
Attempt to rebuild the related table. After rebuilding the table, GC cleanup still does not proceed.

| username: h5n1 | Original post link

Set gc.enable-compaction-filter: false to disable TiKV’s compaction filter GC and use the old GC mode for multi-version GC.

| username: zhang_2023 | Original post link

There seems to be an issue with region merging.

| username: zhanggame1 | Original post link

There might be an issue with compact. Try manually compacting during a time of low load.

| username: zhaokede | Original post link

Manually handle it during idle periods or restart the cluster.

| username: Sword | Original post link

Upgrade to the latest version and take a look.

| username: TiDBer_H5NdJb5Q | Original post link

How about using ctl to check the GC safepoint?

| username: TiDBer_QYr0vohO | Original post link

In your spare time, manually compact it.