GC is normal but still shows "too many versions containing deleted or overwritten but not GC'd"

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: GC正常但还是显示“含已删除或覆盖但未 GC 的版本”数太多

| username: wakaka

[TiDB Usage Environment] Production
[TiDB Version] 5.2.2
[Encountered Problem] Simple queries are very slow
[Reproduction Path]


[Problem Phenomenon and Impact]


[Attachments]

Please provide the version information of each component, such as cdc/tikv, which can be obtained by executing cdc version/tikv-server --version.

| username: xfworld | Original post link

This kind of problem can only be avoided by preventing full table scans. I shared this before :cowboy_hat_face:

| username: wakaka | Original post link

Selecting count(1) from table t is very fast and only takes 0.x seconds, while the query with conditions shown in the picture above takes around 5 seconds.

| username: wakaka | Original post link

Could you please share the article again? I couldn’t find it.

| username: xfworld | Original post link

“select count(1) from t” directly uses the primary key index and will not perform a full table scan.

Offline event in Wuhan,

| username: wakaka | Original post link

I saw the article by the expert, which mentioned avoiding full table scans and GC. Currently, my entire table has 18,000 rows of data, and GC is functioning normally. I just don’t understand why there are still so many expired keys. Moreover, when I execute the same statement again, the execution plan is the same, but when it executes quickly, there is no key_skipped_count.

| username: h5n1 | Original post link

It is possible that this bug has not met the GC conditions and requires manual compaction.

| username: xfworld | Original post link

The pitfalls have been pointed out to you, make sure to avoid them in time :rofl:

| username: wakaka | Original post link

I see that the regions of this table are distributed across all TiKV and TiFlash nodes… Do I need to perform this tikvctl operation on all 10+ machines?

| username: wakaka | Original post link

Yes, I see it. Thank you! Do I have to use tikv-ctl to execute commands on each TiKV node? Is there a quicker way to handle this? This table is frequently truncated.

| username: xfworld | Original post link

You can also operate in a cluster manner.

Manually compact the data of the entire TiKV cluster

The compact-cluster command can manually compact the entire TiKV cluster. The meaning and usage of the parameters of this command are the same as those of the compact command.

| username: weixiaobing | Original post link

What is the impact of compacting the entire cluster on the business, and what should be noted?

| username: wakaka | Original post link

I don’t know how big the risk is and how long the execution time will be; the cluster is very large.

| username: xfworld | Original post link

You can reduce the concurrency, and it’s best to test with UAT resources, which would be the most reliable.

| username: RaftSnail | Original post link

May I ask if the issue has been resolved? Was it resolved through manual compaction? Please share, thanks.

| username: alfred | Original post link

Temporarily bypass the bug and wait for a fix. :+1:

| username: wakaka | Original post link

The cluster is too large and uncontrollable, so I didn’t operate it.

| username: system | Original post link

This topic was automatically closed 1 minute after the last reply. No new replies are allowed.