Error Message: TiDB Node Numerous [gc worker] Delete Range Failed

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiDB 节点大量[gc worker] delete range failed 报错信息

| username: OnTheRoad

【TiDB Usage Environment】Production Environment
【TiDB Version】v5.3.0
【Deployment Environment】CentOS7.9, 3TiDB/3PD/3TiKV independent deployment
【Encountered Issue】A TiDB node shows a large number of gc worker errors in the Dashboard

[gc_worker.go:713] ["[gc worker] delete range failed on range"] [uuid=60a807a27f00012] [startKey=7480000000000017ac] [endKey=7480000000000017ad] [error="[gc worker] destroy range finished with errors: [unsafe destroy range failed on store 1: gc worker is too busy unsafe destroy range failed on store 3: gc worker is too busy unsafe destroy range failed on store 2: gc worker is too busy]"]

【Reproduction Path】No operations performed, running normally
【Phenomenon and Impact】
No obvious impact observed

| username: h5n1 | Original post link

The previous GC task probably hasn’t been completed yet.

| username: OnTheRoad | Original post link

I checked the logs of this node, and this error has been reported for almost a week. It keeps reporting this error.

| username: h5n1 | Original post link

Check the TiDB logs for any additional information. Execute select * from mysql.tidb to check the GC settings. Look at the TiKV detail monitoring to see if there are any anomalies in the GC-related panels.

| username: OnTheRoad | Original post link

  1. The first image is not available for translation.
  2. The second image is not available for translation.
  3. The third image is not available for translation.
| username: h5n1 | Original post link

TiDB 5.3.1 Release Notes

Regarding the issue where the GC worker is unable to perform range deletion (i.e., execute the unsafe_destroy_range parameter) after being busy #11903

| username: xiaohetao | Original post link

Learned:+1::+1::+1::+1:

| username: wakaka | Original post link

May I ask if this issue was resolved after upgrading to version 5.3.1?

| username: OnTheRoad | Original post link

This has not been upgraded yet. Recently, there are plans to relocate the data center. After the relocation is completed, we will start upgrading TiDB, possibly to version 5.4.2.

| username: wakaka | Original post link

It seems we also encountered this problem. How did you solve it? GC数据不回收 - #4,来自 h5n1 - TiDB 的问答社区

| username: OnTheRoad | Original post link

For now, I’m not dealing with it, but I’m keeping an eye on the cluster’s status at all times.

| username: wakaka | Original post link

Now the cluster capacity is growing too fast, with an increase of 300G per day (the actual business volume is not that much), and the space just can’t be reclaimed.

| username: OnTheRoad | Original post link

We don’t have such a large amount of data here, and the data inside has been cleaned. Usually, TiDB is used for offline analysis. Therefore, even if we don’t reclaim it, the impact is not obvious for now. However, it will need to be addressed sooner or later.

| username: wakaka | Original post link

Okay, now it’s just that there’s no way to determine how to solve it without upgrading. Thank you for your response.

| username: OnTheRoad | Original post link

It seems that the official team hasn’t released the corresponding patch, so upgrading might be the only solution. We estimate that we can start the upgrade process after October. If you have any methods to resolve this issue in the meantime, please let us know.

| username: wakaka | Original post link

The main issue is that the cluster is too large, and I’m not sure if there will be other bugs after the upgrade, so I haven’t proceeded with the upgrade yet. Okay, if I do proceed, I’ll let you know.

| username: 特雷西-迈克-格雷迪 | Original post link

It is best to conduct tests when upgrading the cluster, especially for major version upgrades. Do you have scheduled jobs? Why is the GC so busy?

| username: OnTheRoad | Original post link

This triggered a bug.

| username: system | Original post link

This topic was automatically closed 1 minute after the last reply. No new replies are allowed.