After enabling cross-table merging, the number of empty regions still decreases very slowly

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 开启跨表合并后empty_region的数量仍然下降很慢

| username: SWCloud

Production environment, cluster version V5.1.0
Encountered an issue where the current cluster has a large number of Regions, nearly 8 million, and there are nearly 1.6 million empty Regions according to PD monitoring.
To reduce the cluster load, we considered merging Regions. However, after enabling cross-table merge, the number of empty Regions is decreasing very slowly. What could be the reason for this?

Currently, after increasing the relevant parameters, the undersize_region_count has been decreasing steadily, by nearly 40,000 per hour, but the empty_region_count is decreasing very slowly, by only a few dozen per hour. This difference in speed is too large. Please help analyze the reason.

| username: WalterWj | Original post link

It looks like the overall process is merging :thinking:, maybe we should wait a bit longer. One way is to reduce the merge size and keys, so it prioritizes merging the empty ones.

| username: xfworld | Original post link

In a production environment, to avoid production pressure, it is better not to make adjustments.

If you want to make adjustments, you can refer to the following documents:

| username: SWCloud | Original post link

Previously, the size was set to 5 and the keys were set to 200,000. Initially, it decreased by about 2,000 per hour, but after a few hours, the speed dropped to a few dozen per hour.

| username: SWCloud | Original post link

We have already tried all these methods.

| username: WalterWj | Original post link

I recommend you wait for the regions to gradually decrease. As long as the overall trend is decreasing, it’s fine. If the overall trend stops decreasing but you expect some regions to merge and they haven’t, then take another look.

| username: xfworld | Original post link

Observe the scheduling situation of the PD operator~

If it is scheduling normally, it might be due to the large amount of data. In that case, we can only wait for a while.

| username: SWCloud | Original post link

Okay, let’s observe for a while longer.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.