Production environment, cluster version V5.1.0
Encountered an issue where the current cluster has a large number of Regions, nearly 8 million, and there are nearly 1.6 million empty Regions according to PD monitoring.
To reduce the cluster load, we considered merging Regions. However, after enabling cross-table merge, the number of empty Regions is decreasing very slowly. What could be the reason for this?
Currently, after increasing the relevant parameters, the undersize_region_count has been decreasing steadily, by nearly 40,000 per hour, but the empty_region_count is decreasing very slowly, by only a few dozen per hour. This difference in speed is too large. Please help analyze the reason.
It looks like the overall process is merging , maybe we should wait a bit longer. One way is to reduce the merge size and keys, so it prioritizes merging the empty ones.
Previously, the size was set to 5 and the keys were set to 200,000. Initially, it decreased by about 2,000 per hour, but after a few hours, the speed dropped to a few dozen per hour.
I recommend you wait for the regions to gradually decrease. As long as the overall trend is decreasing, it’s fine. If the overall trend stops decreasing but you expect some regions to merge and they haven’t, then take another look.