Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: region不均衡
[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] What operations were performed that caused the issue
[Encountered Issue: Problem Phenomenon and Impact]
[Resource Configuration]
[Attachments: Screenshots / Logs / Monitoring]
There is a TiDB cluster 5.0.6
Regions are unbalanced
Self-check
Labels are correct
Scheduler show
[
“balance-region-scheduler”,
“balance-leader-scheduler”,
“balance-hot-region-scheduler”,
“label-scheduler”
]
Check scores
Reference manual
Scheduling
Seeking advice on how to handle this to balance the regions.
Are there many empty regions?
Looking at the monitoring, the region score is unbalanced. Theoretically, this should trigger data migration scheduling. If scheduling does not occur:
- Confirm whether scheduling has been turned off by checking the official documentation on using the pd-ctl tool.
- Check if TiDB Lightning is being used and see if TiKV has entered import mode. Refer to the official documentation: FAQ on Lightning abnormal exit.
If everything is normal, check the content of pd-ctl store to see if region weight is configured and if the weight has been adjusted. 
Looking at this picture, could it be that your TiKV configuration has two machines sharing one node? The IPs are partially hidden, so it’s not certain if they are the same.
Is the 3-replica distribution like this:
173:21187 and 173:21188 one replica, each 2.5k
174:21187 and 174:21188 one replica, each 2.5k
175:21187 one replica, total 5k
Because the two TiKVs above are on the same machine, if it fails, everything fails, so it’s impossible to place two replicas of one region on these two.
To balance, you could deploy 21188 on 175. Or you can leave it as is; with only three machines, the regions can only be distributed this way.
This should be the truth: the data is distributed according to the label.
It shouldn’t be like this. I’ve done it this way several times, and only this one has a problem.
Could someone with a normal distribution please post a comparison chart to see if it also has this 221 distribution and if it can be balanced?
Isn’t this the kind of deployment I’m talking about? If the two IPs are the same, they definitely share one replica. What’s there to be confused about?