Region Imbalance

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: region不均衡

| username: 田帅萌7

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] What operations were performed that caused the issue
[Encountered Issue: Problem Phenomenon and Impact]
[Resource Configuration]
[Attachments: Screenshots / Logs / Monitoring]
There is a TiDB cluster 5.0.6


Regions are unbalanced
Self-check
Labels are correct
Scheduler show
[
“balance-region-scheduler”,
“balance-leader-scheduler”,
“balance-hot-region-scheduler”,
“label-scheduler”
]
Check scores

Reference manual

Scheduling

Seeking advice on how to handle this to balance the regions.

| username: 裤衩儿飞上天 | Original post link

Are there many empty regions?

| username: WalterWj | Original post link

Looking at the monitoring, the region score is unbalanced. Theoretically, this should trigger data migration scheduling. If scheduling does not occur:

  1. Confirm whether scheduling has been turned off by checking the official documentation on using the pd-ctl tool.
  2. Check if TiDB Lightning is being used and see if TiKV has entered import mode. Refer to the official documentation: FAQ on Lightning abnormal exit.
| username: WalterWj | Original post link

If everything is normal, check the content of pd-ctl store to see if region weight is configured and if the weight has been adjusted. :thinking:

| username: TiDBer_jYQINSnf | Original post link

Looking at this picture, could it be that your TiKV configuration has two machines sharing one node? The IPs are partially hidden, so it’s not certain if they are the same.

Is the 3-replica distribution like this:
173:21187 and 173:21188 one replica, each 2.5k
174:21187 and 174:21188 one replica, each 2.5k
175:21187 one replica, total 5k

Because the two TiKVs above are on the same machine, if it fails, everything fails, so it’s impossible to place two replicas of one region on these two.

To balance, you could deploy 21188 on 175. Or you can leave it as is; with only three machines, the regions can only be distributed this way.

| username: 田帅萌7 | Original post link

  1. It hasn’t been closed.
  2. I haven’t used TiDB Lightning.

image
Weight not adjusted.

| username: WalterWj | Original post link

This should be the truth: the data is distributed according to the label.

| username: 田帅萌7 | Original post link

It shouldn’t be like this. I’ve done it this way several times, and only this one has a problem.

| username: yilong | Original post link

Could someone with a normal distribution please post a comparison chart to see if it also has this 221 distribution and if it can be balanced?

| username: TiDBer_jYQINSnf | Original post link

Isn’t this the kind of deployment I’m talking about? If the two IPs are the same, they definitely share one replica. What’s there to be confused about?