BR Restore Data: Severe Data Imbalance on a TiKV Node

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: br restore数据,TIKV某节点数据严重不均衡

| username: jaybing926

[TiDB Usage Environment] Production Environment
[TiDB Version]
Cluster version: v4.0.9
[Encountered Problem: Phenomenon and Impact]
Deploying a new TiDB cluster, backing up data from the old cluster using BR, and then restoring it to the new cluster. During the restore process, it was found that the data storage on a certain TiKV node was severely unbalanced compared to other nodes.
As shown in the figure below: the number of regions on the 192.168.241.74 node is very small. The hardware, including storage configuration, of these TiKV nodes is exactly the same. It’s very strange why this problem occurs. I would like to ask how to troubleshoot this issue, thank you~


Below is a bit of the TiKV log from this node:

| username: jaybing926 | Original post link

These TiKV nodes have exactly the same configuration, so why is there such a big difference in their scores? What does this indicate?

| username: jaybing926 | Original post link

There are many empty regions. Is this the cause?

| username: buddyyuan | Original post link

Take a look at the scheduling settings; it feels like the scheduling has stopped.

// Schedulers represent region/leader schedulers which can impact performance.
Schedulers = map[string]struct{}{
“balance-leader-scheduler”: {},
“balance-hot-region-scheduler”: {},
“balance-region-scheduler”: {}
}

| username: jaybing926 | Original post link

Although I’m not entirely sure, data balancing has started now, possibly due to those empty regions. The score of that abnormal node is indeed low.

I’m using version 4.0.9, and by default, automatic cleanup of empty regions is not enabled. It works fine once enabled.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.