PD is continuously balancing regions

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: pd一直在balance region平衡

| username: TiDBer_yUoxD0vR

[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version] 3.0.12
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Problem Phenomenon and Impact]
Since yesterday afternoon, there has been continuous balance region activity, and a large number of slow queries for insert statements, most of the time spent in the prewrite phase.


The region score has been fluctuating continuously. Why is the score constantly changing? This has never happened before. Is it because the usage space is greater than 60%?

The region has been balancing for a day and is still not finished. These two nodes show a sawtooth pattern, going up and then down.

This has resulted in a large number of slow insert queries.

PD parameters are as follows:
» config show
{
“replication”: {
“location-labels”: “”,
“max-replicas”: 3,
“strictly-match-label”: “false”
},
“schedule”: {
“disable-location-replacement”: “false”,
“disable-make-up-replica”: “false”,
“disable-namespace-relocation”: “false”,
“disable-raft-learner”: “false”,
“disable-remove-down-replica”: “false”,
“disable-remove-extra-replica”: “false”,
“disable-replace-offline-replica”: “false”,
“enable-one-way-merge”: “false”,
“high-space-ratio”: 0.6,
“hot-region-cache-hits-threshold”: 3,
“hot-region-schedule-limit”: 4,
“leader-schedule-limit”: 4,
“low-space-ratio”: 0.8,
“max-merge-region-keys”: 200000,
“max-merge-region-size”: 20,
“max-pending-peer-count”: 16,
“max-snapshot-count”: 3,
“max-store-down-time”: “30m0s”,
“merge-schedule-limit”: 8,
“patrol-region-interval”: “100ms”,
“region-schedule-limit”: 8,
“replica-schedule-limit”: 8,
“scheduler-max-waiting-operator”: 3,
“schedulers-v2”: [
{
“args”: null,
“disable”: false,
“type”: “balance-region”
},
{
“args”: null,
“disable”: false,
“type”: “balance-leader”
},
{
“args”: null,
“disable”: true,
“type”: “hot-region”
},
{
“args”: null,
“disable”: false,
“type”: “label”
}
],
“split-merge-interval”: “1h0m0s”,
“store-balance-rate”: 15,
“tolerant-size-ratio”: 5
}
}
»
[Resource Configuration] Enter TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]

| username: ealam_小羽 | Original post link

It looks like this issue is quite similar. Refer to the hot scheduling issue:

| username: tidb菜鸟一只 | Original post link

What is the current disk usage rate of each of your TiKV nodes?

| username: TiDBer_yUoxD0vR | Original post link

All my scheduler operator creates are balance-region, there are no hotspots.


| username: TiDBer_yUoxD0vR | Original post link

The default for v3.0 is 0.6, I just changed it to 0.7, and the maximum disk usage rate for TiKV is 64%.

| username: xfworld | Original post link

Considering an upgrade? 3.X… :face_with_spiral_eyes:

Although this was the first version I encountered, it’s honestly not as user-friendly as the current versions…


Region Score is an important metric used by PD to schedule Regions. It is a score calculated based on the current state of the Store and the Region, used to evaluate the suitability of a Region on the current Store. The higher the Region Score, the more likely PD will schedule the Region to the current Store.

PD decides which Store to schedule a Region to by calculating the Region Score for each Store. The calculation method for Region Score is based on the current state of the Store and the Region. For detailed calculation methods, refer to the official TiDB documentation [1].

The calculation mechanism for Region Score is based on the current state of the Store and the Region. PD calculates the Score of the current Store based on factors such as remaining space, load, and disk usage. At the same time, PD calculates the Score of each Region based on factors such as size, number of replicas, and distribution. Finally, PD divides each Region’s Score by its weight to get the final Region Score.

PD decides which Store to schedule a Region to based on the Region Score. When PD finds that a Store’s Region Score is too high, it will schedule some Regions to other Stores to achieve load balancing. Conversely, when PD finds that a Store’s Region Score is too low, it will schedule some Regions to that Store to improve its utilization.


I suspect it’s the hotspot scheduling that broke this balance. You can try disabling this scheduling service first.
Reference Q&A:


A simple description of the Score calculation method:

Reference documentation:


Core algorithm documentation reference:

| username: tidb菜鸟一只 | Original post link

When the disk usage of some TiKV nodes exceeds the high-space-ratio parameter, their scoring strategy will differ significantly from that of TiKV nodes below this parameter. At this time, their score may be much higher than that of TiKV nodes with disk usage below this parameter, leading to a large number of regions balancing from high-usage TiKV nodes to low-usage TiKV nodes…
Try increasing this parameter to see if the balance tends to stabilize.

| username: Kongdom | Original post link

The version is a bit outdated; it is recommended to upgrade to the latest stable version.
The change in scores might be caused by region balancing.

| username: TiDBer_yUoxD0vR | Original post link

Does region balance cause slow inserts? The prewrite phase of inserts takes a lot of time, and the report shows a large number of read-write conflicts (tikvLockFast) and a small number of write-write conflicts (txnLock) and updateLeader.

| username: TiDBer_yUoxD0vR | Original post link

I set the high-space-ratio to 0.75, and now the score is changing slowly. I want it to schedule more slowly and have less impact on write operations. Should I lower the region-schedule-limit?
Expanding the TiKV nodes can solve this problem, right?

| username: xfworld | Original post link

Yes, because the region was scheduled, it may also cause backoff…

| username: tidb菜鸟一只 | Original post link

Lowering the region-schedule-limit will reduce the impact of balancing on the cluster. You can lower the region-schedule-limit and then expand the TiKV nodes. This way, the TiKV nodes with high usage can slowly balance the regions to the newly expanded TiKV nodes, without affecting the current cluster’s operations, and also reducing the disk usage of other nodes.

| username: 像风一样的男子 | Original post link

The version is too old, right? I am shrinking nodes, and even setting the region-schedule-limit to 3600 does not affect system usage.

| username: Kongdom | Original post link

:wink: Not only should you consider the version, but also the hardware~

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.