Will PD's hot-scheduler cause scheduling conflicts when based on read or write traffic?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: PD的hot-scheduler基于读流量或写流量进行调度,会导致调度冲突吗?

| username: TiDBer_UmTwzsCW

I have some doubts after reading the PD source code. The hot-scheduler in PD randomly balances hotspots based on the current cluster’s read or write traffic. From the code, it seems that during the generation of operators, only one type of information, either read or write, is considered. Could this scheduling method lead to redundant scheduling, with regions repeatedly migrating and failing to reach a balanced state?

If the above problem exists, can it be resolved by increasing the two parameters in the diagram to raise the scheduling threshold?

| username: Billmay表妹 | Original post link

The PD’s Hot Region Scheduler schedules based on read or write traffic, avoiding scheduling conflicts. The Hot Region Scheduler identifies hot regions based on the current cluster’s read or write traffic and generates scheduling tasks according to the distribution of these hot regions. When generating scheduling tasks, the Hot Region Scheduler considers both read and write information to better balance the hot regions.

During the operator generation process, the Hot Region Scheduler considers both read and write information to better balance the hot regions. It calculates the heat value of each region based on the current cluster’s read and write traffic, then identifies which regions are hot. When generating scheduling tasks, the Hot Region Scheduler creates tasks based on the distribution of hot regions to better balance them.

If there are issues such as redundant scheduling or repeated region migrations, you can consider adjusting the parameters of the Hot Region Scheduler to resolve them. For example, you can increase the hot-region-schedule-limit and hot-region-cache-hits-threshold parameters to raise the scheduling threshold and reduce unnecessary scheduling tasks. Additionally, you can adjust other parameters like leader-schedule-limit and region-schedule-limit according to the actual situation to better adapt to the current cluster’s load conditions.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.