What task does split-bucket-scheduler represent in the scheduler of PD?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 请问pd 的 scheduler中 split-bucket-scheduler 代表的是什么任务

| username: GreenGuan

May I ask what task is represented by split-bucket-scheduler in pd’s scheduler? I couldn’t find it on the official website.

| username: Billmay表妹 | Original post link

The split-bucket-scheduler is a scheduler in TiDB that performs split operations on Regions in TiKV, dividing a large Region into multiple smaller Regions to achieve better load balancing and improve query performance.

Specifically, the split-bucket-scheduler distributes data across different Regions based on the data distribution in TiKV to achieve load balancing. Additionally, it periodically checks the size of Regions according to the time interval set by the schedule.split-merge-interval parameter. If a Region’s size exceeds the threshold set by the split-region-size parameter, a split operation is triggered to divide the Region into multiple smaller Regions.

It is important to note that the split-bucket-scheduler is just one of the schedulers in TiDB and is not the only way to split Regions. TiDB also provides other methods for Region splitting, such as manual split and automatic split.