Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: TiDB在建表时如何放置全新的Region?
How does PD determine which TiKV nodes to place a brand new Region on?
Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: TiDB在建表时如何放置全新的Region?
How does PD determine which TiKV nodes to place a brand new Region on?
The default split-table is enabled, which means that creating a table will split it into a new region.
As for how this region is placed, it depends on the scheduling of TiKV.
Use show table xxx regions;
to see which regions this table has.
Then use pd-ctl to check the location of the regions.
Only the first Region is “created,” while the rest of the Regions are “split.” You can check the logs or track the system views to observe this phenomenon. When splitting, the new Regions will be on the same nodes as the original Region, and then PD will schedule them based on the situation.
The first region of each table is “created,” right? So when creating the first region of each table, how is it determined which nodes the replicas are placed on? Is it decided by PD? Is it placed randomly or according to various scheduling strategies? I’m just starting to learn TiDB and haven’t found the corresponding entry point in the source code.
Actually, I mainly want to know who determines the placement of each replica of the first region of each table on different TiKV nodes. Is it PD? Is it placed randomly or according to various scheduling strategies? I am new to TiDB and have not found the corresponding processing logic in the source code.
It wasn’t created, it was split. You can check through the logs.
There are many determining factors, mainly controlled by PD. TiDB does not care about how regions are placed.
You can check out region scheduling: placement-rule.
When creating a new table, how is it determined from which existing region the new table’s region will split?
I see you are all concerned about the creation issue, let me explain:
TiKV places all data within a range from “” to “”, which can be understood as the interval [-∞, +∞).
As data is inserted, there might be too much data within the range from A to C. Suppose B is right in the middle, then it will be split. It will be split into two regions: [-∞, B) and [B, +∞).
Suppose another table is created, and according to the encoding rules, assume the prefix for all data in this table is db1_tb1_. Then the range [B, +∞) will be split again into [B, db1_tb1_) and [db1_tb1_, +∞).
That’s basically how it works. All regions are connected end to end, always forming a continuous interval of [-∞, +∞).
Awesome! You’ve cleared up my confusion.
According to your description, once the prefix of a table is determined, its position in the range is also determined, and the TiKV node where the “original region” of this table is located is also determined, right?
Additionally, how is the prefix of each table determined? And where can I learn more detailed explanations about the mechanism you described?
Creating a table will assign a table_id to the table. The key format of the table is fixed, so the range can be determined.
MySQL [information_schema]> select region_id,tidb_decode_key(START_KEY),tidb_decode_key(END_KEY),min(table_id) from TIKV_REGION_STATUS group by region_id,START_KEY,END_KEY order by min(table_id);+-----------+---------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------+---------------------+
| region_id | tidb_decode_key(START_KEY) | tidb_decode_key(END_KEY) | min(table_id) |
+-----------+---------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------+---------------------+
| 2 | 7800000100000000FB | | NULL |
| 14 | 7800000000000000FB | 7800000100000000FB | NULL |
| 262 | 6D00000000000000F8 | 6E00000000000000F8 | NULL |
| 258 | | 6D00000000000000F8 | NULL |
| 10 | 7200000000000000FB | 7200000100000000FB | NULL |
| 8 | 6E00000000000000F8 | 7200000000000000FB | NULL |
| 1101 | 7200000100000000FB | {"handle":{"c_d_id":"7","c_id":"1066","c_w_id":"1"},"table_id":110} | 4 |
| 1013 | {"handle":{"c_d_id":"1","c_id":"1479","c_w_id":"18"},"table_id":110} | {"table_id":112} | 110 |
| 1077 | {"handle":{"c_d_id":"10","c_id":"2710","c_w_id":"3"},"table_id":110} | {"handle":{"c_d_id":"7","c_id":"2326","c_w_id":"8"},"table_id":110} | 110 |
| 1109 | {"handle":{"c_d_id":"4","c_id":"1863","c_w_id":"13"},"table_id":110} | {"handle":{"c_d_id":"1","c_id":"1479","c_w_id":"18"},"table_id":110} | 110 |
| 1089 | {"handle":{"c_d_id":"7","c_id":"2326","c_w_id":"8"},"table_id":110} | {"handle":{"c_d_id":"4","c_id":"1863","c_w_id":"13"},"table_id":110} | 110 |
| 1069 | {"handle":{"c_d_id":"7","c_id":"1066","c_w_id":"1"},"table_id":110} | {"handle":{"c_d_id":"10","c_id":"2710","c_w_id":"3"},"table_id":110} | 110 |
| 1097 | {"table_id":112} | {"index_id":2,"index_vals":{"h_c_w_id":"16"},"table_id":112} | 112 |
| 1017 | {"index_id":2,"index_vals":{"h_c_w_id":"16"},"table_id":112} | {"table_id":114} | 112 |
| 1025 | {"table_id":114} | {"table_id":118} | 114 |
| 1029 | {"handle":{"ol_d_id":"1","ol_number":"7","ol_o_id":"404","ol_w_id":"18"},"table_id":118} | {"table_id":120} | 118 |
| 1073 | {"handle":{"ol_d_id":"9","ol_number":"4","ol_o_id":"1820","ol_w_id":"5"},"table_id":118} | {"handle":{"ol_d_id":"3","ol_number":"9","ol_o_id":"2730","ol_w_id":"8"},"table_id":118} | 118 |
| 1061 | {"table_id":118} | {"handle":{"ol_d_id":"5","ol_number":"3","ol_o_id":"967","ol_w_id":"3"},"table_id":118} | 118 |
| 1065 | {"handle":{"ol_d_id":"5","ol_number":"3","ol_o_id":"967","ol_w_id":"3"},"table_id":118} | {"handle":{"ol_d_id":"9","ol_number":"4","ol_o_id":"1820","ol_w_id":"5"},"table_id":118} | 118 |
| 1093 | {"handle":{"ol_d_id":"2","ol_number":"10","ol_o_id":"1341","ol_w_id":"13"},"table_id":118} | {"handle":{"ol_d_id":"6","ol_number":"7","ol_o_id":"2240","ol_w_id":"15"},"table_id":118} | 118 |
| 1081 | {"handle":{"ol_d_id":"3","ol_number":"9","ol_o_id":"2730","ol_w_id":"8"},"table_id":118} | {"handle":{"ol_d_id":"8","ol_number":"9","ol_o_id":"624","ol_w_id":"10"},"table_id":118} | 118 |
| 1085 | {"handle":{"ol_d_id":"8","ol_number":"9","ol_o_id":"624","ol_w_id":"10"},"table_id":118} | {"handle":{"ol_d_id":"2","ol_number":"10","ol_o_id":"1341","ol_w_id":"13"},"table_id":118} | 118 |
| 1105 | {"handle":{"ol_d_id":"6","ol_number":"7","ol_o_id":"2240","ol_w_id":"15"},"table_id":118} | {"handle":{"ol_d_id":"1","ol_number":"7","ol_o_id":"404","ol_w_id":"18"},"table_id":118} | 118 |
| 1053 | {"handle":{"s_i_id":"49025","s_w_id":"12"},"table_id":120} | {"handle":{"s_i_id":"23033","s_w_id":"15"},"table_id":120} | 120 |
| 1033 | {"handle":{"s_i_id":"42196","s_w_id":"18"},"table_id":120} | {"index_id":2,"index_vals":"20, 10, 3000, 1859, 20, 10, 1859, ","table_id":124} | 120 |
| 1049 | {"handle":{"s_i_id":"16535","s_w_id":"10"},"table_id":120} | {"handle":{"s_i_id":"49025","s_w_id":"12"},"table_id":120} | 120 |
| 1041 | {"handle":{"s_i_id":"67971","s_w_id":"3"},"table_id":120} | {"handle":{"s_i_id":"84050","s_w_id":"6"},"table_id":120} | 120 |
| 1037 | {"table_id":120} | {"handle":{"s_i_id":"67971","s_w_id":"3"},"table_id":120} | 120 |
| 1057 | {"handle":{"s_i_id":"23033","s_w_id":"15"},"table_id":120} | {"handle":{"s_i_id":"42196","s_w_id":"18"},"table_id":120} | 120 |
| 1045 | {"handle":{"s_i_id":"84050","s_w_id":"6"},"table_id":120} | {"handle":{"s_i_id":"16535","s_w_id":"10"},"table_id":120} | 120 |
| 2121 | {"index_id":2,"index_vals":"20, 10, 3000, 1859, 20, 10, 1859, ","table_id":124} | {"index_id":2,"index_vals":{"o_c_id":"1","o_d_id":"1","o_id":"2941","o_w_id":"1"},"table_id":1292} | 308 |
| 2141 | {"index_id":2,"index_vals":{"o_c_id":"1","o_d_id":"1","o_id":"2941","o_w_id":"1"},"table_id":1292} | {"index_id":2,"index_vals":{"o_c_id":"3000","o_d_id":"10","o_id":"1859","o_w_id":"20"},"table_id":1292} | 1292 |
| 2133 | {"index_id":2,"index_vals":{"o_c_id":"3000","o_d_id":"10","o_id":"1859","o_w_id":"20"},"table_id":1292} | {"table_id":1298} | 1292 |
| 2169 | {"table_id":1298} | {"handle":{"o_d_id":"1","o_id":"1989","o_w_id":"10"},"table_id":1298} | 1298 |
| 2165 | {"handle":{"o_d_id":"1","o_id":"1989","o_w_id":"10"},"table_id":1298} | {"table_id":281474976710654} | 1298 |
| 266 | {"table_id":281474976710654} | {"table_id":281474976710655} | 281474976710654 |
| 12 | {"table_id":281474976710655} | 7800000000000000FB | 4611686018427387906 |
+-----------+---------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------+---------------------+
37 rows in set, 14 warnings (0.01 sec)
Learn PCTA/PCTP https://learn.pingcap.cn/learner/certification-center
Yes, the original location of the region is actually where it was before it split. After splitting, this region might be migrated by PD. If you turn off all the schedulers when initializing the cluster, even if there are many TiKV nodes and many regions in the entire cluster, these regions will all be concentrated on the initial TiKV node because they are all split from it. Without a scheduler, there will be no migration arrangements, so they won’t move out.
As for learning these, the posts by Xia Zong are quite good, and there are also blogs you can check out.
The explanation is very accurate; each region is a left-closed, right-open interval.
When creating a table, the default configuration is a brand new region.