Issue with loadregion in limit 1 query

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: limit 1 查询 loadregion问题

| username: h5n1

[Question] In the test environment, when performing a Limit 1 operation on a table with 2171 non-index regions, trace SQL shows that during the first execution, it seems to load all region information.


b7f5ded710a03003a1b7c88678608f5

Why does it load all region information when performing a Limit 1 operation? Even if the region cache does not have the region information of the table, shouldn’t it only load a few regions sufficient for the limit?
The second execution does not load all regions.

| username: 我是咖啡哥 | Original post link

Just this one region.

| username: 我是咖啡哥 | Original post link

You can try removing the limit to see if many regions will be displayed here.

| username: jansu-dev | Original post link

Call chain (roughly): func Selectfunc (c *CopClient) Sendfunc buildCopTasksSplitKeyRangesByLocationsLocateKeyfindRegionByKeyloadRegion call point

  1. Why does it load all region information when limit 1 is used? Even if the region information for the table is not in the region cache, shouldn’t it be enough to load just a few regions sufficient for the limit?
    a. The region cache information is based on the ranges passed into the function. In this SQL case, the ranges for the tableReader operator are table start_key + end_key, which results in loading so many keys.
    b. However, loading is a relatively light operation, just sending a request to PD, so it’s not too bad.
    c. Actually, if you want to load fewer regions, you can manually change the scan range when constructing the operator within the where clause limit. After the content in the regionCache expires, it should also scan less.

  2. Why doesn’t it load the second time?
    In findRegionByKey, it first searches based on CachedRegion. If it doesn’t find it, it will then actually load it.

| username: h5n1 | Original post link

For a full table limit x like this, can it be optimized? The table is quite large, and loading the region takes more than a second. This scenario might not be common in actual business.

| username: jansu-dev | Original post link

I’ll raise an enhancement issue tonight. I can only try to explain the cause of this phenomenon, but whether to optimize it and how to optimize it still needs to be considered by the developers.
Later (before the moderator meeting): I’ll post the link.

| username: jansu-dev | Original post link

Enhancement issue address → Decrease loading Region info in RegionCache when using `select * from table limit 1` · Issue #39733 · pingcap/tidb · GitHub

| username: h5n1 | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.