Limit Pushdown

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: limit 下推

| username: TiDBer_QHSxuEa1

I encountered an issue where an SQL query with an ORDER BY clause on a completely empty field returns results in a different order when a LIMIT is applied compared to when it is not. I suspect this is because the LIMIT is pushed down to TiKV. I want to disable the LIMIT pushdown for this specific SQL query. The documentation only mentions global blacklists and limit-hint pushdown methods. I tried adding a join (select 1) and it worked. Are there any other ways to achieve this?

| username: tidb菜鸟一只 | Original post link

If the field used in the order by clause is completely empty, TiDB does not guarantee that the results will be consistent each time you query, because it is a distributed database. This has nothing to do with whether limit is pushed down or not. If you don’t add limit, can the results remain stable with multiple queries?

| username: TiDBer_QHSxuEa1 | Original post link

The strange thing is that without adding a limit, the result set is stable, but I haven’t tested other SQL queries.

| username: 有猫万事足 | Original post link

The returned result is related to the distribution of data on the storage nodes (TiKV). If multiple queries are performed, the speed at which different storage units (Regions) on the storage nodes (TiKV) return results will vary, causing the results to be unstable.

Because the field you are ordering by is a null value, it is essentially the same as not having an order by clause.

| username: tidb菜鸟一只 | Original post link

Is there some special handling logic for “order by” with empty values? Regardless, solving this by prohibiting limit pushdown seems a bit absurd. It would be more reasonable to have the development team ensure that the “order by” uses a unique field. Only in this case can TiDB guarantee that the results returned each time are unique…

| username: cassblanca | Original post link

In TiDB, the sorting rules for NULL in the ORDER BY statement are as follows:

  • When ORDER BY COLUMN ASC, NULL is placed at the beginning by default.
  • When ORDER BY COLUMN DESC, NULL is placed at the end by default.

If the sorting rule is not explicitly specified, the default is ASC. Additionally, because each TiKV returns results at different speeds, the result set can become unstable when involving NULL sorting.

| username: redgame | Original post link

Try adding SELECT /*+ NO_INDEX_MERGE() */.