Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 嵌套查询,in里层有序的主键,查询出的数据没有按照主键排序
[TiDB Usage Environment] Production Environment
[TiDB Version] v6.5.3
[Reproduction Path] select * from t1 where id in (select id from t1 where id > 432100 order by id asc limit 4000);
[Encountered Problem: Phenomenon and Impact]
- select * from t1 where id in (select id from t1 where id > 432100 order by id asc limit 4000);
- Batch SQL as above, the development code stores the query results in memory each time, then takes the id value of the last record as the maximum value for the next batch;
- Before upgrading, version 5.4.3 was normal, the last value was the maximum value. After upgrading to 6.5.3, batch stress testing encountered issues. After investigation, it was found that the last value was not the maximum value, leading to duplicate data in the entire business;
- Testing with version 6.5.5 found that the issue no longer occurred, indicating that the problem was fixed in 6.5.5;
- Is there any parameter in 6.5.3 that can control this sorting?
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]
After R&D printed the logs, it was found that the values in the IN clause were sorted in previous versions. However, in version 6.5.3, the values appear unsorted. This issue does not occur during manual testing but only appears when running parallel batch processes.
This has nothing to do with the version, right? select * from t1 where id in (select id from t1 where id > 432100 order by t1_id asc limit 4000);
Shouldn’t you just sort it again at the end of this SQL? select * from t1 where id in (select id from t1 where id > 432100 order by t1_id asc limit 4000) order by id asc;
Yes, making this change can solve the problem, but this is a batch process. Each modification requires a release and stress testing. I’m not sure if there is a parameter that can control this. Previously, in version 5.4.3, it was still fine.
Compare the execution plans. If the execution plans are consistent, it might be a probabilistic issue.
There really is a bit of a difference.
New version
Old version
This issue cannot rely on the execution plan for assurance. For example, hash index join cannot guarantee record order, but index join should be able to. He mentioned that there is a problem with version 6.5.3, which should use hash index join. However, this issue is not a database problem but a usage issue, and it can be handled using pagination functions.
From @人如其名
This query result is not sorted. Why would you consider the last row as the maximum value?
How can I force version 6.5.3 to use index_join?
Study the hint, add the index_join hint, but you need to modify the statement or use the binding execution plan method. Without parameters, directly disable index_hash_join. Another way is to modify the cost evaluation to influence the optimizer’s preference, so you don’t have to change the program. However, you might not be able to handle this, as it requires a deeper understanding of TiDB.
Found the reason, it’s due to the cost model. The parameter is tidb_cost_model_version. Version 6.5.3 uses 2, while 5.4.3 uses 1.
Setting it to 1 uses index join, and the sorting is correct. Setting it to 2 uses index hash join, and the sorting becomes incorrect.
The root cause is still the SQL issue. The SQL has been modified to add an order by clause in the outer layer.
There is a principle in writing SQL: except for ORDER BY, no statement guarantees order.
One is sorted and the other is unsorted, right?
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.