Why does a TiDB query with Limit 10 take 1 minute and 30 seconds?

translator_bot · June 22, 2024, 9:39am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb 查询 Limit 10 为啥用时1分30秒？

| username: lxzkenney

The table data volume is: 600 million, Limit 10, why is it so slow?

Just checked another limit 1; it took 2 minutes and 30 seconds. The entire cluster jittered.

Execution plan:

explain analyze:

translator_bot · June 22, 2024, 9:39am

| username: hey-hoho | Original post link

Post the complete execution plan for us to review.

translator_bot · June 22, 2024, 9:39am

| username: lxzkenney | Original post link

Put it at the very bottom.

translator_bot · June 22, 2024, 9:39am

| username: TiDBer_pkQ5q1l0 | Original post link

Looking at this execution plan, it doesn’t seem right. Check the dashboard to see where the detailed time consumption is.

translator_bot · June 22, 2024, 9:39am

| username: BraveChen | Original post link

Try using EXPLAIN ANALYZE + SQL to get the actual execution plan and see where the slowness is. For now, looking at the operator structure of the execution plan, it seems fine to me.

translator_bot · June 22, 2024, 9:39am

| username: realcp1018 | Original post link

I have encountered this problem as well.
Try using trace format='row' select * from table limit 10; to check. I remember that the total time for loadRegion is quite long during the first query.
However, loadRegion is generally not very slow. This usually happens when the data volume of the entire cluster is huge, and in the past, it was generally left unresolved.
This time, I’m waiting for an answer.

translator_bot · June 22, 2024, 9:39am

| username: tidb菜鸟一只 | Original post link

Don’t use limit 10 directly, try using order by id limit 10;
Also, there seems to be an issue with the table statistics.

Check the statistics health.
SHOW stats_healthy WHERE db_name=‘sbtest’ AND table_name=‘sbtest1’;

translator_bot · June 22, 2024, 9:39am

| username: lxzkenney | Original post link

Manually analyzed, now it’s 100%, but the problem still exists.

Changed to order by id, can’t retrieve, timeout and killed.

translator_bot · June 22, 2024, 9:39am

| username: lxzkenney | Original post link

Moved to the bottom, you can take a look.

translator_bot · June 22, 2024, 9:39am

| username: BraveChen | Original post link

The rpt_time is high, and the entire scan of kv to retrieve data takes a long time. It might be due to some occasional operations causing TiKV not to trigger compaction. You can try doing it manually to see if it can reduce read amplification.

translator_bot · June 22, 2024, 9:39am

| username: lxzkenney | Original post link

The cause of this issue has been roughly identified. This table actually has over 200,000 records and is part of an hourly task in the data warehouse. They delete the entire table every hour. TiDB has an MVCC mechanism, which results in many historical versions existing. Garbage Collection (GC) is performed every 12 hours, meaning each record has approximately 12 historical versions. When querying with LIMIT 10;, does it scan the historical versions of the data? I rebuilt a table, inserted the data, and LIMIT 10; returned in 0.06 seconds. I understand that querying data scans all deleted versions. Can someone explain the principle behind this?

translator_bot · June 22, 2024, 9:39am

| username: Kongdom | Original post link

Is it here? If GC is working properly, it should delete historical versions, right?

translator_bot · June 22, 2024, 9:39am

| username: tidb菜鸟一只 | Original post link

I think it might be because you repeatedly delete and insert, causing too many regions corresponding to this table.
SELECT * FROM INFORMATION_SCHEMA.TIKV_REGION_STATUS a WHERE a.db_name=‘’ AND a.table_name=‘’;
Check how many regions correspond to this table. By querying this way, TiDB has to scan all the regions…

translator_bot · June 22, 2024, 9:39am

| username: Raymond | Original post link

If you use the primary key index to search, you won’t encounter such a situation. You can give it a try.

translator_bot · June 22, 2024, 9:39am

| username: lxzkenney | Original post link

1682, that’s quite a lot, isn’t it?

translator_bot · June 22, 2024, 9:39am

| username: jansu-dev | Original post link

It seems the issue has been resolved, so I’ll mark it as solved for now.
Actually, you can try it out to see if a higher version of Paging has a speed-up effect in this scenario, considering it’s just limit 10. → System Variables | PingCAP Documentation Center However, version: v5.0.3 does not have this feature yet.
This variable is used to control whether the IndexLookUp operator uses the paging method to send Coprocessor requests, with the default value being OFF. For read requests that use IndexLookUp and Limit where Limit cannot be pushed down to IndexScan, there may be high read request latency and high CPU usage in TiKV’s Unified read pool. In such cases, since the Limit operator only needs a small portion of the data, enabling tidb_enable_paging can reduce the amount of data processed, thereby reducing latency and resource consumption. → TiDB 5.4 Release Notes | PingCAP Documentation Center

translator_bot · June 22, 2024, 9:39am

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.