The execution of "select * from T limit 1" takes a long time

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: select * from T limit 1 执行很久

| username: Holland

[TiDB Usage Environment] Production Environment
[TiDB Version] v6.5.1
[Reproduction Path] Operations performed that led to the issue
explain analyze SELECT * FROM label LIMIT 1; takes a long time to execute, nearly 2 minutes.

The actual table data volume doesn’t seem large.

Checked this post 专栏 - MVCC导致limit 1执行慢测试 | TiDB 社区

Changed gc from 24h to 30min, but it had no effect, still very slow. However, adding an index with a where condition returns quickly.

[Encountered Issue: Problem Phenomenon and Impact]
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]

| username: kkpeter | Original post link

It seems to be the same issue, but the PR has already been merged. Not sure if it has been reintroduced in the new version.

| username: kkpeter | Original post link

Looking at related posts, limit 1 will scan the entire table, but without sorting, it returns randomly. I don’t quite understand the logic of TiDB scanning the entire table here; it would be better to just return the first row read.

| username: zhanggame1 | Original post link

Is the data volume large? Your query is not useful; MVCC data is ignored. You need to look at the size and number of regions and the number of keys. You can query like this:

select sum(t.APPROXIMATE_SIZE), sum(t.APPROXIMATE_KEYS), count(*) 
where t.DB_NAME='XX' and t.TABLE_NAME='XXX'
| username: Holland | Original post link

| username: zhanggame1 | Original post link

Changing the GC from 24h to 30min, you first need to confirm that the GC is progressing normally. Additionally, the GC cannot delete the number of scanned keys; you need to wait for the database to compact, which may reduce the data volume and keys, but this process will take a long time.

| username: Holland | Original post link

It looks like GC is continuously progressing.

| username: zhanggame1 | Original post link

The total regions are 34G, so a full table scan is naturally very slow… You just changed the GC time not long ago, observe it for another day or two.

| username: Holland | Original post link

Okay, the GC was modified the night before yesterday.

| username: zhanggame1 | Original post link

The data volume is not large, it’s better to rebuild the table.

| username: TIDB-Learner | Original post link

How long does it take to directly select * from the table when the table doesn’t look big?

| username: zhanggame1 | Original post link

For clustered tables, limit 1 starts scanning from the beginning and returns the first row. The problem is that scanning to the first row may require scanning through many gigabytes of data.

| username: 路在何chu | Original post link

Is this occasional? Are the other tables like this too?

| username: 小龙虾爱大龙虾 | Original post link

Send the execution plan.

| username: Jellybean | Original post link

It seems that the previous discussions haven’t addressed the core issue. Please post the execution plan of the slow SQL, and we can analyze it.

| username: onlyacat | Original post link

I’ve encountered this as well. It chose the wrong execution plan. Either add a hint or add a binding.

| username: Kongdom | Original post link

Indeed, please share the execution plan. It looks quite strange.

| username: 有猫万事足 | Original post link

TiDB’s execution plan is quite detailed.
Without an execution plan, you can only rely on experience and guesswork.
In the absence of corroboration, these can only serve as a thought process or reference.

| username: 春风十里 | Original post link

For the issue of slow SQL, you still need to look at the details and analyze further. Post the execution plan for review.

| username: zhanggame1 | Original post link

It’s normal for the limit to be this slow. The data volume of this table and the region size differ by several orders of magnitude.