With the same WHERE condition, sometimes the query uses an index (IndexLookup), and sometimes it performs a full table scan (TableReader)

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 同where条件,有的查询走索引(IndexLookup),有的时候扫表(TableReader)

| username: TiDBer_2tTuI9A5

[TiDB Usage Environment] Production Environment
[TiDB Version] v4.0.9
[Reproduction Path] MySQL client
[Encountered Problem: Phenomenon and Impact]
The same query statement, with the same WHERE condition (different values), sometimes uses an index (IndexLookup) and sometimes performs a table scan (TableReader).
[Resource Configuration]
[Attachment: Screenshot/Log/Monitoring]

| username: h5n1 | Original post link

How about manually collecting the statistics again and then checking?

| username: TiDBer_2tTuI9A5 | Original post link

Could you please let me know how I should provide the information you need? Thank you very much.

| username: h5n1 | Original post link

The actual execution plan may not be the one shown. You can check if there is this SQL in the slow SQL records to see the related execution plan. You can also use the explain analyze method to see what the execution plan looks like.

| username: TiDBer_2tTuI9A5 | Original post link

Hello, here is the analyze information:

| username: h5n1 | Original post link

How many records are there with the condition count(*)? It looks like a statistics issue. Try manually collecting the statistics again and see.

| username: TiDBer_2tTuI9A5 | Original post link

There is actually only one piece of data.

| username: TiDBer_2tTuI9A5 | Original post link

The following is the log from tidb_slow_query.log:

select * from objects where bucketname='test' and name='Thumbnails/img2021/0319/0828158e.jpg';
# Time: 2022-12-06T17:42:25.922607919+08:00
# Txn_start_ts: 437864292124983326
# User@Host: root[root] @ 172.15.104.247 [172.15.104.247]
# Conn_ID: 55490
# Query_time: 27.998234246
# Parse_time: 0.000012787
# Compile_time: 0.000254719
# Rewrite_time: 0.000072921
# Cop_time: 27.942105545 Process_time: 415.546 Wait_time: 0.332 Request_count: 859 Total_keys: 254780341 Process_keys: 254775997
# DB: yig
# Is_internal: false
# Digest: 8e443b2ca62a8b5d75388a297da66ed3f020a0065d60caae6db17a132a36966e
# Stats: gc:437864279070212131
# Num_cop_tasks: 859
# Cop_proc_avg: 0.483755529 Cop_proc_p90: 0.555 Cop_proc_max: 0.839 Cop_proc_addr: 172.15.104.241:20160
# Cop_wait_avg: 0.000386495 Cop_wait_p90: 0.001 Cop_wait_max: 0.003 Cop_wait_addr: 172.15.104.242:20160
# Mem_max: 11133148
# Prepared: true
# Plan_from_cache: false
# Has_more_results: false
# KV_total: 416.411025455
# PD_total: 0.000002569
# Backoff_total: 0
# Write_sql_response_total: 0.000018356
# Succ: true
# Plan: tidb_decode_plan('vwfAMAkxNV84CTAJNTAJeWlnLmdjLm10aW1lOmRlc2MsIG9mZnNldDowLCBjb3VudDo1MAEpBSHgMjcuOTk2MDc3NzIycywgbG9vcHM6MwkxMC41OTE0Njg4MTEwMzUxNTYgTUIJTi9BCjEJMzFfMTcJBWxEZGF0YTpUb3BOXzE2CTQwNTUwGVYcNTM3OTQ2ODkVVgg4M
TIBgfBbcF90YXNrOiB7bnVtOiA4NTksIG1heDogODQwLjE3MTAxN21zLCBtaW46IDUxMC43MzbCtXMsIGF2ZzogNDg0Ljc3NTk0OG1zLCBwOTU6IDU5MC44MDAyNzhtcywBSlhfcHJvY19rZXlzOiAzNTU0OTEsIHA5NTYXACA0MTI5LCB0b3QFFyg6IDZtNTUuNTQ2cwkVIHdhaXQ6IDMzMgFUDHJwY18J
rAg3MywFDiU6ATQENi4BSRQ3MzYxMnMF3HhyX2NhY2hlX2hpdF9yYXRpbzogMC4wMH0JMjYuODUzIUkMMjUgSylLLDIJMTVfMTYJMV8wCaK5AT1mBDBuNVwYMCwgdGlrdilbEHtwcm9jIQsMOjgzOQHCOG1pbjowcywgcDgwOjUzNAETITwENTgFHihpdGVyczoyNDkzOCEUMGFza3M6ODU5fQlOL0EBBBg
KMwkxXzE1BacwMjU0NzkzMzQyCWxlKC5qAlgsIDIwMjItMTItMDYgMDk6MjQ6NDAuMAUBACkFNxAxNTI5NUkfhrkABDc3BaYuuQAINDgzFbkEMzUBC4K5ABg0CTEwXzE0OroAZHRhYmxlOmdjLCBrZWVwIG9yZGVyOmZhbHNlBagQNzQwMzGiqAAANUaoAAQ2OAGdJWEEMTkBC4KoAA==')
# Plan_digest: 62c16bc589067c967847e02f0e69d05dff227531c22874ede526fa204a76e77b
| username: weixiaobing | Original post link

You can try to readjust the order of the composite index. It looks like the selectivity of “name” is better.

| username: TiDBer_2tTuI9A5 | Original post link

Currently, it is a composite index (bucketname, name, version).

So how should this issue be explained? And how can it be resolved or worked around? Modifying the index in a production environment is not very feasible.

| username: weixiaobing | Original post link

Add a hint to force using this index.

| username: buddyyuan | Original post link

You can only bind it first and force it to use the index. Specifically, you still need to look at the selectivity, histograms, and so on.

SHOW STATS_HISTOGRAMS

| username: TiDBer_2tTuI9A5 | Original post link

Hello, how do you add the hint you mentioned… using useindex?

| username: TiDBer_2tTuI9A5 | Original post link

Hello, why does the same where condition cause selectivity issues? What is the reason for this??? Clearly, the where condition matches the first two columns of the composite index.

| username: ealam_小羽 | Original post link

Image|690x247

| username: ealam_小羽 | Original post link

Recollect statistics (note to execute during off-peak business hours when the table is relatively large):

| username: 我是咖啡哥 | Original post link

This is generally caused by inaccurate statistics, histograms, and severe column skew. As mentioned by the expert above, the order of your index should be better like this: name, bucketname. You can create a new index and leave the original one unchanged for now.

| username: tidb菜鸟一只 | Original post link

I think if the values in the name field of your table are mostly different, you don’t need a composite index. Just create an index on the name field directly. If inaccurate statistics prevent the index from being used, you can directly specify it with a hint, like use_index.

| username: Jiawei | Original post link

I think it might be due to the cost of querying the two SQLs, because the optimizer ultimately decides how to query based on the cost of the two queries. You can check the cost of both.

| username: TiDBer_2tTuI9A5 | Original post link

How can I check the cost you mentioned?