Indexes in TiDB Severely Affect Query Performance

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TIDB 索引严重影响查询性能

| username: Royal

[TiDB Usage Environment] Production Environment / Testing / Poc Production Environment
[TiDB Version] v5.4.0
[Reproduction Path] Operations performed that led to the issue
[Encountered Issue: Problem Phenomenon and Impact] Using an index actually made the performance worse
[Resource Configuration] 50 cores, 384GB
[Attachments: Screenshots/Logs/Monitoring]
TableFullScan is much more efficient than IndexRangeScan, not sure if large data volumes cannot use indexes
Execution Plan:


| username: tidb菜鸟一只 | Original post link

Why use an index for about half of the data… Does the TiDB optimizer automatically choose to use the index?

| username: Royal | Original post link

Yes, the optimizer uses the index based on the data volume, but if my time range becomes larger, it will perform a TableFullScan, which is the second picture, and the performance is better.

| username: 我是咖啡哥 | Original post link

When the data volume is large, the efficiency of back table queries is too low. It’s faster to scan the entire table.

| username: Royal | Original post link

So, does that mean large amounts of data are not suitable for indexing? Aren’t we creating indexes to improve query efficiency?

| username: zhanggame1 | Original post link

Indexes are used to avoid full table scans, but accessing the index and then the table incurs additional overhead. If the amount of data accessed is too large a proportion of the table, a full table scan is faster. Generally, queries that use indexes should access less than 10% of the table.

| username: zhanggame1 | Original post link

If you only need to access a few columns of the table, you can consider using TiFlash, which will perform a full scan on only those columns.

| username: 我是咖啡哥 | Original post link

It is not that large amounts of data are not suitable for indexing! Indexing is meant to quickly find data that meets the conditions from a large table. However, if the query conditions do not filter much data, then using an index may not necessarily be faster. If each query needs to return a large amount of data from this table, then the index is not very meaningful.

| username: Royal | Original post link

Yes, my business only involves partial columns. When accessing, there may be a large number of queries or a small number of queries. It feels like the optimizer should handle this. For example, if less than 10% of the columns are queried, use IndexRangeScan; otherwise, use TableFullScan.

| username: Royal | Original post link

This seems to be the optimizer’s job. For queries, you can only choose not to add indexes, because there are both large and small scan requirements. As a compromise, it’s better not to add indexes.

| username: 我是咖啡哥 | Original post link

I don’t understand your actual scenario. Why compromise by not adding an index? Generally, you add an index, and for special SQL, you add a hint to fix the execution plan.

| username: Royal | Original post link

Well, after adding the index, the optimizer used IndexRangeScan, but the efficiency slowed down. So, the compromise is not to add it. The purpose of my post is to discuss under what circumstances to add an index. Currently, it seems that the optimizer needs optimization. Since we have fixed SQL for our business, hints can only optimize specific scenarios of SQL and cannot be generalized.

| username: tidb菜鸟一只 | Original post link

Check if the statistics of this table are inaccurate:
SHOW stats_healthy WHERE db_name=‘’ AND table_name=‘’;

| username: zhanggame1 | Original post link

You can only do some table analysis and try again.

| username: 我是咖啡哥 | Original post link

Did you not collect statistical information?
Share the table structure, SQL, and execution plan for us to take a look.

| username: Royal | Original post link

sql:
SELECT
info_id, COUNT(*) AS log_call_event_count, MAX(log_test.error_message) as error_message FROM
sy_db.log_test
where log_ds >= ‘2023-07-03’ AND log_ds <= ‘2023-07-04’ AND log_ts >= ‘2023-07-03 15:00:00’ and log_ts <= ‘2023-07-04 15:00:00’
group by info_id

The execution plan is the first image.

| username: zhouzeru | Original post link

  • tidb_index_lookup_size: Controls the chunk size during the index lookup process, default is 20000. If the number of rows matched by the query is large, you can appropriately increase the value of this parameter to reduce the number of chunks.
  • tidb_index_scan_batch_size: Controls the batch size during the index scan process, default is 256. If the batch size is too small, it will lead to frequent network transmissions, thereby affecting query performance.
  • tidb_max_chunk_size: Controls the maximum chunk size of the returned result set, default is 32MB. If the result set returned by the query is large, you can appropriately increase the value of this parameter to reduce the number of chunks.
| username: Royal | Original post link

SQL:
SELECT
info_id, COUNT(*) AS log_call_event_count, MAX(log_test.error_message) as error_message FROM
sy_db.log_test
where log_ds >= ‘2023-07-03’ AND log_ds <= ‘2023-07-04’ AND log_ts >= ‘2023-07-03 15:00:00’ and log_ts <= ‘2023-07-04 15:00:00’
group by info_id

The execution plan is the first image.

| username: Royal | Original post link

Yes, are some partitions not 100% healthy?

| username: redgame | Original post link

Collect statistics