How to Optimize Queries for Single Tables with Hundreds of Millions of Rows

translator_bot · June 23, 2024, 12:10am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 单表亿级的查询如何优化

| username: TiDBer_tsKlwcSz

【TiDB Usage Environment】Production Environment
【TiDB Version】5.4
【Reproduction Path】Nearly 100 million rows of data in a single table in the production environment, performing pagination queries
【Encountered Problem: Phenomenon and Impact】 Data return time is over 3 seconds
【Resource Configuration】
【Attachment: Screenshot/Log/Monitoring】

    select
    inc.*
    from a  inc
    where 1=1  and inc.fault_status=1  and inc.error_alarm=0  and inc.block_state=0 ORDER BY inc.start_ts desc limit 10;

translator_bot · June 23, 2024, 12:10am

| username: 裤衩儿飞上天 | Original post link

Execution plan, table structure, table health status, etc.

translator_bot · June 23, 2024, 12:10am

| username: TiDBer_tsKlwcSz | Original post link

I found that it is particularly slow when using SELECT COUNT(1), taking up to 2 seconds.

translator_bot · June 23, 2024, 12:10am

| username: hooopo | Original post link

Looking at your diagram, you only created an index on start_ts. You should create a composite index, (start_ts, block_state, error_alarm, fault_status).

Additionally, if count is slow, you can try enabling the TiFlash node.

translator_bot · June 23, 2024, 12:10am

| username: forever | Original post link

The previous conditions are all equality queries, it’s better to put start_ts at the end.