Performance Optimization for Multi-Table Joins

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 多表关联性能优化

| username: Hacker_QfCMflZ4

[TiDB Usage Environment] Production Environment
[TiDB Version] 6.5.0
[Reproduction Path] Multiple tables left join, each table has about 15 million rows of data
[Encountered Issues: Problem Phenomenon and Impact]

  1. Multiple tables left join, final result limit 20. From the execution plan, limit 20 is not pushed down to TiKV, and the entire SQL execution time is close to 2 seconds. Are there any optimization ideas? Normally, for a left join, it should be enough to get 20 rows from the first table.
  2. One of the tables is performing a full table scan, why is it not using the index?
    [Resource Configuration]

    [Attachments: Screenshots/Logs/Monitoring]
    SQL statement

Execution plan

| username: 人如其名 | Original post link

The sentences and execution plan are incomplete, how can anyone help analyze it?

| username: Hacker_QfCMflZ4 | Original post link


| username: realcp1018 | Original post link

First, without looking at the execution plan, for your SQL:

  1. Move the filter condition AND node.delete_type=0 to the WHERE clause.
  2. Check if t4 contains a composite index on (workflowid, delete_type).
  3. Check if all the fields on both sides of the join conditions have single-column indexes or composite indexes with them as prefixes.
    To make this SQL faster, focus on how quickly the t4 table can be queried (a composite index on (workflowid, delete_type) can speed this up) and how fast the joins are (complete indexes on both sides of the join).
| username: tidb菜鸟一只 | Original post link

You are using a left join, but you are filtering data from other tables. This way, the limit will definitely not be the first 20 rows of table t4. If you want to get the first 20 rows of the left table, please write it in this format:

select * from (select * from t4 where t4..... in .... and (t4....  or t4.... ) limit 0,20) t4
left join (select * from t1 where t1.....=false and t1..... is not null) on t4.... 
left join.....