How Table Analysis Affects TiFlash by a Factor of 5

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 表分析对tiflash是如何影响相关5倍

| username: 扬仔_tidb

【TiDB Usage Environment】Production Environment
【TiDB Version】v6.5.5
【Reproduction Path】Operations performed that led to the issue
【Encountered Issue: Problem Phenomenon and Impact】
There are two tables, one with 2.38 million rows (supplier_sub_category) and another with 30,000 rows. Before analyzing the supplier_sub_category table, the execution plan took 2.1 seconds, but after analyzing the table, the new execution plan shows 0.4 seconds.
What could cause such a significant difference in the execution plans before and after the analysis? This table was manually analyzed just yesterday, and the data usually doesn’t change much.
【Attachments: Screenshots/Logs/Monitoring】
Statement
explain analyze
SELECT
a.supplier_id,
a.brand_id,
a.category_id AS third_category_id,
max(b.bd_id) AS bd_id
FROM
supplier.supplier_sub_category a
LEFT JOIN supplier.supplier_brand_category b ON a.supplier_id = b.supplier_id
AND a.root_category_id = b.category_id
AND b.category_level = 0
AND a.brand_id = b.brand_id
WHERE
a.is_delete = 0
AND b.is_delete = 0
GROUP BY
a.supplier_id,
a.brand_id,
a.category_id;

| username: 江湖故人 | Original post link

Has the access path and the actual number of processed rows remained unchanged? Can you confirm if the cluster load is stable?

| username: Kongdom | Original post link

Is the execution plan exactly the same? Except for the time?

| username: FutureDB | Original post link

If the execution plans are identical but the times differ, it is possible that the cluster load and resource usage were different during the two executions.

| username: Kongdom | Original post link

If the execution plans are exactly the same, then it can only be due to cluster load and resource usage.

| username: dba远航 | Original post link

The execution plan is the same, but the statistics have changed, which will also affect the internal algorithms, such as the estimation of IO to be scanned, etc.

| username: tidb菜鸟一只 | Original post link

Could it simply be an issue with the busy state of TiFlash?

| username: 扬仔_tidb | Original post link

Thank you everyone, it might have been related to the performance of the TiFlash machine at that time. Now, after repeatedly executing it multiple times, it still takes 0.4 seconds.

| username: 烂番薯0 | Original post link

How is the performance of the TiFlash machine?

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.