Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 子查询sql,oracle 2秒查出结果,tidb 30分钟没查出结果
This SQL statement takes 2 seconds to produce results in Oracle, but it is very slow in TiDB.
TiDB execution plan
How large is the data volume of this table?
Can you share the structure?
The main reason for the slow speed is that the SQL statement is not optimized. You can use the EXPLAIN command to analyze the execution plan of the SQL statement and see if there are any full table scans or other inefficient operations. Additionally, you can check if there are any issues with the indexes, such as missing indexes or indexes not being used.
Well, the data volume is too small. Comparing it with MySQL might be more meaningful.
TiDB’s operator and plan optimization capabilities definitely can’t match Oracle’s, so without a certain data scale, no advantages can be demonstrated. There are ways to optimize it:
In the red box area, without operator support, it’s equivalent to aggregating to TiDB for partial processing, then completing a full table scan through TiFlash, and then aggregating again.
Aggregated twice… and there’s waiting in between… 
You can try the following methods:
- Remove
only_full_group_by
from sql_mode
, and then rewrite the SQL as follows:
SELECT
max(id),
patient_id,
card_number
FROM
base.T_DCH_PATIENT_CARD
WHERE
card_type = 4
AND card_state = 1
AND hosp_id = 1517
GROUP BY
patient_id;
- Under the default
sql_mode
, rewrite the SQL as follows:
SELECT
t.id,
t.patient_id,
t.card_number
FROM
base.T_DCH_PATIENT_CARD t
WHERE
t.id IN (
SELECT
max(id) AS id
FROM
base.T_DCH_PATIENT_CARD
WHERE
card_type = 4
AND card_state = 1
AND hosp_id = 1517
GROUP BY
patient_id
)
If you find that using TiFlash is slower, try disabling TiFlash first and then try again.
Your SQL query is definitely problematic when running on TiDB with TiFlash. I suggest using a hint to avoid TiFlash and try again. Additionally, you can rewrite the SQL as follows:
select patient_id, card_number from
(
select patient_id, card_number, rank() over (partition by patient_id order by id desc) as row_num
from base.T_DCH_PATIENT_CARD ca
where ca.card_type = 4
and ca.card_state = 1
and ca.hosp_id = 1517
)
where row_num = 1;
Can you post the execution plan for Oracle? It seems that TiDB hasn’t optimized the correlated subquery here. After filtering the ca table, there are still hundreds of thousands of results. Assuming the number of records to be de-correlated in the subquery is very small, optimizing away the subquery should significantly improve efficiency. However, it would be best to post the Oracle execution plan to see how it is executed.