SQL query results are inconsistent and may have missing data

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: SQL查询结果不一致,会少数据

| username: TiDBer_Lee

【TiDB Usage Environment】Production Environment
【TiDB Version】v6.5.3
Situation Description

  • Forced to use TiFlash MPP through parameter settings
  • Inconsistent query results multiple times, missing one fixed data row
  • If the cluster automatically selects the query engine, the results are normal
  • If the 3 left joins in the SQL are rewritten into two and then unioned, the results are normal

【explain analyze execution plan comparison】

SQL as follows

SELECT
        full_data.field1 AS field1,
        full_data.field2 AS field2,
    (CASE WHEN t1.total IS NULL THEN 0 ELSE t1.total END)   +(CASE WHEN t2.total IS NULL THEN 0 ELSE t2.total END) AS total
    FROM
    (SELECT
            CASE
            WHEN group0.group_id IS NOT NULL THEN CONCAT(group0.group_id, '(组)')
            ELSE dimension0.`patent_type`
            END AS field1,
            CASE
            WHEN group1.group_id IS NOT NULL THEN CONCAT(group1.group_id, '(组)')
            ELSE dimension1.`country`
            END AS field2
    FROM dw_ext.`t_1692603597355` t
            INNER JOIN ads_tab11 dimension0 ON dimension0.pdoc_id = t.pdoc_id
            LEFT JOIN biz.ads_tab22 group0 ON group0.group_value = dimension0.`patent_type` and group0.field = 'PATENT_TYPE' and group0.identifier_id = '0fb4b7e7fe0642618a1e7288faf7d8ae' 
            INNER JOIN ads_tab33 dimension1 ON dimension1.pdoc_id = t.pdoc_id
            LEFT JOIN biz.ads_tab22 group1 ON group1.group_value = dimension1.`country` and group1.field = 'COUNTRY' and group1.identifier_id = '0fb4b7e7fe0642618a1e7288faf7d8ae' 
    WHERE 1=1
            AND dimension0.`patent_type` in ('A','B','U','D') AND dimension0.`patent_type` IS NOT NULL
            AND dimension1.`country` in ('US','CN') AND dimension1.`country` IS NOT NULL
    GROUP BY field1,field2
    HAVING COUNT(t.pdoc_id) > 0
    ) full_data
    LEFT JOIN
        (
    SELECT field1,field2,COUNT(`pn`) AS total
    FROM (
    SELECT
            CASE
            WHEN group0.group_id IS NOT NULL THEN CONCAT(group0.group_id, '(组)')
            ELSE dimension0.`patent_type`
            END AS field1,
            CASE
            WHEN group1.group_id IS NOT NULL THEN CONCAT(group1.group_id, '(组)')
            ELSE dimension1.`country`
            END AS field2,dimension1.`pn`
    FROM dw_ext.`t_1692603597355` t
            INNER JOIN ads_tab11 dimension0 ON dimension0.pdoc_id = t.pdoc_id
            LEFT JOIN biz.ads_tab22 group0 ON group0.group_value = dimension0.`patent_type` and group0.field = 'PATENT_TYPE' and group0.identifier_id = '0fb4b7e7fe0642618a1e7288faf7d8ae' 
            INNER JOIN ads_tab33 dimension1 ON dimension1.pdoc_id = t.pdoc_id
            LEFT JOIN biz.ads_tab22 group1 ON group1.group_value = dimension1.`country` and group1.field = 'COUNTRY' and group1.identifier_id = '0fb4b7e7fe0642618a1e7288faf7d8ae' 
    WHERE 1=1
            AND dimension0.`patent_type` in ('A','B','U','D') AND dimension0.`patent_type` IS NOT NULL
            AND dimension1.`country` in ('US','CN') AND dimension1.`country` IS NOT NULL
        AND dimension1.`pn` IS NOT NULL
                AND dimension1.`country` IN ('US')
    GROUP BY t.pdoc_id, dimension1.`pn`,field1,field2
    ) T
    GROUP BYfield1,field2
    ORDER BY total DESC,field1 ASC,field2 ASC
        ) t1 ON
            t1.field1 = full_data.field1 AND 
            t1.field2 = full_data.field2
        LEFT JOIN
        (
    SELECT field1,field2,COUNT(`pn`) AS total
    FROM (
    SELECT
            CASE
            WHEN group0.group_id IS NOT NULL THEN CONCAT(group0.group_id, '(组)')
            ELSE dimension0.`patent_type`
            END AS field1,
            CASE
            WHEN group1.group_id IS NOT NULL THEN CONCAT(group1.group_id, '(组)')
            ELSE dimension1.`country`
            END AS field2,dimension1.`pn`
    FROM
    dw_ext.`t_1692603597355` t
            INNER JOIN ads_tab11 dimension0 ON dimension0.pdoc_id = t.pdoc_id
            LEFT JOIN biz.ads_tab22 group0 ON group0.group_value = dimension0.`patent_type` and group0.field = 'PATENT_TYPE' and group0.identifier_id = '0fb4b7e7fe0642618a1e7288faf7d8ae' 
            INNER JOIN ads_tab33 dimension1 ON dimension1.pdoc_id = t.pdoc_id
            LEFT JOIN biz.ads_tab22 group1 ON group1.group_value = dimension1.`country` and group1.field = 'COUNTRY' and group1.identifier_id = '0fb4b7e7fe0642618a1e7288faf7d8ae' 
    WHERE 1=1
            AND dimension0.`patent_type` in ('A','B','U','D') AND dimension0.`patent_type` IS NOT NULL
            AND dimension1.`country` in ('US','CN') AND dimension1.`country` IS NOT NULL
        AND dimension1.`pn` IS NOT NULL
                AND dimension1.`country` IN ('CN')
    GROUP BY t.pdoc_id, dimension1.`pn`,field1,field2
    ) T
    GROUP BY field1,field2
    ORDER BY total DESC,field1 ASC,field2 ASC
        ) t2 ON
            t2.field1 = full_data.field1 AND 
            t2.field2 = full_data.field2   
    GROUP BY full_data.field1,full_data.field2
    HAVING total > 0
    ORDER BY total DESC,field1 ASC,field2 ASC
    LIMIT   30000;
| username: xfworld | Original post link

What about changing it to INNER JOIN?

| username: zhanggame1 | Original post link

Is there anything special about the missing one, such as null values?

| username: Kongdom | Original post link

Is the last part unnecessary? There is no aggregate function in the SELECT.

GROUP BY full_data.field1, full_data.field2
HAVING total > 0

Additionally, in this kind of statement, IS NOT NULL can be omitted. Adding it might not be efficient.

AND dimension0.`patent_type` in ('A', 'B', 'U', 'D') 
AND dimension0.`patent_type` IS NOT NULL

AND dimension1.`country` in ('US', 'CN') 
AND dimension1.`country` IS NOT NULL
| username: TiDBer_Lee | Original post link

No, sometimes this record can be found, and sometimes it cannot be found.

| username: zhanggame1 | Original post link

Suspect that the table data and index data are inconsistent.

| username: TiDBer_Lee | Original post link

Using TiFlash, no indexes. :sweat:

| username: 人如其名 | Original post link

Please attach the complete execution plans for both the correct and incorrect executions. The screenshot content is too limited.

| username: Fly-bird | Original post link

Is the query correct?

| username: Kongdom | Original post link

I see there is a “limit” in the statement, could it be related to this? Try removing the “limit”.

| username: TiDBer_Lee | Original post link

No problem, it seems to be caused by too many nested subqueries. This is also an issue with TiFlash, and it doesn’t occur if TiKV is used instead. It also doesn’t occur if the query is refactored using the WITH syntax.

| username: Kongdom | Original post link

:thinking: In that case, you can report a bug. Reporting bugs earns points. :yum:

| username: ajin0514 | Original post link

Is there anything special about the missing data?

| username: windtalker | Original post link

Is there a complete comparison for explain analyze? The screenshot is not comprehensive enough…

| username: Billmay表妹 | Original post link

left join query return different result · Issue #47477 · pingcap/tidb · GitHub Issue has been created.

If possible, please provide the reproduction steps for the issue on the issue page.

| username: TiDBer_Lee | Original post link

Is there anything else that needs to be added?

| username: Billmay表妹 | Original post link

Please provide the complete execution plans for both the correct and incorrect executions. The screenshot content is too limited.

| username: TiDBer_Lee | Original post link

This post has plans for correct and incorrect execution. Take a look. If it’s too long to steal in full, is there any other way?

| username: Kongdom | Original post link

:yum: My cousin means to provide a way to reproduce the issue.

| username: Billmay表妹 | Original post link

What kind of operations did you perform to encounter this issue? Please list the complete reproduction steps to facilitate troubleshooting.

Only by reproducing and locating the issue can we resolve it.