After upgrading to TiDB v6.4.0, some fields are empty when using the WITH statement to perform a LEFT JOIN

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 升级Tidb v6.4.0后,使用with语句做leftjoin,关联后字段部分为空

| username: TiDBer_uul2Yg3E

[TiDB Usage Environment] Test
[TiDB Version] V6.4.0
[Encountered Issue: Problem Phenomenon and Impact]
When using TiDB v6.3.0, executing the same SQL query returned correct data. However, after upgrading to v6.4.0, executing the same SQL query results in some fields being completely empty.

The executed SQL is as follows:

with tmp_base_data_pre as
(
    select 
        t1.tenantsid                  
        ,t1.information_enterprise_id            
        ,t1.template_idstr            
        ,t1.information_name  
        ,t1.information_id             
        ,t1.information_version        
        ,t1.information_items         
        ,t1.template_version  
        ,t1.template_name     
        ,t1.template_type     
        ,t1.id                         
        ,t1.complete                   
        ,t1.version                    
        ,t1.datas                      
        ,t1.create_by                  
        ,t1.create_date                
        ,t1.create_name                
        ,t1.create_program             
        ,t1.last_update_by             
        ,t1.last_update_date           
        ,t1.last_update_name           
        ,t1.last_update_program
    from
    (
        select 
            tenantsid                  
            ,information_name           
            ,information_enterprise_id  
            ,information_id             
            ,information_version        
            ,information_items         
            ,JSON_UNQUOTE(JSON_EXTRACT(t.template, '$.idStr'))      as template_idstr       
            ,JSON_UNQUOTE(JSON_EXTRACT(t.template, '$.version'))    as template_version  
            ,JSON_UNQUOTE(JSON_EXTRACT(t.template, '$.name'))       as template_name     
            ,JSON_UNQUOTE(JSON_EXTRACT(t.template, '$.type'))       as template_type     
            ,id                         
            ,complete                   
            ,version                    
            ,datas                      
            ,create_by                  
            ,create_date                
            ,create_name                
            ,create_program             
            ,last_update_by             
            ,last_update_date           
            ,last_update_name           
            ,last_update_program
        from EVALUATION_KPI_QUESTIONNAIRE_RESULT_JSON_DATA t
        where t.complete='true' or t.complete is null
		
    )t1
    join 
    (
	    select 
            tenantsid                  
            ,JSON_UNQUOTE(JSON_EXTRACT(t.template, '$.idStr'))                    as template_idstr           
            ,information_enterprise_id
            ,substring_index(max(concat(last_update_date,'_',id)),'_',1)          as last_update_date			
            ,substring_index(max(concat(last_update_date,'_',id)),'_',-1)         as id			
	    from EVALUATION_KPI_QUESTIONNAIRE_RESULT_JSON_DATA t
        where t.complete='true' or t.complete is null
	    group by tenantsid,information_enterprise_id,JSON_UNQUOTE(JSON_EXTRACT(t.template, '$.idStr'))
	)t2
    on(t1.tenantsid=t2.tenantsid and t1.template_idstr=t2.template_idstr and t1.information_enterprise_id=t2.information_enterprise_id and t1.last_update_date=t2.last_update_date and t1.id=t2.id)	
), 
tmp_value_list as 
(
    select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items,concat('\$\[',h.help_topic_id,'\].isPrimary'))) as information_items_isprimary
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items,concat('\$\[',h.help_topic_id,'\].way'))      ) as information_items_way
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items,concat('\$\[',h.help_topic_id,'\].name'))     ) as information_items_name
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items,concat('\$\[',h.help_topic_id,'\].value'))    ) as information_items_value
    from tmp_base_data_pre t
    inner join HELP_TOPIC h 
    on(h.help_topic_id<json_length(t.information_items))
),
tmp_value_area AS
(
    select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr        
        ,information_items_value_level
        ,information_items_value_adcode
        ,information_items_value_id
        ,information_items_value_name
		,ROW_NUMBER()over(partition by tenantsid order by template_idstr desc) as test_rank
    from (
	select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr        
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].level'))    ) as information_items_value_level
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].adcode'))    ) as information_items_value_adcode
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].id'))    ) as information_items_value_id
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].name'))    ) as information_items_value_name
    from tmp_value_list t
    inner join HELP_TOPIC h 
    on(h.help_topic_id<json_length(t.information_items_value) and t.information_items_way='地区选择' and (t.information_items_isprimary='true' or t.information_items_isprimary is null))
    where t.information_items_way='地区选择' and (t.information_items_isprimary='true' or t.information_items_isprimary is null)
	)t
),
tmp_value_industry AS
(
    select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr        
        ,information_items_value_level
        ,information_items_value_code
        ,information_items_value_id
        ,information_items_value_name
		,ROW_NUMBER()over(partition by tenantsid order by template_idstr desc) as test_rank
	from (
	select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr        
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].level'))    ) as information_items_value_level
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].code'))    ) as information_items_value_code
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].id'))    ) as information_items_value_id
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].name'))    ) as information_items_value_name
    from tmp_value_list t
    inner join HELP_TOPIC h 
    on(h.help_topic_id<json_length(t.information_items_value) and t.information_items_way='行业选择' and (t.information_items_isprimary='true' or t.information_items_isprimary is null))
    where t.information_items_way='行业选择' and (t.information_items_isprimary='true' or t.information_items_isprimary is null)
	)t
),
tmp_value_scale AS
(
    select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr        
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].id'))    ) as information_items_value_id
        ,JSON_UNQUOTE(JSON_EXTRACT(t.information_items_value,concat('\$\[',h.help_topic_id,'\].name'))    ) as information_items_value_name
    from tmp_value_list t
    inner join HELP_TOPIC h 
    on(h.help_topic_id<json_length(t.information_items_value) and t.information_items_way='企业规模' and (t.information_items_isprimary='true' or t.information_items_isprimary is null))
    where t.information_items_way='企业规模' and (t.information_items_isprimary='true' or t.information_items_isprimary is null)
),
tmp_indicator_list as 
(
    select 
        tenantsid
        ,information_enterprise_id
        ,template_idstr
        ,JSON_UNQUOTE(JSON_EXTRACT(t.datas,concat('\$\[',h.help_topic_id,'\].id')))     as indicator_id
        ,JSON_UNQUOTE(JSON_EXTRACT(t.datas,concat('\$\[',h.help_topic_id,'\].name')))   as indicator_name
        ,JSON_UNQUOTE(JSON_EXTRACT(t.datas,concat('\$\[',h.help_topic_id,'\].unit')))   as indicator_unit
        ,JSON_UNQUOTE(JSON_EXTRACT(t.datas,concat('\$\[',h.help_topic_id,'\].result'))) as indicator_value
    from tmp_base_data_pre t
    inner join HELP_TOPIC h 
    on(h.help_topic_id<json_length(t.datas))
)
select 
    t1.tenantsid
    ,t1.information_name
    ,t1.information_enterprise_id
    ,t1.information_id
    ,t1.information_version
    ,t21.information_items_value_id                          as province_id 
    ,t21.information_items_value_adcode                      as province_adcode 
    ,t21.information_items_value_name                        as province_name 
    ,t22.information_items_value_id                          as city_id 
    ,t22.information_items_value_adcode                      as city_adcode 
    ,t22.information_items_value_name                        as city_name 
    ,t23.information_items_value_id                          as district_id 
    ,t23.information_items_value_adcode                      as district_adcode 
    ,t23.information_items_value_name                        as district_name 
    ,t31.information_items_value_code                        as category_code
    ,t31.information_items_value_name                        as category_name
    ,t32.information_items_value_code                        as division_code
    ,t32.information_items_value_name                        as division_name
	,t33.information_items_value_code                        as group_code
    ,t33.information_items_value_name                        as group_name
	,t34.information_items_value_code                        as class_code
    ,t34.information_items_value_name                        as class_name
	,t4.information_items_value_id                           as scale_id
    ,t4.information_items_value_name                         as scale_name
	,t1.template_idstr               
    ,t1.template_version          
    ,t1.template_name             
    ,t1.template_type             
    ,t1.id                                                   as questionnaire_id                      
    ,t1.complete                  
    ,t1.version                   
    ,t1.create_by                 
    ,t1.create_date               
    ,t1.create_name               
    ,t1.create_program            
    ,t1.last_update_by            
    ,t1.last_update_date          
    ,t1.last_update_name          
    ,t1.last_update_program
	,t5.indicator_id
    ,t5.indicator_name
    ,t5.indicator_unit
    ,t5.indicator_value
from tmp_base_data_pre t1
left join tmp_value_area t21
on(t1.tenantsid=t21.tenantsid and t1.template_idstr=t21.template_idstr and t1.information_enterprise_id=t21.information_enterprise_id and t21.information_items_value_level='province')
left join tmp_value_area t22
on(t1.tenantsid=t22.tenantsid and t1.template_idstr=t22.template_idstr and t1.information_enterprise_id=t22.information_enterprise_id and t22.information_items_value_level='city' and substr(t21.information_items_value_adcode,1,2)=substr(t22.information_items_value_adcode,1,2))
left join tmp_value_area t23
on(t1.tenantsid=t23.tenantsid and t1.template_idstr=t23.template_idstr and t1.information_enterprise_id=t23.information_enterprise_id and t23.information_items_value_level='district' and substr(t22.information_items_value_adcode,1,4)=substr(t23.information_items_value_adcode,1,4))
left join tmp_value_industry t31
on(t1.tenantsid=t31.tenantsid and t1.template_idstr=t31.template_idstr and t1.information_enterprise_id=t31.information_enterprise_id and t31.information_items_value_level='1')
left join tmp_value_industry t32
on(t1.tenantsid=t32.tenantsid and t1.template_idstr=t32.template_idstr and t1.information_enterprise_id=t32.information_enterprise_id and t32.information_items_value_level='2')
left join tmp_value_industry t33
on(t1.tenantsid=t33.tenantsid and t1.template_idstr=t33.template_idstr and t1.information_enterprise_id=t33.information_enterprise_id and t33.information_items_value_level='3')
left join tmp_value_industry t34
on(t1.tenantsid=t34.tenantsid and t1.template_idstr=t34.template_idstr and t1.information_enterprise_id=t34.information_enterprise_id and t34.information_items_value_level='4')
left join tmp_value_scale t4
on(t1.tenantsid=t4.tenantsid and t1.template_idstr=t4.template_idstr and t1.information_enterprise_id=t4.information_enterprise_id)
left join tmp_indicator_list t5
on(t1.tenantsid=t5.tenantsid and t1.template_idstr=t5.template_idstr and t1.information_enterprise_id=t5.information_enterprise_id)

Execution result:

| username: Billmay表妹 | Original post link

This might be due to some configuration file parameters being converted to system variables after upgrading to v6.4.0, causing changes in system behavior. It is recommended that you check whether the configuration files and system variables are correctly set after the upgrade. You can refer to the descriptions of configuration file parameters and system variables in the official TiDB documentation to ensure they are set correctly.

Additionally, you can try using TiDB’s logging feature to view detailed information when executing SQL, to better understand the issue. You can use TiDB’s slow query log and error log to view detailed information about SQL execution, as well as any possible errors or exceptions.

| username: TiDBer_uul2Yg3E | Original post link

Which specific configuration are you referring to? Is there a range? Because I haven’t manually changed the TiDB configuration; I’ve been using the official default settings. Moreover, I deployed v6.4.0 directly in another zone, and the effect is the same as upgrading to v6.4.0—there are still many empty fields.

| username: Billmay表妹 | Original post link

Which version did you upgrade from? Since 6.4.0 does not provide an iterative version, would you consider upgrading to 6.5.2 first?

| username: 有猫万事足 | Original post link

I have a guess, which may not be correct.

6.4 upgraded a new feature for join.
If 6.3 is normal but 6.4 is not, it is likely that this new feature introduction caused the issue.

Due to the enumeration process of the dynamic programming algorithm potentially consuming more time, the Join Reorder algorithm is currently controlled by the variable [tidb_opt_join_reorder_threshold](https://docs-archive.pingcap.com/zh/tidb/v6.4/system-variables#tidb_opt_join_reorder_threshold). When the number of nodes participating in Join Reorder exceeds this threshold, the greedy algorithm is chosen; otherwise, the dynamic programming algorithm is chosen.

Note that the default value of this tidb_opt_join_reorder_threshold mentioned in the documentation is 0.
This means that by default, the original greedy algorithm is still used.
I wonder if someone has adjusted this parameter, causing the join algorithm to change?

You can check the current value of this variable. If it is not 0, try setting it to 0.
Anyway, from the upgrade content, this area seems quite suspicious.

| username: redgame | Original post link

The person above seems to have guessed it somewhat correctly.

| username: zhanggame1 | Original post link

It’s unlikely that anyone would change this parameter, right?

| username: Anna | Original post link

It’s unlikely that anyone would change this parameter, isn’t this considered a flaw?

| username: 有猫万事足 | Original post link

I think it’s fine to consider it a defect. If there’s a way to reproduce it, you can directly report it on GitHub/forum.
The bug I reported last week has already been closed. The efficiency is still acceptable.

| username: Anna | Original post link

Fast and impressive