Can TiDB's temporary tables be used in this scenario?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TIDB的临时表可以在该场景下使用吗?

| username: clearcodecn

To improve efficiency, please provide the following information. Clear problem descriptions can be resolved faster:
【TiDB Usage Environment】Production
【TiDB Version】5.7.25-TiDB-v5.4.0
【Encountered Problem】Too many sub-conditions in ‘in’, poor performance of inner join on large tables
【Reproduction Path】
Two tables, a and b, each with data volumes in the tens of millions. Both query statements include tenant indexes, as shown below:

select count(distinct a.customerId) as count from a inner join b on a.customerId = b.id and a.company_id = b.company_id where a.company_id = 'uuid-1' and b.uids in ('1','2'......up to 1000''); 

【Problem Phenomenon and Impact】

Cluster complexity around 60%, the query times out without returning, and CPU spikes to over 90%.

A few days ago, I saw a mention of temporary tables in the documentation, so I conducted an experiment:

CREATE TEMPORARY TABLE users (
    id varchar(255) not null primary key
);

insert into users (id) values ('1','2' ....)

The modified query SQL is:

select count(distinct a.customerId) as count from a inner join b on a.customerId = b.id and a.company_id = b.company_id where a.company_id = 'uuid-1' and b.uids in (select id from users);

Found that the result returned in 1 second.

Since I’m not familiar with using temporary tables, I would like to ask if it is appropriate to use temporary tables in this scenario? If using the temporary table solution, will creating many temporary tables during server operation put too much pressure on the server?

| username: ddhe9527 | Original post link

Why does it feel like the speed isn’t an issue with this temporary table? Please share the execution plans for both SQL queries.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.