TiFlash Performance Issues

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tiflash性能问题

| username: TiDBer_8aM3dZG6

[TiDB Usage Environment] Production Environment
[TiDB Version] V6.1.6
[Reproduction Path] Operations performed that led to the issue
[Encountered Issue: Problem Phenomenon and Impact]
A table with 130 million records takes 13 seconds to perform a complex select count query. The table has already been synchronized with TiFlash. This speed is somewhat slow. How can it be further optimized? Initially, the task was MPP, but after the number of records increased, it became cop.
explain analyze select count(1) from vm_company_derived_variables_df_1m
where new_status = 1
and ((c_city = ‘Xiamen’ or c_city = ‘Ningde’) or (c_province = ‘Shanghai’ and c_district = ‘Minhang’))
and c_industry_new in (‘Catering’, ‘Retail’, ‘Service’, ‘Culture and Sports’, ‘Trade’, ‘Food and Beverage’, ‘National Defense and Military’, ‘Tourism’, ‘Home Appliances’, ‘Media’,
‘Transportation’, ‘Power Equipment’, ‘Steel’, ‘Nonferrous Metals’, ‘Finance’, ‘Electronics’, ‘Chemicals’, ‘Energy’, ‘Machinery and Equipment’)
and (c_xw=1 or c_kjzxw=1 or c_new_fourth_board=1 or c_zgcun_ht=1 or c_new_three_board =1 or c_national_zjt = 1
or c_yingying_enterprise = 1 or c_torch_plan_project =1 or c_gazelle_enterprise=1 or c_private_technology=1)
and ((c_income_10v1 >= 10000000 and c_income_10v1 <= 60000000) or (c_income_25v1 >= 10000000 and c_income_25v1 <= 60000000)
or (c_income_50v1 >= 10000000 and c_income_50v1 <= 60000000) or (c_income_75v1 >= 10000000 and c_income_75v1 <= 60000000)
or (c_income_90v1 >= 10000000 and c_income_90v1 <= 60000000))

[Resource Configuration]
One physical machine, virtual machine, PD: 3 nodes, TiDB: 3 nodes, each with 32GB memory, TiKV: 3 nodes, each with 32GB memory. TiFlash: 1 node, SSD hard drive.
[Attachment: Screenshot/Log/Monitoring]
explain.txt (3.4 KB)

| username: 人如其名 | Original post link

threads:900 is too many!

Set tidb_max_tiflash_threads=6;
Then try executing this statement again?

| username: TiDBer_8aM3dZG6 | Original post link

Changing it to 6 didn’t work. Now with 200 million data, threads have become 1366. What is the reason?

| username: yilong | Original post link

Use 使用 PLAN REPLAYER 保存和恢复集群现场信息 | PingCAP 文档中心 to collect the information.

| username: zhanggame1 | Original post link

Using TiFlash might not be the optimal solution. TiFlash is a columnar storage, which is fast for filtering and judging a few columns in a table. With so many conditions, you might need to access too many columns, so it might be better to use KV.

| username: 有猫万事足 | Original post link

“TiFlash supports query execution in MPP mode, which introduces cross-node data exchange (data shuffle process) in computation.”

You only have one TiFlash. As mentioned above, my understanding is that the difference between MPP and cop should not be significant with a single TiFlash. (No matter how you cross nodes, there is only one, so at most it is data exchange with itself.) Moreover, some operators are not supported by MPP. If you think the query is slow due to MPP, you can force MPP mode to see if it speeds up as follows.

If you want TiDB to ignore the optimizer's cost estimation and force the use of MPP, you can set it with the following statements:

set @@session.tidb_allow_mpp=1;
set @@session.tidb_enforce_mpp=1;