Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: tilflash CPU使用率高- 慢SQL-sum() 90个列,400w数据
To improve efficiency, please provide the following information. Clear problem descriptions can be resolved faster:
【TiDB Usage Environment】Production Environment or Test Environment or POC
Production Environment
【TiDB Version】
v4.0.8
【Encountered Problem】
High CPU usage on the machine where TiFlash is located, able to pinpoint specific slow queries
select report_time, sum(1), sum(2), sum(3)…sum(90)
from t1
where report_time >= ‘2022-10-16 00:00:00’ and report_time < ‘2022-10-17 10:00:00’
group by report_time
The t1 table is a large wide table with about 90 fields. It needs to group by the report_time field and aggregate the values of the other 90 fields.
The data volume for one day is about 4 million, and the report_time range is just one day. The data volume is not very large, but there are 90 fields to aggregate.
Some simple tests have been done, reducing the number of sum fields can reduce the SQL execution time, but the business side needs all fields to be aggregated.
This has stalled our optimization efforts.
【Reproduction Path】What operations were performed to encounter the problem
【Problem Phenomenon and Impact】
【Attachments】 Related logs and monitoring (https://metricstool.pingcap.com/)
If the question is about performance optimization or fault troubleshooting, please download the script and run it. Please select all and copy-paste the terminal output results and upload them.