Adding One More 'If' Statement Can Affect Performance by Over 100 Seconds

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 多一行if就会影响100多秒性能

| username: Jjjjayson_zeng

【TiDB Usage Environment】Production Environment
【TiDB Version】
【Reproduction Path】What operations were performed when the problem occurred
【Encountered Problem: Problem Phenomenon and Impact】
【Resource Configuration】Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
【Attachments: Screenshots/Logs/Monitoring】


workflow.sql (19.7 KB)
Remove_if_execution_plan.xlsx (18.5 KB)
For_remove_execution_plan.xlsx (18.4 KB)


On the surface, adding an if statement will cause it to use the disk.

| username: tidb狂热爱好者 | Original post link

The calculation should be moved out of SQL. Your SQL is calculating each row and putting all the load on TiDB.

| username: Jjjjayson_zeng | Original post link

How to modify this SQL?

| username: xfworld | Original post link

Without an intermediate layer, can it help you perform these calculations, or must the scenario be implemented using SQL?

| username: Jjjjayson_zeng | Original post link

It must be done with SQL, this should be fine.

| username: Mingdr | Original post link

Comparing the two execution plans, it seems that the slowdown is caused by sorting and flushing to disk.

| username: Jjjjayson_zeng | Original post link

Why does it go through the sorting algorithm? I don’t understand.

| username: 哈喽沃德 | Original post link

Calculate the if condition in advance.

| username: Jjjjayson_zeng | Original post link

How to advance?

| username: Jjjjayson_zeng | Original post link

Can you give a simple example?

| username: tidb狂热爱好者 | Original post link

Create a virtual column to calculate the total duration, including holidays, and change it to a virtual column to calculate it in advance.

| username: Jjjjayson_zeng | Original post link

Here’s a simple example, I’m a bit confused, master.

| username: 有猫万事足 | Original post link

After removing

└─Sort_132	390.23	410417	root		time:36.3s, loops:402	heytea.orgstdstruct.unitcode, heytea.rbsj.gh, heytea.rbsj.sjrq	409.6 MB	0 Bytes

Without removing

└─Sort_132	390.23	410417	root		time:2m29s, loops:402	heytea.orgstdstruct.unitcode, heytea.rbsj.gh, heytea.rbsj.sjrq	410.2 MB	466.9 MB

When not removed, the sort spilled to disk. The disk size is 466.9 MB.

I checked, once the sort operator spills to disk, it uses external sorting.
From the performance report submitted at that time, it indeed causes a significant performance regression, roughly 4-12 times.

Additionally, I still think your SQL should run on TiFlash.

| username: Jjjjayson_zeng | Original post link

I added it, but it won’t run. What should I do?

| username: 有猫万事足 | Original post link

If it’s convenient, you can save a copy of the on-site execution information, and I can take a look at it on my own cluster. This includes some cluster parameters, table structures, and statistics. You can also open the zip file yourself and delete any content you feel is inappropriate.

If it’s not convenient, then never mind.
The key points to check are whether MPP is enabled (without MPP, hash join cannot be pushed down to TiFlash for execution) and whether there are any tables in the query that do not have TiFlash replicas.

| username: Jjjjayson_zeng | Original post link

It’s not convenient for me to download it here.

| username: 有猫万事足 | Original post link

Okay. Take a close look at the MPP documentation.

It shouldn’t actually be very complicated. Normally, if all the tables in the SQL query have TiFlash replicas and you set it at the session level to force MPP, you should be able to see that part of it is using MPP.

Your execution plan basically doesn’t have any TiFlash parts in it. I feel that one of these two conditions is not being met. This shouldn’t be hard to find.

| username: Jjjjayson_zeng | Original post link

The fact that TiFlash wasn’t used proves that the system determined TiFlash wasn’t necessary. That’s how I understand it. I feel that this issue might not necessarily be with TiFlash.

| username: 有猫万事足 | Original post link

Your SQL aggregates several tables, each with around 20 million rows, into 14,000 rows. This is a suitable scenario for TiFlash + MPP.

I’m not sure whether the optimizer didn’t choose this or if there’s some setting preventing it from doing so.

| username: Qiuchi | Original post link

How about trying to enforce MPP with tidb_enforce_mpp?