Optimization Issues with Count

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: count优化问题

| username: emerson_cai

The above SQL becomes slow when adding COUNT(instance.id) AS instance_count. How can this be optimized?

| username: Miracle | Original post link

Post the execution plans of the two SQL queries.

| username: tidb狂热爱好者 | Original post link

Statistics are suitable for TiFlash.

| username: emerson_cai | Original post link

Sort_11 1.00 root cmdb.model_group.id, cmdb.model.id
└─Projection_13 1.00 root cmdb.model_group.id, cmdb.model_group.unique_identification, cmdb.model_group.name, cmdb.model.id, cmdb.model.name, cmdb.model.sort_num, cmdb.model.model_type, cmdb.model.unique_identification, cmdb.model.model_icon
└─HashAgg_14 1.00 root group by:cmdb.model.id, cmdb.model_group.id, funcs:firstrow(cmdb.model_group.id)->cmdb.model_group.id, funcs:firstrow(cmdb.model_group.name)->cmdb.model_group.name, funcs:firstrow(cmdb.model_group.unique_identification)->cmdb.model_group.unique_identification, funcs:firstrow(cmdb.model.id)->cmdb.model.id, funcs:firstrow(cmdb.model.name)->cmdb.model.name, funcs:firstrow(cmdb.model.model_icon)->cmdb.model.model_icon, funcs:firstrow(cmdb.model.unique_identification)->cmdb.model.unique_identification, funcs:firstrow(cmdb.model.sort_num)->cmdb.model.sort_num, funcs:firstrow(cmdb.model.model_type)->cmdb.model.model_type
└─IndexJoin_19 0.02 root left outer join, inner:IndexLookUp_18, outer key:cmdb.model_group.id, inner key:cmdb.model.model_group_id, equal cond:eq(cmdb.model_group.id, cmdb.model.model_group_id)
├─TableReader_37(Build) 0.01 root data:Selection_36
│ └─Selection_36 0.01 cop[tikv] eq(cmdb.model_group.model_group_type, 0)
│ └─TableFullScan_35 15.00 cop[tikv] table:model_group keep order:false, stats:pseudo
└─IndexLookUp_18(Probe) 0.02 root
├─IndexRangeScan_16(Build) 0.02 cop[tikv] table:model, index:idx_model_group_id(model_group_id) range: decided by [eq(cmdb.model.model_group_id, cmdb.model_group.id)], keep order:false
└─TableRowIDScan_17(Probe) 0.02 cop[tikv] table:model keep order:false
| username: tidb狂热爱好者 | Original post link

The statistics plan has expired, analyze table.

| username: Miracle | Original post link

Execute explain analyze on your SQL,
and provide the execution plans for both with and without the count.

| username: caiyfc | Original post link

Isn’t it true that the SQL without COUNT(instance.id) AS instance_count doesn’t join with the instance table at all?

| username: emerson_cai | Original post link

Adding count:

Sort_12 1.00 root cmdb.model_group.id, cmdb.model.id
└─Projection_14 1.00 root cmdb.model_group.id, cmdb.model_group.unique_identification, cmdb.model_group.name, cmdb.model.id, cmdb.model.name, cmdb.model.sort_num, cmdb.model.model_type, cmdb.model.unique_identification, cmdb.model.model_icon, Column#36
└─HashAgg_15 1.00 root group by:cmdb.model.id, cmdb.model_group.id, funcs:count(cmdb.instance.id)->Column#36, funcs:firstrow(cmdb.model_group.id)->cmdb.model_group.id, funcs:firstrow(cmdb.model_group.name)->cmdb.model_group.name, funcs:firstrow(cmdb.model_group.unique_identification)->cmdb.model_group.unique_identification, funcs:firstrow(cmdb.model.id)->cmdb.model.id, funcs:firstrow(cmdb.model.name)->cmdb.model.name, funcs:firstrow(cmdb.model.model_icon)->cmdb.model.model_icon, funcs:firstrow(cmdb.model.unique_identification)->cmdb.model.unique_identification, funcs:firstrow(cmdb.model.sort_num)->cmdb.model.sort_num, funcs:firstrow(cmdb.model.model_type)->cmdb.model.model_type
└─IndexHashJoin_20 75.46 root left outer join, inner:IndexReader_17, outer key:cmdb.model.id, inner key:cmdb.instance.model_id, equal cond:eq(cmdb.model.id, cmdb.instance.model_id)
├─IndexJoin_31(Build) 0.02 root left outer join, inner:IndexLookUp_30, outer key:cmdb.model_group.id, inner key:cmdb.model.model_group_id, equal cond:eq(cmdb.model_group.id, cmdb.model.model_group_id)
│ ├─TableReader_49(Build) 0.01 root data:Selection_48
│ │ └─Selection_48 0.01 cop[tikv] eq(cmdb.model_group.model_group_type, 0)
│ │ └─TableFullScan_47 15.00 cop[tikv] table:model_group keep order:false, stats:pseudo
│ └─IndexLookUp_30(Probe) 0.02 root
│ ├─IndexRangeScan_28(Build) 0.02 cop[tikv] table:model, index:idx_model_group_id(model_group_id) range: decided by [eq(cmdb.model.model_group_id, cmdb.model_group.id)], keep order:false
│ └─TableRowIDScan_29(Probe) 0.02 cop[tikv] table:model keep order:false
└─IndexReader_17(Probe) 75.46 root index:IndexRangeScan_16
└─IndexRangeScan_16 75.46 cop[tikv] table:instance, index:uk_model_id_name(model_id, name) range: decided by [eq(cmdb.instance.model_id, cmdb.model.id)], keep order:false

Without adding count:

Sort_11 1.00 root cmdb.model_group.id, cmdb.model.id
└─Projection_13 1.00 root cmdb.model_group.id, cmdb.model_group.unique_identification, cmdb.model_group.name, cmdb.model.id, cmdb.model.name, cmdb.model.sort_num, cmdb.model.model_type, cmdb.model.unique_identification, cmdb.model.model_icon
└─HashAgg_14 1.00 root group by:cmdb.model.id, cmdb.model_group.id, funcs:firstrow(cmdb.model_group.id)->cmdb.model_group.id, funcs:firstrow(cmdb.model_group.name)->cmdb.model_group.name, funcs:firstrow(cmdb.model_group.unique_identification)->cmdb.model_group.unique_identification, funcs:firstrow(cmdb.model.id)->cmdb.model.id, funcs:firstrow(cmdb.model.name)->cmdb.model.name, funcs:firstrow(cmdb.model.model_icon)->cmdb.model.model_icon, funcs:firstrow(cmdb.model.unique_identification)->cmdb.model.unique_identification, funcs:firstrow(cmdb.model.sort_num)->cmdb.model.sort_num, funcs:firstrow(cmdb.model.model_type)->cmdb.model.model_type
└─IndexJoin_19 0.02 root left outer join, inner:IndexLookUp_18, outer key:cmdb.model_group.id, inner key:cmdb.model.model_group_id, equal cond:eq(cmdb.model_group.id, cmdb.model.model_group_id)
├─TableReader_37(Build) 0.01 root data:Selection_36
│ └─Selection_36 0.01 cop[tikv] eq(cmdb.model_group.model_group_type, 0)
│ └─TableFullScan_35 15.00 cop[tikv] table:model_group keep order:false, stats:pseudo
└─IndexLookUp_18(Probe) 0.02 root
├─IndexRangeScan_16(Build) 0.02 cop[tikv] table:model, index:idx_model_group_id(model_group_id) range: decided by [eq(cmdb.model.model_group_id, cmdb.model_group.id)], keep order:false
└─TableRowIDScan_17(Probe) 0.02 cop[tikv] table:model keep order:false
| username: emerson_cai | Original post link

Added and not added have all been supplemented and sent.

| username: caiyfc | Original post link

I suggest using a different way to display the execution plan; this one is too hard to read.

| username: caiyfc | Original post link

Your execution plan did not include analyze, so the actual data quantity is not visible. Based on the estimated rows in the execution plan, it can be inferred that the statistics of the related tables might be outdated, causing the optimizer to not choose the optimal execution plan. According to the execution plan, the SQL without count indeed does not join with the instance table, resulting in one less join and therefore faster execution.

| username: tidb菜鸟一只 | Original post link

Didn’t you notice that the execution plan without the count doesn’t include the instance table at all? If you query one less table, it will definitely be faster…

| username: dba远航 | Original post link

After adding count, the queried table is completely different, it queried the instance.