Questions about actrows in the TopN operator

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TopN 算子中的actrows 疑问

| username: Raymond

The version is 6.5.3
Data source is tpch

SQL statement:
 explain analyze select * from  orders where O_TOTALPRICE >=370000 order by O_ORDERDATE  asc limit 500,10;

Why is the actrows of the TopN operator pushed down to tikv in the execution plan 6099? I don’t quite understand this point, please explain it.

mysql> explain analyze select * from  orders where O_TOTALPRICE >=370000 order by O_ORDERDATE  asc limit 500,10;
+----------------------------------+----------+---------+-----------+------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------+----------+------+
| id                               | estRows  | actRows | task      | access object                                        | execution info                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | operator info                                  | memory   | disk |
+----------------------------------+----------+---------+-----------+------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------+----------+------+
| TopN_9                           | 10.00    | 10      | root      |                                                      | time:99.5ms, loops:2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | tpch2.orders.o_orderdate, offset:500, count:10 | 307.6 KB | N/A  |
| └─IndexLookUp_22                 | 510.00   | 6099    | root      |                                                      | time:97.9ms, loops:8, index_task: {total_time: 38ms, fetch_handle: 38ms, build: 6.63µs, wait: 6.81µs}, table_task: {total_time: 151.3ms, num: 5, concurrency: 5}, next: {wait_index: 3.64ms, wait_table_lookup_build: 231.9µs, wait_table_lookup_resp: 92.3ms}                                                                                                                                                                                                                                                                             |                                                | 1.31 MB  | N/A  |
|   ├─IndexRangeScan_19(Build)     | 14966.40 | 11447   | cop[tikv] | table:orders, index:index_O_TOTALPRICE(O_TOTALPRICE) | time:37.6ms, loops:15, cop_task: {num: 7, max: 12.9ms, min: 983.4µs, avg: 5.69ms, p95: 12.9ms, max_proc_keys: 4695, p95_proc_keys: 4695, tot_proc: 2ms, tot_wait: 4ms, rpc_num: 7, rpc_time: 39.6ms, copr_cache: disabled, distsql_concurrency: 15}, tikv_task:{proc max:2ms, min:0s, avg: 714.3µs, p80:1ms, p95:2ms, iters:35, tasks:7}, scan_detail: {total_process_keys: 11447, total_process_keys_size: 526562, total_keys: 11454, get_snapshot_time: 2.47ms, rocksdb: {key_skipped_count: 11447, block: {cache_hit_count: 56}}}       | range:[370000.00,+inf], keep order:false       | N/A      | N/A  |
|   └─TopN_21(Probe)               | 510.00   | 6099    | cop[tikv] |                                                      | time:146.1ms, loops:13, cop_task: {num: 15, max: 57ms, min: 5.8ms, avg: 22.7ms, p95: 57ms, max_proc_keys: 1967, p95_proc_keys: 1967, tot_proc: 150ms, tot_wait: 19ms, rpc_num: 15, rpc_time: 340.6ms, copr_cache: disabled, distsql_concurrency: 15}, tikv_task:{proc max:36ms, min:2ms, avg: 10.5ms, p80:25ms, p95:36ms, iters:19, tasks:15}, scan_detail: {total_process_keys: 11447, total_process_keys_size: 1732802, total_keys: 11463, get_snapshot_time: 7.62ms, rocksdb: {key_skipped_count: 32, block: {cache_hit_count: 42082}}} | tpch2.orders.o_orderdate, offset:0, count:510  | N/A      | N/A  |
|     └─TableRowIDScan_20          | 14966.40 | 11447   | cop[tikv] | table:orders                                         | tikv_task:{proc max:34ms, min:2ms, avg: 9.8ms, p80:24ms, p95:34ms, iters:19, tasks:15}                                                                                                                                                                                                                                                                                                                                                                                                                                                     | keep order:false                               | N/A      | N/A  |
+----------------------------------+----------+---------+-----------+------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------+----------+------+
5 rows in set (0.11 sec)

Table structure

mysql> show create table orders;
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table  | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| orders | CREATE TABLE `orders` (
  `O_ORDERKEY` bigint(20) NOT NULL,
  `O_CUSTKEY` bigint(20) NOT NULL,
  `O_ORDERSTATUS` char(1) NOT NULL,
  `O_TOTALPRICE` decimal(15,2) NOT NULL,
  `O_ORDERDATE` date NOT NULL,
  `O_ORDERPRIORITY` char(15) NOT NULL,
  `O_CLERK` char(15) NOT NULL,
  `O_SHIPPRIORITY` bigint(20) NOT NULL,
  `O_COMMENT` varchar(79) NOT NULL,
  PRIMARY KEY (`O_ORDERKEY`) /*T![clustered_index] CLUSTERED */,
  KEY `index_O_TOTALPRICE` (`O_TOTALPRICE`),
  KEY `index_a` (`O_ORDERSTATUS`,`O_TOTALPRICE`),
  KEY `indexO_CUSTKEY` (`O_CUSTKEY`),
  KEY `index_date` (`O_ORDERDATE`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

| username: xfworld | Original post link

If the estimated value is significantly different from the actual value, consider running ANALYZE TABLE on the affected table.

{total_process_keys: 11447, 
total_process_keys_size: 1732802,
 total_keys: 11463, get_snapshot_time: 7.62ms,
 rocksdb: {key_skipped_count: 32, block: {cache_hit_count: 42082}}} | tpch2.orders.o_orderdate, offset:0, count:510  | N/A      | N/A  |
total_process_keys: 11447, total_process_keys_size: 526562, total_keys: 11454

Has this table undergone a lot of deletions? Check the health of this table.

| username: h5n1 | Original post link

When TopN is pushed down to TiKV, each cop task performs TopN on the qualifying data when returning to the table. There are a total of 15 tasks, and according to the limit condition, 510 rows need to be scanned. However, each cop task may not necessarily reach 510 rows during scanning. For example, a certain region may only have 490 rows of qualifying data. These data are then returned to TiDB for re-sorting, and finally, 10 rows are returned.

| username: Jellybean | Original post link

When executing TopN, according to the actual execution plan, you can see that the Coprocessors’ parallel operator distsql_concurrency is 15, and limit 500,10 indicates that each operator will scan 510 rows. The estimate is based on a single scan, but in actual execution, you need to consider multi-concurrent execution, i.e., 15×510=7650 rows. During the actual operation, the data range scanned by some operators may vary, increasing (due to historically deleted data) or decreasing (if the operator’s data scan range is less than 510 rows).

As long as the actual number of scanned rows does not differ significantly, the execution plan is considered correct.

| username: Raymond | Original post link

Some operators may scan different data ranges, including more data (historically deleted data). Will the historical version data scanned be counted in the actRows of the execution plan?

| username: Raymond | Original post link

This reply is incorrect, please ignore it.

| username: Raymond | Original post link

The historical version data scanned will not be counted in the actRows of the execution plan.

| username: TIDB-Learner | Original post link

Happy New Year

| username: dba远航 | Original post link

Evaluation and reality are different.

| username: 春风十里 | Original post link

Why is it 15 here? How is this 15 calculated?


In the execution plan, one loops 15 and another loops 13, why is that?

| username: Jellybean | Original post link

The execution plan includes this cop_task num identifier.

| username: 春风十里 | Original post link

This table’s regions that meet the conditions are distributed across 15 nodes, so the concurrency is 15?

| username: h5n1 | Original post link

The number of cop tasks for each operator, with one cop task scanning one region, and the concurrency is related to the concurrency setting variables, default is 15.

| username: Jellybean | Original post link

The “loops” in the execution plan indicates the number of times the current operator is called by the parent operator. Therefore:

  • “loops 15” in the execution plan means that the operator IndexRangeScan_19(Build) was called 15 times by the parent operator IndexLookUp_22.
  • “loops 13” in the execution plan means that the operator TopN_21(Probe) was called 13 times by the parent operator IndexLookUp_22.

Scanning data regions is completed through Cop Task tasks, with each task scanning one region. The “num” in the cop task indicates the number of tasks. “loops:13, cop_task: {num: 15,” means scanning 15 regions. The distsql_concurrency mentioned earlier is not accurate for estimating computation; it refers to the concurrency within the Coprocessors scanning regions and is unrelated to this.

Additionally, when reviewing the execution plan, you can also focus on the execution time of the operators. The “time” in the execution info indicates the total execution time from entering to leaving the operator. If the operator is called multiple times by the parent operator (loops), this time is the cumulative total time. This can help identify slow points in the process and is a key metric in performance analysis.

| username: FutureDB | Original post link

So if there are more than 15 regions scanned in the table above, what is the actual number of cop_tasks if there are fewer than 15 regions?

| username: h5n1 | Original post link

It just so happens that there are 15 regions. The number of cop tasks corresponds to the number of regions that multiple people need to scan. You can take a look at the plan for a full table scan and the plan for querying just one row of data.