Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: tidb_enable_paging=ON的情况下tikv-client积压的RPC消息还是较大的疑惑

[TiDB Usage Environment] Poc
[TiDB Version] 5.7.25-TiDB-v6.5.2
Modified tidb_max_paging_size=256, tidb_distsql_scan_concurrency=1, other parameters are default.
In this case, executing: select a.* from a; observed memory usage is larger than expected.
Expected memory usage:
To simulate the slow execution of the upper operator causing the distsql request to backlog RPC requests, I directly modified the source code of tidb-server, delaying the explain analyze part to make tikv-client messages more likely to accumulate.
tidb/executor/explain.go at 29116c0256c52b224da2b34d712c1063d171c0ad · pingcap/tidb · GitHub
Added time.Sleep(time.Second) delay processing in each loop when fetching the result set.
Allowing RPC messages returned by tikv to accumulate, the maximum memory usage (basically only tikv-client accumulated messages occupying memory) is: tidb_distsql_scan_concurrency * tidb_max_paging_size * sum of all field definitions of the table
I think the memory usage should not exceed the above expected value, but in reality, the memory usage is much larger than this (it feels like the number of copTask requests returned to tikv-client cache is far more than tidb_distsql_scan_concurrency).
Specific scenario:
Table structure definition:
mysql> desc lineitem_bak;
+-----------------+---------------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+---------------+------+------+---------+-------+
| L_ORDERKEY | bigint(20) | NO | PRI | NULL | |
| L_PARTKEY | bigint(20) | NO | | NULL | |
| L_SUPPKEY | bigint(20) | NO | | NULL | |
| L_LINENUMBER | bigint(20) | NO | PRI | NULL | |
| L_QUANTITY | decimal(15,2) | NO | | NULL | |
| L_EXTENDEDPRICE | decimal(15,2) | NO | | NULL | |
| L_DISCOUNT | decimal(15,2) | NO | | NULL | |
| L_TAX | decimal(15,2) | NO | | NULL | |
| L_RETURNFLAG | char(1) | NO | | NULL | |
| L_LINESTATUS | char(1) | NO | | NULL | |
| L_SHIPDATE | date | NO | MUL | NULL | |
| L_COMMITDATE | date | NO | | NULL | |
| L_RECEIPTDATE | date | NO | | NULL | |
| L_SHIPINSTRUCT | char(25) | NO | | NULL | |
| L_SHIPMODE | char(10) | NO | | NULL | |
| L_COMMENT | varchar(44) | NO | | NULL | |
| col1 | decimal(15,2) | YES | | NULL | |
| col2 | decimal(15,2) | YES | | NULL | |
| col3 | decimal(15,2) | YES | | NULL | |
| col4 | decimal(15,2) | YES | | NULL | |
| col5 | decimal(15,2) | YES | | NULL | |
| col6 | decimal(15,2) | YES | | NULL | |
| col7 | decimal(15,2) | YES | | NULL | |
| col8 | decimal(15,2) | YES | | NULL | |
| col9 | decimal(15,2) | YES | | NULL | |
| col10 | decimal(15,2) | YES | | NULL | |
| col11 | decimal(15,2) | YES | | NULL | |
| col12 | decimal(15,2) | YES | | NULL | |
| col13 | decimal(15,2) | YES | | NULL | |
| col14 | decimal(15,2) | YES | | NULL | |
| col15 | decimal(15,2) | YES | | NULL | |
| col16 | decimal(15,2) | YES | | NULL | |
| col17 | decimal(15,2) | YES | | NULL | |
| col18 | decimal(15,2) | YES | | NULL | |
| col19 | decimal(15,2) | YES | | NULL | |
| col20 | decimal(15,2) | YES | | NULL | |
| col21 | decimal(15,2) | YES | | NULL | |
| col22 | decimal(15,2) | YES | | NULL | |
| col23 | decimal(15,2) | YES | | NULL | |
| col24 | decimal(15,2) | YES | | NULL | |
| col25 | decimal(15,2) | YES | | NULL | |
| col26 | decimal(15,2) | YES | | NULL | |
| col27 | decimal(15,2) | YES | | NULL | |
| col28 | decimal(15,2) | YES | | NULL | |
| col29 | decimal(15,2) | YES | | NULL | |
| col30 | decimal(15,2) | YES | | NULL | |
| col31 | decimal(15,2) | YES | | NULL | |
| col32 | decimal(15,2) | YES | | NULL | |
| col33 | decimal(15,2) | YES | | NULL | |
| col34 | decimal(15,2) | YES | | NULL | |
| col35 | decimal(15,2) | YES | | NULL | |
| col36 | decimal(15,2) | YES | | NULL | |
| col37 | decimal(15,2) | YES | | NULL | |
| col38 | decimal(15,2) | YES | | NULL | |
| col39 | decimal(15,2) | YES | | NULL | |
| col40 | decimal(15,2) | YES | | NULL | |
| col41 | decimal(15,2) | YES | | NULL | |
| col42 | decimal(15,2) | YES | | NULL | |
| col43 | decimal(15,2) | YES | | NULL | |
| col44 | decimal(15,2) | YES | | NULL | |
| col45 | decimal(15,2) | YES | | NULL | |
| col46 | decimal(15,2) | YES | | NULL | |
| col47 | decimal(15,2) | YES | | NULL | |
| col48 | decimal(15,2) | YES | | NULL | |
| col49 | decimal(15,2) | YES | | NULL | |
| col50 | decimal(15,2) | YES | | NULL | |
+-----------------+---------------+------+------+---------+-------+
66 rows in set (0.00 sec)
Table data volume and statement execution plan:
mysql> select count(*) from lineitem_bak;
+-----------+
| count(*) |
+-----------+
| 179998372 |
+-----------+
1 row in set (16.14 sec)
mysql> explain select * from lineitem_bak;
+-----------------------+--------------+-----------+--------------------+----------------------+
| id | estRows | task | access object | operator info |
+-----------------------+--------------+-----------+--------------------+----------------------+
| TableReader_5 | 179998372.00 | root | | data:TableFullScan_4 |
| └─TableFullScan_4 | 179998372.00 | cop[tikv] | table:lineitem_bak | keep order:false |
+-----------------------+--------------+-----------+--------------------+----------------------+
2 rows in set (0.00 sec)
--Setting parameters:
mysql> set tidb_distsql_scan_concurrency=1;
Query OK, 0 rows affected (0.00 sec)
mysql> set tidb_max_paging_size=256;
Query OK, 0 rows affected (0.00 sec)
--Initiate statement execution:
mysql> explain analyze select * from lineitem_bak;
Wait at least 20 seconds to allow tikv data to fully return to tidb-server for accumulation, observe the processlist situation:
mysql> select * from information_schema.processlist\G
*************************** 1. row ***************************
ID: 4050963675563950485
USER: root
HOST: 192.168.31.200:42902
DB: tpch
COMMAND: Query
TIME: 120
STATE: autocommit
INFO: explain analyze select * from lineitem_bak
DIGEST: 474e5e529400b26f8147ce06f488416d9e5f948c01deb9f0244de806fd5438bf
MEM: 231428128
DISK: 0
TxnStart: 06-11 10:03:22.584(442092487333380097)
*************************** 2. row ***************************
ID: 4050963675563950487
USER: root
HOST: 192.168.31.200:49006
DB: tpch
COMMAND: Query
TIME: 0
STATE: autocommit
INFO: select * from information_schema.processlist
DIGEST: 4b5e7cdd5d3ed84d6c1a6d56403a3d512554b534313caf296268abdec1c9ea99
MEM: 0
DISK: 0
TxnStart:
2 rows in set (0.00 sec)
Since the lineitem_bak table has a total of 66 fields, and the maximum memory occupied by each field does not exceed L_COMMENT varchar(44), the maximum memory occupied by each row of records will not exceed 66 * 44 = 2904 bytes. Adding other placeholders for memory operations, the average memory usage per row will not exceed 3000 bytes.
The expected overall memory usage is:
tidb_distsql_scan_concurrency * tidb_max_paging_size * sum of all field definitions of the table = 1 * 256 * 3000 = 768000 bytes, approximately (750KB).
But in reality, the memory usage is 231428128 bytes, approximately (230MB), far exceeding the estimated memory amount.
Heap information:
So why is the memory usage still relatively large when tidb_enable_paging=ON?
Are there other concurrency conditions besides tidb_distsql_scan_concurrency?