Despite tidb_enable_paging=ON, there is still a significant backlog of RPC messages in the tikv-client

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb_enable_paging=ON的情况下tikv-client积压的RPC消息还是较大的疑惑

| username: 人如其名

[TiDB Usage Environment] Poc
[TiDB Version] 5.7.25-TiDB-v6.5.2
Modified tidb_max_paging_size=256, tidb_distsql_scan_concurrency=1, other parameters are default.
In this case, executing: select a.* from a; observed memory usage is larger than expected.
Expected memory usage:

To simulate the slow execution of the upper operator causing the distsql request to backlog RPC requests, I directly modified the source code of tidb-server, delaying the explain analyze part to make tikv-client messages more likely to accumulate.
tidb/executor/explain.go at 29116c0256c52b224da2b34d712c1063d171c0ad · pingcap/tidb · GitHub
Added time.Sleep(time.Second) delay processing in each loop when fetching the result set.

Allowing RPC messages returned by tikv to accumulate, the maximum memory usage (basically only tikv-client accumulated messages occupying memory) is: tidb_distsql_scan_concurrency * tidb_max_paging_size * sum of all field definitions of the table
I think the memory usage should not exceed the above expected value, but in reality, the memory usage is much larger than this (it feels like the number of copTask requests returned to tikv-client cache is far more than tidb_distsql_scan_concurrency).
Specific scenario:
Table structure definition:

mysql> desc lineitem_bak;
+-----------------+---------------+------+------+---------+-------+
| Field           | Type          | Null | Key  | Default | Extra |
+-----------------+---------------+------+------+---------+-------+
| L_ORDERKEY      | bigint(20)    | NO   | PRI  | NULL    |       |
| L_PARTKEY       | bigint(20)    | NO   |      | NULL    |       |
| L_SUPPKEY       | bigint(20)    | NO   |      | NULL    |       |
| L_LINENUMBER    | bigint(20)    | NO   | PRI  | NULL    |       |
| L_QUANTITY      | decimal(15,2) | NO   |      | NULL    |       |
| L_EXTENDEDPRICE | decimal(15,2) | NO   |      | NULL    |       |
| L_DISCOUNT      | decimal(15,2) | NO   |      | NULL    |       |
| L_TAX           | decimal(15,2) | NO   |      | NULL    |       |
| L_RETURNFLAG    | char(1)       | NO   |      | NULL    |       |
| L_LINESTATUS    | char(1)       | NO   |      | NULL    |       |
| L_SHIPDATE      | date          | NO   | MUL  | NULL    |       |
| L_COMMITDATE    | date          | NO   |      | NULL    |       |
| L_RECEIPTDATE   | date          | NO   |      | NULL    |       |
| L_SHIPINSTRUCT  | char(25)      | NO   |      | NULL    |       |
| L_SHIPMODE      | char(10)      | NO   |      | NULL    |       |
| L_COMMENT       | varchar(44)   | NO   |      | NULL    |       |
| col1            | decimal(15,2) | YES  |      | NULL    |       |
| col2            | decimal(15,2) | YES  |      | NULL    |       |
| col3            | decimal(15,2) | YES  |      | NULL    |       |
| col4            | decimal(15,2) | YES  |      | NULL    |       |
| col5            | decimal(15,2) | YES  |      | NULL    |       |
| col6            | decimal(15,2) | YES  |      | NULL    |       |
| col7            | decimal(15,2) | YES  |      | NULL    |       |
| col8            | decimal(15,2) | YES  |      | NULL    |       |
| col9            | decimal(15,2) | YES  |      | NULL    |       |
| col10           | decimal(15,2) | YES  |      | NULL    |       |
| col11           | decimal(15,2) | YES  |      | NULL    |       |
| col12           | decimal(15,2) | YES  |      | NULL    |       |
| col13           | decimal(15,2) | YES  |      | NULL    |       |
| col14           | decimal(15,2) | YES  |      | NULL    |       |
| col15           | decimal(15,2) | YES  |      | NULL    |       |
| col16           | decimal(15,2) | YES  |      | NULL    |       |
| col17           | decimal(15,2) | YES  |      | NULL    |       |
| col18           | decimal(15,2) | YES  |      | NULL    |       |
| col19           | decimal(15,2) | YES  |      | NULL    |       |
| col20           | decimal(15,2) | YES  |      | NULL    |       |
| col21           | decimal(15,2) | YES  |      | NULL    |       |
| col22           | decimal(15,2) | YES  |      | NULL    |       |
| col23           | decimal(15,2) | YES  |      | NULL    |       |
| col24           | decimal(15,2) | YES  |      | NULL    |       |
| col25           | decimal(15,2) | YES  |      | NULL    |       |
| col26           | decimal(15,2) | YES  |      | NULL    |       |
| col27           | decimal(15,2) | YES  |      | NULL    |       |
| col28           | decimal(15,2) | YES  |      | NULL    |       |
| col29           | decimal(15,2) | YES  |      | NULL    |       |
| col30           | decimal(15,2) | YES  |      | NULL    |       |
| col31           | decimal(15,2) | YES  |      | NULL    |       |
| col32           | decimal(15,2) | YES  |      | NULL    |       |
| col33           | decimal(15,2) | YES  |      | NULL    |       |
| col34           | decimal(15,2) | YES  |      | NULL    |       |
| col35           | decimal(15,2) | YES  |      | NULL    |       |
| col36           | decimal(15,2) | YES  |      | NULL    |       |
| col37           | decimal(15,2) | YES  |      | NULL    |       |
| col38           | decimal(15,2) | YES  |      | NULL    |       |
| col39           | decimal(15,2) | YES  |      | NULL    |       |
| col40           | decimal(15,2) | YES  |      | NULL    |       |
| col41           | decimal(15,2) | YES  |      | NULL    |       |
| col42           | decimal(15,2) | YES  |      | NULL    |       |
| col43           | decimal(15,2) | YES  |      | NULL    |       |
| col44           | decimal(15,2) | YES  |      | NULL    |       |
| col45           | decimal(15,2) | YES  |      | NULL    |       |
| col46           | decimal(15,2) | YES  |      | NULL    |       |
| col47           | decimal(15,2) | YES  |      | NULL    |       |
| col48           | decimal(15,2) | YES  |      | NULL    |       |
| col49           | decimal(15,2) | YES  |      | NULL    |       |
| col50           | decimal(15,2) | YES  |      | NULL    |       |
+-----------------+---------------+------+------+---------+-------+
66 rows in set (0.00 sec)

Table data volume and statement execution plan:

mysql> select count(*) from lineitem_bak;
+-----------+
| count(*)  |
+-----------+
| 179998372 |
+-----------+
1 row in set (16.14 sec)

mysql> explain select * from lineitem_bak;
+-----------------------+--------------+-----------+--------------------+----------------------+
| id                    | estRows      | task      | access object      | operator info        |
+-----------------------+--------------+-----------+--------------------+----------------------+
| TableReader_5         | 179998372.00 | root      |                    | data:TableFullScan_4 |
| └─TableFullScan_4     | 179998372.00 | cop[tikv] | table:lineitem_bak | keep order:false     |
+-----------------------+--------------+-----------+--------------------+----------------------+
2 rows in set (0.00 sec)
--Setting parameters:
mysql> set tidb_distsql_scan_concurrency=1;
Query OK, 0 rows affected (0.00 sec)

mysql> set tidb_max_paging_size=256;
Query OK, 0 rows affected (0.00 sec)


--Initiate statement execution:
mysql> explain analyze select * from lineitem_bak;

Wait at least 20 seconds to allow tikv data to fully return to tidb-server for accumulation, observe the processlist situation:

mysql> select * from information_schema.processlist\G
*************************** 1. row ***************************
      ID: 4050963675563950485
    USER: root
    HOST: 192.168.31.200:42902
      DB: tpch
 COMMAND: Query
    TIME: 120
   STATE: autocommit
    INFO: explain analyze select * from lineitem_bak
  DIGEST: 474e5e529400b26f8147ce06f488416d9e5f948c01deb9f0244de806fd5438bf
     MEM: 231428128
    DISK: 0
TxnStart: 06-11 10:03:22.584(442092487333380097)
*************************** 2. row ***************************
      ID: 4050963675563950487
    USER: root
    HOST: 192.168.31.200:49006
      DB: tpch
 COMMAND: Query
    TIME: 0
   STATE: autocommit
    INFO: select * from information_schema.processlist
  DIGEST: 4b5e7cdd5d3ed84d6c1a6d56403a3d512554b534313caf296268abdec1c9ea99
     MEM: 0
    DISK: 0
TxnStart: 
2 rows in set (0.00 sec)

Since the lineitem_bak table has a total of 66 fields, and the maximum memory occupied by each field does not exceed L_COMMENT varchar(44), the maximum memory occupied by each row of records will not exceed 66 * 44 = 2904 bytes. Adding other placeholders for memory operations, the average memory usage per row will not exceed 3000 bytes.
The expected overall memory usage is:
tidb_distsql_scan_concurrency * tidb_max_paging_size * sum of all field definitions of the table = 1 * 256 * 3000 = 768000 bytes, approximately (750KB).
But in reality, the memory usage is 231428128 bytes, approximately (230MB), far exceeding the estimated memory amount.
Heap information:

So why is the memory usage still relatively large when tidb_enable_paging=ON?
Are there other concurrency conditions besides tidb_distsql_scan_concurrency?

| username: tiancaiamao | Original post link

Many factors contribute to memory amplification:

  1. Chunk encoding can lead to data amplification, with different types having varying amplification ratios. In fact, the decimal type has a very high amplification ratio.
  2. After an RPC message is read, it is stored as a single block of memory containing multiple chunks. The release mechanism is such that the entire block of memory is only released after all chunks within it are consumed. (For example, if there are 100 chunks and 99 of them are consumed but one chunk is still referenced, the entire block of memory cannot be released.)
  3. The parameter tidb_distsql_scan_concurrency controls concurrency, but it also involves channels. Data read concurrently is not immediately consumed but written to a channel, from which consumers read and consume the data. Due to the presence of channels, actual memory usage is influenced not only by concurrency but also by channel size.
  4. Besides the concurrency at the data reading layer, some operators have internal concurrency. Chunks being “consumed” within these operators also contribute to memory amplification.

Regarding the amplification ratio:

Make coprocessor paging as the default RPC protocol · Issue #35633 · pingcap/tidb · GitHub contains the development records for coprocessor paging.
The document https://pingcap.feishu.cn/docs/doccnPKxuP3LiOBI5HaSPO4Idtc includes some data.

Comparison of chunk vs non-chunk sizes:

  • Decimal shows the most severe amplification: 2,603,116 vs 10,410,316
  • For bigint not null, chunk is better than non-chunk, occupying a smaller package size: 2,342,962 vs 2,084,880
  • For varchar(150), chunk occupies slightly more than non-chunk: 5,205,162 vs 6,770,230
  • For varchar(1), chunk is nearly 3 times the size of non-chunk: 782,036 vs 2,347,106
| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.