TiDB Unified Read Pool CPU Utilization Does Not Meet Expectations

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TIDB Unified Read Pool Cpu利用率不符合预期

| username: residentevil

[TiDB Usage Environment] Production Environment
[TiDB Version] V7.1.2
[Reproduction Path] Execute analyze table XX on TiDB
[Encountered Problem: Phenomenon and Impact] The single table has a capacity of around 20 million rows. When executing analyze table on TiDB, the Unified Read Pool CPU utilization can exceed 300%, which feels extremely unexpected. Why does this operation consume so many resources?
[Resource Configuration]
TIKV Single Server Specifications: 96C512G 4*8T, a total of 4 TIKV instances deployed
TIKV Customized Configuration:
readpool.storage.use-unified-pool: true
readpool.unified.max-thread-count: 30
readpool.unified.min-thread-count: 5
[Attachments: Screenshots/Logs/Monitoring]

| username: Jellybean | Original post link

When executing the analyze operation, table data will be read for statistical analysis and calculation. If the table is relatively large, a similar situation may occur due to the need to read a large amount of data from the storage layer.

Therefore, it is necessary to avoid peak business hours when performing analyze operations and try to execute them during late-night hours or off-peak business periods.

| username: residentevil | Original post link

Actually, I want to know if there are any optimization methods for the unified read pool. I have adjusted multiple parameters offline many times, but the results still seem to be quite poor.

| username: Jellybean | Original post link

In theory, it shouldn’t. For optimization, you can check the section on thread pool optimization on the official website, where it is introduced. Usually, there shouldn’t be any issues.

When executing this, does the cluster’s latency and access QPS have a significant impact?

| username: 有猫万事足 | Original post link

Unified Read Pool addresses the issue of QPS drop in mixed scenarios of long and short queries.

During analyze, it can be considered a long query. If using Unified Read Pool causes a sharp drop in QPS, as shown in the graph in the article above, it is not as expected.

Also, 300% only used 3 cores. The concurrency of analyze is controlled by other parameters.

It is generally between 4 and 1. If none of them have been specifically adjusted, I feel that 300% is a relatively normal value.

| username: dba远航 | Original post link

Try reducing the value of readpool.unified.max-thread-count appropriately.

| username: 小龙虾爱大龙虾 | Original post link

It’s normal. Check the parameters related to analyze concurrency. Analyze itself is concurrent.

| username: residentevil | Original post link

Let me take a look at this article first. Actually, besides ANALYZE, for SQL scenarios with a lot of row retrievals, the Unified Read Pool load is also very high. It feels like the design in this area is indeed somewhat hard to understand. :sweat_smile: