Parameters Affecting Coprocessor and Their Significance

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 影响coprocessor的相关参数及意义

| username: db_user

[TiDB Usage Environment] Production Environment / Testing / Poc Production
[TiDB Version] v4.0.13
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Issue Phenomenon and Impact]
[Resource Configuration] Enter TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]

±-----±-------------------±-------------------------------------------------±------+
| tikv | readpool.coprocessor.high-concurrency | 25 |
| tikv | readpool.coprocessor.low-concurrency | 25 |
| tikv | readpool.coprocessor.max-tasks-per-worker-high | 2000 |
| tikv | readpool.coprocessor.max-tasks-per-worker-low | 2000 |
| tikv | readpool.coprocessor.max-tasks-per-worker-normal | 2000 |
| tikv| readpool.coprocessor.normal-concurrency | 25 |
| tikv | readpool.coprocessor.stack-size | 10MiB |
| tikv | readpool.coprocessor.use-unified-pool | true |

Currently, a scheduled task scans 300,000 keys with dozens of concurrent operations, causing the RPC time of another query to increase.

I would like to ask:

  1. Does readpool.coprocessor.low-concurrency mean that only 25 coprocessors can handle non-point queries simultaneously?
  2. How can I check the current number of coprocessor threads for non-point queries?
  3. When encountering slow coprocessor speeds, is it more useful to tune readpool.coprocessor.low-concurrency or to adjust parameters like tidb_index_lookup_concurrency, tidb_index_lookup_join_concurrency, tidb_index_lookup_size, tidb_index_serial_scan_concurrency?

Looking forward to the experts’ answers.

| username: 人如其名 | Original post link

My personal understanding:

  1. At any given moment, only 25 tasks can be active. Through testing, it is inferred that tasks are not executed one after another but are allocated time slices. Even if a task is not completed, it will yield the CPU to let the next task work. However, it is unclear why max-tasks-per-worker-low causes task backlog at the single-thread level rather than at the entire low-concurrency thread pool level. If the backlog is at the thread level, some threads might have “light” tasks, such as scanning only, which execute faster, while others might have heavier tasks. This could lead to uneven task execution across different threads.
  2. When not using the unified unified_read_pool (for example, if you have separately set readpool.coprocessor.low-concurrency), you can refer to this:

Check the ratio based on the configured number of threads.
3. I believe that adjusting the concurrency at the TiDB level can only reduce the number of copTask requests to TiKV at the same time, but the overall process will slow down. The focus is still on TiKV itself. Concurrent heavy tasks (such as scanning multiple keys or aggregation) might occupy too many time slices, leading to overall slow response. Adjusting readpool.coprocessor.low-concurrency is more effective when the CPU is sufficient. However, fundamentally, I suspect that slow data scanning is the cause. I look forward to official optimizations in this area, such as shared scanning, pushing tasks down to the lower “RocksDB storage layer,” and further down, but I estimate this is very challenging as it requires designing the storage yourself.

| username: 人如其名 | Original post link

I tested it myself, and it seems that without using the readpool.unified unified thread pool, it is placed at the normal level, not the low level. However, I am using version 7.0.


| username: db_user | Original post link

Indeed, I feel that the main impact is from cop, and although my CPU hasn’t peaked, it slows down during high concurrency at times, so it seems to be due to cop concurrency. However, looking at the execution plan, the waiting time isn’t that long. It might also be because the execution plan in version 4 isn’t as detailed. I’ll adjust the concurrency on Monday and see.

| username: 人如其名 | Original post link

Yes, it lacks more detailed CPU time and wait time, etc.

| username: db_user | Original post link

After changing readpool.coprocessor.normal-concurrency from 6 to 7, there were indeed some changes. The overall latency decreased slightly, but not as significantly as shown here. There seems to be some issues with the Grafana display.

| username: db_user | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.