RocksDB CPU load remains consistently high

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: rocksdb CPU负载始终较高

| username: TiDBer_OzwiOwPc

The image shows the CPU monitoring graph of RocksDB.
I have a few questions that I don’t understand and would like to ask:

  1. If the CPU is high, can it only be increased by adding threads through max-background-jobs?
  2. No matter how many threads, the upper limit of this monitoring is always 100%.

Is it necessary to add max-background-jobs threads in my graph?

| username: 有猫万事足 | Original post link

  • The RocksDB thread pool is used for Compact and Flush tasks in RocksDB, and usually does not need to be configured.
    • If the machine has a small number of CPU cores, you can set rocksdb.max-background-jobs and raftdb.max-background-jobs to 4.
    • If you encounter a Write Stall, you can check the Write Stall Reason indicators in RocksDB-kv on Grafana monitoring to see which indicators are not zero.
      • If it is caused by pending compaction bytes, you can set rocksdb.max-sub-compactions to 2 or 3 (this configuration indicates the number of sub-threads allowed for a single compaction job, with a default value of 3 in TiKV 4.0 and 1 in version 3.0).
      • If the reason is related to memtable count, it is recommended to increase the max-write-buffer-number for all columns (default is 5).
      • If the reason is related to level0 file limit, it is recommended to increase the following parameters to 64 or higher:

rocksdb.defaultcf.level0-slowdown-writes-trigger
rocksdb.writecf.level0-slowdown-writes-trigger
rocksdb.lockcf.level0-slowdown-writes-trigger
rocksdb.defaultcf.level0-stop-writes-trigger
rocksdb.writecf.level0-stop-writes-trigger
rocksdb.lockcf.level0-stop-writes-trigger

How many CPU cores does it have?
If it has more than 10 cores, it might make sense to adjust max-background-jobs. Because with more than 10 cores, according to the formula, this value can be up to 9. If it has fewer than 10 cores, there might not be much need for adjustment.
Additionally, whether you have encountered a Write Stall is another consideration. There can be many reasons for this.
The RocksDB thread pool only handles flush (persisting memtable) and compact (compression, which may involve adjustments to many SST files), implying that your disk might be slow. If the RocksDB CPU usage is high, you should also check whether the disk I/O is sufficient before considering adjusting the max-background-jobs parameter. Otherwise, adding more threads might have limited effect.

| username: TiDBer_OzwiOwPc | Original post link

The maximum CPU load is 100%, right? It won’t be several threads reaching several hundred, will it?

  • When the number of CPU cores is N, the default value is max(2, min(N - 1, 9)). I’ve always assumed it to be 9.
| username: TiDBer_OzwiOwPc | Original post link

Is the disk performance still slow? But the utilIO is not reaching 100. Is it because of the network overhead from using FC SAN, so there’s no solution?

| username: 有猫万事足 | Original post link

When n>=10, it is always 9.

| username: 有猫万事足 | Original post link

utilIO is not even at 100

It seems like there is still some IO capacity left. Increasing the thread pool might be useful. But the question is, are there any specific issues appearing?

If there is no Write Stall, and the pending compaction bytes graph is decreasing, and there are no batch insert slow queries in the slow query monitoring,
I even think that just the RocksDB CPU graph alone is not sufficient reason to adjust parameters.

Because there are no specific issues, any adjustments made might have negligible effects.

| username: TiDBer_OzwiOwPc | Original post link

There’s nothing much, just the overall speed is not fast. This thread’s CPU seems to be fully loaded all the time, and I want to reduce it. So there’s no need to adjust it? When should I adjust it in response to a write stall?

| username: 有猫万事足 | Original post link

Overall optimization is a big topic. I suggest you watch this video. It can link many monitoring points together and provide an idea and direction for tuning.

If you just observe that something is full and then adjust it, I think it doesn’t make much sense.
As an exploratory or learning-oriented operation, I agree with it.
But in a production environment, if the correlation is uncertain, it might easily cause some other troubles.