High Memory Usage in TiKV

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tikv内存占用过高

| username: TiDBer_RQobNXGv

The memory usage of the TIKV node has been too high recently and hasn’t decreased. Normally, it should be around 63%, but now it is consistently above 70%. What could be the reason for this?

| username: xfworld | Original post link

The default setting is a single-node single-service mode. If you need to deploy in a mixed mode, you have to set the resource allocation for each service on this node, which is quite complex.

It is generally not recommended, and maintenance will also be more difficult.

Refer to this:

| username: 啦啦啦啦啦 | Original post link

Is it the default configuration? For mixed deployment, memory-related parameters need to be adjusted, otherwise, it is very easy to encounter OOM (Out of Memory).

| username: TiDBer_pkQ5q1l0 | Original post link

For mixed deployment, set storage.block-cache.capacity to a smaller value, as it defaults to 45% of the machine’s memory.

| username: tidb菜鸟一只 | Original post link

SHOW config WHERE NAME=‘storage.block-cache.capacity’
Check if it’s too large. For example, on your 65-node setup with 1 PD, 1 TiDB, and 2 TiKV nodes, if you have 64GB of memory, it’s recommended to allocate around 8GB for each TiKV node.

| username: maokl | Original post link

These two parameters can be used to control TiKV memory usage:
rocksdb.defaultcf.block-cache-size:
rocksdb.writecf.block-cache-size:

| username: TiDBer_RQobNXGv | Original post link

This value was set to 10GB during deployment, and it has remained stable at 10GB.

| username: TiDBer_RQobNXGv | Original post link

I set the storage.block-cache.capacity of each TiKV to 10GB.

| username: tidb菜鸟一只 | Original post link

So each of the two nodes used 10G for block-cache-size. I see that one TiKV process is using around 18G, which seems normal. Are there any anomalies in the cluster right now?

| username: TiDBer_RQobNXGv | Original post link

The remaining memory is running low, and I’m worried that a large number of queries might cause the node to restart. Currently, the memory usage of the TiKV node has soared to 22GB.

| username: tidb菜鸟一只 | Original post link

In general, it’s like this. For your mixed deployment, it is recommended to isolate resources with NUMA.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.