Questions about TiKV Memory Control Parameters

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 关于 tikv 内存控制参数的疑问

| username: MrSylar

Phenomenon
Refer to ethercflow’s response in What are the parameters to limit TiKV instance memory usage? - :ringer_planet: TiDB Technical Issues / Deployment & Operations Management - TiDB Q&A Community (asktug.com). One of the memory control methods for TiKV is to set storage.block-cache.capacity, which automatically calculates memory-usage-limit. Under a 4k memory page, memory-usage-limit = block-cache * 0.75 / 0.45. Additionally, due to statistical reasons, TiKV’s actual memory usage will exceed the memory-usage-limit.

Questions

  1. If both storage.block-cache.capacity and memory-usage-limit are manually set, how should the upper limit of TiKV memory usage be determined?
  2. If the actual TiKV memory usage far exceeds the memory-usage-limit, is there any way to further analyze how TiKV memory is being used?
| username: tidb菜鸟一只 | Original post link

  1. In theory, it is not recommended to explicitly set memory-usage-limit. You only need to set storage.block-cache.capacity, and memory-usage-limit will automatically be set to storage.block-cache.capacity * 5/3. storage.block-cache.capacity is the shared cache size of TiKV. The actual memory-usage-limit should include not only storage.block-cache.capacity but also the memory occupied by TiKV components. The upper limit of TiKV memory usage should be based on memory-usage-limit.
  2. First, please confirm whether there are other PD or TiDB nodes deployed on your current machine, or if multiple TiKV nodes are deployed. Then, provide feedback on the total system memory size, as well as the sizes of the storage.block-cache.capacity and memory-usage-limit parameters.
| username: 我是咖啡哥 | Original post link

We recently encountered memory alert issues as well.
The TiKV node system memory is 32G, with the default memory-usage-limit set to 24G. We frequently received memory usage alerts during data synchronization.
Later, we set the memory-usage-limit to 16G, and the issue was resolved.

| username: 我是咖啡哥 | Original post link

memory-usage-limit
The memory usage limit for TiKV instances. When the memory usage of TiKV approaches this threshold, internal caches will be cleared to free up memory.
There is such a sentence in the documentation.

| username: zhanggame1 | Original post link

Memory usage alarm, no OOM, right? There’s no need to adjust parameters, unused memory is also a waste.

| username: 像风一样的男子 | Original post link

Some time ago, my KV memory also triggered an alert. I resolved it by adding more memory. Will limiting KV memory reduce cluster performance?

| username: 我是咖啡哥 | Original post link

The application memory usage is greater than 95%… The alerts are configured uniformly and cannot be turned off. So I simply restricted it directly.

| username: 我是咖啡哥 | Original post link

It’s definitely good to be able to expand the memory. My database is a historical one, not important, so I can accept it being a bit slow. It’s just limited.

| username: MrSylar | Original post link

  1. Haha, I just want to figure out the relationship between “manually setting storage.block-cache.capacity and memory-usage-limit” simultaneously. I can’t figure it out, and seeing the memory usage not behaving as expected is really frustrating.
  2. I checked the memory usage %mem of TiKV from the OS layer using top, in the simplest single TiKV instance scenario.
| username: 有猫万事足 | Original post link

I resolved it through resource control.
My 4-core 8GB machine kept having memory alerts, and even after restarting, the alerts would reappear within 2 hours.
Before version 7.1, it wasn’t very stable.
In version 7.1, I set up resource control. Although the alerts are still there, the memory usage has become very stable and doesn’t suddenly spike, causing TiKV to crash.

| username: MrSylar | Original post link

Try using the resource_control feature of tiup, or is it the new feature resource control?

| username: 有猫万事足 | Original post link

New feature resource control, I have documented this process.

My own experience is that this feature is very useful for the stability of TiKV, saving the trouble of adjusting a bunch of parameters. For TiDB with the same hardware configuration, it still occasionally crashes.

| username: tidb菜鸟一只 | Original post link

The size settings for storage.block-cache.capacity and memory-usage-limit parameters should not be such that memory-usage-limit is smaller and storage.block-cache.capacity is larger…

| username: MrSylar | Original post link

No, it won’t :joy:. My environment has 32G system memory, storage.block-cache.capacity set to 8G, memory-usage-limit set to 16G, and TiKV’s memory usage rate is 80%.

| username: tidb菜鸟一只 | Original post link

No way, based on my experience, if storage.block-cache.capacity is set to 8G, generally the total memory usage of TiKV won’t exceed 15G… How can yours take up 25G?

| username: MrSylar | Original post link

So I posted to see if I could understand why this is happening. :joy:

| username: residentevil | Original post link

I encountered the same issue and have been troubleshooting for a long time without any conclusion.

| username: TiDBer_yyy | Original post link

This is too difficult, I don’t know which part of the memory was released.

| username: zhaokede | Original post link

The memory-usage-limit was not set, and an alert was received.

| username: TiDBer_yyy | Original post link

I have the same issue here.
Total memory is 64G, memory-usage-limit is 52G, storage.block-cache is 38G, and there are OOM problems, with multiple TiKV instances reaching 59G of memory usage.