TiKV Memory Slowly Increases in k8s Mode, Eventually OOM

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: k8s模式下tikv内存缓慢增长,最后oom

| username: TiDBer_NEw0xuKK

[TiDB Usage Environment] Production Environment
[TiDB Version] 7.1.0

It has been about two years since deployment and usage. During this period, memory has been slowly increasing (possibly due to the continuous growth of data). Each time the memory exceeds the k8s limit, the configuration is updated, and the TiKV nodes undergo rolling restarts. However, it gradually reaches the critical value again. Is this situation normal? Is there any way to control it?

Configuration file:
tidb.yml (36.3 KB)

Memory screenshot

| username: WalterWj | Original post link

Try configuring storage.block-cache.capacity in TiKV.

| username: TiDBer_NEw0xuKK | Original post link

Okay, I’ll give it a try.

| username: WalterWj | Original post link

You can consider configuring 20~30GB and give it a try.

| username: Miracle | Original post link

What is the memory of the node where TiKV is located? Are there other services on the node?

| username: dba远航 | Original post link

You can try limiting concurrency, using small transactions, controlling the maximum memory usage parameters, etc.

| username: TiDBer_vfJBUcxl | Original post link

I’m sorry, but I can’t access external websites. If you provide the text you need translated, I’d be happy to help!

| username: TiDBer_NEw0xuKK | Original post link

Configured 40GB yesterday, but after one night, TiKV still exceeded this limit.

| username: TiDBer_NEw0xuKK | Original post link

The physical node has 96GB and other services deployed on it with k8s.

| username: TiDBer_jYQINSnf | Original post link

Try these

memory-usage-limit = "40G"

grpc-memory-pool-quota = "1G"
capacity = "30GB"
max-total-wal-size = "2GB"
enable-statistics = false
write-buffer-size = "512MB"
max-write-buffer-number = 5
write-buffer-size = "512MB"
max-write-buffer-number = 5
write-buffer-size = "512MB"
max-write-buffer-number = 5
write-buffer-size = "128MB"
max-write-buffer-number = 5
memory-limit = "512MB"
| username: Miracle | Original post link

Is the node’s memory usage normal?

| username: TiDBer_NEw0xuKK | Original post link

Okay, I’ll give it a try.

| username: TiDBer_NEw0xuKK | Original post link

It’s quite normal, but this TiKV pod keeps growing slowly and eventually OOMs.

| username: wangccsy | Original post link

Find the reason for your continuous growth. There should be some operations that haven’t been released. If it keeps growing, it will definitely result in an OOM (Out of Memory) eventually. After all, memory is limited.

| username: tidb菜鸟一只 | Original post link

Set the value of storage.block-cache.capacity to 45% of the total memory you think TiKV can use. If you have a 96G physical machine running only TiKV, setting it to 40G should not be a problem. However, if there are many other pods running on it, it is recommended to set it smaller.

| username: WalterWj | Original post link

It’s normal to exceed the limit. The recommended configuration for this is around 45% of the limit memory configuration you set for the pod. If you set it to 40, it will definitely exceed. This is the main memory usage, and TiKV also has other memory usages, including some cache and memory used for gRPC communication, etc.

| username: kkpeter | Original post link

If the TiKV cluster has a high write load and memory usage exceeds normal levels, set the memory-usage-limit to 75% of the total memory.

| username: yiduoyunQ | Original post link

Please share the output of kubectl -n default get tc advanced-tidb -oyaml. Additionally, refer to the official documentation and try the following (modifications will trigger a rolling restart of TiKV):

      cpu: "2000m"
      memory: "48Gi"
      storage: "100Gi"
      cpu: "4000m"
      memory: "48Gi"
| username: TiDBer_NEw0xuKK | Original post link

Okay, okay, I’ll give it a try, thank you!

| username: TiDBer_NEw0xuKK | Original post link

Okay, thank you.