After TiKV component OOM, executing SHOW STATS_HEALTHY; does not return any data

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiKV组件OOM后,执行SHOW STATS_HEALTHY;查询不到任何数据

| username: TiDBer_KWW5sFtj

The issue encountered: After the TiKV component experiences an OOM (Out of Memory) event, executing SHOW STATS_HEALTHY; does not return any data.
Master|root@(none)>show stats_healthy where table_name = ‘table_name’;
Empty set (0.00 sec)

It requires restarting the TiKV component to recover. Why is this happening?

| username: Jellybean | Original post link

Based on your description, there are two issues:

  1. TiKV OOM problem:
  • Please check the dashboard statement analysis and slow query situation to find large SQLs.
  • Confirm the block cache configuration size of TiKV to see if it is too large. If it is a scenario where multiple nodes are deployed on a single machine, be more careful when configuring this parameter.
  1. Unable to view the statistics health issue (STATS_HEALTHY):
    Can this issue be reproduced?
    During the troubleshooting period, check for any abnormal cluster logs, including tidb and tikv logs, and confirm them.
| username: Kongdom | Original post link

Check the cluster status with “display” to see if it is normal.

| username: dba远航 | Original post link

I suspect that after the TiKV component OOMs, the metadata information in memory is cleared, so it cannot be found. It will be reloaded after a restart.

| username: andone | Original post link

Try again after some time, it might be loading metadata into memory.

| username: 路在何chu | Original post link

He probably estimated that a long time had passed.

| username: Sunward | Original post link

The loading time is a bit long, check the CPU and memory changes and health status of TiKV.

| username: tidb菜鸟一只 | Original post link

Please send the cluster topology diagram. Additionally, check the TiKV configuration with the command SHOW config WHERE NAME LIKE '%storage.block-cache.capacity%'. If TiKV is running as a single instance on the machine, set storage.block-cache.capacity to 45% of the total memory. If there are multiple instances, divide that by the number of instances.