TiSpark Reading KV Data OOM

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tispark读取kv数据OOM

| username: TiDBer_Lee

Using TiSpark to read a 200G table, I found that TiKV encountered OOM issues and kept failing. The cluster has 6 TiKV nodes, each with 16 cores and 64GB of memory. During the reading process, the resource consumption of TiKV was unbalanced. Initially, 2 nodes were working, gradually increasing to 4 nodes, and then OOM occurred. Disk usage was not high.

I tested with another TiDB cluster, which was upgraded from version 5.1.4 to 6.5.3, and it worked fine with normal memory consumption. However, the problematic cluster was directly installed with version 6.5.3. I am not sure if the default parameter changes caused this issue. Has anyone encountered similar situations? Please help with some answers.

| username: Jellybean | Original post link

In the application scenario you described, the OOM issue at the TiKV storage layer is most likely caused by reading too much data at once, leading to the block cache being overwhelmed, or by reading a large amount of data but returning it too slowly, resulting in OOM.

First, compare and check the configuration of the two clusters, especially the block cache size configuration.

| username: redgame | Original post link

Try adjusting the memory configuration of the TiKV nodes and increasing the memory limit for each node.