Lightning Data Import OOM

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: lightning 导入数据 OOM

| username: TiDBer_Z93z8JhD

[TiDB Usage Environment] Test Environment
[TiDB Version] 6.1
[Encountered Problem] OOM when importing data with lightning
[Reproduction Path] What operations were performed to encounter the problem
Memory is sufficient, why is it reporting OOM, is it a problem with Go language?

image

File size is also not an issue
image

| username: hey-hoho | Original post link

Try limiting the number of concurrent connections.

| username: TiDBer_Z93z8JhD | Original post link

Is it modifying
index-concurrency = 2
table-concurrency = 6
these two parameters?

| username: buchuitoudegou | Original post link

There is also an io-concurrency. OOM is indeed not very common. Could you share the machine configuration and the size of the dataset?

| username: buchuitoudegou | Original post link

It seems that the concurrency might not be the cause of the OOM. You can try using the pprof tool to check the memory usage during the next import:

go tool pprof -inuse_space localhost:<status-port>/debug/pprof/heap

The status-port is set in the lightning configuration file (pprof will only be enabled if it is set).

| username: 近墨者zyl | Original post link

–Pay attention to setting region-concurrency

When importing a large amount of data, each concurrency occupies about 2 GiB of memory, which means the total memory usage can reach up to region-concurrency * 2 GiB. By default, region-concurrency is the same as the number of logical CPUs. If the memory size (GiB) is less than twice the number of logical CPUs or if an OOM occurs during runtime, you need to manually lower the region-concurrency parameter to avoid TiDB Lightning OOM.

Check your CPU. If this parameter is not limited, the total memory used by Lightning will be CPU x 2G. If it exceeds 15GB, it needs to be limited. It is best to control CPU x 2G to be 70% of the total memory.