Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: lightning 导入数据 OOM
[TiDB Usage Environment] Test Environment
[TiDB Version] 6.1
[Encountered Problem] OOM when importing data with lightning
[Reproduction Path] What operations were performed to encounter the problem
Memory is sufficient, why is it reporting OOM, is it a problem with Go language?
File size is also not an issue
Try limiting the number of concurrent connections.
Is it modifying
index-concurrency = 2
table-concurrency = 6
these two parameters?
There is also an io-concurrency. OOM is indeed not very common. Could you share the machine configuration and the size of the dataset?
It seems that the concurrency might not be the cause of the OOM. You can try using the pprof tool to check the memory usage during the next import:
go tool pprof -inuse_space localhost:<status-port>/debug/pprof/heap
The status-port is set in the lightning configuration file (pprof will only be enabled if it is set).
–Pay attention to setting region-concurrency
When importing a large amount of data, each concurrency occupies about 2 GiB of memory, which means the total memory usage can reach up to region-concurrency * 2 GiB. By default, region-concurrency is the same as the number of logical CPUs. If the memory size (GiB) is less than twice the number of logical CPUs or if an OOM occurs during runtime, you need to manually lower the region-concurrency parameter to avoid TiDB Lightning OOM.
Check your CPU. If this parameter is not limited, the total memory used by Lightning will be CPU x 2G. If it exceeds 15GB, it needs to be limited. It is best to control CPU x 2G to be 70% of the total memory.