Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: Join过程中,如果右表数据量很大,TiDB是否会出现内存不够的情况?
[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Encountered Issues]
[Reproduction Path] What operations were performed that led to the issue
[Issue Phenomenon and Impact]
[Attachments]
Please provide the version information of each component, such as cdc/tikv, which can be obtained by executing cdc version/tikv-server --version.
Large table joins can easily cause OOM (Out of Memory); the specific situation needs to be examined in detail.
Thank you for your reply. Can TiDB handle large tables with disk-based processing, or is it limited to in-memory processing?
There are operators that write to disk, and each version has differences. Optimization is ongoing.
Is there a strategy for spilling operators to disk? I would like to know at what buffer size the system will start using disk-based processing. Also, can the buffer size be adjusted?
This is not specifically explained, mainly relying on mem quota query to limit the memory used by SQL.
You also need to enable a parameter for disk spill.
The main consideration for spilling to disk is which operators support it. Currently, most operators support spilling to disk. For joins, it’s important to note that index NL join does not support spilling to disk. If the right table has many duplicate records, it can easily consume a lot of memory, and multiple nested joins can easily lead to OOM (Out of Memory).
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.