Suggestions for Handling Out of Memory Errors in Single SQL Statements During Bulk Insert and Delete Operations in TiDB

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb大批量插入与删除数据报单sql内存溢出错误的建议

| username: wenyi

When replicating a table in TiDB, i.e., inserting a large batch of data into a new table, an out-of-memory error occurs for a single SQL statement. The suggestions are as follows:

  1. Why doesn’t TiDB learn from MySQL or Oracle and use temporary files instead of completing everything in memory? Reporting such errors is very unfriendly.
  2. Is there a parameter control? For example, if a single SQL statement uses a certain amount of memory, it writes to a temporary file. This would improve user experience. MySQL and Oracle seem to handle this well, and it is suggested that TiDB learn from them.
| username: Jasper | Original post link

The parameter tidb_enable_tmp_storage_on_oom can be controlled. For details, please refer to the documentation:

| username: zhanggame1 | Original post link

By default, it is enabled. Even if OOM is enabled for “insert into select,” it is useless.

| username: cassblanca | Original post link

Oracle is a monolithic architecture, where temporary tablespaces can be used for sorting, storing intermediate results, and even for flashback operations. TiDB, on the other hand, is a distributed architecture and has more considerations, such as the consistency of distributed transactions involving temporary files. Additionally, temporary files can lead to system performance degradation. If a performance issue on a single node affects the entire cluster, it might not be worth the trade-off.

Of course, starting from version 6.3, the official documentation provides the parameter tidb_enable_tmp_storage_on_oom System Variables | PingCAP Documentation Center which can enable temporary disk storage for certain operators, but not all operations support temporary disk storage.

| username: redgame | Original post link

Check tidb_enable_tmp_storage_on_oom.

| username: TiDBer_vfJBUcxl | Original post link

Add conditions to the query of the t_dch_patient_card table, so that the t_dch_patient_card table completes the import after multiple iterations.

| username: TiDBer_vfJBUcxl | Original post link

Use the system variable tidb_mem_quota_query to configure the memory usage threshold during the execution of a single SQL query.

| username: Jasper | Original post link

Currently, only some operators support spilling to disk.