TiDB OOM Query

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb oom 查询

| username: Chen1234

Parameter configuration:
mem-quota-query: 314572800
oom-action: cancel
tidb 5.4.0 takes 6 minutes, v4.0.8 takes 4 seconds

| username: h5n1 | Original post link

First, use EXPLAIN ANALYZE to check if the execution plans are consistent across the two versions.

| username: Chen1234 | Original post link

The execution plan is the same, and the parameter configuration is also the same.

| username: h5n1 | Original post link

Check the oom-use-tmp-storage and tmp-storage-path parameter settings, and whether temporary files are generated in the specified path during SQL execution.

| username: Chen1234 | Original post link

The default parameters are:

root@10.101.22.58 08:33:09>show config where name like '%tmp%';
+------+--------------------+---------------------+--------------------------------------------------------------+
| Type | Instance           | Name                | Value                                                        |
+------+--------------------+---------------------+--------------------------------------------------------------+
| tidb | 10.101.22.20:4000  | oom-use-tmp-storage | true                                                         |
| tidb | 10.101.22.20:4000  | tmp-storage-path    | /tmp/0_tidb/MC4wLjAuMDo0MDAwLzAuMC4wLjA6MTAwODA=/tmp-storage |
| tidb | 10.101.22.20:4000  | tmp-storage-quota   | -1                                                           |
| tidb | 10.101.22.195:4000 | oom-use-tmp-storage | true                                                         |
| tidb | 10.101.22.195:4000 | tmp-storage-path    | /tmp/0_tidb/MC4wLjAuMDo0MDAwLzAuMC4wLjA6MTAwODA=/tmp-storage |
| tidb | 10.101.22.195:4000 | tmp-storage-quota   | -1                                                           |
| tidb | 10.101.22.58:4000  | oom-use-tmp-storage | true                                                         |
| tidb | 10.101.22.58:4000  | tmp-storage-path    | /tmp/0_tidb/MC4wLjAuMDo0MDAwLzAuMC4wLjA6MTAwODA=/tmp-storage |
| tidb | 10.101.22.58:4000  | tmp-storage-quota   | -1                                                           |
+------+--------------------+---------------------+--------------------------------------------------------------+
9 rows in set (0.02 sec)

Modified to:

tmp-storage-path: /tmp

Version 5.4.0 generates an empty “record” folder, while TiDB 4.0.8 does not.

[root@titb-test-1 tmp-storage]# ls
_dir.lock  record
[root@titb-test-1 tmp-storage]# du -sh *
0       _dir.lock
0       record
| username: h5n1 | Original post link

Is swap enabled in version 5.4? Check the relevant monitoring in node exporter.

| username: Chen1234 | Original post link

Neither of the two TiDB setups has swap enabled.


| username: Min_Chen | Original post link

It seems to be as expected. Directly querying a table with nearly 100 million records without a WHERE clause filter under the condition of mem-quota-query = 314572800 should result in an OOM (Out of Memory).

| username: h5n1 | Original post link

The issue he mentioned is that with different versions, the execution time varies significantly when running the same data volume and SQL, especially during OOM (Out of Memory) events.

| username: Min_Chen | Original post link

Hello, according to the screenshot, during the execution of version 5.4.0, the tidb-server did not run out of memory (OOM). Instead, the mysql CLI experienced an OOM while the server was sending data to it. In version 4.0.8, the tidb-server detected memory overuse during execution and proactively canceled the operation. The stages are different, hence the difference in execution time. In version 5.4.0, it seems that the mysql CLI only experienced an OOM after a period of data transmission, whereas in version 4.0.8, the operation was directly canceled upon detecting excessive memory usage.

| username: h5n1 | Original post link

How can you tell that it was OOM during the sending period?

| username: forever | Original post link

Is it because the OOM handling mechanism is different?