Tidb-server out of memory

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb-server oom

| username: TiDBer_UNdkzKdD

[TiDB Usage Environment] Production Environment
[TiDB Version]
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Issue Phenomenon and Impact]
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachment: Screenshot/Log/Monitoring]

| username: WalterWj | Original post link

Sure, here is the translation:

| username: TiDBer_UNdkzKdD | Original post link

When there is a risk of OOM, no SQL is recorded in the temporary files.



There are not many SQL executions during this time.

| username: WalterWj | Original post link

If there are no records, it will be troublesome…

You can check the SQL related to the “expensive” keyword in the tidb.log. See if you can find anything.

| username: 这里介绍不了我 | Original post link

Is there any information about the machine configuration?

| username: TiDBer_UNdkzKdD | Original post link

There are no records either.

| username: TiDBer_UNdkzKdD | Original post link

Two virtual machines with 8 cores and 16GB each, running tidb-server separately.

| username: WalterWj | Original post link

It should not be this SQL. There are two triggers for expensive: one is exceeding 1GB, and the other is executing for more than 1 minute. Check if there are any other SQLs.

| username: TiDBer_UNdkzKdD | Original post link

No, no other SQL was found before the OOM occurred.

| username: WalterWj | Original post link

It feels like it might be a bug… How about trying an upgrade?

| username: 随缘天空 | Original post link

Have you installed the dashboard monitoring? Go to the log menu list and search for relevant TiDB log information. Check if there are any related SQL errors around the time the OOM occurred.

| username: TIDB-Learner | Original post link

It is generally caused by unreasonable statements. Check the logs, what about the explain analyze statement?

| username: 小龙虾爱大龙虾 | Original post link

Your TiDB Server didn’t suddenly spike; the monitoring shows that the memory didn’t suddenly increase. Are you using a mixed deployment?

| username: GreenGuan | Original post link

I have seen TiDB memory spikes causing OOM. The feeling of TiKV OOM is in mixed deployment situations. The memory recorded in the dashboard for the host is for the host’s memory, not specifically for TiDB monitoring.

| username: TiDBer_UNdkzKdD | Original post link

There were no error logs before the OOM occurred. The error logs in the screenshot are all during the restart process.

| username: TiDBer_UNdkzKdD | Original post link

It is not mixed deployment. tidb-server is deployed separately.

| username: buddyyuan | Original post link

Take a look at the panic logs of your tidb-server. The memory is dropping directly, so it is possible that a panic occurred.

| username: TiDBer_UNdkzKdD | Original post link

There was no panic; it was killed by the system OOM.

| username: DBAER | Original post link

You can check the dashboard traffic visualization analysis chart

| username: buddyyuan | Original post link

That’s too fast, it went down before we could even record it. With version 6.1, you can enable topsql, so even if it goes down, topsql will still have the records.