Upgrading TiDB to version 6.1.1 causes rapid memory consumption; 5 servers with 128GB of memory are frequently exhausted, leading to automatic TiKV restarts. How to resolve this issue?

Do you want to know the reason? It’s slow SQL.

After upgrading to 6.1.1, the remaining space of 5.2.1, which was more than 10GB, and some with more than 20GB, suddenly decreased significantly and the system frequently restarts.

  1. Check /var/log/message to confirm if TiDB restarted due to OOM.
  2. If so, analyze the tidb.log and slow SQL around the OOM time point.
  3. Currently, the information can only be judged to this extent. More detailed information requires further conclusions combined with panel analysis.

This is the general idea. Usually, OOM is caused by large SQL or execution plan deviation, and there are also a few bugs caused by internal mechanisms.

In versions after 6.2, you can use set @@global.tidb_enable_paging = 1 to solve this issue.
In older versions, you can try set @@session.tidb_enable_chunk_rpc = 0.

Wasn’t this bug fixed as early as version 5.x?

