Memory Usage Issues of Single SQL Execution

translator_bot · June 22, 2024, 12:27am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 单条SQL执行的内存使用问题

| username: EricSong

[TiDB Usage Environment] Production Environment
[Encountered Problem: Problem Phenomenon and Impact]
There is a view in the database that performs a join operation, left joining a large table with a metadata table, resulting in a view named ‘user’. The ‘user’ view contains nearly five million rows of data, approximately 4GB in size.
When querying using SELECT id, count(1) as total from user ui group by id order by total desc limit 100, the TiDB node experiences a sudden increase in memory usage (nearly 5GB in a short period). However, upon checking the relevant logs, it was found that the monitored memory increment and the memory usage reported in the SQL logs do not match (the SQL only used about 250Mi).
I would like to ask what the calculation logic for SQL memory usage is?
[Attachments: Screenshots/Logs/Monitoring]
Screenshot2023-08-11 17.32.23

translator_bot · June 22, 2024, 12:27am

| username: redgame | Original post link

Roughly, each operator allocates memory space, and the memory space for intermediate results is added together.

translator_bot · June 22, 2024, 12:27am

| username: TiDBer_vfJBUcxl | Original post link

Check this out: 单条SQL执行限制内存2G，同事写SQL查询没有limit 这条SQL执行占用18G的内存,为啥SET tidb_mem_quota_query = 2 << 30；没有生效 - #11，来自 cy6301567 - TiDB 的问答社区

translator_bot · June 22, 2024, 12:27am

| username: 人如其名 | Original post link

Monitoring (Prometheus, Grafana) shows the memory usage of the entire instance, which includes the actual memory used and objects that have not yet been garbage collected. This data is provided at the Go language level and is an accurate representation of memory overhead.

The memory usage in slow logs and information_schema.processlist is estimated by TiDB based on the number of records, which can be “relatively accurate.” However, there are some issues: 1) The memory usage of some records (such as chunks) may not be very accurate, but it is generally reliable; 2) Some memory overheads might not be accounted for, although newer versions have more comprehensive memory statistics; 3) Objects that are “destroyed” cannot be accounted for, but they still occupy memory until garbage collection occurs.

The calculation logic for SQL memory usage (as seen in show processlist or slow logs) is quite complex and is mainly tracked through memory tracking:

What is Memory Tracking

TiDB-server introduces a memory tracking framework to monitor memory usage during the execution of each statement in a connection. This helps trigger OOM actions when a statement consumes too much memory, thereby protecting the overall resource availability of the instance.

The smallest unit of memory tracking is a tracker, which is inserted at various stages of statement execution. A hierarchical tree is formed based on statement → operator → data processing within the operator. Each memory usage during data processing is recorded in the tracker and accumulated in the parent tracker, eventually summing up to the root tracker. This forms a memory tracking tree.

The memory obtained by the root tracker here is the memory usage of the statement you see in the slow log.

Implementation of Memory Tracking

The core structure for memory tracking is the Tracker:

type Tracker struct {
   bytesLimit           atomic.Value
   actionMuForHardLimit actionMu
   actionMuForSoftLimit actionMu
   mu                   struct {
      children map[int][]*Tracker
      sync.Mutex
   }
   parMu struct {
      parent *Tracker
      sync.Mutex
   }
   label int
   bytesConsumed       int64
   bytesReleased       int64
   maxConsumed         atomicutil.Int64
   SessionID           uint64
   NeedKill            atomic.Bool
   NeedKillReceived    sync.Once
   IsRootTrackerOfSess bool
   isGlobal            bool
}

Important attributes include:

actionMuForHardLimit: Implements the ActionOnExceed interface for hard limit actions. When memory usage (bytesConsumed) reaches the hard limit, it triggers the OOM action. The hard limit is set by the tidb_mem_quota_query parameter.
actionMuForSoftLimit: Implements the ActionOnExceed interface for soft limit actions. When memory usage (bytesConsumed) reaches the soft limit, it triggers the OOM action. The soft limit is 0.8 * tidb_mem_quota_query. Currently, the only soft limit action is the hashAgg operator spilling to disk; others are hard limit actions.
bytesConsumed: The current memory tracked by this tracker, reported to the parent node. The root node contains the total memory used by the statement. If it exceeds the tidb_mem_quota_query, it triggers the OOM action.
maxConsumed: The maximum memory ever used, mainly for displaying the maximum memory consumption of the current operator in the processlist.
mu: Records the parent node’s tracker to form the tracking tree.

The most important method of this structure is Consume(bs int64), which adds the memory consumption bs to bytesConsumed. When bs is positive, it indicates memory consumption (e.g., reading chunk data from disk into memory). When bs is negative, it indicates memory release (e.g., writing temporary files from memory to disk). The core logic is as follows:

Recursively call the getParent() method through the current tracker, accumulating the current memory consumption value (tracker.bytesConsumed) to the parent node.
For each tracker in the hierarchy, if the accumulated tracker.bytesConsumed exceeds tracker.maxConsumed, set tracker.maxConsumed to tracker.bytesConsumed.
After the loop, find the rootTracker (the top-level tracker of the current session). If rootTracker’s bytesConsumed exceeds the hard limit, record rootExceed (=rootTracker). If rootTracker’s bytesConsumed exceeds the soft limit, record rootExceedForSoftLimit (=rootTracker).
If the instance’s memory usage exceeds tidb_server_memory_limit, kill the statement occupying the most memory. Check if the current rootTracker is already marked for execution; if so, trigger the Cancel action.
If rootExceed exists, it means the statement’s memory exceeds the limit, triggering the hard limit OOM action.
If rootExceedForSoftLimit exists, it means the statement’s memory exceeds 0.8 of the limit, triggering the soft limit OOM action.

Thus, by continuously calling the Consume() method to record current memory consumption, memory usage for operators and statements is tracked, and OOM actions are triggered accordingly.

translator_bot · June 22, 2024, 12:27am

| username: cy6301567 | Original post link

How do we control this? We also encountered a situation where a single SQL query crashed TiKV before.

translator_bot · June 22, 2024, 12:27am

| username: zhanggame1 | Original post link

I’ve also observed that some SQL queries consume a large amount of TiDB memory, while from the SQL statistics, it appears to be much less.

translator_bot · June 22, 2024, 12:27am

| username: EricSong | Original post link

Thank you, but I am still quite confused. According to your explanation, the monitoring results of Grafana and the records on the Dashboard should be similar, but the current discrepancy is significantly large. Is this because the memory space of certain operators has not been accounted for?

translator_bot · June 22, 2024, 12:27am

| username: 有猫万事足 | Original post link

The version is too old. I tested a 4 million row table on version 7.1.1 and couldn’t reproduce the issue.

In the dashboard, TiDB Memory Usage shows almost no fluctuation. There should be no calculation error.
The difference between us lies in the size of this table. The row count differs by 1 million, but your table is several times larger than mine.

I have added TiFlash to other large tables at hand, so I can’t replicate the test.

translator_bot · June 22, 2024, 12:27am

| username: 人如其名 | Original post link

Yes. Mainly, in lower versions, some memory consumption within HashAgg was not being tracked. You can try using a higher version like 7.1 with the same data and see if it improves. I estimate it will be much better.