What occupies heap memory, and why is my memory usage high?

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: heap memory 都是什么占用,为什么我的内存占用高

| username: wluckdog

[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version] tidb v6.5.0
[Reproduction Path] Operations performed that led to the issue
[Encountered Issue: Issue Phenomenon and Impact]
Heap memory node memory display
View TiDB-Runtime page for memory analysis of each tidb instance

Why does this gc-threadhold occupy such high memory?

Monitoring shows that the main memory is divided into heap memory, process, and analyze usage sizes. Among them, heap memory and process occupy more. For actual memory usage, should we look at heap memory or process? Here, both process and heap use 20-30G. How to analyze whether such high memory usage is reasonable?

curl -G http://ip:10080/debug/pprof/heap > pd.heap.prof
Then use go tool pprof pd.heap.prof to check the high memory usage. The heap for a single instance is not high.
Use go tool pprof to export the memory analysis of the corresponding instance, and the memory usage is not high.

[Resource Configuration] Enter TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]

| username: changpeng75 | Original post link

The GC on TiDB is the garbage collection of the Go language, right? Check if the garbage collection mechanism of Go is MADV_DONTNEED or MADV_FREE?

| username: 麻烦是朋友 | Original post link

Let the GC garbage collection be decided by the Go language itself.

| username: wluckdog | Original post link

How can I check whether Go’s garbage collection mechanism is MADV_DONTNEED or MADV_FREE?

| username: changpeng75 | Original post link

The GC mechanism of Go now defaults to MADV_FREE. You can add GODEBUG=madvdontneed=1 before starting the TiDB Server.

| username: wangccsy | Original post link

Heap memory.

| username: dba远航 | Original post link

GC is for reclaiming useless data, and it will consume a lot of resources when there are large data changes.

| username: wluckdog | Original post link

The operation of GC is controlled by the GC leader. There are a dozen TiDB instances here, and the memory occupied by the TiDB instances providing external services is around 20G, while the memory of the TiDB instances not providing external services is around 10G. The difference is quite large; normally, the leader should occupy more memory, right?

| username: 江湖故人 | Original post link

Threadhold means threshold, not usage. At this point, the Go language will start garbage collection. The default memory management strategy in Go, MADV_FREE, is lazier compared to MADV_DONTNEED, which can easily cause TiDB to experience OOM (Out of Memory). Therefore, some people change the environment variable to use the MADV_DONTNEED mode:

| username: redgame | Original post link

The garbage collection mechanism uses the mark-and-sweep algorithm and the concurrent mark-and-sweep algorithm to achieve automatic memory reclamation, ensuring that memory leaks and memory overflow issues do not occur during program execution.

| username: TiDBer_aaO4sU46 | Original post link

The garbage collection mechanism is responsible for cleaning up data that is no longer in use. When there are a large number of data changes in the program, garbage collection may consume a significant amount of resources.