Alarm for High Memory Usage in tidb-server

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb-server 内存占用过高时的报警

| username: 每天必打卡

[TiDB Usage Environment] Production Environment
[TiDB Version] v5.3.4
[Reproduction Path] None
[Encountered Problem: Phenomenon and Impact] There are only 4 physical machines, each with 300G memory, with mixed deployment of tidb, tikv, pd. The memory allocated to tidb-server is 64G. When the issue occurred, 3 tidb-servers crashed almost simultaneously. The var/log/message contains oom-killer, but there is no oom-killer log in tidb.log. How can I modify the default settings to trigger an alert when tidb-server’s memory usage reaches 80% of the limit, and record information such as goroutine, heap, running_sql, etc.?

| username: 像风一样的男子 | Original post link

Didn’t you find the parameters that need to be modified? Is there any other problem?

| username: 每天必打卡 | Original post link

By default, the tidb-server instance will print alarm logs and record related log files when the machine memory usage (the machine memory is 300G) reaches 80% of the total memory. I want the tidb-server instance to print alarm logs and record related log files when the “instance memory” (the memory set for tidb-server is 64G) usage reaches 80%. Is there a way to do this?

| username: tidb菜鸟一只 | Original post link

When the memory threshold alarm function is enabled, if the configuration item server-memory-quota is not set, the memory alarm threshold is memory-usage-alarm-ratio * system memory size; if server-memory-quota is set and greater than 0, the memory alarm threshold is memory-usage-alarm-ratio * server-memory-quota. You should set the server-memory-quota for tidb-server first.

| username: tidb菜鸟一只 | Original post link

If server-memory-quota is set and greater than 0, the memory alarm threshold is memory-usage-alarm-ratio * server-memory-quota.
So it should be like this now, isn’t it alarming?

| username: 像风一样的男子 | Original post link

64G*0.8=51.2G memory. Why don’t you set it to 17% of 300?

| username: 每天必打卡 | Original post link

Can the server-memory-quota parameter be set in the production environment of version 5.3.4? Are there any issues?

| username: 每天必打卡 | Original post link

Because it is a mixed deployment, I am not sure if TiKV or other components occupying memory will cause TiDB to continuously log, and goroutine, heap, and running_sql are also not ideal.

| username: 像风一样的男子 | Original post link

The official documentation has detailed explanations, take a look:

| username: 每天必打卡 | Original post link

Has anyone encountered issues related to server-memory-quota in production? Looking to gather some experience.

| username: 像风一样的男子 | Original post link

If you upgrade to version 6.4, you can use the system variable tidb_server_memory_limit to set the maximum memory usage for TiDB. Both percentage and specific size can be set.

| username: redgame | Original post link

The mem-quota-query parameter is used to limit the memory usage of a single query. You can adjust the value of this parameter according to the actual situation to control the memory consumption of a single query.