TiDB Service Restart Causes Most Business Operations to Report Errors and Alarms

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb 服务重启 导致大部分业务都报错告警

| username: leoones

[Overview] Scenario + Problem Overview
TiDB service restarted today, affecting most business services with alerts.

[2022/09/03 10:35:03.925 +08:00] [INFO] [printer.go:34] [“Welcome to TiDB.”] [“Release Version”=v5.4.0] [Edition=Community] [“Git Commit Hash”=55f3b24c1c9f506bd652ef1d162283541e428872] [“Git Branch”=heads/refs/tags/v5.4.0] [“UTC Build Time”=“2022-01-25 08:39:26”] [GoVersion=go1.16.4] [“Race Enabled”=false] [“Check Table Before Drop”=false] [“TiKV Min Version”=v3.0.0-60965b006877ca7234adaced7890d7b029ed1306]
[2022/09/03 10:35:03.925 +08:00] [INFO] [printer.go:48] [“loaded config”]

Detailed log attachment
tidb-2022-09-03T10-59-51.933.zip (20.3 MB)
[TiDB Version]
V5.4.0

| username: h5n1 | Original post link

Please share the memory monitoring trend of the TiDB server.

| username: leoones | Original post link

I did not see any records of the TiDB service being killed due to memory overflow in the dmesg -T|grep tidb logs. However, the monitoring shows that there is an abnormal memory issue on one of the nodes.

| username: h5n1 | Original post link

Is the problematic TiDB server the one with IP 145? There is this error 19 seconds before the restart. The IP 10.54.185.37 should be TiKV, right? Check the logs to see if there is anything. It seems you might need to upgrade the version; the latest for 5.4 is 5.4.2.

| username: leoones | Original post link

It’s this node, 10.54.185.37. This is not TiKV, it’s the client’s address, right?

| username: xfworld | Original post link

First, locate the problem and enable resource consumption statistics.
Then, it is recommended to set the maximum execution time and maximum memory usage for SQL to limit SQL, reduce server pressure, and avoid crashes.

| username: leoones | Original post link

The maximum memory usage is default.
The maximum execution time is 300s.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.