One of the three TiDB servers keeps restarting. Brothers and sisters, please help take a look.
The logs are as follows:
Aug 10 21:12:37 TIDB-PD1 bash: [2022/08/10 21:12:37.001 +08:00] [WARN] [config.go:1004] [“Some configuration options should be moved to [instance] section. Please use the latter config options in [instance] instead: (slow-threshold, tidb_slow_log_threshold).”]
Aug 10 21:18:31 TIDB-PD1 kernel: audit: audit_lost=5113587 audit_rate_limit=512 audit_backlog_limit=16384
Aug 10 21:18:31 TIDB-PD1 kernel: audit: rate limit exceeded
Aug 10 21:21:51 TIDB-PD1 systemd: tidb-4000.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Aug 10 21:21:51 TIDB-PD1 systemd: Unit tidb-4000.service entered failed state.
Aug 10 21:21:51 TIDB-PD1 systemd: tidb-4000.service failed.
Aug 10 21:22:06 TIDB-PD1 systemd: tidb-4000.service holdoff time over, scheduling restart.
Aug 10 21:22:06 TIDB-PD1 systemd: Stopped tidb service.
Aug 10 21:22:06 TIDB-PD1 systemd: Started tidb service.
Aug 10 21:22:06 TIDB-PD1 bash: [2022/08/10 21:22:06.749 +08:00] [WARN] [config.go:1004] [“Some configuration options should be moved to [instance] section. Please use the latter config options in [instance] instead: (slow-threshold, tidb_slow_log_threshold).”]
Aug 10 21:24:00 TIDB-PD1 systemd-logind: New session 5981 of user root.
Aug 10 21:24:00 TIDB-PD1 systemd: Started Session 5981 of user root.
Aug 10 21:24:01 TIDB-PD1 systemd-logind: New session 5982 of user root.
Aug 10 21:24:01 TIDB-PD1 systemd: Started Session 5982 of user root.
Aug 10 21:26:54 TIDB-PD1 systemd: tidb-4000.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Aug 10 21:26:54 TIDB-PD1 systemd: Unit tidb-4000.service entered failed state.
Aug 10 21:26:54 TIDB-PD1 systemd: tidb-4000.service failed.
Aug 10 21:27:09 TIDB-PD1 systemd: tidb-4000.service holdoff time over, scheduling restart.
Aug 10 21:27:09 TIDB-PD1 systemd: Stopped tidb service.
Aug 10 21:27:09 TIDB-PD1 systemd: Started tidb service.
Aug 10 21:27:09 TIDB-PD1 bash: [2022/08/10 21:27:09.751 +08:00] [WARN] [config.go:1004] [“Some configuration options should be moved to [instance] section. Please use the latter config options in [instance] instead: (slow-threshold, tidb_slow_log_threshold).”]
Are the connections to the three TiDB nodes balanced?
Is the resource usage of the three TiDB nodes consistent, or is there a significant difference?
Check the top 10 slow SQL queries through the Dashboard to investigate whether any SQL queries require a large amount of memory, which could lead to OOM (Out of Memory) issues.
Enable diagnostic capabilities by setting the execution time and memory usage limits for SQL queries to capture SQL-related problems.
You can try enabling a global maximum execution time for each SQL query. If a query exceeds this time, it will be killed to mitigate TiDB OOM issues.
Additionally, please provide specific cluster configuration and version information to help with the assessment.
Oh, then the restart has nothing to do with this configuration. I tested it, and although this parameter is not configured correctly, it should not be related to the TiDB restart issue. It is recommended to follow the points provided by xfworld for troubleshooting first. If the issue is not resolved, you can provide the tidb.log for 10 minutes before and after the restart time.
The system log is as follows:
Aug 16 17:48:15 TIDB-PD1 systemd: Created slice User Slice of tidb.
Aug 16 17:48:15 TIDB-PD1 systemd-logind: New session 6512 of user tidb.
Aug 16 17:48:15 TIDB-PD1 systemd: Started Session 6512 of user tidb.
Aug 16 17:48:15 TIDB-PD1 systemd-logind: Removed session 6512.
Aug 16 17:48:15 TIDB-PD1 systemd: Removed slice User Slice of tidb.
Aug 16 17:48:15 TIDB-PD1 systemd: Created slice User Slice of tidb.
Aug 16 17:48:15 TIDB-PD1 systemd-logind: New session 6513 of user tidb.
Aug 16 17:48:15 TIDB-PD1 systemd: Started Session 6513 of user tidb.
Aug 16 17:48:15 TIDB-PD1 systemd: Reloading.
Aug 16 17:48:15 TIDB-PD1 systemd: Started blackbox_exporter service.
Aug 16 17:48:15 TIDB-PD1 systemd-logind: Removed session 6513.
Aug 16 17:48:15 TIDB-PD1 systemd: Removed slice User Slice of tidb.
Aug 16 17:48:15 TIDB-PD1 bash: level=info ts=2022-08-16T09:48:15.669186844Z caller=main.go:213 msg=“Starting blackbox_exporter” version=“(version=0.12.0, branch=HEAD, revision=4a22506cf0cf139d9b2f9cde099f0012d9fcabde)”
Aug 16 17:48:15 TIDB-PD1 bash: level=info ts=2022-08-16T09:48:15.669802622Z caller=main.go:220 msg=“Loaded config file”
Aug 16 17:48:15 TIDB-PD1 bash: level=info ts=2022-08-16T09:48:15.669927186Z caller=main.go:324 msg=“Listening on address” address=:9115
Aug 16 17:48:15 TIDB-PD1 systemd: Created slice User Slice of tidb.
Aug 16 17:48:15 TIDB-PD1 systemd-logind: New session 6514 of user tidb.
Aug 16 17:48:15 TIDB-PD1 systemd: Started Session 6514 of user tidb.
Aug 16 17:48:15 TIDB-PD1 systemd-logind: Removed session 6514.
Aug 16 17:48:15 TIDB-PD1 systemd: Removed slice User Slice of tidb.
Aug 16 17:51:25 TIDB-PD1 systemd: tidb-4000.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Aug 16 17:51:25 TIDB-PD1 systemd: Unit tidb-4000.service entered failed state.
Aug 16 17:51:25 TIDB-PD1 systemd: tidb-4000.service failed.
Aug 16 17:51:40 TIDB-PD1 systemd: tidb-4000.service holdoff time over, scheduling restart.
Aug 16 17:51:40 TIDB-PD1 systemd: Stopped tidb service.
Aug 16 17:51:40 TIDB-PD1 systemd: Started tidb service.
Aug 16 17:51:40 TIDB-PD1 bash: [2022/08/16 17:51:40.508 +08:00] [WARN] [config.go:1004] [“Some configuration options should be moved to [instance] section. Please use the latter config options in [instance] instead: (slow-threshold, tidb_slow_log_threshold).”]
Aug 16 17:51:46 TIDB-PD1 systemd: tidb-4000.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Aug 16 17:51:46 TIDB-PD1 systemd: Unit tidb-4000.service entered failed state.
Aug 16 17:51:46 TIDB-PD1 systemd: tidb-4000.service failed.
Aug 16 17:52:01 TIDB-PD1 systemd: tidb-4000.service holdoff time over, scheduling restart.
Aug 16 17:52:01 TIDB-PD1 systemd: Stopped tidb service.
Aug 16 17:52:01 TIDB-PD1 systemd: Started tidb service.
Aug 16 17:52:01 TIDB-PD1 bash: [2022/08/16 17:52:01.256 +08:00] [WARN] [config.go:1004] [“Some configuration options should be moved to [instance] section. Please use the latter config options in [instance] instead: (slow-threshold, tidb_slow_log_threshold).”]
Aug 16 18:00:01 TIDB-PD1 systemd: Started Session 6515 of user root.
Has it not been resolved yet? After restarting the application, is the error in tidb.log still the same?
If it is still the same, enable debug log for TiDB and post another tidb.log.