Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: pd 在夜间down了,不知道什么原因

[Test Environment for TiDB] Testing
[TiDB Version] 7.5.0
[Reproduction Path] PD crashed around 1 AM for unknown reasons. The PD logs are as follows:
[2024/05/08 01:27:00.398 +08:00] [INFO] [grpc_service.go:1948] ["update service GC safe point"]
[service-id=gc_worker] [expire-at=-9223372035139672989] [safepoint=449603756461391872]
[2024/05/08 01:28:40.520 +08:00] [INFO] [grpc_service.go:1893] ["updated gc safe point"] [safe-p
oint=449603756461391872]
[2024/05/08 01:37:00.396 +08:00] [INFO] [grpc_service.go:1948] ["update service GC safe point"]
[service-id=gc_worker] [expire-at=-9223372035139672389] [safepoint=449603913747791872]
[2024/05/08 01:38:37.465 +08:00] [INFO] [lease.go:187] ["stop lease keep alive worker"] [purpose
="leader election"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [allocator_manager.go:772] ["exit allocator daemon"] []
[2024/05/08 01:38:37.466 +08:00] [INFO] [coordinator.go:160] ["patrol regions has been stopped"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [coordinator.go:344] ["drive slow node scheduler is stop
ped"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [coordinator.go:326] ["drive push operator has been stop
ped"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [allocator_manager.go:316] ["exit allocator loop"] []
[2024/05/08 01:38:37.466 +08:00] [INFO] [scheduler_controller.go:364] ["scheduler has been stopp
ed"] [scheduler-name=balance-hot-region-scheduler] [error="context canceled"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [coordinator.go:374] ["coordinator is stopping"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [scheduler_controller.go:364] ["scheduler has been stopped"] [scheduler-name=balance-leader-scheduler] [error="context canceled"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [main.go:284] ["got signal to exit"] [signal=hangup]
[2024/05/08 01:38:37.466 +08:00] [INFO] [server.go:127] ["region syncer has been stopped"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [scheduler_controller.go:364] ["scheduler has been stopped"] [scheduler-name=transfer-witness-leader-scheduler] [error="context canceled"]
The subsequent logs are all about stopping various modules. Could it be related to these log messages?
stop lease keep alive worker
drive slow node scheduler is stop
drive push operator has been stop
[Encountered Problem: Phenomenon and Impact] PD crashed around midnight. How should I further investigate the cause? This doesn’t seem to be the first time it has happened.