PD Dashboard reports "Prometheus component not deployed in the cluster, monitoring unavailable."

translator_bot · June 22, 2024, 9:21pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: pd dashboard报“集群中未部署 Prometheus 组件，监控不可用。”

| username: wfxxh

[TiDB Usage Environment] Production Environment
[TiDB Version] v5.4.2
[Reproduction Path] Bare-metal deployment of TiDB, used for three years, current version running for six months, first time encountering this issue
[Encountered Problem: Symptoms and Impact] PD dashboard suddenly reports an error: Prometheus component not deployed in the cluster, monitoring unavailable.

Is there any other way besides manually modifying the Prometheus address? What is the cause of this issue, and what does the “failed to reload persist options” in the logs mean? There have been numerous warnings of this kind.

Error Screenshot

PD Logs

Cluster Status

translator_bot · June 22, 2024, 9:21pm

| username: tidb菜鸟一只 | Original post link

Are there any errors reported in the Prometheus logs?

translator_bot · June 22, 2024, 9:21pm

| username: wfxxh | Original post link

There isn’t any.

translator_bot · June 22, 2024, 9:21pm

| username: 我是咖啡哥 | Original post link

Have the monitoring components been restarted?

translator_bot · June 22, 2024, 9:21pm

| username: wfxxh | Original post link

This morning, when I saw an anomaly in the PD dashboard, I thought it was an issue with Prometheus and restarted Prometheus once. However, the problem persisted.

translator_bot · June 22, 2024, 9:21pm

| username: wfxxh | Original post link

I manually changed this, and it worked fine. The problem is that the cluster topology hasn’t been changed, so why did it suddenly become unrecognizable?

translator_bot · June 22, 2024, 9:21pm

| username: wfxxh | Original post link

Another point is that Grafana monitoring is normal, and Grafana’s data source and PD use the same Prometheus.

translator_bot · June 22, 2024, 9:21pm

| username: Billmay表妹 | Original post link

You can try deploying it manually.

translator_bot · June 22, 2024, 9:21pm

| username: 会飞的土拨鼠 | Original post link

You can check the configuration file. Currently, the Prometheus component is not recognized in the PD panel cluster.

translator_bot · June 22, 2024, 9:21pm

| username: wfxxh | Original post link

Deploy what? All monitoring services are in a normal state.

translator_bot · June 22, 2024, 9:21pm

| username: wfxxh | Original post link

The configuration file is also normal, and the cluster topology is normal. At the beginning, PD could recognize Prometheus, but after six months, it suddenly couldn’t recognize the address.

translator_bot · June 22, 2024, 9:21pm

| username: Min_Chen | Original post link

Hello,
Can you try reloading the monitoring? tiup cluster reload -R prometheus,grafana,alertmanager

translator_bot · June 22, 2024, 9:21pm

| username: wfxxh | Original post link

I just tried it, the problem still persists.