V6.5.1 Dashboard Anomalies

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: v6.5.1 dashboard 异常

| username: 是我的海

[TiDB Usage Environment] Production Environment
[TiDB Version] v.6.5.1
After upgrading the cluster to 6.5.1, two clusters encountered dashboard issues. The topsql was automatically turned off and could not be turned on. The dashboard error screenshot is as follows:

Logging into the database, I found that the parameter tidb_enable_top_sql is OFF. Setting it to ON still does not enable the topsql function.

After the upgrade, I found that the topsql function was actively turned off in ng.log. The log is as follows:

[2023/04/07 15:56:22.504 +08:00] [INFO] [pdvariable.go:110] ["load global config"] [cfg="{\"EnableTopSQL\":true}"]
[2023/04/07 15:57:22.504 +08:00] [INFO] [pdvariable.go:110] ["load global config"] [cfg="{\"EnableTopSQL\":true}"]
[2023/04/07 15:58:22.504 +08:00] [INFO] [pdvariable.go:110] ["load global config"] [cfg="{\"EnableTopSQL\":true}"]
[2023/04/07 15:58:52.517 +08:00] [WARN] [client.go:107] ["Request failed"] [kindTag=PD] [url=http://10.105.129.19:2429/pd/api/v1/members] [responseStatus="503 Service Unavailable"] [responseBody="no leader"] [error="http_client.server_error: GET http://10.105.129.19:2429/pd/api/v1/members (PD): Response status 503"] [errorVerbose="http_client.server_error: GET http://10.105.129.19:2429/pd/api/v1/members (PD): Response status 503\n at github.com/pingcap/tidb-dashboard/util/client/httpclient.(*Client).handleAfterResponseHook()\n\t/go/pkg/mod/github.com/pingcap/tidb-dashboard/util@v0.0.0-20211014081729-82f8b809f5ae/client/httpclient/client.go:81\n at github.com/go-resty/resty/v2.(*Client).execute()\n\t/go/pkg/mod/github.com/go-resty/resty/v2@v2.6.0/client.go:947\n at github.com/go-resty/resty/v2.(*Request).Execute()\n\t/go/pkg/mod/github.com/go-resty/resty/v2@v2.6.0/request.go:729\n at github.com/pingcap/tidb-dashboard/util/client/httpclient.(*Request).Execute()\n\t/go/pkg/mod/github.com/pingcap/tidb-dashboard/util@v0.0.0-20211014081729-82f8b809f5ae/client/httpclient/request.go:102\n at github.com/pingcap/tidb-dashboard/util/client/httpclient.(*Request).Get()\n\t/go/pkg/mod/github.com/pingcap/tidb-dashboard/util@v0.0.0-20211014081729-82f8b809f5ae/client/httpclient/request.go:76\n at github.com/pingcap/tidb-dashboard/util/client/pdclient.(*APIClient).GetMembers()\n\t/go/pkg/mod/github.com/pingcap/tidb-dashboard/util@v0.0.0-20211014081729-82f8b809f5ae/client/pdclient/pd_api.go:43\n at github.com/pingcap/tidb-dashboard/util/topo.GetPDInstances()\n\t/go/pkg/mod/github.com/pingcap/tidb-dashboard/util@v0.0.0-20211014081729-82f8b809f5ae/topo/pd.go:28\n at github.com/pingcap/ng-monitoring/component/topology.(*TopologyDiscoverer).getPDComponents()\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/ng-monitoring/component/topology/discovery.go:168\n at github.com/pingcap/ng-monitoring/component/topology.(*TopologyDiscoverer).fetchAllScrapeTargets()\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/ng-monitoring/component/topology/discovery.go:130\n at github.com/pingcap/ng-monitoring/component/topology.(*TopologyDiscoverer).fetchTopology()\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/ng-monitoring/component/topology/discovery.go:95\n at github.com/pingcap/ng-monitoring/component/topology.(*TopologyDiscoverer).loadTopologyLoop()\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/ng-monitoring/component/topology/discovery.go:81\n at github.com/pingcap/ng-monitoring/utils.GoWithRecovery()\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/ng-monitoring/utils/misc.go:26\n at runtime.goexit()\n\t/usr/local/go/src/runtime/asm_amd64.s:1571"]
[2023/04/07 15:58:52.517 +08:00] [ERROR] [discovery.go:83] ["load topology failed"] [error="http_client.server_error: GET http://10.105.129.19:2429/pd/api/v1/members (PD): Response status 503"] [errorVerbose="http_client.server_error: GET http://10.105.129.19:2429/pd/api/v1/members (PD): Response status 503\n at github.com/pingcap/tidb-dashboard/util/client/httpclient.(*Client).handleAfterResponseHook()\n\t/go/pkg/mod/github.com/pingcap/tidb-dashboard/util@v0.0.0-20211014081729-82f8b809f5ae/client/httpclient/client.go:81\n at github.com/go-resty/resty/v2.(*Client).execute()\n\t/go/pkg/mod/github.com/go-resty/resty/v2@v2.6.0/client.go:947\n at github.com/go-resty/resty/v2.(*Request).Execute()\n\t/go/pkg/mod/github.com/go-resty/resty/v2@v2.6.0/request.go:729\n at github.com/pingcap/tidb-dashboard/util/client/httpclient.(*Request).Execute()\n\t/go/pkg/mod/github.com/pingcap/tidb-dashboard/util@v0.0.0-20211014081729-82f8b809f5ae/client/httpclient/request.go:102\n at github.com/pingcap/tidb-dashboard/util/client/httpclient.(*Request).Get()\n\t/go/pkg/mod/github.com/pingcap/tidb-dashboard/util@v0.0.0-20211014081729-82f8b809f5ae/client/httpclient/request.go:76\n at github.com/pingcap/tidb-dashboard/util/client/pdclient.(*APIClient).GetMembers()\n\t/go/pkg/mod/github.com/pingcap/tidb-dashboard/util@v0.0.0-20211014081729-82f8b809f5ae/client/pdclient/pd_api.go:43\n at github.com/pingcap/tidb-dashboard/util/topo.GetPDInstances()\n\t/go/pkg/mod/github.com/pingcap/tidb-dashboard/util@v0.0.0-20211014081729-82f8b809f5ae/topo/pd.go:28\n at github.com/pingcap/ng-monitoring/component/topology.(*TopologyDiscoverer).getPDComponents()\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/ng-monitoring/component/topology/discovery.go:168\n at github.com/pingcap/ng-monitoring/component/topology.(*TopologyDiscoverer).fetchAllScrapeTargets()\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/ng-monitoring/component/topology/discovery.go:130\n at github.com/pingcap/ng-monitoring/component/topology.(*TopologyDiscoverer).fetchTopology()\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/ng-monitoring/component/topology/discovery.go:95\n at github.com/pingcap/ng-monitoring/component/topology.(*TopologyDiscoverer).loadTopologyLoop()\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/ng-monitoring/component/topology/discovery.go:81\n at github.com/pingcap/ng-monitoring/utils.GoWithRecovery()\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/ng-monitoring/utils/misc.go:26\n at runtime.goexit()\n\t/usr/local/go/src/runtime/asm_amd64.s:1571"] [stack="github.com/pingcap/ng-monitoring/component/topology.(*TopologyDiscoverer).loadTopologyLoop\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/ng-monitoring/component/topology/discovery.go:83\ngithub.com/pingcap/ng-monitoring/utils.GoWithRecovery\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/ng-monitoring/utils/misc.go:26"]
[2023/04/07 15:58:56.038 +08:00] [INFO] [pdvariable.go:116] ["global config watch channel closed"]
[2023/04/07 15:59:22.505 +08:00] [INFO] [pdvariable.go:110] ["load global config"] [cfg="{\"EnableTopSQL\":true}"]
[2023/04/07 16:00:11.063 +08:00] [WARN] [scraper.go:248] ["failed to dial scrape target"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.129.94\",\"port\":20213,\"status_port\":20232}"] [error="context deadline exceeded"]
[2023/04/07 16:00:13.064 +08:00] [WARN] [scraper.go:236] ["retry to scrape component"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.129.94\",\"port\":20213,\"status_port\":20232}"] [retried=1]
[2023/04/07 16:00:18.065 +08:00] [WARN] [scraper.go:248] ["failed to dial scrape target"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.129.94\",\"port\":20213,\"status_port\":20232}"] [error="context deadline exceeded"]
[2023/04/07 16:00:22.066 +08:00] [WARN] [scraper.go:236] ["retry to scrape component"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.129.94\",\"port\":20213,\"status_port\":20232}"] [retried=2]
[2023/04/07 16:00:22.504 +08:00] [INFO] [pdvariable.go:110] ["load global config"] [cfg="{\"EnableTopSQL\":true}"]
[2023/04/07 16:00:22.530 +08:00] [WARN] [scraper.go:248] ["failed to dial scrape target"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.129.94\",\"port\":20213,\"status_port\":20232}"] [error="context canceled"]
[2023/04/07 16:00:22.530 +08:00] [INFO] [scraper.go:71] ["stop scraping Top SQL from the component"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.129.94\",\"port\":20213,\"status_port\":20232}"]
[2023/04/07 16:00:52.531 +08:00] [INFO] [scraper.go:68] ["starting to scrape Top SQL from the component"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.129.94\",\"port\":20213,\"status_port\":20232}"]
[2023/04/07 16:00:57.241 +08:00] [WARN] [scraper.go:248] ["failed to dial scrape target"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.128.77\",\"port\":20213,\"status_port\":20232}"] [error="context deadline exceeded"]
[2023/04/07 16:00:59.242 +08:00] [WARN] [scraper.go:236] ["retry to scrape component"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.128.77\",\"port\":20213,\"status_port\":20232}"] [retried=1]
[2023/04/07 16:01:04.242 +08:00] [WARN] [scraper.go:248] ["failed to dial scrape target"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.128.77\",\"port\":20213,\"status_port\":20232}"] [error="context deadline exceeded"]
[2023/04/07 16:01:08.242 +08:00] [WARN] [scraper.go:236] ["retry to scrape component"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.128.77\",\"port\":20213,\"status_port\":20232}"] [retried=2]
[2023/04/07 16:01:22.505 +08:00] [INFO] [pdvariable.go:110] ["load global config"] [cfg="{\"EnableTopSQL\":true}"]
[2023/04/07 16:01:48.965 +08:00] [WARN] [scraper.go:248] ["failed to dial scrape target"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.129.9\",\"port\":20213,\"status_port\":20232}"] [error="context deadline exceeded"]
[2023/04/07 16:01:50.966 +08:00] [WARN] [scraper.go:236] ["retry to scrape component"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.129.9\",\"port\":20213,\"status_port\":20232}"] [retried=1]
[2023/04/07 16:01:55.966 +08:00] [WARN] [scraper.go:248] ["failed to dial scrape target"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.129.9\",\"port\":20213,\"status_port\":20232}"] [error="context deadline exceeded"]
[2023/04/07 16:01:59.967 +08:00] [WARN] [scraper.go:236] ["retry to scrape component"] [component="{\"name\":\"tikv\",\"ip\":\"10.105.129.9\",\"port\":20213,\"status_port\":20232}"] [retried=2]
[2023/04/07 16:02:22.505 +08:00] [INFO] [pdvariable.go:110] ["load global config"] [cfg="{\"EnableTopSQL\":true}"]
[2023/04/07 16:02:22.976 +08:00] [WARN] [scraper.go:248] ["failed to dial scrape target"] [component="{\"name\":\"tidb\",\"ip\":\"10.105.128.164\",\"port\":5740,\"status_port\":10132}"] [error="context deadline exceeded"]
[2023/04/07 16:02:24.977 +08:00] [WARN] [scraper.go:236] ["retry to scrape component"] [component="{\"name\":\"tidb\",\"ip\":\"10.105.128.164\",\"port\":5740,\"status_port\":10132}"] [retried=1]
[2023/04/07 16:02:29.977 +08:00] [WARN] [scraper.go:248] ["failed to dial scrape target"] [component="{\"name\":\"tidb\",\"ip\":\"10.105.128.164\",\"port\":5740,\"status_port\":10132}"] [error="context deadline exceeded"]
[2023/04/07 16:02:33.978 +08:00] [WARN] [scraper.go:236] ["retry to scrape component"] [component="{\"name\":\"tidb\",\"ip\":\"10.105.128.164\",\"port\":5740,\"status_port\":10132}"] [retried=2]
[2023/04/07 16:02:38.979 +08:00] [WARN] [scraper.go:248] ["failed to dial scrape target"] [component="{\"name\":\"tidb\",\"ip\":\"10.105.128.164\",\"port\":5740,\"status_port\":10132}"] [error="context deadline exceeded"]
[2023/04/07 16:02:46.979 +08:00] [WARN] [scraper.go:236] ["retry to scrape component"] [component="{\"name\":\"tidb\",\"ip\":\"10.105.128.164\",\"port\":5740,\"status_port\":10132}"] [retried=3]
[2023/04/07 16:02:51.101 +08:00] [WARN] [scraper.go:248] ["failed to dial scrape target"] [component="{\"name\":\"tidb\",\"ip\":\"10.105.129.127\",\"port\":5740,\"status_port\":10132}"] [error="context deadline exceeded"]
[2023/04/07 16:02:53.102 +08:00] [WARN] [scraper.go:236] ["retry to scrape component"] [component="{\"name\":\"tidb\",\"ip\":\"10.105.129.127\",\"port\":5740,\"status_port\":10132}"] [retried=1]
[2023/04/07 16:02:58.103 +08:00] [WARN] [scraper.go:248] ["failed to dial scrape target"] [component="{\"name\":\"tidb\",\"ip\":\"10.105.129.127\",\"port\":5740,\"status_port\":10132}"] [error="context deadline exceeded"]
[2023/04/07 16:03:02.103 +08:00] [WARN] [scraper.go:236] ["retry to scrape component"] [component="{\"name\":\"tidb\",\"ip\":\"10.105.129.127\",\"port\":5740,\"status_port\":10132}"] [retried=2
| username: 大鱼海棠 | Original post link

Refer to this for deployment: TiDB Dashboard 常见问题 | PingCAP 文档中心

| username: 是我的海 | Original post link

I referred to it and tried everything, but it doesn’t work. The ng_port and everything are configured properly. The key issue is that everything was working fine in the previous version, so why did it stop working after the upgrade?

| username: 大鱼海棠 | Original post link

Have you installed it? Is the ng process started? You can check the ng log for errors. Generally, it should be fine after deployment. It might be an issue with upgrading from an old version, but I’m not sure about that.

| username: caiyfc | Original post link

Take a look at this: NgMonitoring Unable to Start Issue - :ringer_planet: TiDB Technical Issues / Deployment & Operations Management - TiDB Q&A Community (asktug.com)

| username: 是我的海 | Original post link

It looks like the problem has been solved. For all clusters upgraded to 6.5.1, there is an issue with the configuration file generated by ng-monitor. The elements in the endpoint are not separated by commas. Modify it and restart Prometheus, and it should be fine.

| username: nexustar | Original post link

TiUP v1.12.1 has been released, which fixes this issue. After upgrading, running tiup cluster reload xxx -R prometheus can resolve it.

| username: 是我的海 | Original post link

Okay, thank you.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.