TiDB Monitoring Migration: Some Data Not Displayed

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb 监控迁移部分数据不显示了

| username: xingzhenxiang

[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version]
[Reproduction Path] Migrate the monitoring component from one machine to another
[Encountered Problem: Monitoring display is abnormal, originally PD was the leader, now it shows as follower, and many data points are not displayed]
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]
Content before migration


Content after migration

| username: xfworld | Original post link

Is it possible that the previously deployed node instance services have not yet had their integration information updated?


It is recommended to directly scale down the original Prometheus and then expand a new set. This way is simpler.

| username: dba-kit | Original post link

Take a look at the top area, are there two or more tipd instances here? Some tipd-related monitoring panels are connected to the tipd instances, and after switching, both tipd instances can only cover a part of the time range. However, the TiDB and TiKV panels should not be affected.

| username: xingzhenxiang | Original post link

This is how it was operated. Some information can be displayed, some information is displayed, and some is displayed incorrectly. The operation process is as follows:

tiup cluster display tidb-test
tiup cluster show-config tidb-test
tiup cluster display tidb-test
tiup cluster scale-in tidb-test -N 10.26.109.98:9093,10.26.109.98:3000
tiup cluster display tidb-test
tiup cluster check ./single-monitor.yaml --user root -p
tiup cluster check --help
tiup cluster check --cluster tidb-test ./single-monitor.yaml --user
tiup cluster scale-out tidb-test single-monitor.yaml --user root -p

The operation shows success, but the displayed content is missing, and the PD, which was originally the leader, is displayed as a follower.

| username: dba-kit | Original post link

Uh, you directly scaled in Prometheus, so the previous data must have been deleted…

| username: xingzhenxiang | Original post link

I tried everything, but even when I selected the option corresponding to the leader-pd port, it still showed as follower.

| username: 裤衩儿飞上天 | Original post link

How did you migrate the monitoring components? Did you migrate the Prometheus data?

| username: xingzhenxiang | Original post link

Yes, as shown in the initial diagram, some information is not displayed, and the PD leader is shown as a follower.

| username: xingzhenxiang | Original post link

No migration, direct scaling down and scaling up. Look at the commands above, it’s the related process, no backup of historical data.

| username: dba-kit | Original post link

Use pd-ctl to check who the current PD leader is:

tiup ctl:v6.5.0 pd member
| username: xingzhenxiang | Original post link

Starting component ctl: /home/tidb/.tiup/components/ctl/v6.5.0/ctl pd --pd 10.26.109.99:2379 member
{
“header”: {
“cluster_id”: 7204711972043972961
},
“members”: [
{
“name”: “pd-10.26.109.99-2379”,
“member_id”: 3748938598193133207,
“peer_urls”: [
http://10.26.109.99:2380
],
“client_urls”: [
http://10.26.109.99:2379
],
“deploy_path”: “/export/tidb-deploy/pd-2379/bin”,
“binary_version”: “v6.5.0”,
“git_hash”: “d1a4433c3126c77fb2d5bb5720eefa0f2e05c166”
},
{
“name”: “pd-10.26.109.98-2379”,
“member_id”: 7744517254921949623,
“peer_urls”: [
http://10.26.109.98:2380
],
“client_urls”: [
http://10.26.109.98:2379
],
“deploy_path”: “/export/tidb-deploy/pd-2379/bin”,
“binary_version”: “v6.5.0”,
“git_hash”: “d1a4433c3126c77fb2d5bb5720eefa0f2e05c166”
},
{
“name”: “pd-10.26.109.97-2379”,
“member_id”: 12410221973370163989,
“peer_urls”: [
http://10.26.109.97:2380
],
“client_urls”: [
http://10.26.109.97:2379
],
“deploy_path”: “/export/tidb-deploy/pd-2379/bin”,
“binary_version”: “v6.5.0”,
“git_hash”: “d1a4433c3126c77fb2d5bb5720eefa0f2e05c166”
}
],
“leader”: {
“name”: “pd-10.26.109.99-2379”,
“member_id”: 3748938598193133207,
“peer_urls”: [
http://10.26.109.99:2380
],
“client_urls”: [
http://10.26.109.99:2379
],
“deploy_path”: “/export/tidb-deploy/pd-2379/bin”,
“binary_version”: “v6.5.0”,
“git_hash”: “d1a4433c3126c77fb2d5bb5720eefa0f2e05c166”
},
“etcd_leader”: {
“name”: “pd-10.26.109.99-2379”,
“member_id”: 3748938598193133207,
“peer_urls”: [
http://10.26.109.99:2380
],
“client_urls”: [
http://10.26.109.99:2379
],
“deploy_path”: “/export/tidb-deploy/pd-2379/bin”,
“binary_version”: “v6.5.0”,
“git_hash”: “d1a4433c3126c77fb2d5bb5720eefa0f2e05c166”
}
}

| username: xingzhenxiang | Original post link

The leader and display show the same.

| username: ffeenn | Original post link

Please also send a screenshot of the display. 1. Did you restart the cluster after the upgrade? 2. All Prometheus monitoring data has been deleted. Check the UI monitoring to see if new data has been generated.

| username: magic | Original post link

Is the monitoring data cleared? Check Prometheus and try refreshing Grafana’s DataSource.

| username: xingzhenxiang | Original post link

Still not working.

| username: Soysauce520 | Original post link

Try deleting the configuration file under Prometheus and restarting it?

| username: xingzhenxiang | Original post link

The issue has been resolved. The server hosting the new monitoring component was about 5 minutes behind the correct time. After synchronizing the time, everything is displaying correctly. Thank you, everyone.

| username: ffeenn | Original post link

The importance of using a time server.

| username: Soysauce520 | Original post link

It is also related to the time of the host where the browser is located, and it needs to be consistent.

| username: liuis | Original post link

Does it have to do with the browser’s time? I don’t think so.