Why does the blackbox component remain running when using tiup to shut down the cluster while other components have been successfully shut down? Please advise

translator_bot · June 20, 2024, 1:22pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 使用tiup关闭集群，其它组件关闭完成，blackbox未关闭，是何原因，请大佬指教

| username: hanyj

[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version] 6.1.4
[Reproduction Path] tiup cluster stop ${cluster-name}
[Encountered Problem: Problem Phenomenon and Impact] Using tiup to shut down the cluster, other components shut down successfully, but blackbox did not shut down; later manually executed the shutdown of blackbox separately.
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachment: Screenshot/Log/Monitoring]
Error: failed to stop: 10.10.0.1 node exporter-9100.service, please check the instance’s log() for more detail.: timed out waiting for port 9100 to be stopped after 2m0s

2024-06-12T16:57:17.026+0800 INFO Execute command finished {“code”: 1, “error”: “failed to stop: 10.10.0.1 node_exporter-9100.service, please check the instance’s log() for more detail.: timed out waiting for port 9100 to be stopped after 2m0s”, “errorVerbose”: “timed out waiting for port 9100 to be stopped after 2m0s\ngithub.com/pingcap/tiup/pkg/cluster/module.(*WaitFor).Execute\n\tgithub.com/pingcap/tiup/pkg/cluster/module/wait_for.go:91\ngithub.com/pingcap/tiup/pkg/cluster/spec.PortStopped\n\tgithub.com/pingcap/tiup/pkg/cluster/spec/instance.go:130\ngithub.com/pingcap/tiup/pkg/cluster/operation.systemctlMonitor.func1\n\tgithub.com/pingcap/tiup/pkg/cluster/operation/action.go:338\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.0.0-20220819030929-7fc1605a5dde/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1594\nfailed to stop: 10.10.0.1 node_exporter-9100.service, please check the instance’s log() for more detail.”}

translator_bot · June 20, 2024, 1:22pm

| username: lemonade010 | Original post link

“please check the instance’s log() for more detail.” Go check the blockbox’s log.

translator_bot · June 20, 2024, 1:22pm

| username: zhaokede | Original post link

It should be an error, check the description in the logs.

translator_bot · June 20, 2024, 1:22pm

| username: hanyj | Original post link

The blackbox_exporter.log only contains the logs from the time it was started.

translator_bot · June 20, 2024, 1:22pm

| username: jiayou64 | Original post link

Is it deployed on a single machine? Try running tiup cluster display <cluster-name> to check.

translator_bot · June 20, 2024, 1:22pm

| username: hanyj | Original post link

Not a single machine
alertmanager 10.10.0.1 9093/9094 linux/x86_64 Down
grafana 10.10.0.1 3000 linux/x86_64 Down
pd 10.10.0.2 2379/2380 linux/x86_64 Down
pd 10.10.0.1 2379/2380 linux/x86_64 Down
pd 10.10.0.3 2379/2380 linux/x86_64 Down
prometheus 10.10.0.1 9091/12020 linux/x86_64 Down
tidb 10.10.0.2 13306/10080 linux/x86_64 Down
tidb 10.10.0.1 13306/10080 linux/x86_64 Down
tidb 10.10.0.3 13306/10080 linux/x86_64 Down
tikv 10.10.0.2 20160/2010 linux/x86_64 N/A
tikv 10.10.0.1 20160/2010 linux/x86_64 N/A
tikv 10.10.0.3 20160/2010 linux/x86_64 N/A

translator_bot · June 20, 2024, 1:22pm

| username: jiayou64 | Original post link

monitored: Monitoring service configuration, i.e., blackbox exporter and node exporter. Each machine will deploy a node exporter and a blackbox exporter.
Log in to tidb 10.10.0.1
ss -ntl|grep 9100
Configure the monitoring node separately, refer to the official topology configuration:
最小拓扑架构 | PingCAP 文档中心