Where can I find the documentation for the DM monitoring metrics defined in the Prometheus alert rule file?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: Prometheus定义的告警rule文件,定义的DM监控指标是在哪有说明?

| username: 超7成网友

As the title suggests, what does the expression [dm_relay_space] mean and where is it defined? Where is this data maintained?

rules:

  • alert: DM_remain_storage_of_relay_log
    expr: dm_relay_space{type=“available”} < 10102410241024
    labels:
    env: dm-cluster-prod
    level: critical
    expr: dm_relay_space{type=“available”} < 10
    102410241024
    annotations:
    description: ‘cluster: dm-cluster-prod, instance: {{ $labels.instance }}, values: {{ $value }}’
    value: ‘{{ $value }}’
    summary: DM remain storage of relay log
| username: CuteRay | Original post link

Refer to this document and compare it with Grafana monitoring.

| username: buchuitoudegou | Original post link

The data is stored in Prometheus, and it will periodically pull data from DM.

| username: 超7成网友 | Original post link

I don’t understand where keywords like [dm_relay_space] in the configuration are defined? What other such keywords are there?

| username: buchuitoudegou | Original post link

Are you referring to the code:

relayLogSpaceGauge = metricsproxy.NewGaugeVec(
		&promutil.PromFactory{},
		prometheus.GaugeOpts{
			Namespace: "dm",
			Subsystem: "relay",
			Name:      "space",
			Help:      "the space of storage for relay component",
		}, []string{"type"}) // type can be 'capacity' and 'available'.

The preceding “dm” and “relay” are the same prefix for the namespace, and this metric is defined like this.

| username: 超7成网友 | Original post link

What code is this? Is it for DM?

| username: buchuitoudegou | Original post link

Correct.

| username: buchuitoudegou | Original post link

When using it, you shouldn’t need to consider where it is defined, right? Are you going to change the Grafana dashboard configuration or something like that?

| username: 超7成网友 | Original post link

I want to modify the alert configuration. The alert condition is determined by [expr: dm_relay_space{type=“available”} < 10 * 1024 * 1024 * 1024]. If I don’t know the meaning of dm_relay_space and what indicators like this are, I can’t make changes.

| username: buchuitoudegou | Original post link

Understood, it seems there is indeed no relevant documentation. We will consider adding it in the future. :cry:

| username: system | Original post link

This topic was automatically closed 1 minute after the last reply. No new replies are allowed.