Is there logging when Prometheus scrapes monitoring data?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: prometheus 抓取监控数据的时候 有日志记录吗?

| username: 逍遥_猫

Version v6.5.1
The memory and IO captured by the cluster dashboard show anomalies on some machines. How can I view the logs captured by Prometheus?

| username: 逍遥_猫 | Original post link

I checked the logs under tidb-deploy/prometheus-9091/log, specifically prometheus.log. There are only records from 7 AM this morning, and there have been no log entries from 7 AM until now. The log level in the configuration file is set to: info.

| username: 芮芮是产品 | Original post link

Wouldn’t that be bad?

| username: 逍遥_猫 | Original post link

The monitoring data for the cluster’s PD, TiDB, and TiKV have always been retrieved via GET.

| username: Inkjade | Original post link

The logs can be found in tidb-deploy/prometheus-9091/log.

  1. Check if the node status (UP) on your Prometheus is normal.
  2. If it is not normal, check the agent status (network status).
  3. Restart the Prometheus service and observe if it returns to normal.
| username: 有猫万事足 | Original post link

Visit this address

http://{prometheus_ip}:9091/targets

You can see the interval of the last scrape for each target and the duration of the scrape.
It is estimated that some targets are no longer functioning properly.
You can try accessing the URL in front to see if it can return monitoring values normally.

| username: dba远航 | Original post link

Check the communication status between the server and other servers. Is it an issue with log capturing?

| username: 逍遥_猫 | Original post link

  1. The node where Prometheus did not scrape data is normal, with no service restart occurrences. Memory usage is up to 80%, CPU has redundancy, and there are no IO anomalies.
  2. Waiting to change a parameter and try restarting.
| username: 逍遥_猫 | Original post link

Prometheus is successfully scraping monitoring data from other machines. This machine communicates normally with other clusters, with no anomalies or alerts, and the usage of IO, CPU, and memory resources is not extreme. However, the SQL running on the current machine is taking an exceptionally long time.

| username: 逍遥_猫 | Original post link

This can show the last connection record, but how can I see the history?

| username: 像风一样的男子 | Original post link

Prometheus itself has a time-series database, and the data will be stored in the database.

| username: 像风一样的男子 | Original post link

Data is stored in tsdb
/data/tidb-data/prometheus-8249
1701667270916

| username: 逍遥_猫 | Original post link

This is the collected monitoring data.
I want to see the connection log records, the handshake kind.

| username: 逍遥_猫 | Original post link

Why is TiDB’s memory usage 0 here?

| username: 逍遥_猫 | Original post link

Similar to this /data/tidb/tidb-deploy/prometheus-9091/log


But this does not record the ip of the monitored machine.

| username: 有猫万事足 | Original post link

Historical issues, indeed, are difficult to analyze.
If there is no data, it means that no data was collected.
The target call at that moment must have failed.

| username: 像风一样的男子 | Original post link

There are no such detailed records; the overhead is too high.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.