Dear experts, I have a few questions about monitoring

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 各位大佬,问几个监控上的问题

| username: 气死人的萌新

There are too many monitoring metrics, and it’s a bit confusing. I have referred to the official documentation for monitoring metrics.

Here are a few commonly used parameters. Could the experts check if the monitoring metrics I am looking at are correct?
For example, to check CPU usage, just look at:
image
To check the memory utilization of the application server, should I look at TiDB’s Memory Usage? I’m not quite sure which metric to look at here.
image

For file system IO, I should look at the TiKV panel, but I’m not sure which part. Is it the Threads section?
Which metric should I look at for the network? Is it Network Traffic in the overview?
image

| username: 气死人的萌新 | Original post link

TPS and QPS can be directly viewed on the dashboard.
image

IOPS can be directly viewed on the Overview.
image

Is that correct?

| username: Miracle | Original post link

Overview looks at the overall picture, selecting some of the more important metrics from the TiDB, PD, and TiKV panels. Generally, to check CPU or memory usage, you can go to the corresponding TiDB, PD, or TiKV panels. For example, if you want to see the CPU usage of TiDB, you can go to the TiDB panel. Metrics like TPS and QPS can also be found in the TiDB panel.

| username: 气死人的萌新 | Original post link

Boss, may I ask which monitoring metrics you usually look at when reviewing these modules?

| username: Miracle | Original post link

These modules seem to look like nodes?
You can see the CPU, memory, and IO usage of the nodes on the overview panel. Swap seems to be off by default, right?
It seems like the monitoring of nodes is just those in the overview, not sure if I missed any…
In our environment, we have dedicated monitoring for nodes, and I usually check from there. For TiDB, I only focus on the monitoring of each component…

| username: Fly-bird | Original post link

In the dashboard, you can see memory, CPU, and IO. The database system shouldn’t worry about swap; if it’s being used, your database is basically unusable. For the network, just look at QPS. If you really want to see it, check the overview in Grafana.

| username: 气死人的萌新 | Original post link

Thank you, master. I remember that I turned off the swap when configuring the database.

| username: 气死人的萌新 | Original post link

Thank you, master. I think I found it. I’ll check again during the stress test later.

| username: Fly-bird | Original post link

Databases do not use swap because it is disk-based virtual memory. No matter how fast it is, it cannot compare to actual memory. I don’t think any database would use it.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.