Understanding 999/99/9 in Grafana

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: Grafana中的关于999/99/9的理解

| username: 我是咖啡哥

There are many monitoring data related to different percentiles of latency in Grafana monitoring. Looking at this data is a bit confusing. As shown in the figure, 999 indicates 99.9% of the time is 101ms, and 99 is 7.67ms. Does this 99 mean 99% excluding the previous 99.9% or what?

| username: jaybing926 | Original post link

The meaning of this monitoring indicator is: the time TiDB takes to get TSO from PD.
The meanings of these values are as follows:
90% of PD TSO wait duration is 3.81ms
99% of PD TSO wait duration is 7.67ms
99.9% of PD TSO wait duration is 101ms

| username: Minorli-PingCAP | Original post link


| username: 我是咖啡哥 | Original post link

Now I understand, actually 90% of the average time is only 3.81ms, 9% have relatively high latency, causing the 99th percentile average to be dragged from 3.81ms to 7.67ms. There is also a very small portion that is extremely high, possibly up to seconds, causing the 999th percentile to average out to 101ms. If the system is relatively stable, these three values should be quite close.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.