Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 事务两阶段提交中第一阶段(prewrite 阶段)的耗时
The first phase (prewrite phase) of the two-phase commit in transactions is relatively slow, mainly occurring in the following stage:
I have a picture below:
This image shows the storage async write duration, including the time consumed by both the raft store pool and the apply pool stages. Looking at the chart, the maximum duration is only 2.5ms. It doesn’t exceed 1.5s. I’m a bit confused.
There have been many slow queries recently. They are not shown in the graph.
The monitoring graphs for the scheduler latch wait duration and storage async snapshot duration (lease read) are shown above, and they don’t appear to be slow. I don’t even know how to investigate this further.
The above is the specific execution time of the execution plan. It shows that the main time consumption is in tikv_wall_time: 1.48s. Where should I look for the monitoring graph for this part?
Isn’t it showing that the delay is in commit_log?
Isn’t the commit log phase included in the phase? The highest storage async write duration for the entire phase is 2.50ms.
This is a bit much, I’ll take my time to understand it, thank you.
This is line 99, you can change the expression and take a look.
Grafana monitoring TiKV details—raft propose—propose wait duration per server
Grafana monitoring TiKV details—raft IO—append log duration
Grafana monitoring TiKV details—raft IO—commit log duration
Grafana monitoring TiKV details—raft propose—apply wait duration
Grafana monitoring TiKV details—raft IO—apply log duration
There are monitoring items for each corresponding stage in Grafana. Let’s check the commit log first.
Inserts are so slow, it’s highly likely a hardware issue, an I/O bottleneck, or insufficient TiKV nodes.
The main tasks of the prewrite phase are version checking and conflict detection.
If it takes a long time, you can focus on checking whether there are too many data versions or if there are many transaction conflicts.
Go to Grafana, find the TiKV-Trouble-Shooting panel → Pending compaction bytes graph.
I want to see the content of the Pending compaction bytes graph at 10:20.