The Time Consumption of the First Phase (Prewrite Phase) in Two-Phase Commit Transactions

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 事务两阶段提交中第一阶段(prewrite 阶段)的耗时

| username: yulei7633

The first phase (prewrite phase) of the two-phase commit in transactions is relatively slow, mainly occurring in the following stage:


I have a picture below:

| username: yulei7633 | Original post link

This image shows the storage async write duration, including the time consumed by both the raft store pool and the apply pool stages. Looking at the chart, the maximum duration is only 2.5ms. It doesn’t exceed 1.5s. I’m a bit confused.

| username: yulei7633 | Original post link

There have been many slow queries recently. They are not shown in the graph.

| username: yulei7633 | Original post link

The monitoring graphs for the scheduler latch wait duration and storage async snapshot duration (lease read) are shown above, and they don’t appear to be slow. I don’t even know how to investigate this further.

| username: yulei7633 | Original post link

The above is the specific execution time of the execution plan. It shows that the main time consumption is in tikv_wall_time: 1.48s. Where should I look for the monitoring graph for this part?

| username: tidb菜鸟一只 | Original post link

Isn’t it showing that the delay is in commit_log?

| username: yulei7633 | Original post link

Isn’t the commit log phase included in the phase? The highest storage async write duration for the entire phase is 2.50ms.

| username: TiDBer_小阿飞 | Original post link

| username: yulei7633 | Original post link

This is a bit much, I’ll take my time to understand it, thank you.

| username: 小龙虾爱大龙虾 | Original post link

This is line 99, you can change the expression and take a look.

| username: tidb菜鸟一只 | Original post link

Grafana monitoring TiKV details—raft propose—propose wait duration per server
Grafana monitoring TiKV details—raft IO—append log duration
Grafana monitoring TiKV details—raft IO—commit log duration
Grafana monitoring TiKV details—raft propose—apply wait duration
Grafana monitoring TiKV details—raft IO—apply log duration
There are monitoring items for each corresponding stage in Grafana. Let’s check the commit log first.

| username: heiwandou | Original post link

Learn a bit

| username: 路在何chu | Original post link

Inserts are so slow, it’s highly likely a hardware issue, an I/O bottleneck, or insufficient TiKV nodes.

| username: andone | Original post link

Is the disk an SSD?

| username: yulei7633 | Original post link

It’s an NVMe drive.

| username: heiwandou | Original post link

tikv_wall_time

| username: Jellybean | Original post link

The main tasks of the prewrite phase are version checking and conflict detection.

If it takes a long time, you can focus on checking whether there are too many data versions or if there are many transaction conflicts.

| username: 有猫万事足 | Original post link

Go to Grafana, find the TiKV-Trouble-Shooting panel → Pending compaction bytes graph.

I want to see the content of the Pending compaction bytes graph at 10:20.