Drainer reports meet causality.DetectConflict exec now

translator_bot · June 23, 2024, 10:53am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: drainer报meet causality.DetectConflict exec now

| username: TiDBer_SHQw04Jz

[TiDB Usage Environment] Production\Test Environment\POC
Production Environment

[TiDB Version]
5.1.x

[Encountered Problem]
The drainer log reports the following log frequently. Does it have any impact? What does it mean?
meet causality.DetectConflict exec now

[Reproduction Path] What operations were performed when the problem occurred
[Problem Phenomenon and Impact]
[Attachments]

Relevant logs, configuration files, Grafana monitoring (https://metricstool.pingcap.com/)
TiUP Cluster Display information
TiUP Cluster Edit config information
TiDB-Overview monitoring
Corresponding module’s Grafana monitoring (if any, such as BR, TiDB-binlog, TiCDC, etc.)
Corresponding module logs (including logs one hour before and after the issue)

If the question is related to performance optimization or troubleshooting, please download the script and run it. Please select all and copy-paste the terminal output and upload it.

translator_bot · June 23, 2024, 10:53am

| username: xfworld | Original post link

There are conflicts. It is best to narrow down the scope and see which tables are causing the conflicts.

translator_bot · June 23, 2024, 10:53am

| username: TiDBer_SHQw04Jz | Original post link

The upstream TiDB cluster does indeed have write-write conflicts, but why does it affect the drainer synchronization to the downstream cluster? Moreover, there are no write-write conflicts in the monitoring of the downstream cluster. Actually, this issue mainly affects the synchronization speed of the drainer, making the synchronization very slow. Thank you.

translator_bot · June 23, 2024, 10:53am

| username: neilshen | Original post link

If convenient, please provide the Drainer logs.

There are indeed write conflicts in the upstream TiDB cluster, but why does it affect Drainer’s synchronization to the downstream cluster?

Background: To improve synchronization efficiency, Drainer uses multiple “threads” (i.e., goroutines) to write to the downstream MySQL concurrently.

Due to concurrent writing, Drainer needs to perform conflict checks on upstream transactions; otherwise, conflicts would occur downstream, reducing synchronization performance. When a conflict is detected, Drainer reports meet causality.DetectConflict exec now.

Moreover, there are no write conflicts in the downstream cluster monitoring. Actually, this issue mainly affects the synchronization speed of Drainer, making it very slow. Thank you.

Since conflicts are checked and avoided in advance by Drainer, there are no write conflicts downstream. Drainer’s conflict detection is much less costly than downstream conflict detection, but if the upstream write pattern involves hot spot updates, it will still slow down synchronization. The overall synchronization speed will be close to the upper limit of the downstream hot spot update speed.

translator_bot · June 23, 2024, 10:53am

| username: TiDBer_SHQw04Jz | Original post link

The drainer logs do not contain much information, mostly just meet causality.DetectConflict exec now.
Based on the responses from the experts above and further analysis of the cluster monitoring, it can be concluded that a large number of write conflicts in the upstream TiDB cluster caused the drainer synchronization speed to slow down, leading to increased drainer synchronization delays. Thank you.

I have another question:
For hotspot writes in the TiDB cluster, we can easily troubleshoot. But for the hotspot updates mentioned here, how do we generally analyze them? Are there any corresponding monitors or logs to assist in the analysis?
From the TiDB monitoring, it is evident that the number of replace-type SQL statements has significantly increased, leading to the above issues.

translator_bot · June 23, 2024, 10:53am

| username: neilshen | Original post link

For hot write issues in a TiDB cluster, we can easily troubleshoot them. But for the hot update mentioned here, how do we generally analyze it? Are there any corresponding monitoring tools or logs to assist in the analysis?

The TiDB documentation provides best practices related to hotspots, recommended reading:

translator_bot · June 23, 2024, 10:53am

| username: TiDBer_SHQw04Jz | Original post link

Thank you.

translator_bot · June 23, 2024, 10:54am

| username: system | Original post link

This topic was automatically closed 1 minute after the last reply. No new replies are allowed.