Severe Lock Contention

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 锁冲突严重

| username: hygame

[TiDB Usage Environment] Production Environment
[TiDB Version] 6.5.0
[Reproduction Path] None
[Encountered Problem: Problem Phenomenon and Impact]
Recently, we have noticed severe lock conflicts in TiDB, which are affecting the performance of the entire cluster, but the hardware load is very low. We are unsure how to resolve this issue.
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]




| username: 我是咖啡哥 | Original post link

This should only be optimized from the code logic. The database is powerless.

| username: hygame | Original post link

I want to know which specific object caused the conflict and how to troubleshoot it. When I checked the TiDB server node logs, I found that when the latency increased, the logs reported that TiKV was correspondingly slow. However, the log level of the KV node is set to error, and there are no errors reported. Now, I have reported this issue to the R&D team, and they have responded that there is no possibility of causing lock conflicts.

| username: Jiawei | Original post link

You need to judge based on the lock logs of TiKV, which will contain specific data rows. Find the corresponding conflicts and battle with the developers. Most of the locks are caused by unreasonable business design.

| username: Jiawei | Original post link

It is still necessary to avoid lock contention on the business design side. The official documentation has a lock troubleshooting process that you can refer to.

| username: hygame | Original post link

I will first set the kv log level to info online, and then find the conflicting rows.

| username: tidb菜鸟一只 | Original post link

You can first check the system tables to troubleshoot.

| username: dbaspace | Original post link

Check the TIDB-server logs to see if there is a 9007 ([kv:9007] Write conflict), which can prove that it is a write-write conflict.

| username: hygame | Original post link

| username: hygame | Original post link

For table 1277, the logs of a TiDB server show that there are about several thousand Write conflicts per day. Previously, this was reported to the R&D team, and they responded that they need specific statements or specific rows.

| username: hygame | Original post link

I suspect this issue is causing it.

| username: hygame | Original post link

Why is the lock conflict in kv a unique index issue?

| username: 胡杨树旁 | Original post link

The table INFORMATION_SCHEMA.CLUSTER_TIDB_TRX can be used to view lock information, but it can only display locks that are currently being executed.

| username: hygame | Original post link

As long as the stat field is LockWaiting, does it indicate a write-write conflict?

| username: xfworld | Original post link

Enable the resource location feature and set any query slower than a certain number of seconds as a slow query. This way, the SQL will be recorded, and you can review it later.

If there is lock waiting, it is definitely one of those time-consuming ones.

Reference documentation related to lock handling:

| username: hygame | Original post link

Okay, I’ll give it a try, thank you.

| username: zhimadi | Original post link

We also have a similar issue. How can we find the specific statement or specific line and the time?

| username: hygame | Original post link

| username: Jellybean | Original post link

You can refer to the above brother’s operation.