Write Conflict Log Issues

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 写写冲突日志问题

| username: GreenGuan

TiDB encountered a write-write conflict, but it is impossible to locate which index of which table caused it. I referred to this article, but the log output is not as mentioned in the document. In the document, it shows:

[2020/05/12 15:17:01.568 +08:00] [WARN] [session.go:446] ["commit failed"] [conn=3] ["finished txn"="Txn{state=invalid}"] [error="[kv:9007]Write conflict, txnStartTS=416617006551793665, conflictStartTS=416617018650001409, conflictCommitTS=416617023093080065, key={tableID=47, indexID=1, indexValues={string, }} primary={tableID=47, indexID=1, indexValues={string, }} [try again later]"]

In production, it shows:

[txn.go:83] [RunInNewTxn] ["retry txn"=442462672668854215] ["original txn"=442462672668854215] [error="[kv:9007]Write conflict, txnStartTS=442462672668854215, conflictStartTS=442462672668854858, conflictCommitTS=442462672668855267, key=[]byte{0x6d, 0x4e, 0x65, 0x78, 0x74, 0x47, 0x6c, 0x6f, 0x62, 0xff, 0x61, 0x6c, 0x49, 0x44, 0x0, 0x0, 0x0, 0x0, 0xfb, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x73} primary=[]byte{0x6d, 0x4e, 0x65, 0x78, 0x74, 0x47, 0x6c, 0x6f, 0x62, 0xff, 0x61, 0x6c, 0x49, 0x44, 0x0, 0x0, 0x0, 0x0, 0xfb, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x73} [try again later]"]

I reviewed the code and it seems to have reached the line mKey, mField, err := tablecodec.DecodeMetaKey(key) in the prettyWriteKey function. What does this hexadecimal encoding mean?
byte{0x6d, 0x4e, 0x65, 0x78, 0x74, 0x47, 0x6c, 0x6f, 0x62, 0xff, 0x61, 0x6c, 0x49, 0x44, 0x0, 0x0, 0x0, 0x0, 0xfb, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x73}

func newWriteConflictError(conflict *kvrpcpb.WriteConflict) error {
	if conflict == nil {
		return kv.ErrWriteConflict
	}
	var buf bytes.Buffer
	prettyWriteKey(&buf, conflict.Key)
	buf.WriteString(" primary=")
	prettyWriteKey(&buf, conflict.Primary)
	return kv.ErrWriteConflict.FastGenByArgs(conflict.StartTs, conflict.ConflictTs, conflict.ConflictCommitTs, buf.String())
}

func prettyWriteKey(buf *bytes.Buffer, key []byte) {
	tableID, indexID, indexValues, err := tablecodec.DecodeIndexKey(key)
	if err == nil {
		_, err1 := fmt.Fprintf(buf, "{tableID=%d, indexID=%d, indexValues={", tableID, indexID)
		if err1 != nil {
			logutil.BgLogger().Error("error", zap.Error(err1))
		}
		for _, v := range indexValues {
			_, err2 := fmt.Fprintf(buf, "%s, ", v)
			if err2 != nil {
				logutil.BgLogger().Error("error", zap.Error(err2))
			}
		}
		buf.WriteString("}}")
		return
	}

	tableID, handle, err := tablecodec.DecodeRecordKey(key)
	if err == nil {
		_, err3 := fmt.Fprintf(buf, "{tableID=%d, handle=%d}", tableID, handle)
		if err3 != nil {
			logutil.BgLogger().Error("error", zap.Error(err3))
		}
		return
	}

	mKey, mField, err := tablecodec.DecodeMetaKey(key)
	if err == nil {
		_, err3 := fmt.Fprintf(buf, "{metaKey=true, key=%s, field=%s}", string(mKey), string(mField))
		if err3 != nil {
			logutil.Logger(context.Background()).Error("error", zap.Error(err3))
		}
		return
	}

	_, err4 := fmt.Fprintf(buf, "%#v", key)
	if err4 != nil {
		logutil.BgLogger().Error("error", zap.Error(err4))
	}
}
| username: xfworld | Original post link

Refer to the documentation. Identify the table using tableID=47 first, and then see if there is an appropriate solution for the business.

| username: GreenGuan | Original post link

The problem is that the tableID cannot be identified from the log.

In the TiDB documentation:
key={tableID=47, indexID=1, indexValues={string, }}

Actual log:
key=byte{0x6d, 0x4e, 0x65, 0x78, 0x74, 0x47, 0x6c, xxx

| username: 有猫万事足 | Original post link

I didn’t look at the code carefully, but I have a guess.
I assembled the key you mentioned above from the array, treating 0x0 as 00.
The assembled key is as follows:

6d4e657874476c6f62ff616c494400000000fb0000000000000073

I then compared it with the key in the INFORMATION_SCHEMA.TIKV_REGION_STATUS table, and the number of digits matched.

So, can we use this key to try to locate the region_id and table_id between start_key and end_key, and thus find out which table it corresponds to?

This log might precisely indicate a write-write conflict for a specific key value. It’s just that an error occurred when translating to the table id.

If the guess is correct, then executing the following SQL should find something:

select * from INFORMATION_SCHEMA.TIKV_REGION_STATUS trs where trs.START_KEY like '6D4E657874476C6F62FF616C494400000000%'

If nothing is found, narrow down the range. Of course, if the first few letters don’t match, then the guess is probably unreliable. :sweat_smile:

| username: redgame | Original post link

The logs in production are in the context. Look for them.

| username: xfworld | Original post link

I translated the Java Hex for you,
6D4E657874476C6F62FF616C494400000000FB0000000000000073

@有猫万事足 You can verify your guess…

You can give it a try…

Execution result:
image

| username: xfworld | Original post link

The AI assistant provided some replies that you can refer to:


In TiDB, the table id is a uint64 type number, but in logs, it usually appears in hexadecimal form. If you need to convert the hexadecimal table id to a decimal number, you can use functions in programming languages or online tools for conversion.

For example, in Go language, you can use the strconv.ParseUint() function to convert a hexadecimal string to a uint64 type number. The sample code is as follows:

package main

import (
	"fmt"
	"strconv"
)

func main() {
	hexStr := "3807"
	tableID, err := strconv.ParseUint(hexStr, 16, 64)
	if err != nil {
		fmt.Println("Error:", err)
		return
	}
	fmt.Println(tableID)
}

The output is:

14343

In this, the first parameter of the strconv.ParseUint() function is the hexadecimal string to be converted, the second parameter is the base to convert from (here it is hexadecimal), and the third parameter is the type of the converted number (here it is uint64).


In TiDB, the primary key is a byte array ([]byte) type, but in logs, it usually appears in JSON format string. If you need to convert the primary key in the log to a byte array, or convert a byte array to a JSON format string, you can use functions in programming languages for conversion.

For example, in Go language, you can use the json.Marshal() function to convert a byte array to a JSON format string, and use the json.Unmarshal() function to convert a JSON format string to a byte array. The sample code is as follows:

package main

import (
	"encoding/json"
	"fmt"
)

func main() {
	// Convert byte array to JSON format string
	primaryKey := []byte{0x6d, 0x79, 0x50, 0x72, 0x69, 0x6d, 0x61, 0x72, 0x79}
	primaryKeyJSON, err := json.Marshal(primaryKey)
	if err != nil {
		fmt.Println("Error:", err)
		return
	}
	fmt.Println(string(primaryKeyJSON)) // Output: "[109,121,80,114,105,109,97,114,121]"

	// Convert JSON format string to byte array
	primaryKeyJSON = []byte(`[109,121,80,114,105,109,97,114,121]`)
	var primaryKey2 []byte
	err = json.Unmarshal(primaryKeyJSON, &primaryKey2)
	if err != nil {
		fmt.Println("Error:", err)
		return
	}
	fmt.Println(primaryKey2) // Output: [109 121 80 114 105 109 97 114 121]
}

In this, the json.Marshal() function converts a byte array to a JSON format string, and the json.Unmarshal() function converts a JSON format string to a byte array. In the json.Unmarshal() function, the second parameter is a pointer to the target variable used to store the parsed result.

| username: Anna | Original post link