Cluster Encountered ERROR 1105

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 集群出现ERROR 1105

| username: dbaspace

[TiDB Usage Environment] Testing
[TiDB Version] V4.0.8

[Encountered Problem: Symptoms and Impact]
The cluster cannot add DDL operations, and writes (DML) also report errors. Executing admin show ddl results in an exception.
tidb log


Check the TIKV log where the exception occurred

Unable to start after shutting down one tidb node, exception log:

DM writes tidb cluster exception information:
“msg”: "[code=10006:class=database:scope=not-set:level=high] execute statement failed: REPLACE INTO db_message_sync.tbl_message_queue (id,messageKey,signMD5,groupID,shopID,brandID,cardTypeID,channelSignID,accountNo,chargeNum,toMobile,messageContent,serviceCode,serviceSubCode,bizSrc,messageType,sendType,startSendTime,priorityLevel,effectiveTimeLen,sendCount,lastSendTime,messageStatus,transStatus,remark,properties,action,actionStamp,createTime,intTelCode,thirdCode,thirdMessageKey,thirdAccount,sysCode,costPrice,salesPrice) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?): Error 1105: tikv aborts txn: Txn(Mvcc(DefaultNotFound { key: [109, 68, 66, 58, 50, 57, 50, 52, 51, 255, 0, 0, 0, 0, 0, 0, 0, 0, 247, 0, 0, 0, 0, 0, 0, 0, 104, 84, 97, 98, 108, 101, 58, 50, 57, 255, 50, 54, 56, 0, 0, 0, 0, 0, 250] }))

| username: 胡杨树旁 | Original post link

Has the GC time already passed? The error seems to indicate that the key cannot be found.

| username: dbaspace | Original post link

Now any operation results in an error. It should be fine to run GC. The error seems to be an MVCC issue. I’m just not sure if using recover-mvcc will be effective.

| username: 胡杨树旁 | Original post link

I seem to have misunderstood. Is this DM synchronizing data? From MySQL to TiDB?

| username: dbaspace | Original post link

Yes, the GC time is 10 minutes.

| username: CuteRay | Original post link

Which version of DM is it?
You can refer to this
[FAQ] tikv aborts txn: Txn(Mvcc(PessimisticLockNotFound - :milky_way: Operations Guide / TiDB Common FAQ - TiDB Q&A Community (asktug.com)

| username: CuteRay | Original post link

Also, refer to this issue.

| username: dbaspace | Original post link

Currently, the cluster operation DDL is stuck, and it is not possible to perform create/drop operations. Additionally, DDL maintenance commands (admin show ddl jobs;) are also reporting errors. I have already followed the instructions, but it didn’t help.

| username: tidb狂热爱好者 | Original post link

Follow my method: first, shut down one TiDB instance. After shutting it down, continue to shut down the others. Once all are shut down, restart TiDB, and it should be fine. The DDL cannot select a leader. Shut down all TiDB instances and then restart them; this will resolve the issue.

| username: dbaspace | Original post link

Now if I shut down one, it cannot be started. There are 2 nodes left, so I will first back up the cluster.

| username: 胡杨树旁 | Original post link

Does admin show ddl allow you to see which node is the owner?

| username: TiDBer_jYQINSnf | Original post link

./target/debug/tikv-ctl ldb --column_family=default --db=./db/ get 0xXXXXXXXXXXXXXXXX --hex
Replace 0xXXXXXXXX with the key output in your log.
Replace -db with the db in the tikv directory.
Check all tikv nodes to see if this key exists.
If it indeed doesn’t exist, it might have been lost.

| username: dbaspace | Original post link

All the TiDB servers are down and can’t be started now. :grinning:

| username: dbaspace | Original post link

All cluster nodes tidb-server are down. Log message:

| username: dbaspace | Original post link

Executing this command results in an error, and the same issue occurs when operating on the owner node.