TiKV Component Restarts During Dumpling Backup

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: dumpling备份时候导致TiKV组件发生重启

| username: 小生不才

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version] V5.2.4
[Reproduction Path] Operations performed that led to the issue
Scheduled backup jobs executed daily
[Encountered Issue: Problem Phenomenon and Impact]
During the scheduled backup process using dumpling, the TiKV component restarts, causing the backup job to not complete successfully.
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

| username: TiDBer_小阿飞 | Original post link

Lock conflict or MVCC can’t find the corresponding key-value pair in KV, right?
Check for pessimistic lock conflicts and see what operations were performed.
select * from information_schema.deadlocks;

| username: tidb菜鸟一只 | Original post link

SHOW config WHERE NAME LIKE ‘%pessimistic-txn.pipelined%’;
Check if this parameter is set to true by default. If it is, change it to false and try again.
SET config tikv pessimistic-txn.pipelined=‘false’;

| username: h5n1 | Original post link

The restart happened around 23:31, and the TiKV lock logs above are from 23:24. What about the logs before the restart?

| username: 小生不才 | Original post link

It was found that the TiKV restart was caused by insufficient memory. The concurrency number was modified during the dumpling backup. All components were normal during the backup, but the backup still failed. According to the official documentation for the pessimistic-txn.pipelined parameter, asynchronous writes of pessimistic locks may fail only when TiKV experiences network isolation or node failure.

| username: 小生不才 | Original post link

Do I need to check during the backup period? Currently, there is no lock conflict information.

| username: 小生不才 | Original post link

The logs before the restart are consistent with the logs above, so I just captured a portion randomly.

| username: TiDBer_小阿飞 | Original post link

When a TIKV restart occurs during backup, check the Dashboard and further investigate if there is any lock information to make a better judgment.