Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 备份百万级表
[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] insert into A select * from B
[Encountered Problem: Problem Phenomenon and Impact] Time-consuming and prone to backup failures
[Resource Configuration]
[Attachments: Screenshots / Logs / Monitoring] High resource usage
This will result in large transactions, causing a significant increase in tidb-server memory usage, which may eventually lead to failure. It is recommended to use BR for data backup. If you do not have permission to use BR for backup, it is recommended to perform the backup in batches.
Is the failure due to TiDB OOM or just this SQL being terminated? Check your txn-total-size-limit
(transaction size) and tidb_mem_quota_query
(memory size allowed for a single SQL) to see if they meet the requirements. If resources are insufficient and the transaction is too large, you can enable batch DML before the insert and set how many rows to commit at a time.
It is best to copy table B to table A in batches based on conditions.
Would using the dumpling+lightning route be better?
The purpose is to back up, and the data volume is only in the millions. Dumpling might be more suitable for your needs.
Export and import. Direct insertion can easily exceed large transactions.
Large SQL queries require a lot of memory, but this conflicts with TiDB’s default configuration. TiDB’s underlying layer needs a replication protocol, which seems to default to no more than 1GB. If this is the case, backups are very likely to report errors. Dumpling is more suitable for your needs.
Wouldn’t exporting it to another place be a better strategy?
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.