Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 如何增量备份某张表的某个时段的数据至tidb,并清理备份好的数据?
【TiDB Usage Environment】Production Environment
【TiDB Version】7.5.0
【Reproduction Path】/
【Encountered Problem: We have a requirement to perform scheduled backups for archiving purposes. For example, backing up the last month’s data of a specific table daily at midnight, and also need to clean up the backed-up data.】
【Resource Configuration】/
【Attachments: Screenshots/Logs/Monitoring】
I did not find such a fine-grained backup tool in the documentation.
             
            
              
              
              
            
           
          
            
            
              
I think Dumpling is more suitable: Using Dumpling to Export Data | PingCAP Documentation Center
You can use the where parameter in Dumpling to filter the data you need to export, and it can be exported as CSV or SQL. For deletion, you need to delete manually or write a simple script.
             
            
              
              
              
            
           
          
            
            
              
Use Dumpling with date conditions, then use find with mtime in the script to determine the time for deletion.
             
            
              
              
              
            
           
          
            
            
              
Dumpling can add WHERE conditions, and it can also back up data at a specified MVCC point in time.
             
            
              
              
              
            
           
          
            
            
              
Dumpling is the correct solution, right?
             
            
              
              
              
            
           
          
            
            
              
Isn’t this the best use of partition tables? Directly partition by month, then back up the corresponding partition to TiDB through any means (such as dumpling, dm, datax, cloudcanel, etc.), and then directly clear the data of the corresponding partition in the data source.
             
            
              
              
              
            
           
          
            
            
              
Partitioned tables are indeed the best solution. I didn’t read the Dumpling documentation carefully. Many thanks to the community experts for their answers.
             
            
              
              
              
            
           
          
            
            
              
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.