Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tidb能用otter工具来抽取同步到另外一个tidb吗?
I want to synchronize several large tables from a cloud TiDB to a local TiDB. Can I use the Otter synchronization tool for this operation? I see that Otter works by mimicking itself as a slave to fetch binlogs from MySQL. TiDB also has binlogs, but I’m not sure if it can be used.
It should not be possible, the binlog formats are different. For direct synchronization between two TiDB instances, you can use TiCDC.
I want to synchronize some tables and preferably filter out delete operations. The best version for CDC is 4.0.6. My version is 4.0.2, which is too old. I’m afraid there might be issues.
Using tidb binlog synchronization with the old version should also work.
Maintaining consistency with the TiDB version (MySQL version) is worth a try; it should work.
Based on binlog, it is not possible, but logically it can be.
Are there any tools or methods for logical synchronization? Tools for synchronizing and transferring data across servers?
How large is the table? I usually perform logical backups.
How to perform a logical backup? Using Dumpling and Lightning? Tables with more than a terabyte. There are more than a dozen of them. Planning to use BR to back up the entire database and import it locally first, then only synchronize tables larger than 200GB. For these tables, query and insert data older than 60 days into the local database on a daily basis by date. Delete historical data from the cloud tables.
Up T? It’s almost the size of my entire database here. Better go with BR physical backup.
Migration using BR, what I’m thinking now is, after migrating with BR, regularly back up one day’s data from 60 days ago for a specific table, and then synchronize, synchronizing one day’s data each day.
BR full backup + TiCDC incremental backup should be considered the standard solution.
Although they are both called binlog, the storage formats may not be the same. However, you can give it a try.
In version 4.0, CDC real-time synchronization cannot filter deletes. It seems that only version 6.5 and later can filter them. There is related documentation on this. I need to delete historical data from the source end every day, but the target end should not delete it and should always retain it.
TiDB to TiDB is already supported. For version 4, use TiDB Binlog with Pump + Drainer.
I indeed did not notice this and do not have practical experience in this area.
I read the tidbbinlog documentation, and there are parameters to specify syncing certain tables, but I didn’t see any option to filter delete operations.
This is a standard T+N. You can use DataX; there’s no need to use this real-time method, especially since there’s no delete operation.
Yes, exactly. I just want to set one up. If it can be real-time with delete filtering, that would be great. If not, then it doesn’t need to be real-time; periodic synchronization every day would be fine. Ideally, it should be automatic synchronization.