Is it feasible to support cross-national data synchronization relying on TiDB?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 依托 TiDB 来支持跨国数据同步是否可行?

| username: ealam_小羽

[TiDB Usage Environment] Production Environment
[TiDB Version]
[Encountered Problem]
Background: A large amount of underlying data is in domestic Pgsql and needs to be synchronized overseas.
Current Solution: Domestic Pgsql synchronizes to overseas Pgsql, and then synchronizes to overseas TiDB through DataX.
Problem: The current solution only addresses the link from domestic to overseas, and there is a subsequent need to synchronize overseas TiDB data back to domestic.

I would like to discuss with everyone whether using TiDB’s multi-center approach to ensure data synchronization between domestic and overseas, while using TiCDC to synchronize data to MQ, is feasible for synchronizing data between heterogeneous databases or businesses domestically and overseas.
Or should we consider using Debezium, where domestic and overseas synchronization is done through CDC, and then databases and business parties access the CDC solution?

| username: xfworld | Original post link

Whether it’s cross-country or cross-data center, the most troublesome issues should be network and latency. Then, regardless of the CDC mechanism, the following need to be considered:

  1. Data consistency issues when synchronization fails

  2. Whether the business logic can support data validation

  3. In a multi-center setup, whether the data contains center markers to reduce merging pressure

Please refer to this.

| username: cs58_dba | Original post link

Asynchronous replication reliability is not that high.

| username: ealam_小羽 | Original post link

Due to network and latency issues, it feels like relying on the database for cross-border synchronization is more stable than relying on the CDC mechanism. By using a multi-center approach and distinguishing between the two data write sources overseas and domestically through Placement Rules, it ensures that at least one node has the latest data.

| username: ealam_小羽 | Original post link

Is this asynchronous replication reliability referring to TiDB?
The reliability of CDC doesn’t seem very high either, unless you get an acknowledgment each time, but generally speaking, wouldn’t the concurrency be quite low in this case (due to network latency)?

| username: HACK | Original post link

Got it!

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. No new replies are allowed.