Why is it recommended to deploy TiCDC downstream when there is network latency? What is the principle behind this?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 当有网络延迟时,ticdc 建议在部署在下游,这是什么原理 ?

| username: Raymond

Excuse me, teachers, when there is network latency, it is recommended to deploy ticdc downstream. What is the principle behind this?

| username: tidb菜鸟一只 | Original post link

You can look at it this way: if a remote TiDB cluster uses TiCDC to synchronize data, deploying it at the source end means low latency for select and high latency for update. Deploying it at the target end means high latency for select and low latency for update. Of course, it’s not really select; it’s reading the data changes from the source end, and update means changing the data at the target end.

| username: TiDBer_jYQINSnf | Original post link

I understand that TiCDC captures changes in batches, but executes SQL statements one by one downstream. If the latency is high, deploying it downstream means that a batch transaction from upstream to downstream will be slower, but it only slows down once. Executing SQL statements will be fast.

If deployed upstream, capturing changes is fast, but executing SQL statements becomes slower because the transaction is parsed into many SQL statements and executed one by one, with back-and-forth interactions, which significantly increases latency.

This is just a rough idea, and I haven’t looked into TiCDC’s source code.

| username: tomsence | Original post link

It is only necessary when the upstream and downstream latency is relatively high. Deploying it downstream reduces the network latency between TiCDC and the synchronized system, making the synchronized system execute faster.

| username: Raymond | Original post link

I think it should be understood this way: if deployed downstream, TiCDC only needs to read the log changes from upstream, so it only needs to transmit logs. TiCDC converts these logs into data, without the need to transmit data. If TiCDC is deployed upstream, then it needs to transmit data downstream, which would consume more network bandwidth. Therefore, deploying TiCDC downstream can save network bandwidth, and considering network latency, it is more recommended to deploy it downstream.

| username: yiduoyunQ | Original post link

  1. up stream execute SQL → 2. TiCDC (grpc) fetch change event from TiKV → 3. TiCDC (mysql client) write SQL to down stream → down stream execute SQL
    1. It can be considered to have its own batch, which is less affected by network latency.
    1. It is an ordinary MySQL client interacting with the downstream via TCP. For example, if you open a MySQL command line on the upstream and connect to the downstream to execute SQL, each SQL statement will go through the complete network. Large transactions are affected by network latency in proportion to the size of the transaction.
| username: Raymond | Original post link

It can be considered to have built-in batch processing, which is less affected by network latency.

| username: yiduoyunQ | Original post link

I’m not familiar with TiCDC either. My personal understanding is that the upstream TiDB → TiCDC transmits TiKV change events, which is similar to MySQL binlog events. At this stage, there is no need to consider the order of concurrency. Subsequently, TiCDC parses the events and converts them into MySQL SQL/kafka events and writes them to the downstream in order (by table). TiCDC → downstream transmits logical SQL (this can be observed by enabling general log on the downstream).

| username: TiDBer_jYQINSnf | Original post link

I understand it roughly like this: TiCDC captures a transaction change from the upstream. If this transaction is several megabytes, it sends these several megabytes to the downstream at once. After reaching the downstream, this several-megabyte transaction is parsed into hundreds or thousands of SQL statements, which are then executed one by one. Conversely, if hundreds or thousands of SQL statements are sent one by one from the upstream, the round-trip acknowledgment would also be slow.

| username: Raymond | Original post link

Yes, my understanding is the same as yours, mainly to reduce network bandwidth latency.

| username: cs58_dba | Original post link

It feels suitable for situations where latency is not a concern.

| username: TiDBer_jYQINSnf | Original post link

Well, there is network latency, so we can only try to minimize it as much as possible.