How does TiCDC perceive region split and region merge?

translator_bot · June 23, 2024, 1:45am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiCDC 怎么感知到 region split、region merge 的呀？

| username: mxd-321

I would like to ask the experts, when TiCDC subscribes to a table (test_table), it should subscribe to the region where test_table is located. However, the region may split and merge later. How does TiCDC perceive that test_table needs to add or delete regions?

translator_bot · June 23, 2024, 1:45am

| username: h5n1 | Original post link

The general process is that split and merge will generate raft messages, and CDC monitors the raft messages on the region.

translator_bot · June 23, 2024, 1:45am

| username: OnTheRoad | Original post link

Is it possible to determine through the epoch information of the Region?
In the epoch of the Region, there are conf_ver and version, which represent different version states of this Region. If a Region undergoes membership changes, such as adding or removing a peer, conf_ver will increase by 1. If the Region undergoes a split or merge, the version will increase by 1. Therefore, it is possible to determine the addition, deletion, split, and merge of a Region through epoch information.

translator_bot · June 23, 2024, 1:45am

| username: mxd-321 | Original post link

After CDC detects a split or merge, new regions need to be added. How should the startup time of the region client be managed? Does the raft message explicitly indicate the time point at which the region is split or merged? Or am I misunderstanding something? I hope the experts can clarify.

translator_bot · June 23, 2024, 1:45am

| username: neilshen | Original post link

…, but subsequent regions may split and merge, so how does TiCDC perceive that test_table needs to add or delete regions?

TiCDC maintains the range of data it listens to in memory and listens to the corresponding regions based on the range. If a region undergoes a split/merge, the region will synchronize the relevant information to TiCDC, and TiCDC will listen to the new region based on the information.

…, so how is the startup time of this region client determined? Does the raft message explicitly indicate when the region splits or merges? Or am I misunderstanding something? I hope the expert can clarify.

When TiCDC initiates a region listening request, it includes the current checkpoint ts. After receiving the request, TiKV performs an incremental scan and synchronizes the historical incremental data (checkpoint ts, current ts] to TiCDC, while also synchronizing real-time data to TiCDC.

translator_bot · June 23, 2024, 1:45am

| username: mxd-321 | Original post link

When TiCDC initiates a region listening request, it includes the current checkpoint ts.

Where is this checkpoint ts maintained?

translator_bot · June 23, 2024, 1:45am

| username: neilshen | Original post link

Checkpoint ts is maintained in TiCDC memory and also persisted to PD.

translator_bot · June 23, 2024, 1:45am

| username: mxd-321 | Original post link

Does TiCDC advance the checkpoint ts every time it retrieves data, or is there a specific ts event for that?

translator_bot · June 23, 2024, 1:45am

| username: neilshen | Original post link

The advancement of checkpoint ts depends on the progress of TiCDC synchronizing to the downstream, and then periodically persists the checkpoint ts.

translator_bot · June 23, 2024, 1:45am

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.