Some Questions About TiCDC, Kindly Requesting Expert Answers~

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 关于 TiCDC 的一些问题,恳请大佬解答一下~

| username: mxd-321

  1. If multiple changefeeds in TiCDC subscribe to the same table, does each task need to fetch data from the regions in TiKV? For example, if Task 1 fetches Table A and Task 2 also fetches Table A, do both tasks need to fetch data from TiKV?
  2. Based on Question 1, if each task needs to fetch data from the regions in TiKV, the number of gRPC connections will increase. Will this have a significant impact on the TiKV cluster?

Since I am a beginner in Go language and do not have the ability to read the source code yet, I sincerely request the experts to answer these questions.

| username: neilshen | Original post link

  1. If multiple changefeeds in TiCDC subscribe to the same table, does each task need to fetch data from the region in TiKV? For example, Task 1 fetches Table A, and Task 2 also fetches Table A. Do both tasks need to fetch data from TiKV?

Yes, the data of Table A will be fetched twice.

  1. Based on question 1, if each task needs to fetch data from the region in TiKV, will the increase in gRPC connections significantly impact the TiKV cluster?

Different changefeeds on TiCDC will reuse gRPC connections. In the current implementation, there will only be one gRPC connection between a single TiCDC and TiKV. The impact of TiCDC on TiKV mainly depends on the business load. If it is read-heavy and write-light, the impact is minimal; if it is write-heavy and read-light and the CPU load of the gRPC module in TiKV is already high, there might be some impact.

| username: mxd-321 | Original post link

Is the write-heavy and read-light scenario referring to business writes? If it’s business writes, TiKV not only needs to store the data but also send it to TiCDC, which indeed consumes some CPU. By the way, does TiCDC actively pull the changelog from TiKV, or does TiKV actively send it to TiCDC? I remember there is a TiKV CDC component. Is there any documentation related to this component?

| username: neilshen | Original post link

TiCDC initiates the connection to TiKV. Once the connection is established, TiKV actively sends data to TiCDC. The message sending is flow-controlled; TiKV can only send successfully when TiCDC is able to process it. Currently, there is no publicly available documentation on the principles of the CDC component in TiKV.

| username: mxd-321 | Original post link

Okay, thank you, expert.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.