TiDB Binlog Synchronization is a Bit Confusing, Seeking Clarification

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidbbinlog同步有点绕,求解惑

| username: LBX流鼻血

There are two TiDB clusters:
10.18.xx.xx - Production
10.16.xx.xx - Disaster Recovery
Both sides have set up pump
The slave has also been restored using lighting
The pos number is: 444969304030707734
Binlog has been enabled on both sides: true

May I ask if I want to start syncing from the master to the slave,
should I configure drainer on the master node or the slave node? (I understand that it should be configured on the master, then write the POS, and it will be transmitted, but I have seen many say to configure it on the slave node)
If it is configured on the slave node, how to associate it with the master?

drainer_servers:

  • host: 10.16.xx.xx
    ssh_port: 22
    port: 8249
    deploy_dir: /tidbdata/binlog/drainer-8249/data
    data_dir: /tidbdata/binlog/drainer-8249
    log_dir: log
    config:
    initial-commit-ts: 444969304030707734
    syncer.db-type: “tidb”
    syncer.to.host: “10.16.xx.xx”
    syncer.to.user: “drainer”
    syncer.to.password: “xxx”
    syncer.to.port: 4000
    arch: amd64
    os: linux
| username: Jasper | Original post link

Drainer is configured in the main cluster. You just need to write the corresponding downstream IP, port, username, password, and other information. For specific configuration methods, you can refer to the official documentation:

| username: LBX流鼻血 | Original post link

Is it OK once the expansion is done? There’s no need to start any programs or jobs afterward, right?

| username: hey-hoho | Original post link

Expanding the drainer node into the main cluster, what you mentioned should refer to placing the drainer node in the data center of the secondary cluster. To summarize, the drainer machine is placed together with the secondary database, but the instance is expanded into the main database.

| username: Fly-bird | Original post link

There is already a pump and drainer in the cluster, and the downstream is directly a replica.

| username: Jasper | Original post link

Yes, once you configure and expand the pump and drainer, the synchronization will start. You can check the synchronization status using the command show drainer status.

| username: Soysauce520 | Original post link

If there is a pump, and a drainer is added to the primary database to write file information downstream, you can understand it as establishing a synchronization channel from the primary database to the downstream. My understanding is that the pump receives SQL logs from each TiDB server, and the drainer handles the entire pump log file. The drainer can choose to write to a file or to a downstream database.