Switching the IP of the source data source in a TiDB-DM cluster

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: Tidb-DM集群的source数据源切换ip

| username: baofengyu

[TiDB Usage Environment] Production Environment
[TiDB Version]
TiDB Cluster: v6.1.6
TiDB-DM Cluster: v6.5.0
[Current Usage]
Using the TiDB-DM cluster to complete real-time synchronization from upstream MySQL to TiDB, using IP and port when creating the source.
Task status:
image
[Encountered Problem: Problem Phenomenon and Impact]
Now the upstream MySQL database is switching, and the IP needs to change. Does the current synchronization task support modifying the binlog synchronization point to ensure the synchronization task continues normally after the IP switch?

| username: 连连看db | Original post link

You need to stop the task, make changes, and then restart the task. For details, refer to Data Source Management.

| username: Zhang_Zhi | Original post link

After the original task is stopped, record the synchronized position (pos), then start a new incremental synchronization task. If the IP has changed, just switch the data source.

| username: tidb狂热爱好者 | Original post link

I feel it’s still simple.

| username: baofengyu | Original post link

How do I configure DM to specify the sync pos point for incremental synchronization?

| username: mono | Original post link

Sure. Pause the task. Add a new data source, synchronize the log position specified in the task configuration file, and start the task.

meta:
binlog-name:
binlog-pos:

| username: tidb菜鸟一只 | Original post link

There is no need to add a new task. Just stop the task with stop-task, then use the operate-source stop command to remove the source configuration corresponding to the original MySQL instance address from the DM cluster. Update the MySQL instance address in the source configuration file and use the operate-source create command to reload the new source configuration into the DM cluster. Then restart the data migration task with the start-task command.

| username: mono | Original post link

No one mentioned adding a new task above. If you follow what you said, it really adds a new task. All tasks are deleted. If you’re not careful, you’ll fall into the trap. :joy:

| username: tidb菜鸟一只 | Original post link

Is there an operation to add a new task here? Isn’t it to stop the task first, then recreate the source, and then start the task?

| username: 考试没答案 | Original post link

Try to keep these two versions consistent. I remember version 6 has a bug.

| username: baofengyu | Original post link

Okay, thank you.

| username: baofengyu | Original post link

This requires two instances to have a master-slave relationship and enable GTID.

| username: 有猫万事足 | Original post link

There is no need to enable GTID, he is correct.

Because the data source configuration already includes a unique ID for the data source.

source-id: t1

In the source configuration file, there will be a configuration item similar to the one above.

So after you operate-source stop, change the MySQL instance IP, and then operate-source create, without changing the source-id, it is an update to the data source configuration, in your case, it is an update to the IP.

It has nothing to do with whether GTID is enabled upstream.

The suggestion to update the data source IP and create a new incremental task is also correct, but it is a bit roundabout. It can solve the problem, but it is not the most convenient way.

| username: baofengyu | Original post link

How do you solve the issue of synchronization points after switching to a new IP? The binlog position information of the new IP instance is not related to the position information of the old instance saved in the meta-dm database. Will it start synchronizing from the current position of the new instance after starting? This way, data will be lost, right?

| username: 有猫万事足 | Original post link

The issue with the synchronization position is as follows.

Each task has its own binlog position recorded in the dm_mate table.

You can find the table 【taskname】_syncer_checkpoint under the dm_meta database, which records the current binlog position.

Therefore, as long as the data source ID and task name remain unchanged, there is no need to worry about losing the binlog position.

| username: baofengyu | Original post link

Understood, I found the table where the binlog position is recorded. I am worried that, for example, the old instance’s consumption position recorded in this table is mysql-bin|000045.log, but the latest binlog file of the new instance is mysql-bin|000005.log. According to the above process, the task is started, and the task reads the synchronization position saved in the database as mysql-bin|000045.log. However, the new instance does not have this file and position at all. Will it report an error or start synchronizing from the latest binlog of the new instance? If it starts synchronizing from the latest, wouldn’t there be a possibility of data loss?

| username: 有猫万事足 | Original post link

In this situation, simply changing the IP address will not solve the problem.

The task also needs to be rebuilt. In the task configuration, you can specify the starting binlog position for synchronization. Moreover, if the task name is the same, you need to add --remove-meta when starting the task. This will delete the four tables corresponding to the original task in dm_meta and completely rebuild the task.

| username: baofengyu | Original post link

Okay, thank you for the reply.

| username: Zhang_Zhi | Original post link

Incremental synchronization can be achieved by setting the task-mode to incremental in the task configuration and specifying the starting pos position for synchronization.