DM: Why does the preparation time for the full export phase take so long when using dumpling for full export?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: DM:dumpling全量导出时, 全量阶段准备开始的时间非常长,是怎么回事?

| username: 超7成网友

DM2.0: Configuration of mydumpers used in the full stage.
Description:
After starting the task, the dump directory did not have data for a long time.
Under normal circumstances, after starting the task, the log [All the dumping transactions have started] will be output, and then you can see the files under the dump. However, the problem is that it took nearly two hours to see this [All the dumping transactions have started].

Check the logs
[2023/01/18 01:44:31.490 +08:00] [INFO] [checkpoint.go:660] [“create checkpoint schema”] [task=ph_kefu-10.100.73.42] [unit=“binlog replication”] [component=“remote checkpoint”] [statement=“CREATE SCHEMA IF NOT EXISTS dm_meta”]
[2023/01/18 01:44:33.068 +08:00] [WARN] [db.go:264] [“execute transaction”] [task=kefu] [unit=“binlog replication”] [query=“[CREATE TABLE IF NOT EXISTS dm_meta.kefu_syncer_checkpoint (\n\t\t\tid VARCHAR(64) NOT NULL,\n\t\t\tcp_schema VARCHAR(128) NOT NULL,\n\t\t\tcp_table VARCHAR(128) NOT NULL,\n\t\t\tbinlog_name VARCHAR(128),\n\t\t\tbinlog_pos INT UNSIGNED,\n\t\t\tbinlog_gtid TEXT,\n\t\t\texit_safe_binlog_name VARCHAR(128) DEFAULT ‘’,\n\t\t\texit_safe_binlog_pos INT UNSIGNED DEFAULT 0,\n\t\t\texit_safe_binlog_gtid TEXT,\n\t\t\ttable_info JSON NOT NULL,\n\t\t\tis_global BOOLEAN,\n\t\t\tcreate_time timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,\n\t\t\tupdate_time timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,\n\t\t\tUNIQUE KEY uk_id_schema_table (id, cp_schema, cp_table)\n\t\t)]”] [argument=“”] [“cost time”=1.577981416s]
[2023/01/18 01:44:33.068 +08:00] [INFO] [checkpoint.go:685] [“create checkpoint table”] [task=kefu] [unit=“binlog replication”] [component=“remote checkpoint”] [statements=“["CREATE TABLE IF NOT EXISTS dm_meta.kefu_syncer_checkpoint (\n\t\t\tid VARCHAR(64) NOT NULL,\n\t\t\tcp_schema VARCHAR(128) NOT NULL,\n\t\t\tcp_table VARCHAR(128) NOT NULL,\n\t\t\tbinlog_name VARCHAR(128),\n\t\t\tbinlog_pos INT UNSIGNED,\n\t\t\tbinlog_gtid TEXT,\n\t\t\texit_safe_binlog_name VARCHAR(128) DEFAULT ‘’,\n\t\t\texit_safe_binlog_pos INT UNSIGNED DEFAULT 0,\n\t\t\texit_safe_binlog_gtid TEXT,\n\t\t\ttable_info JSON NOT NULL,\n\t\t\tis_global BOOLEAN,\n\t\t\tcreate_time timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,\n\t\t\tupdate_time timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,\n\t\t\tUNIQUE KEY uk_id_schema_table (id, cp_schema, cp_table)\n\t\t)"]”]
[2023/01/18 01:44:33.077 +08:00] [INFO] [subtask.go:223] [“start to run”] [subtask=kefu] [unit=Dump]
[2023/01/18 01:44:33.079 +08:00] [INFO] [log.go:77] [“detect server type”] [task=kefu] [unit=dump] [type=MySQL]
[2023/01/18 01:44:33.079 +08:00] [INFO] [log.go:77] [“detect server version”] [task=kefu] [unit=dump] [version=8.0.21]

[2023/01/18 03:07:14.680 +08:00] [INFO] [log.go:77] [“All the dumping transactions have started. Start to unlock tables”] [task=kefu] [unit=dump]
[2023/01/18 04:29:15.717 +08:00] [INFO] [log.go:77] [“backup Success summary: total backup ranges: 69, total success: 69, total failed: 0, total take(backup time): 1h22m1.007362539s, total take(real time): 1h22m1.007400392s, total size(MB): 13671.35, avg speed(MB/s): 2.78, total rows: 52397765”] [task=kefu] [unit=dump]
[2023/01/18 04:29:15.804 +08:00] [INFO] [dumpling.go:135] [“dump data finished”] [task=kefu] [unit=dump] [“cost time”=2h44m42.726804241s]

| username: Billmay表妹 | Original post link

Is the version of DM too outdated? Are there any plans to upgrade it soon?

| username: 超7成网友 | Original post link

There are no upgrade plans in the near future.

The issue might be due to other MySQL processes locking the table, preventing the dumper from obtaining a read lock.

| username: 裤衩儿飞上天 | Original post link

Are you importing into MySQL or TiDB?

| username: 超7成网友 | Original post link

Sync data from MySQL to TiDB

| username: 裤衩儿飞上天 | Original post link

When Dumpling exports MySQL, the consistency guarantee method used is flush, which will flush tables with read lock. Specifically as follows:

| username: 超7成网友 | Original post link

If there are other processes at this time, such as canal synchronizing data, the long connection will be blocked, causing the read lock to not be acquired. This will result in such a problem, right?

| username: 裤衩儿飞上天 | Original post link

Long connections do not necessarily cause blocking.
Active transactions, DDL, and flushing cache to disk are all tasks that need to be completed before locking can be successful.
You can search for “mysql flush tables with read lock” for more details.

| username: 超7成网友 | Original post link

Hmm, let me take a look first.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.