Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: DM同步时,没办法对下游设置SetConnMaxLifetime,偶尔会因为下游连接中断,出现误报
Phenomenon
The online DM task occasionally triggers the DM_sync_process_exists_with_error
alert. Upon investigation, it was found that there are basically two types of errors:
Message: database driver, RawCause: driver: bad connection
and execute statement failed: begin\" RawCause:\"invalid connection\"
Cause
The downstream TiDB is set with wait_timeout=600
. If the update frequency of the upstream MySQL is very low, for example, there is only one update every half an hour, the syncer connection of DM will be killed by the downstream TiDB due to being in a sleep state for a long time. If there is an update at this time, this error will be triggered.
Suggested Fix
Add parameters in the DM task configuration to allow users to set Golang’s connection parameters independently (you can only expose ConnMaxLifetime). Or directly retry internally without returning an error. The main issue is that the occasional DM_sync_process_exists_with_error error is really annoying.
Additionally, the name DM_sync_process_exists_with_error
is also a typo, it should be exits_with_error
.
Will there be any issues with data consistency? It is acceptable if data synchronization is not affected after a warning is prompted.
Thank you for the feedback. We will track the progress of the fix at SQL connection should tolerate being killed by idle too long · Issue #7376 · pingcap/tiflow · GitHub.
Setting a fixed value cannot handle idle periods of uncertain duration, so we will address this issue from another perspective.
During the dump phase, if there are large tables during the backup, resulting in connections not being used for a long time, a driver: bad connection
error may occur, causing the task to fail.
The new version should have fixed the issue during the dump phase. Which version of DM are you using?
That indeed shouldn’t happen. Please upload the logs.
Similar issues occur periodically, causing task failures. Later, after modifying the MySQL wait_timeout
parameter, the export was successful.
[2022/10/17 14:56:59.918 +08:00] [INFO] [conn.go:70] ["cannot execute query"] [task=clear_task] [unit=dump] [retryTime=1] [sql="SHOW COLUMNS FROM `ying99_fundtxn`.`campain_discount`"] [args=null] [error="driver: bad connection"]
[2022/10/17 14:57:00.603 +08:00] [ERROR] [dumpling.go:152] ["dump data exits with error"] [task=clear_task] [unit=dump] ["cost time"=16m55.166129053s] [error="ErrCode:32001 ErrClass:\"dump-unit\" ErrScope:\"internal\" ErrLevel:\"high\" Message:\"mydumper/dumpling runs with error, with output (may empty): \" RawCause:\"sql: SHOW COLUMNS FROM `ying99_fundtxn`.`campain_discount`: driver: bad connection\" "]
Please provide the complete log.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.