Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: DM 7.1.1在load阶段,报batch write rows reach max retry 3 and still failed: invalid connection
During the load phase of DM 7.1.1, an error batch write rows reach max retry 3 and still failed: invalid connection
occurred. After executing resume-task
, it did not automatically recover.
The detailed error log is:
[2023/08/28 19:13:47.391 +08:00] [ERROR] [chunk_process.go:476] ["write to data engine failed"] [task=test_mysql_task] [unit=lightning-load] [table=`test`.`serviced_page`] [engineNumber=0] [fileIndex=0] [path=test.serviced_page.0000000000000.sql:0] [task=deliver] [error="[`test`.`serviced_page`] batch write rows reach max retry 3 and still failed: invalid connection"]
It seems from the log that the SQL execution failed, and the failed SQL includes a field where the business stores images encoded in BASE64.
My experience is that if the SQL execution fails, there should be an error code and specific SQL. At least on my side, if DM encounters this invalid connection, it is usually because it cannot connect to the downstream TiDB. It might be an issue with the TiDB configuration set in the task file. Try connecting to TiDB with a MySQL client to check.
How large is the field stored after BASE64 encoding?
I guess you encountered a bug again.
I checked, and there are two JSON fields, with the longest field being 11M. It seems that the import logic was changed to use tidb-lightning after version 6.5, which caused this issue. In version 6.1, importing this table did not have any problems.
After manually deleting these two records from the dump file in dm-work, the synchronization was normal. Could it be that the length somewhere is using uint16, which does not support fields greater than 65535?
Uh, it’s not a BUG, it’s an issue with the downstream database parameter configuration. When testing DM synchronization to see if MySQL → MySQL synchronization is possible, I temporarily used a Docker image. The default configuration for max_allowed_packet is 4194304, and both of these records exceed 4M, so the insertion failed. After adjusting max_allowed_packet to 1GB, this table was successfully imported.
However, the strange thing is that the returned error does not have an error code, it just says the retry failed.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.