TiDB fails to synchronize large tables when using flink-sql-connector

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb 使用flink-sql-connector同步时,大表不能成功同步

| username: TiDBer_CnmB0Ekn

Describe the bug(Please use English)
A clear and concise description of what the bug is.
When using Flink SQL to synchronize with TiDB, tables with large amounts of data fail to synchronize, while tables with the same structure but smaller amounts of data can synchronize successfully.
Environment :

  • Flink version : 1.15.1
  • Flink CDC version: flink-sql-connector-tidb-cdc-2.2.1.jar
  • Database and version: TiDB 6.1

To Reproduce
Steps to reproduce the behavior:

  1. The test data :
  2. The test code :
  3. The error :
    2022-08-18 11:19:34,325 WARN org.tikv.common.region.AbstractRegionStoreClient - no followers of region[4052] available, retry
    2022-08-18 11:19:34,326 WARN org.tikv.common.operation.RegionErrorHandler - request failed because of: UNKNOWN
    2022-08-18 11:19:34,333 INFO org.tikv.cdc.CDCClient - remove regions:
    2022-08-18 11:19:34,333 WARN org.apache.flink.runtime.taskmanager.Task - Source: cnft_item_backup[1] (1/1)ClassCastException: java.lang.Integer cannot be cast to java.lang.Long when using oracle connector #121 (c40c3c7e9651b3343fa3c8f33eadd159) switched from RUNNING to FAILED with failure cause: org.tikv.common.exception.TiClientInternalException: Error scanning data from region.
    at org.tikv.common.operation.iterator.ScanIterator.cacheLoadFails(ScanIterator.java:115)
    at org.tikv.common.operation.iterator.ConcreteScanIterator.hasNext(ConcreteScanIterator.java:105)
    at java.base/java.util.Iterator.forEachRemaining(Unknown Source)
    at org.tikv.txn.KVClient.scan(KVClient.java:117)
    at com.ververica.cdc.connectors.tidb.TiKVRichParallelSourceFunction.readSnapshotEvents(TiKVRichParallelSourceFunction.java:172)
    at com.ververica.cdc.connectors.tidb.TiKVRichParallelSourceFunction.run(TiKVRichParallelSourceFunction.java:127)
    at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:110)
    at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:67)
    at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:332)
    Caused by: org.tikv.common.exception.TiClientInternalException: ScanResponse failed without a cause
    at org.tikv.common.region.RegionStoreClient.isScanSuccess(RegionStoreClient.java:315)
    at org.tikv.common.region.RegionStoreClient.scan(RegionStoreClient.java:306)
    at org.tikv.common.region.RegionStoreClient.scan(RegionStoreClient.java:346)
    at org.tikv.common.operation.iterator.ConcreteScanIterator.loadCurrentRegionToCache(ConcreteScanIterator.java:80)
    at org.tikv.common.operation.iterator.ScanIterator.cacheLoadFails(ScanIterator.java:80)
    … 8 more

Additional Description
If applicable, add screenshots to help explain your problem.

【 TiDB Usage Environment】Production, Testing, Research
【 TiDB Version】
【Encountered Problem】
【Reproduction Path】Operations performed that led to the problem
【Problem Phenomenon and Impact】

【Attachments】

Please provide the version information of each component, such as cdc/tikv, which can be obtained by executing cdc version/tikv-server --version.

| username: xfworld | Original post link

Is it caused by incorrect data format,

2022-08-18 11:19:34,333 WARN org.apache.flink.runtime.taskmanager.Task - Source: cnft_item_backup[1] (1/1)ClassCastException: java.lang.Integer cannot be cast to java.lang.Long when using oracle connector #121 (c40c3c7e9651b3343fa3c8f33eadd159) switched from RUNNING to FAILED with failure cause: org.tikv.common.exception.TiClientInternalException: Error scanning data from region.

The type conversion error described in the log

| username: TiDBer_CnmB0Ekn | Original post link

Oracle connector: I am using the TiDB connector. There is another issue: why does the problem not occur when the table has a small amount of data?

| username: TiDBer_CnmB0Ekn | Original post link

Does the data type mismatch mean that the type in the TiDB field is inconsistent with the table data type in Flink?

| username: xfworld | Original post link

Check it yourself.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.