The tikv.sink.buffer-size does not take effect when using TiBigData's Flink-TiDB-Connector to write to TiDB

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 使用TiBigData的Flink-TiDB-Connector写tidb的时tikv.sink.buffer-size不生效

| username: TiDBer_o0MXVuK4

I encountered an issue while using TiBigData’s Flink-TiDB-Connector:

  1. Environment:

    • flink-tidb-connector-1.14
    • tidb 6.5.0
    • flink 1.14.3
  2. SQL:

    SET ‘execution.runtime-mode’ = ‘batch’;
    SET ‘pipeline.name’ = ‘test’;
    SET ‘yarn.application.name’ = ‘test’;
    
    CREATE CATALOG `mytidb`
    WITH (
    ‘type’ = ‘tidb’,
    ‘tidb.database.url’ = ‘jdbc:tidb://10.0.44.83:3390/its_sjzt_jg?useServerPrepStmts=true&cachePrepStmts=true&prepStmtCacheSqlLimit=1024&prepStmtCacheSize=128&rewriteBatchedStatements=true&allowMultiQueries=true&useConfigs=maxPerformance’,
    ‘tidb.username’ = ‘xxxx’,
    ‘tidb.password’ = ‘xxxx’,
    ‘sink.max-retries’ = ‘8’,
    ‘tidb.sink.impl’ = ‘tikv’,
    ‘tikv.sink.transaction’ = ‘minibatch’,
    ‘tikv.sink.buffer-size’ = ‘4096’,
    ‘tikv.sink.deduplicate’ = ‘false’
    );
    
  3. Issue:

    Writing data is extremely slow, taking several hours. Upon investigation, I found that the PRE_WRITE phase’s tikv.sink.buffer-size did not take effect, and it still submits one transaction per data entry. See the images below for details:


| username: Billmay表妹 | Original post link

When using TiBigData’s Flink-TiDB-Connector to write to TiDB, if the tikv.sink.buffer-size parameter does not take effect, it may be due to the following reasons:

  1. Incorrect parameter name: In Flink-TiDB-Connector, the tikv.sink.buffer-size parameter should be an alias for the sink.buffer-flush.max-rows parameter. Using the wrong parameter name may cause the parameter to not take effect.

  2. Invalid parameter value: The value of the sink.buffer-flush.max-rows parameter should be a positive integer, representing the batch size for each commit. If the parameter value is invalid, it may cause the parameter to not take effect.

  3. Parameter not set correctly: In Flink-TiDB-Connector, the sink.buffer-flush.max-rows parameter can be set as follows:

    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
    tableEnv.getConfig().getConfiguration().setInteger("sink.buffer-flush.max-rows", 1000);
    

    If the parameter is not set correctly, it may cause the parameter to not take effect.

To resolve this issue, you can first check if the parameter name is correct, then check if the parameter value is valid, and finally check if the parameter is set correctly. If the problem persists, you can provide more information, such as code snippets, log information, etc., to better help you solve the problem.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.