Sync_diff_inspector Error

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: sync_diff_inspector报错

| username: zhanggame1

[TiDB Usage Environment] Test/
[TiDB Version] 7.2
[Reproduction Path] What operations were performed to encounter the issue
[Encountered Issue: Problem Phenomenon and Impact]
sync_diff_inspector error as follows


Version as follows
image
Configuration file: modified from the official documentation

[root@tidb tmp]# cat config.toml
# Diff Configuration.

######################### Global config #########################

# Number of threads to check data, the number of connections to the upstream and downstream databases will be slightly larger than this value
check-thread-count = 4

# If enabled, SQL statements for fixing inconsistencies will be output if tables are inconsistent.
export-fix-sql = true

# Only compare table structures without comparing data
check-struct-only = false

# If enabled, tables that do not exist in the upstream or downstream will be skipped.
skip-non-existing-table = false

######################### Datasource config #########################
[data-sources]
[data-sources.mysql] # mysql1 is a custom id that uniquely identifies this database instance, used in task.source-instances/task.target-instance below
    host = "10.0.0.26"
    port = 3306
    user = "root"
    password = "XX" # Set the password to connect to the upstream database, can be plaintext or Base64 encoded.

    # (Optional) Use mapping rules to match multiple upstream sharded tables, where rule1 and rule2 are defined in the Routes configuration section below
    # route-rules = ["rule1", "rule2"]

[data-sources.tidb]
    host = "10.0.0.26"
    port = 4000
    user = "root"
    password = "XX" # Set the password to connect to the downstream database, can be plaintext or Base64 encoded.

    # (Optional) Use TLS to connect to TiDB
    # security.ca-path = ".../ca.crt"
    # security.cert-path = ".../cert.crt"
    # security.key-path = ".../key.crt"

    # (Optional) Use TiDB's snapshot feature, if enabled, historical data will be used for comparison
    # snapshot = "386902609362944000"
    # When snapshot is set to "auto", use the synchronization time point of TiCDC in the upstream and downstream, refer to <https://github.com/pingcap/tidb-tools/issues/663>
    # snapshot = "auto"

########################### Routes ###########################
# If you need to compare data from a large number of tables with different schema or table names, or to verify data from multiple upstream sharded tables against a downstream consolidated table, you can set up mapping relationships through table-rule
# You can configure only the schema or table mapping relationship, or both
# [routes]
# [routes.rule1] # rule1 is a custom id that uniquely identifies this configuration, used in data-sources.route-rules above
# schema-pattern = "test_*"      # Match the schema name of the data source, supports wildcards "*" and "?"
# table-pattern = "t_*"          # Match the table name of the data source, supports wildcards "*" and "?"
# target-schema = "test"         # Target schema name
# target-table = "t" # Target table name

# [routes.rule2]
# schema-pattern = "test2_*"      # Match the schema name of the data source, supports wildcards "*" and "?"
# table-pattern = "t2_*"          # Match the table name of the data source, supports wildcards "*" and "?"
# target-schema = "test2"         # Target schema name
# target-table = "t2" # Target table name

######################### Task config #########################
# Configure the tables in the *target database* that need to be compared
[task]
    # output-dir will save the following information
    # 1 sql: SQL files generated after errors are detected, with one file per chunk
    # 2 log: sync-diff.log saves log information
    # 3 summary: summary.txt saves the summary
    # 4 checkpoint: a dir saves checkpoint information for resuming

    output-dir = "./output"

    # Upstream database, the content is the unique identifier id declared in data-sources
    source-instances = ["tidb"]

    # Downstream database, the content is the unique identifier id declared in data-sources
    target-instance = "mysql"

    # Tables in the downstream database that need to be compared, each table needs to include the schema name and table name, separated by `.`
    # Use ? to match any single character; use * to match any; detailed matching rules refer to golang regexp pkg: https://github.com/google/re2/wiki/Syntax
    target-check-tables = ["test.*"]

    # (Optional) Additional configuration for some tables, where config1 is defined in the Table config section below
    #target-configs = ["config1"]

######################### Table config #########################
# Special configuration for some tables, the configured tables must be included in task.target-check-tables
# [table-configs.config1] # config1 is a custom id that uniquely identifies this configuration, used in task.target-configs above
# Target table name, can use regex to match multiple tables, but a table cannot be matched by multiple special configurations.
# target-tables = ["schema*.test*", "test2.t2"]
# (Optional) Specify the range of data to be checked, needs to conform to the syntax of the where clause in SQL
# range = "age > 10 AND age < 20"
# (Optional) Specify the columns used to divide chunks, if not configured, sync-diff-inspector will select some appropriate columns (primary key/unique key/index)
# index-fields = ["col1","col2"]
# (Optional) Ignore the check of certain columns, for example, some types that sync-diff-inspector currently does not support (json, bit, blob, etc.),
# or floating-point data may differ between TiDB and MySQL, you can use ignore-columns to ignore checking these columns
# ignore-columns = ["",""]
# (Optional) Specify the chunk size for dividing this table, if not specified, it can be deleted or set to 0.
# chunk-size = 0
# (Optional) Specify the collation for this table, if not specified, it can be deleted or set to an empty string.
# collation = ""

| username: chenhanneu | Original post link

Set the range=“true” under table config.

| username: zhanggame1 | Original post link

Is it here that needs to be changed?

(Optional) Specify the range of data to be checked, which needs to conform to the syntax of the WHERE clause in SQL

range = “age > 10 AND age < 20”

| username: chenhanneu | Original post link

Target table name, can use regex to match multiple tables, but a table cannot be matched by multiple special configurations simultaneously.

target-tables = [“table.xxxx”]

# (Optional) Specify the range of data to be checked, needs to conform to the syntax of the where clause in SQL

range = “true and true”