Error in binlog replication: Field 'user_id' doesn't have a default value

【TiDB Usage Environment】Production\Test Environment\POC
Production Environment

【TiDB Version】

【Encountered Problem】
Cluster A synchronizes to Cluster B through drainer
Synchronization interrupted, drainer log error as follows:
[2022/09/05 14:52:50.640 +08:00] [ERROR] [executor.go:111] [“Exec fail, will rollback”] [error=“Error 1364: Field ‘user_id’ doesn’t have a default value”]

Then manually check this data in Cluster B, and it cannot be found:
select user_id from xxx_tb where row_id=9223521735016135040;
ERROR 1364 (HY000): Field ‘user_id’ doesn’t have a default value

If querying other columns, such as name, there will be no error, as shown in the following SQL:
select opt from xxx_tb where row_id=9223521735016135040;

There is a unique index on this table, which is (xxx_id, user_id)

The following SQL shows that this data is already corrupted.
select * from xxx_tb where user_id=‘82946666’;
ERROR 1105 (HY000): [components/tidb_query_executors/src/]: Data is corrupted, missing data for NOT NULL column (offset = 1)

Checked some posts on askgtug, it should be caused by a bug.

Now the questions are:

  1. What caused this bug, and in which version was it fixed?
  2. Is there a way to delete this one corrupted data, so as to avoid redoing the downstream Cluster B?

According to the description on Git, it was already fixed in version 4.0.

5.X no longer recommends using binlog, it is recommended to use tiCDC…

For incorrect data, try this and check if there are any issues with the index and data:

Is it not possible to delete it now?

Thank you, yes, there’s no way to delete this data.

Version 5.1 has come out again, :smile:

Okay, okay.

Is the data on cluster A normal?

The data on cluster A is normal. There should be only one problematic piece of data on cluster B.

