Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: 高并发【批量on duplicate key update】时:AUTO_RANDOM(5) 生成重复
[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version] 6.5.3 (3 TiKV machines)
[Reproduction Path] Not yet reproducible
[Encountered Problem: Problem Phenomenon and Impact]:
I have a table in TiDB4 [with the same table structure in TiDB6.5.3]:
CREATE TABLE `ads_test` (
`id` bigint(20) NOT NULL /*T![auto_rand] AUTO_RANDOM(5) */ COMMENT 'Auto-increment Id',
`index_dt` datetime(3) DEFAULT '1970-01-01 00:00:00.000',
`store_id` varchar(64) DEFAULT NULL,
`visit_date_hour` datetime(3) DEFAULT NULL,
`visit_date` varchar(64) DEFAULT NULL,
`visit_hour` varchar(64) DEFAULT NULL,
`add_payment_info_cnt` bigint(20) DEFAULT NULL,
`etl_datetime` datetime(3) DEFAULT NULL,
`dt` varchar(64) DEFAULT '1970-01-01',
`hr` varchar(64) DEFAULT '00',
PRIMARY KEY (`id`),
KEY `dws_index_dt` (`index_dt`,`dt`,`hr`),
UNIQUE KEY `uk_store_id_visit_date_visit_hr` (`store_id`,`visit_date_hour`,`index_dt`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin /*T![auto_rand_base] AUTO_RANDOM_BASE=2011348235 */ COMMENT='Test Table'
I use TiDB 4’s TiCDC to synchronize to TiDB6.5.3 through the following link:
TiDB4 —> TiCDC —> Kafka ----> Java code (JDBC write) -----> TiDB6.5.3
When writing to TiDB6 via JDBC, I use the syntax insert into ...... on duplicate key update ....
and batch insert data [each batch varies from 50 to 100 rows]. However, the id
field is not included [expecting a new id
to be automatically generated in TiDB6 via AUTO_RANDOM].
The data volume is over 90 million rows. After synchronization, I found that some data in TiDB6 was missing and could not be found. However, I have logged every synchronized data at the Java layer and found that the Java layer has records of synchronization, but there is no data in TiDB6.
Currently, it is suspected that the data was written to TiDB6, but subsequent data inserted generated duplicate id
s via AUTO_RANDOM, and then the previous data was overwritten by the on duplicate key update
mechanism.