Issue with INSERT INTO ON DUPLICATE KEY UPDATE not working

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: INSERT INTO ON DUPLICATE KEY UPDATE失效问题

| username: xxxxxxxx

tidb 4.0.13

sql_mode STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION

Business table: There is an auto-increment primary key and a unique composite index uk_name(a,b,c), with data types bigint(20), varchar(100), and varchar(20) respectively.

Character set: utf8mb4

Collation: utf8mb4_bin

Phenomenon: There is an issue where some INSERT INTO ON DUPLICATE KEY UPDATE operations (indicating Query OK, 2 rows affected) fail (the insert does not include the primary key field, only the composite unique index and other fields, and the update part involves non-unique index fields). That is, when encountering duplicate values, the update part is not executed (due to the sensitivity of business data, screenshots are not convenient).

However, when creating a new database in the current instance and using the create table like method to create the business table in the new database, and then inserting this row of data into the new table, executing the same INSERT INTO ON DUPLICATE KEY UPDATE achieves the expected result.

Moreover, this issue only affects some data in the business table.

Replacing INSERT INTO ON DUPLICATE KEY UPDATE with replace shows the same phenomenon. Additionally, during testing, it was found that executing and querying within a transaction works fine. The specific process is as follows:

Test:

  1. Start a new transaction and record the current data through select.
  2. Execute INSERT INTO ON DUPLICATE KEY UPDATE (indicating Query OK, 2 rows affected).
  3. Query again and find that it matches the expectation (the update part is executed, and the data is updated).
  4. Commit the transaction (Query OK, 0 rows affected).
  5. Query again and find that the data has not been persisted, i.e., it is still the original data.

Another test was conducted by removing the ON DUPLICATE KEY UPDATE part from INSERT INTO ON DUPLICATE KEY UPDATE, leaving only insert into, which directly reported a unique key conflict, indicating that the unique index constraint is fine.

Tests on update and delete operations showed that direct update (update table set col1=‘xxx’, col2=‘yyy’ where a = ‘’ and b = ‘’ and c = ‘’) also did not work, and delete (delete from table where a = ‘’ and b = ‘’ and c = ‘’) had the same issue. The difference is that after delete execution, the auto-increment primary key changes, but other field data remains unchanged.

Therefore, I would like to ask for troubleshooting ideas for this issue.

Additional Information
Table structure
Image

Update test, updating the time field from 2023-01-11 to 2023-01-12, first querying to find 25 matching rows

It can be seen that the update statement returns a result indicating 25 rows updated, but in reality, only three rows were updated.

Operation within a transaction


It can be seen that within the transaction, the operation meets expectations, but after committing and querying again, only some data is updated.

Execution plan

Summary of the issue:
When performing DML operations on data, some data exhibits DML anomalies:

  1. insert into on duplicate update update ---- Based on the unique key
  2. direct update — Updating based on the first field of the unique key, only some data is changed
  3. delete (primary key changes after delete execution) — Based on the unique key

Only some data has issues, not all data, and it cannot be consistently reproduced in other environments

| username: db_user | Original post link

Could you provide the complete ON DUPLICATE KEY UPDATE statement for us to take a look? It seems there might be an issue with the way the statement is written.

| username: 特雷西-迈克-格雷迪 | Original post link

Steps 1/2/3 are examples of optimistic locking, right?

| username: xxxxxxxx | Original post link

The data is quite sensitive and inconvenient to view. The sentences are fine, and the SQL format is the same. Other data is fine, but only a portion of the data has issues.

| username: xxxxxxxx | Original post link

The general log shows it is a pessimistic transaction
txn_mode=PESSIMISTIC

| username: db_user | Original post link

Could you please provide the reproducible table structure for the TiDB logs?

When operating on the same table, are some data entries successful while others are not?

What is the data type and length of the unique key? Additionally, it would be best if you could provide the table structure, especially the information on primary keys and unique index fields.

| username: xxxxxxxx | Original post link

The data length is far from the limit, and the tables with the same structure in other databases under the same instance have no issues. Moreover, only some data has problems. I just tested a new issue: direct updates do not update, and direct deletes do not delete.

Update returns: Rows matched: 1 Changed: 1 Warnings: 0
Delete returns: Query OK, 1 row affected

| username: db_user | Original post link

Are you using a unique key of type char, varchar, bigint, or something else? Is the unique index composed of a single field or multiple fields? Has the cluster enabled a new collation? Is the table collation _bin or _ci? What are the relevant logs in the TiDB logs for this operation? If you can’t provide this information, you can check according to these points and see if there is a corresponding issue on GitHub. Alternatively, if you can’t provide this information, you can verify the minimal reproducible steps yourself and provide them.

| username: 脚本小王子 | Original post link

In TiDB v5.3.0, no issues were found following the steps provided by the original poster. Could the original poster reproduce the issue using the test data?

Here is my test:

###################################
#     Create table and prepare test data #
###################################
mysql> create table t3(id int not null auto_increment primary key, name varchar(30) not null, sex tinyint);
Query OK, 0 rows affected (0.14 sec)

mysql> create unique index uniq_name on t3(name);
Query OK, 0 rows affected (2.92 sec)

mysql> insert into t3 values(1, 'a1', 0), (2,'a2', 1), (3,'a3',0);
Query OK, 3 rows affected (0.02 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from t3;
+----+------+------+
| id | name | sex  |
+----+------+------+
|  1 | a1   |    0 |
|  2 | a2   |    1 |
|  3 | a3   |    0 |
+----+------+------+
3 rows in set (0.01 sec)


###################################
#     Test according to the original poster's steps #
###################################

# 1. Start a new transaction and record the current data through select
mysql> begin;
Query OK, 0 rows affected (0.01 sec)

mysql> select * from t3;
+----+------+------+
| id | name | sex  |
+----+------+------+
|  1 | a1   |    0 |
|  2 | a2   |    0 |
|  3 | a3   |    0 |
+----+------+------+
3 rows in set (0.00 sec)


# 2. Execute INSERT INTO ON DUPLICATE KEY UPDATE (indicated Query OK, 2 rows affected)
mysql> insert into t3(name, sex) values('a2', 1) ON DUPLICATE KEY UPDATE name=values(name), sex=values(sex); 
Query OK, 2 rows affected (0.01 sec)


# 3. Query again, and it matches the expectation (the update part was executed, and the data was updated)
mysql> select * from t3;
+----+------+------+
| id | name | sex  |
+----+------+------+
|  1 | a1   |    0 |
|  2 | a2   |    1 |
|  3 | a3   |    0 |
+----+------+------+
3 rows in set (0.01 sec)

# 4. Commit the transaction (Query OK, 0 rows affected)
mysql> commit;
Query OK, 0 rows affected (0.01 sec)


# 5. Query again, the data is persisted, which is different from the original poster's result (the original poster's conclusion was: 5. Query again, and the data was not persisted, i.e., it was still the original data.)
mysql> select * from t3;
+----+------+------+
| id | name | sex  |
+----+------+------+
|  1 | a1   |    0 |
|  2 | a2   |    1 |
|  3 | a3   |    0 |
+----+------+------+
3 rows in set (0.00 sec)


# Test environment
mysql> select version();
+--------------------+
| version()          |
+--------------------+
| 5.7.25-TiDB-v5.3.0 |
+--------------------+
1 row in set (0.00 sec)
| username: xxxxxxxx | Original post link

Updated the content, please take another look at the post.

| username: db_user | Original post link

Please check if this parameter is enabled:
show config where name like ‘%new_collations_enabled_on_first_bootstrap%’;

Also, check if the data that is not working is affected by case sensitivity when using lower() and upper(). It seems like long varchar might not be recognized under different collation rules. I’ll search for related issues.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.