Which has better performance: an update statement that uses an index to update data or one that doesn't?

translator_bot · July 10, 2024, 12:31pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: update语句走索引更新到数据和没更新到，哪个性能更好

| username: 学无止境

[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version] 6.5.3
[Reproduction Path] Found during testing that update did not update the data, and the performance is worse than updating
[Encountered Problem: Problem Phenomenon and Impact] Unclear reason
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachment: Screenshot/Log/Monitoring]

translator_bot · July 11, 2024, 11:30am

| username: zhanggame1 | Original post link

I guess not finding the data makes the retrieval slower. An ordered search is generally a binary search, and not finding the data might result in more search attempts.

translator_bot · July 11, 2024, 11:31am

| username: Kongdom | Original post link

It should not be slower if not updated.
Not updated means that all data was searched but the update target was not found.
Updated means that the amount of data searched was less than or equal to all data, and the update target was found.
In terms of data volume alone, not updated should be slower.

translator_bot · July 11, 2024, 11:31am

| username: tidb菜鸟一只 | Original post link

Check the execution plan and compare them. The one that scans more rows will be slower.

translator_bot · July 11, 2024, 11:31am

| username: zhaokede | Original post link

“Practice is the only criterion for testing truth. It’s better to prepare some data yourself and look at the execution plan. Sometimes experience alone can’t guarantee accuracy.”

translator_bot · July 11, 2024, 11:31am

| username: 像风一样的男子 | Original post link

I think update also first queries the data and then performs the insert operation based on the found data. You can change update to select and see how the performance is.

translator_bot · July 11, 2024, 11:31am

| username: 连连看db | Original post link

It is not reasonable to infer speed solely based on the amount of data.

translator_bot · July 11, 2024, 11:31am

| username: h5n1 | Original post link

Please provide the explain analyze for both.

translator_bot · July 11, 2024, 11:31am

| username: 学无止境 | Original post link

It seems to be not rigorous enough. If the value matched by the where condition does not have an index, it may be slower to match. In the case of having an index, it is faster to match. But it is not absolute and may also be related to the amount of data. Currently, no information has been found in this regard, and it is unclear if there are any professional results.

translator_bot · July 11, 2024, 11:31am

| username: Kongdom | Original post link

Now that you mention it, isn’t the update mechanism equivalent to delete + add?

translator_bot · July 11, 2024, 11:31am

| username: 有猫万事足 | Original post link

The same issue of not being able to match. So, there is no subsequent delete + update operation. It’s actually just the time for a pure select.

The difference between having an index and not having an index for a select. Isn’t it obvious?

translator_bot · July 11, 2024, 11:31am

| username: TIDB-Learner | Original post link

What data is being tested? Under what circumstances is it updated or not updated? I feel the original poster should differentiate between different situations. For example, using an index to update one piece of data: update t set c='1' where a='2'. The condition is 2 and 10,000,000, where the latter has no data and the update fails. Which case is more time-consuming, with the index order being ASC or DESC?

translator_bot · July 15, 2024, 3:06am

| username: 濱崎悟空 | Original post link

Take a look at the execution plan.