Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: update语句走索引更新到数据和没更新到,哪个性能更好
[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version] 6.5.3
[Reproduction Path] Found during testing that update did not update the data, and the performance is worse than updating
[Encountered Problem: Problem Phenomenon and Impact] Unclear reason
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachment: Screenshot/Log/Monitoring]
1 Like
I guess not finding the data makes the retrieval slower. An ordered search is generally a binary search, and not finding the data might result in more search attempts.
It should not be slower if not updated.
Not updated means that all data was searched but the update target was not found.
Updated means that the amount of data searched was less than or equal to all data, and the update target was found.
In terms of data volume alone, not updated should be slower.
1 Like
Check the execution plan and compare them. The one that scans more rows will be slower.
“Practice is the only criterion for testing truth. It’s better to prepare some data yourself and look at the execution plan. Sometimes experience alone can’t guarantee accuracy.”
I think update also first queries the data and then performs the insert operation based on the found data. You can change update to select and see how the performance is.
It is not reasonable to infer speed solely based on the amount of data.
Please provide the explain analyze for both.
It seems to be not rigorous enough. If the value matched by the where condition does not have an index, it may be slower to match. In the case of having an index, it is faster to match. But it is not absolute and may also be related to the amount of data. Currently, no information has been found in this regard, and it is unclear if there are any professional results.
Now that you mention it, isn’t the update mechanism equivalent to delete + add?
The same issue of not being able to match. So, there is no subsequent delete + update operation. It’s actually just the time for a pure select.
The difference between having an index and not having an index for a select. Isn’t it obvious?
What data is being tested? Under what circumstances is it updated or not updated? I feel the original poster should differentiate between different situations. For example, using an index to update one piece of data: update t set c='1' where a='2'
. The condition is 2 and 10,000,000, where the latter has no data and the update fails. Which case is more time-consuming, with the index order being ASC or DESC?
Take a look at the execution plan.