Is there a performance difference between updating a single field value and updating an entire row value in TiDB?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 在TIDB中指定更新一个字段的值和更新整个一行的值性能有差异吗

| username: TiDBer_G64jJ9u8

[Test Environment for TiDB]
[TiDB Version] 6.5.1
[Reproduction Path] The business specifies updating the corresponding field, but lazily updates the entire row.
[Encountered Problem: Problem Phenomenon and Impact]

According to the data read/write principles of TiKV, whether modifying or adding, the new version is written to overwrite the old version of the KV. Is it correct to assume that updating one field and updating the entire row have no significant difference in overhead? Please clarify this doubt, TiDB experts.

| username: TiDBer_jYQINSnf | Original post link

If there are no indexes, the difference is not significant. If there are indexes, then it does have an impact. Updating one column only requires adjusting the index for that column, but updating multiple columns requires adjusting the indexes for those columns. That’s what I think.

| username: 濱崎悟空 | Original post link

There are differences.

| username: 小龙虾爱大龙虾 | Original post link

I feel it’s almost there. :joy_cat:

| username: 像风一样的男子 | Original post link

I think you understood correctly, updating one field and multiple fields in a row should have the same overhead.

| username: Kongdom | Original post link

I think it should be the same. Key-value, whether it’s a single field or multiple fields, should all be in the value.

| username: TiDBer_jYQINSnf | Original post link

If there is an index, it will combine the indexes of other columns into a key-value submission, which will add several rows. If there is a unique index, it also needs to check for conflicts. Without an index, the cost is the same, but with an index, there is still a significant difference, right?

| username: tidb菜鸟一只 | Original post link

There is no difference. In TiDB, when you update a field, you also need to pull out all the original fields and set a new value for them.

| username: zhanggame1 | Original post link

There is no difference; in TiDB, inserts, updates, and deletes all involve inserting a new key-value pair.

| username: TIDB-Learner | Original post link

Each has its own reasoning. It depends on the situation, and I believe there are differences. If a row has 2 fields versus 1000 (or more, at the extreme) fields, just thinking about it is worrisome.

| username: Kongdom | Original post link

:thinking: Yes, you are talking about the difference between updating indexed columns and non-indexed columns.

| username: FutureDB | Original post link

Theoretically, there shouldn’t be much difference, but you can test whether there will be a significant impact on actual network transmission and memory consumption.

| username: TiDBer_jYQINSnf | Original post link

Updating a row will naturally update the indexed columns if there are any indexes. :grinning:

| username: 小于同学 | Original post link

There are differences, right?

| username: zhaokede | Original post link

Students with resources can test the differences.

| username: forever | Original post link

In theory, if there are no indexes, the performance should be the same.

| username: lemonade010 | Original post link

Key-value values should not have much difference. After all, an update also requires updating the value and writing it to TiKV.

| username: ziptoam | Original post link

It should be about the same, both are additions.

| username: ziptoam | Original post link

However, there may be some additional network overhead, differences in indexing, etc., which might have a slight impact.

| username: zhaokede | Original post link

The network overhead during submission? If the fields are not particularly numerous or large, this should be negligible.