Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 聚簇表的row id 选择
Excuse me, teachers. I saw the following sentence on the official website. My question is, if the primary key of a clustered table is of type char or varchar, how is the row id of this clustered table formed?
I also looked through the documentation, and it seems that there is no mention of how the rowid is generated for a non-integer clustered table primary key.
For non-clustered tables, a _tidb_rowid is automatically generated. I guess it’s also automatically generated for varchar, but just not queryable.
All RowID
in a table are within the range [0, MaxInt64)
From this, it appears that rowid is of int type.
I am also curious about this issue.
Hide the handle id, you can think of it as an auto-increment bigint type primary key.
You can check out the materials related to clustered and non-clustered tables. 聚簇索引 | PingCAP 文档中心
Isn’t that essentially a non-clustered table?
My understanding is this: if the primary key of a clustered table is char or varchar, it is actually the same as int. Its key (RowID) is tablePrefix{TableID}_recordPrefixSep{Col1}. The only difference is that if the primary key is int and auto-incremented, it will generate a write hotspot, whereas a character type will not. For non-clustered tables, their key is tablePrefix{TableID}_recordPrefixSep{_Tidb_RowID}, which means the key is concatenated by _Tidb_RowID (automatically generated only for non-clustered tables). Since _Tidb_RowID is also an auto-incremented int type, it will generate a write hotspot by default. You can use the SHARD_ROW_ID_BITS parameter to scatter it, thereby alleviating the write hotspot issue.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.