Choosing the Row ID for Clustered Tables

translator_bot · June 22, 2024, 9:18pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 聚簇表的row id 选择

| username: Raymond

Excuse me, teachers. I saw the following sentence on the official website. My question is, if the primary key of a clustered table is of type char or varchar, how is the row id of this clustered table formed?

translator_bot · June 22, 2024, 9:18pm

| username: 我是咖啡哥 | Original post link

I also looked through the documentation, and it seems that there is no mention of how the rowid is generated for a non-integer clustered table primary key.
For non-clustered tables, a _tidb_rowid is automatically generated. I guess it’s also automatically generated for varchar, but just not queryable.

All RowID in a table are within the range [0, MaxInt64)

From this, it appears that rowid is of int type.

translator_bot · June 22, 2024, 9:18pm

| username: Raymond | Original post link

I am also curious about this issue.

translator_bot · June 22, 2024, 9:18pm

| username: WalterWj | Original post link

Hide the handle id, you can think of it as an auto-increment bigint type primary key.

You can check out the materials related to clustered and non-clustered tables. 聚簇索引 | PingCAP 文档中心

translator_bot · June 22, 2024, 9:18pm

| username: Raymond | Original post link

Isn’t that essentially a non-clustered table?

translator_bot · June 22, 2024, 9:18pm

| username: tidb菜鸟一只 | Original post link

My understanding is this: if the primary key of a clustered table is char or varchar, it is actually the same as int. Its key (RowID) is tablePrefix{TableID}_recordPrefixSep{Col1}. The only difference is that if the primary key is int and auto-incremented, it will generate a write hotspot, whereas a character type will not. For non-clustered tables, their key is tablePrefix{TableID}_recordPrefixSep{_Tidb_RowID}, which means the key is concatenated by _Tidb_RowID (automatically generated only for non-clustered tables). Since _Tidb_RowID is also an auto-incremented int type, it will generate a write hotspot by default. You can use the SHARD_ROW_ID_BITS parameter to scatter it, thereby alleviating the write hotspot issue.

translator_bot · June 22, 2024, 9:18pm

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.