[TiDB Usage Environment] Production Environment
[TiDB Version] v7.1.0
Should every table have an AUTO RANDOM primary key?
In my business, the business primary key is a UUID formatted string. Should I use the business primary key or a bigint AUTO RANDOM primary key?
I saw in the official documentation that a hotspot issue can occur with a large number of writes, so I created an AUTO RANDOM ID for each table. What constitutes a large number of writes?
I tested writing 50,000 records per second using auto-increment on a regular consumer SSD without any issues. I guess you don’t have such a requirement either.
In my tests with TiDB, the SSD performance is not too bad, but single table writes consume a lot of CPU and memory resources. To improve speed, it’s more effective to increase the performance of TiDB nodes, while adding TiKV nodes doesn’t have much impact.
Can you please explain in what situations it is not suitable? I have many tables with composite primary keys, how should I handle this situation better?
The composite primary key uses a hidden auto-increment bigint column. Whether there is a bottleneck, I think it still needs to be stress-tested to see.
If you don’t understand, just ask. What does the model refer to here?
The reason I want to use meaningless IDs is due to two considerations:
Many of my tables have primary keys that are composite keys, and the scale is very large, such as tens of billions. I am worried that a large number of writes will cause hotspots.
Similarly, for a table with a data scale of tens of billions, I might create three to four indexes. If I use business primary keys like UUID, the index size will take up too much space.
There is no issue with writing to the composite primary key, and clustered tables write much faster than non-clustered tables.
You can test the write issue and you’ll see that it is greatly related to the number of indexes on the table. Adding an index significantly reduces the speed.
If you care about write speed, don’t add extra indexes.
If there are business requirements and dependencies on auto-increment for deep pagination, you can use auto-increment. Otherwise, it is generally recommended to use random.