Are there any best practices for TiDB scaling?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiDB扩容有没有什么最佳实践

| username: TiDBer_Di6thIjv

[TiDB Usage Environment] Production

[TiDB Version]
v6.1.0

[Encountered Problem]
We need to set up a new cluster with a final capacity of around 10TB, but initially, we don’t want to deploy too many machines. Are there any best practices for scaling out? For example, if I want to start with a 3-node setup and then gradually scale out, is there anything I need to pay attention to? How much impact will scaling out have on performance?

| username: 我是咖啡哥 | Original post link

Expansion is very convenient. Isn’t this how it should normally be done? Otherwise, how can we talk about scalability? Adding data nodes (TiKV, TiFlash) involves data migration, which can be done during off-peak periods.

| username: 张雨齐0720 | Original post link

The official documentation contains specific steps. You can try it out in a test environment first.

Official documentation:

| username: 边城元元 | Original post link

  • TiKV disk size configuration recommendation: PCI-E SSD should not exceed 2 TB, regular SSD should not exceed 1.5 TB.
  • TiFlash supports multi-disk deployment.

使用 TiUP 扩容缩容 TiDB 集群 | PingCAP 文档中心 Using TiUP to scale TiDB cluster

It is important to note the impact of scheduling on overall performance during scaling.

| username: cs58_dba | Original post link

A single disk should not exceed 2TB, right? Is the entire data directory made redundant through RAID5, and can the directory size exceed 2TB?

| username: 啦啦啦啦啦 | Original post link

It’s not a single disk; it’s recommended that the storage of a TiKV node should not exceed 2TB, otherwise, a single store will have too many regions.

  • TiKV disk size configuration recommendation: PCI-E SSD should not exceed 2 TB, and regular SSD should not exceed 1.5 TB.
| username: cs58_dba | Original post link

So, with my standard of 3 TiKV nodes, a single instance of data cannot exceed 2T. If I want to exceed 2T, I can only use 4 nodes or more nodes?

| username: 啦啦啦啦啦 | Original post link

:rofl: I didn’t quite understand. A single instance’s data cannot exceed 2T. So, even with 4 nodes or more, a single instance’s data still cannot exceed 2T?

| username: cs58_dba | Original post link

My idea is that a 3T instance with 3 replicas would have a total of 9T of data, distributed across 5 TiKV nodes. Does this mean that the data stored on a single TiKV node would be less than 2T? :sob:

| username: 啦啦啦啦啦 | Original post link

Sure, typically the disk is fixed at around 2TB during planning. When the disk space for a single instance is almost full, just add more nodes.

| username: cs58_dba | Original post link

The more you debate, the clearer the truth becomes. Learned a lot, bro.

| username: system | Original post link

This topic will be automatically closed 60 days after the last reply. No new replies are allowed.