Can a standalone TiDB be converted to a cluster, or can a standalone TiDB be scaled?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 单机部署的Tidb能否转化集群,或者说单机tidb能扩容吗

| username: TiDBer_微风轻吟

[TiDB Usage Environment] Production Environment
[TiDB Version] 7.3.0
[Encountered Issues: Problem Description and Impact]
I couldn’t find information in the documentation about single-machine scaling and converting a single machine to a cluster. Another question is, at what data volume is it generally necessary to use a cluster?

| username: zhanggame1 | Original post link

TiDB is suitable for data volumes over 500GB. If you are running general tests on a single machine, there is no need to use TiDB.

Expanding into a cluster is also very simple; just scale out the PD, TiDB, and TiKV components.

| username: 双开门变频冰箱 | Original post link

Clusters are for high availability, and not necessarily related to capacity.

| username: Miracle | Original post link

Single-node deployment of TiDB is generally for testing purposes. If TiDB is used in production, at least 3 nodes should be deployed in a mixed manner. It can also be used when the data volume is small, but the performance might be slightly lower compared to other single-node databases. The biggest difference between cluster mode and single-node mode is whether it is highly available.

| username: TiDBer_微风轻吟 | Original post link

Thank you. Currently, PD, TiDB, and TiKV are all on one server. So, we just need to expand PD, TiDB, and TiKV externally, right? There’s no need to redeploy the cluster?

| username: zhanggame1 | Original post link

No need to redeploy.

| username: Kongdom | Original post link

There is no need to redeploy. This is a particularly good feature of TiDB, as it supports horizontal scaling. You can also consider that there is no standalone version, only a cluster version.

| username: TiDBer_微风轻吟 | Original post link

If expanding TiKV, isn’t it to increase capacity? For high availability, should we add more TiDB instances? (If I’m wrong, please correct me as I’m a newbie)

| username: dba远航 | Original post link

Expanding a single machine doesn’t seem to make much sense, as multiple components are still concentrated on one machine. Unless you scale down from this machine again.

| username: TiDBer_微风轻吟 | Original post link

However, my TiDB, PD, and TiKV are all on one server. Do I need to migrate them? :sob:

| username: TiDBer_微风轻吟 | Original post link

Does it mean to first expand to other machines, and then remove the TiKV on this machine? Is that what you mean?

| username: Kongdom | Original post link

For example, initially:
Machine A: TiDB, PD, TiKV
You can expand TiDB to Machine B, and it will become:
Machine A: TiDB, PD, TiKV
Machine B: TiDB
Both TiDB instances on Machine A and Machine B can provide external connections, and the data will be written to the TiKV on Machine A.

| username: Kongdom | Original post link

If there are sufficient machines, it is best to deploy according to the standard deployment, which requires at least 8 servers, with only one component deployed on each server.

| username: TiDBer_微风轻吟 | Original post link

At this point, can we scale down TiDB on machine A and only use machine B? Is that also possible? :grin:

| username: TiDBer_微风轻吟 | Original post link

Okay, okay, thank you, I understand.

| username: Kongdom | Original post link

Yes, full marks for understanding :tada:

| username: TiDBer_微风轻吟 | Original post link

By the way, does this have to be configured exactly like this? If I don’t have 48GB, can 32GB handle it? The traffic isn’t very high. :sob:

| username: Kongdom | Original post link

Yes, it is supported for online scaling. If resources are insufficient, you can scale up online. You can also merge certain components onto one server.

In a production environment, TiDB and PD can be deployed and run on the same server. However, if there are higher requirements for performance and reliability, they should be deployed separately as much as possible.

In an extreme case, you can deploy only three servers, with TiDB, PD, and TiKV on each server. This is referred to as a three-node hybrid deployment as mentioned in the following document:

| username: TiDBer_微风轻吟 | Original post link

Alright, alright, thank you so much.

| username: TiDBer_小阿飞 | Original post link

In your current single-machine environment, it is essentially a cluster, and there is not much difference in the nature of deployment compared to the production environment. By cluster, you are likely referring to storage nodes, which can be expanded at any time. Of course, compute nodes can also be expanded at any time without affecting the existing cluster environment. TiDB is very flexible and convenient in this regard.