Building a Minimal Cluster

username: TiDBer_PHgBQFC6

【TiDB Usage Environment】Production Environment
【TiDB Version】
【Encountered Issues: Problem Phenomenon and Impact】
【Resource Configuration】Cluster built with 5 machines, each with 64GB RAM and 32 cores
May I ask everyone, how to plan the cluster more safely and reasonably?
Without needing Tiflash, how to plan it better?
Can TiDB and PD be configured on only 2 machines?

username: 像风一样的男子

It is best to have three PDs, two TiDBs are fine, and three TiKVs.

username: TiDBer_PHgBQFC6

Boss, there are only 5 machines:
tidb_server: 2 machines 101, 102
pd: 3 machines 101, 102, 103
tikv: 3 machines 103, 104, 105
monitoring: 1 machine 105
Is this suitable?

username: 像风一样的男子

You can install three PDs and three TiKVs together, and install two TiDBs and monitoring together. The two TiDBs need to be load balanced.

username: zhanggame1

If the business volume is large, it is recommended to have more TiDB nodes, as it is quite performance-intensive. Increase it to 4 nodes. PD does not consume much resources, so install 3 or 5 nodes as the documentation requires an odd number. For TiKV, 4 or 5 nodes are recommended, but I think 4 nodes are sufficient. Use the remaining machine for TiUP and monitoring.

username: TiDBer_PHgBQFC6

There are only 5 machines in total. According to your implementation:
tidb 101, 102, 103, 104
pd 101, 102, 103
tikv 101, 102, 103, 104
monitoring 105
Will this ensure no issues?

username: redgame

It looks feasible, let’s give it a try.

username: 有猫万事足

TiDB and PD can be configured on just 2 machines.

However, your machines are quite powerful, so you shouldn’t have to compromise. By using NUMA to bind and isolate CPU resources, you can achieve hybrid deployment.

Only machines with a single NUMA core need to consider deploying PD and TiDB on one machine and TiKV on a separate one. Without NUMA isolation of CPU resources, TiKV can max out the CPU under heavy write pressure, causing PD on the same machine to become unresponsive, which can crash the entire cluster.

As long as CPU resources can be isolated, there is no such concern, and they can be deployed together.

There are templates in the documentation, but you’ll need to calculate and adjust the parameters yourself. Given your machine’s performance, prioritizing hybrid deployment and switching only if it doesn’t work seems more appropriate.

username: Kongdom

Classmate, be ruthless: 1 TiDB, 1 PD, and 3 TiKV :thinking:

Or 1 server with 3 TiDB, 1 server with 3 PD, and 3 TiKV. Actually, I don’t think it’s necessary to do it this way.

username: zhanggame1

No problem, and adjusting components in a TiDB cluster is very simple. You can scale up or down with just one or two commands.

username: tidb狂热爱好者

tidb_server: 2 units (101, 102)
pd: 3 units (101, 102, 103)
tikv: 3 units (103, 104, 105)
monitoring: 1 unit (102)

username: Kongdom

PD recommends having an odd number of nodes.

username: 有猫万事足

I only have 2 PDs purely due to lack of resources, running a small operation. :joy:
I don’t feel secure with just 1, and I can’t fit 3. Even so, the monitoring is still sharing a machine with other services.

username: Kongdom

:joy: I’ve been burned by even-numbered node split-brain issues~~~

username: system

