Issues in TiDB Deployment Planning

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb 部署规划问题

| username: TiDBer_xgwIsUrp

[TiDB Usage Environment] Production Environment
[TiDB Version] V7.5
[Reproduction Path] None
[Encountered Problem: Problem Phenomenon and Impact]

Cluster Environment: As shown above, it is the minimum cluster configuration (as per the document).
Question 1:
As shown in the figure, the minimum cluster configuration described in the TiDB documentation. I plan to use 3 machines, namely Host A, Host B, and Host C. The first column of TiDB and PD instances in the above figure will be placed on Host A, and the rest on Host B and Host C respectively. Host A will combine the first row (one host) and the second row (one host) in the figure into one Host A (the document says these two instances can be placed together).
So, does my Host A need to configure 2 IPs for TiDB and 3 IPs for PD? In other words, does my Host A need to configure 5 IPs?
My environment is as follows, the downloaded package should be correct, but there is an error during cluster testing: What is causing this?

Question 2: If I want to configure the TiFlash deployment topology
The configuration environment is as follows:
Host 1: 8 cores, 16GB
Host 2: 8 cores, 16GB
Host 3: 8 cores, 16GB
Host 4: 16 cores, 48GB
Host 5: 32 cores, 64GB
The hard disk is around 500GB,
These configurations differ slightly from the document, is this feasible?

Question 3:
I plan to use VG and LV to manage storage. Is this feasible? I am concerned that it might be inconvenient for data expansion later.

| username: zhanggame1 | Original post link

Are you looking to deploy a single-machine simulated cluster? I didn’t quite understand what you meant.

| username: TiDBer_xgwIsUrp | Original post link

Added, it is a cluster environment.

| username: zhanggame1 | Original post link

When there are not many machines, you can deploy them in a mixed manner. Just ensure the memory is sufficient. You can deploy one PD, one TiDB, and one TiKV on each machine, using a total of three machines. If there are more machines available, you can deploy PD and TiDB on one machine each, totaling three machines, and use the other three machines for TiKV. TiFlash should be deployed separately. I suggest trying a deployment of 3 TiDB+PD, 3 TiKV, and 1 TiFlash.

| username: onlyacat | Original post link

If it’s just for functional verification and not stress testing, it should generally be fine.

No need to configure five IPs, just use different ports.

This essentially means you only have one machine, so you can just put everything on it. If you want to test high availability, you need at least three machines.

| username: Aaronz | Original post link

Mixed deployment can be managed by separating ports. The main focus should be on separating storage, setting memory limits, and configuring CPU NUMA properly.

| username: DBAER | Original post link

Mixed deployment is fine for testing or non-critical environments.

| username: 饭光小团 | Original post link

You don’t have to. If you put all the PDs on one machine, just use different ports.

| username: YuchongXU | Original post link

Virtualization can be used.

| username: TIDB-Learner | Original post link

IP + port. If multiple nodes are deployed on one machine, different ports can be used. The disk capacity is related to your actual data volume. For small clusters, it’s okay if the CPU is not very powerful, but it’s recommended to have more memory.

| username: tidb菜鸟一只 | Original post link

If you deploy tidb-server, tikv, and pd on the same machine, you only need to specify one IP since they use different ports.

| username: TiDBer_xgwIsUrp | Original post link

It’s not a mixed deployment. I plan to deploy the “minimum cluster configuration.” The IPs are not reachable, and there are errors after the pre-deployment tests.

| username: TiDBer_xgwIsUrp | Original post link

It’s not a single machine. There might have been some issues with the description. The current environment has already been updated, and there are errors when detecting nodes. Actually, I want to ask, if we don’t configure the IP, how can we associate it with each node machine?

| username: 托马斯滑板鞋 | Original post link

Remove host 4 and allocate the budget to hosts 1, 2, and 3. Is that okay?
Typically, we configure TiDB + PD + TiKV on hosts 1, 2, and 3, and use host 5 exclusively for TiFlash (this is how we do it in production).

| username: tidb菜鸟一只 | Original post link

If you want to deploy on three machines, just use this and change the IPs to your three machines’ IPs.

  user: "tidb"
  ssh_port: 22
  deploy_dir: "/u01/tidb-deploy"
  data_dir: "/u01/tidb-data"
server_configs: {}
  - host:
  - host:
  - host:
  - host:
  - host:
  - host:
  - host:
  - host:
  - host:
  - host:
  - host:
  - host:
| username: TiDBer_rvITcue9 | Original post link

You can reuse the server, just don’t stress test it.

| username: zhang_2023 | Original post link

Mixed deployment is possible.

| username: Soysauce520 | Original post link

Send the YAML file.

| username: TiDBer_小阿飞 | Original post link

Mutual trust was not established.
When deploying a cluster using TiUP, you can use either a key or an interactive password for security authentication:

  • If using a key, you can specify the key path with -i or --identity_file.
  • If using a password, you can enter the password in the interactive window with -p.
  • If passwordless login to the target machine is already configured, no authentication is needed.

Generally, TiUP will create the user and group specified in topology.yaml on the target machine, except in the following cases:

  • The username specified in topology.yaml already exists on the target machine.
  • The --skip-create-user parameter is used on the command line to explicitly skip the user creation step.
| username: kelvin | Original post link

You can deploy everything on 3 machines. That’s how I’m currently testing it.