Which of the Two TiDB Hybrid Deployment Methods Offers Better Performance

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 2种tidb混合部署方式,那种性能好

| username: wenyi

For data volumes of several hundred gigabytes, using a TiDB cluster and typically deploying on 3 physical machines due to funding constraints, where each physical machine is not exclusively PD, TiDB, or TiKV, there are 2 deployment methods. Which one has better performance?

First deployment method:
Using TiDB’s built-in resource control, such as:
tidb_server:

  • host: 10.201.14.1
    resource_control:
    memory_limit: 128GB
    cpu_quota: 1600%

Second deployment method:
Creating virtual machines on the physical machines, i.e., for the above scenario, creating a virtual machine with 16 cores, 128GB memory, and 500GB storage space, and deploying TiDB independently on the virtual machine.

Which deployment method has better performance?

| username: 芮芮是产品 | Original post link

You’re wrong. What I gave you is the third option: deploying TiKV on physical machines.
Go ask the boss for some virtual machines: 3 PD nodes and 1 TiDB node.

| username: wenyi | Original post link

Under the condition of 6 physical machines, currently there are only 3 physical machines. Discussion on the best deployment method for a minimal resource cluster with optimal performance.

| username: h5n1 | Original post link

First, let’s talk about the machine configuration. Aren’t we just deploying a cluster?

| username: 像风一样的男子 | Original post link

The first option is good, with high performance, and it is convenient for future migration and expansion. There are some official documents regarding resource isolation in hybrid deployment that you can refer to:

| username: wenyi | Original post link

Each cluster configuration is the same, with detailed specifications as follows:
2 Intel Xeon Gold 6330 2G, 28C/56T, 11.2GT/s, 42M Cache, Turbo, HT (205W) DDR4-2933 CPUs, 1024GB memory, RAID card (8GB cache), 10 1.92TB SSD SAS 2.5-inch drives, 2 10GbE ports (including 10GbE modules), 2 redundant power supplies.

| username: wzf0072 | Original post link

Refer to the best practices: 三节点混合部署最佳实践 | PingCAP 文档中心

| username: Jolyne | Original post link

The first type has high performance. I have had a similar situation before. With limited resources, deploying directly on a physical machine, even if mixed, has higher performance than a virtual machine.

| username: TiDBer_小阿飞 | Original post link

A physical cluster is definitely stronger than a virtual machine. Both the CPU and memory are significantly better. After all, a virtual machine still needs to set up an environment!

| username: zhanggame1 | Original post link

Using physical machines offers better performance, I have consulted the official sources.

| username: yulei7633 | Original post link

  1. Virtualizing a virtual machine on a physical machine is also an option, but the performance loss is approximately 20%.
  2. It is recommended to directly deploy on three physical machines in a mixed manner, using NUMA to control the memory and CPU usage. Of course, you need at least two CPUs.
| username: wenyi | Original post link

I also consulted with the official engineers, and they recommended the first method.

| username: rebelsre | Original post link

Three machines, 56C/112T, configured with 1TB of memory, 10 1.92TB SSDs. I have a feeling that your CPU resources are seriously insufficient. Frankly speaking, you can remove some of the memory and disks.

| username: h5n1 | Original post link

Your machine’s memory and hard disk configuration are quite high, and the performance of a physical machine is definitely better than a virtual machine. Use numactl -H to check the number of NUMA nodes. Compared to the memory and hard disk configuration, the CPU is relatively low. You can configure 6 TiKV instances, and the TiKVs on the same machine should have the same label. Use 2 disks for each TiKV to create RAID 10, and allocate the remaining disks for other purposes. Then, based on your concurrency, determine the number of TiDB servers to set up.

| username: yulei7633 | Original post link

I have tested this on my side. The performance of a physical machine is higher than that of a virtual machine created from a physical machine. Of course, when deploying in a mixed environment, remember to distribute the disks to reduce I/O dispersion. Use NUMA to isolate CPU and memory.

| username: TiDBer_vfJBUcxl | Original post link

The first option is better; the performance of a physical machine is directly better than that of a virtual machine.

| username: wenyi | Original post link

Also, the first method
resource_control:
memory_limit: 128GB
cpu_quota: 1600%
I couldn’t find these configurations in the official documentation. Where can I find a detailed introduction?

| username: wenyi | Original post link

I tested it, and the IO performance of the physical machine is the same as that of the virtual machine, with no performance loss.

| username: TiDB_C罗 | Original post link

performance.max-procs: The number of CPUs used by TiDB
tidb_server_memory_limit
Check here: 系统变量 | PingCAP 文档中心

| username: yulei7633 | Original post link

Normally virtualized, due to connection methods, there is some IO performance loss. Even using passthrough, there is still some loss.