Sysbench Performance Testing Does Not Reach Official Data

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: sysbench 压测 性能达不到官方数据

| username: TiDBer_ai1s9Gz4

[TiDB Usage Environment] Production Environment
[TiDB Version]
[Reproduction Path] Conduct sysbench stress testing according to the official case, import 16 tables, each with 10 million rows of data, and the test type is Point Select.
[Encountered Problem: Problem Phenomenon and Impact] TPS does not reach the official data and is far off.
Official Data:


My Test Data:
Single TiDB Node:

Three Nodes Tested Simultaneously:

Please help me see where the problem is.

[Resource Configuration]
A total of three nodes, each deploying two TiDB instances, two TiKV instances, and one PD instance.
Server Configuration:
image

[Attachments: Screenshots/Logs/Monitoring]

| username: Jellybean | Original post link

Your cluster is mixed deployment, and there will be intense resource competition between nodes during high-load stress testing. It is recommended that you separate the deployment of TiDB server and TiKV server. Mixing TiDB and TiPD deployment is fine.

Moreover, the official test data is based on different components being deployed on separate machines:

Official stress test configuration, portal: TiDB Sysbench 性能对比测试报告 - v6.1.0 对比 v6.0.0 | PingCAP 文档中心

| username: CuteRay | Original post link

The storage for TiKV in the official TiDB benchmark environment is optimized for AWS cloud, not regular SSDs. The network bandwidth between all machines can reach 10Gbps. Experimental test environments vary, so this information is for reference only.

| username: tidb菜鸟一只 | Original post link

The most critical point in stress testing is to ensure resource isolation, so that no nodes compete for resources…

| username: 裤衩儿飞上天 | Original post link

To achieve the official data, at least the configuration must keep up.

| username: TiDBer_ai1s9Gz4 | Original post link

I would like to ask how to check if my configuration is reasonable and if the cluster has achieved the best possible performance.

| username: TiDBer_ai1s9Gz4 | Original post link

So, how can I check if my configuration is reasonable and if the cluster has achieved the best possible performance?

| username: TiDBer_ai1s9Gz4 | Original post link

How can I check if my configuration is reasonable and if the cluster has achieved the best possible performance?

| username: maokl | Original post link

Hardware configuration needs to keep up.

| username: Jellybean | Original post link

Compare your machine’s memory, CPU, and disk conditions, as well as the number of machines used by different instances.

| username: TiDBer_jYQINSnf | Original post link

If you have multiple TiDB nodes, is the load only directed to one TiDB node? Is there a proxy in front?

| username: TiDBer_ai1s9Gz4 | Original post link

Yes, I tested it on a single node. Initially, the TPS was over seventy thousand, and I thought that multiplying the result by three would be close to the official result. However, when I stress-tested all three nodes together, the performance dropped significantly. A single node only had a TPS of over thirty thousand. These three nodes are independent and do not share CPU or memory, so I don’t understand why the impact is so significant.

| username: TiDBer_jYQINSnf | Original post link

There are 3 independent TiDB nodes, but there are also PD and TiKV, right?
In this case, performance will be affected. You can use Docker to limit the resources of each node so that they do not affect each other.

| username: TiDBer_ai1s9Gz4 | Original post link

PD and KV nodes are also independent, aren’t they?

| username: TiDBer_jYQINSnf | Original post link

What does this mean?