TiDB Stress Testing

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiDB 压力测试

| username: 快乐的非鱼

[TiDB Usage Environment] Test
[TiDB Version]
[Reproduction Path] Three TiDB machines with 8C 12G each, deploying 3 PDs, 3 TiKVs, and 3 TiDBs, using mechanical hard drives. In the future, the production environment may switch to SSDs.
[Encountered Issues: Problem Phenomenon and Impact]
Currently, performance testing has been conducted with 1 million and 10 million rows across 20 tables, comparing with MySQL 8C 64G. The performance gap is quite noticeable with 1 million rows, but not so much with 10 million rows. Additionally, performance decreased when testing 10 million rows with 4C 8G and 8C 12G configurations. I don’t have much insight into performance optimization. Seeking advice from experts.
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

Also performed dashboard diagnostics

| username: tidb菜鸟一只 | Original post link

I think if each machine can only allocate 12GB of memory, you should stress test with 5 nodes: 1 PD, 1 TiDB, and 3 TiKV. Deploying this way will actually cause resource contention.

| username: BraveChen | Original post link

Makes sense.

| username: 快乐的非鱼 | Original post link

I will find 5 machines to perform a performance test again. It is expected that 2 machines will deploy PD and TiDB, and 3 machines will deploy TiKV, making a total of 5 machines. I will test with both 1 million and 10 million data entries and provide feedback on the results tomorrow.

| username: 快乐的非鱼 | Original post link

I did the test again twice.
5 nodes: 2 machines with 2 PDs and 2 TiDBs each (8C 12G), 3 machines with 3 TiKVs (8C 32G).
And 3 machines, each with 1 PD, 1 TiDB, and 1 TiKV (8C 36G), totaling 3 TiKVs, TiDBs, and PDs.
Using sysbench with 1 million rows of data across 20 tables, the performance of the 5-node setup was actually worse than the 3-node setup, with a significant difference. This can only be understood as the interaction between PD, TiKV, and TiDB affecting the performance.
Below is the test report; the 10 million rows test has not been done yet.

| username: yilong | Original post link

You can refer to the documentation to configure the parameters and try again after warming up.

| username: 孤君888 | Original post link

This kind of testing won’t work, right? Each component still needs to be bound to a CPU core.

| username: 托马斯滑板鞋 | Original post link

Why is your latency so high? Do you have monitoring screenshots of the host resources? (time period type)

| username: 快乐的非鱼 | Original post link

@yilong This test was preheated, but the machine performance was poor. It might be because the machine was a locally purchased one, and then a virtual machine was created using VMware, resulting in unsatisfactory performance. Over the weekend, I bought 5 machines on Alibaba Cloud: 2 machines with 16C 32G (2 PD, 2 TiDB) and 3 machines with 32C 64G (3 TiKV, 200G SSD). Compared to the local setup, the dashboard showed much better memory and CPU performance. The test found that there was not much difference in performance on TiDB for 1 million, 10 million, and 50 million records. Compared to MySQL, there was a 10-20% improvement for small data volumes and a 50-150% improvement for large data volumes.