Performance Comparison Test of Tidb vs Mysql 8 on Low-Configuration Hosts

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: Tidb vs Mysql 8 低配主机性能测试比对

| username: 托马斯滑板鞋

Tidb: 7.1.1
MySQL: percona-server-8.0.33-25

Background:

Exploring the feasibility of replacing a single-master MySQL MGR cluster with TiDB in a low-spec host environment to accelerate business SQL and improve host resource utilization.

I. Test Environment:

  • Alibaba Cloud
Purpose Cloud Host Model CPU Memory Storage Space I/O Performance IOPS Price
Benchmark Host ecs.c7.8xlarge 32C 32G 200G 46MB/s 11000 3351.13 RMB/month
Database Host ecs.c7.4xlarge 16C 32G 1000G 440MB/s 50000 1802.78 RMB/month

II. Deployment Architecture:

Database Type Architecture Total Price
TiDB Three nodes, each with 1 TiDB + 1 PD + 1 TiKV 5,408.34 RMB/month
MySQL Two nodes, master-slave, enhanced semi-synchronous 3,605.56 RMB/month

III. Some Parameter Adjustments:

1. Disable transparent huge pages on the main host
2. Deploy haproxy on the benchmark host, pointing to three TiDB servers
3. Change the log level of TiDB server and TiKV to error
4. Allocate 24G to innodb buffer

IV. TPCC Test Results

  • Data volume of 1000 warehouses

V. Sysbench Test Results

  • 32 tables, each with 10 million rows of data

oltp_select_point (point query)

oltp_read_write (read-write)

VI. Others

1. The performance gap on 8C16G is almost the same
2. TiDB itself did not OOM without any resource restrictions
3. During the test, the three TiDB nodes could evenly reduce CPU idle to single digits; MySQL had a lot of IO wait
| username: Fly-bird | Original post link

The original poster is awesome :grinning:

| username: Miracle | Original post link

Boss, how long does it take to generate data for 1000 warehouses in the tpcc test? Why does it take me almost half an hour to generate data for just a few dozen warehouses?

| username: 托马斯滑板鞋 | Original post link

:joy: Logs are lost, anyway it’s slower than MySQL. I usually just dump data in the background.

| username: yiduoyunQ | Original post link

The performance test data for v7.1 can also be directly referenced from the Cloud official documentation https://docs.pingcap.com/tidbcloud/v7.1.0-performance-benchmarking-with-tpcc and the sysbench section.

| username: ajin0514 | Original post link

Awesome

| username: 路在何chu | Original post link

TiDB is definitely much faster.

| username: TiDBer_小阿飞 | Original post link

Have the prices been estimated?

| username: 托马斯滑板鞋 | Original post link

The price is taken from the monthly price on the ECS purchase settlement page in the Alibaba Cloud console. :upside_down_face:

| username: OnTheRoad | Original post link

This kind of comparison doesn’t hold much significance, does it?

| username: 托马斯滑板鞋 | Original post link

Because the resource utilization rate of single-master MGR is not high, I am considering whether TiDB can replace it on a low-configuration cluster to improve resource utilization. Additionally, the public TPC-C model cannot be tested on MGR (lacks primary key), so I can only use master-slave semi-synchronous simulation for comparison. The test involving sensitive data in the business scenario will not be disclosed.

| username: xingzhenxiang | Original post link

I feel that the distributed advantages of TiDB have not been demonstrated.

| username: 托马斯滑板鞋 | Original post link

In actual stress testing, the resource utilization of the three TiDB hosts is very low, which is much better compared to a single MySQL MGR master. :kissing_closed_eyes:

| username: xingzhenxiang | Original post link

This works, MGR is rarely used.

| username: TiDBer_nwwtuEWS | Original post link

Hello, I am experiencing a similar issue when using tiup bench tpcc to stress test a single MySQL instance. The tiup bench tpcc check for data consistency is extremely time-consuming. I would like to know if you have encountered a similar situation or if you can provide some optimization suggestions.

Details are as follows:

  • Database host configuration: 16 cores, 128GB memory, 1TB high-performance disk, virtual memory disabled.
  • MySQL was installed on Ubuntu 20.04 using apt-get install mysql-server, version 8.0. The only configuration optimizations made in /etc/mysql/mysql.conf.d/mysqld.cnf were:
    • innodb_buffer_pool_size = 96G
    • innodb_write_io_threads = 16
    • innodb_read_io_threads = 16

The benchmarking machine has the same configuration as the MySQL host, using 16 threads to stress test 8 warehouses:

tiup bench tpcc check --warehouses 8 --threads 16

During the process, the database CPU usage is around 800%, which is about 50% of the 16-core CPU. Memory and I/O usage are both very low. The benchmarking machine’s CPU, memory, and I/O usage are also very low. Since the test is conducted over an internal network, network bandwidth is not a bottleneck. However, the check duration is around 1 hour, which seems abnormal.

Could you please help me understand what might be causing this issue? Thank you very much.

| username: 托马斯滑板鞋 | Original post link

Slower than TiDB, couldn’t find the source code. I suspect that the check phase is performing multi-table joins (MySQL’s performance in this area is quite poor). Looking forward to an expert from the original team to help answer this :upside_down_face:
P.S: I suggest directly judging data consistency through logs.

| username: TiDBer_nwwtuEWS | Original post link

Got it, thanks.

| username: TiDBer_uI8QIp1t | Original post link

Hello, I deployed TiDB on 3 servers, each with KV, DB, and PD, just like you. Then, the Java package only accesses the DB on one of the nodes. The configuration is based on the three-node mixed deployment article. However, the concurrency handled by the Java package’s interface is very low, with responses coming in batches of 2 or 3 interfaces. Is this because I didn’t deploy HAProxy, or did I miss some configuration?

| username: andone | Original post link

Thank you, noted.

| username: Jellybean | Original post link

TiDB has advantages in scenarios with large-scale data (more than 50 million rows in a single table). Additionally, when conducting performance tests, it’s important to deploy each instance node separately; otherwise, there will be significant mutual interference, affecting the accuracy of the results.

In cases where the data volume is not large, MySQL has greater advantages in terms of performance and cost efficiency.

If the original poster intends to conduct a comprehensive performance test on TiDB, there is still a lot of room for optimization.