Performance Differences Between TiDB and MySQL

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb与MySQL性能差距

| username: TiDBer_djgos04V

[Test Environment for TiDB]
Recently, I have been researching TiDB and configured a three-node TiDB cluster. When using sysbench for bulk insert testing, I found that the performance of the TiDB cluster is significantly lower than that of a single-node MySQL, even though the machine configurations are the same. However, TiDB is set up on three machines, each with PD, TiDB, and TiKV.

Test command:

sysbench bulk_insert --threads=16 --time=60 --report-interval=5 --mysql-host=*** --mysql-port=4000 --mysql-user=root --mysql-password=****** --mysql-db=sbtest --tables=10 --table-size=1000000 run

TiDB test results:
image
MySQL test results:


I would like to ask what is causing such a large performance gap, otherwise, I won’t be able to explain it later :cry:

| username: Miracle | Original post link

How is the resource configuration of TiDB compared to MySQL?

| username: tidb菜鸟一只 | Original post link

Refer to this,
How to Benchmark TiDB Using Sysbench | PingCAP Documentation Center

| username: TiDBer_djgos04V | Original post link

The machines are all 8 Core + 16G + 100G.

| username: Miracle | Original post link

When I tested it before, it was also a mixed deployment, and the configuration was similar to yours. MySQL also had slightly better performance, but not as exaggerated as yours.

| username: Fly-bird | Original post link

Personally, I think we should compare from a distributed perspective. According to the production configuration, deploy pd3, tidb2, tikv*3 respectively, and then test the performance.

After all, the product architecture design is different.

| username: Christophe | Original post link

Which version is it?

| username: zhanggame1 | Original post link

That’s normal. My tests also yielded similar results. Simply having an empty database won’t make writes faster than MySQL.

| username: TiDBer_djgos04V | Original post link

Version 7.1.1

| username: Kongdom | Original post link

The application scenarios are different, and the tracks are different, so there’s no need for a forced comparison.

| username: 托马斯滑板鞋 | Original post link

:upside_down_face: That’s normal, the network overhead alone is more than doubled; you can also test point_select, I found the difference to be about 3 times;
However, when it comes to multi-table joins in tpcc, it’s the other way around :joy:

| username: zhanggame1 | Original post link

If the data volume is not large enough, a single machine definitely has an advantage, as it avoids a lot of distributed overhead.

| username: TiDBer_djgos04V | Original post link

However, I created a single-node test environment using the playground, and the test results still showed no difference. :cry:

| username: 托马斯滑板鞋 | Original post link

It’s normal and related to the underlying storage structure. When I tested the playground on Alibaba Cloud, it was even 2% slower than the cluster. :rofl:

| username: 托马斯滑板鞋 | Original post link

You can learn about the multi-layer structure of TiDB’s LSM tree, and even compare it with OB’s two-layer structure + B-tree.

| username: 大飞哥online | Original post link

Try a different approach and give it another shot :joy:

| username: zhanggame1 | Original post link

7 to 8 times higher is too outrageous.

| username: 有猫万事足 | Original post link

I can only tell that your boss thinks highly of TiDB. Have you considered asking your boss to pay for you to get a certification? :rofl: Take advantage of this great opportunity.

| username: cassblanca | Original post link

The leader has already tested it for you. :joy:

| username: 大飞哥online | Original post link

The leader is very optimistic about TiDB :grinning: