Poor Sysbench Performance

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: sysbench性能差

| username: TiDBer_WSzLrdh1

[Test Environment] TiDB
[TiDB Version] 7.1.1
[Reproduction Path] sysbench --db-driver=mysql --mysql-host=xxx --mysql-port=xxx --mysql-user=xxx --mysql-password=‘xxx’ --mysql-db=xxx --db-ps-mode=disable --table_size=25000 --tables=150 --rand-type=uniform --db-ps-mode=auto --events=0 --threads=128 --report-interval=1 --time=300 oltp_read_write run
[Encountered Issue: Symptoms and Impact]
image

[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page

[Attachments: Screenshots/Logs/Monitoring]
profiling_2023-10-30_12-39-59.zip (1.2 MB)

| username: Miracle | Original post link

This CPU is really at its limit for survival…
You can try upgrading the CPU.

| username: 像风一样的男子 | Original post link

Check the CPU, memory, and disk monitoring to see which ones have reached their limits.

| username: TiDBer_WSzLrdh1 | Original post link

Yes, the number of CPU cores in TiDB is indeed relatively small, but the performance shouldn’t be this poor. I understand it should at least reach tens of thousands, so there might still be something not quite right.

| username: TiDBer_WSzLrdh1 | Original post link

The TiDB CPU is fully utilized. Is there any way to optimize it through parameter tuning?

| username: 托马斯滑板鞋 | Original post link

I haven’t checked the monitoring data yet, but your latency is a bit too high. Normally, it should be around 10-20.

| username: 像风一样的男子 | Original post link

Is the disk a mechanical hard drive or a solid-state SSD?

| username: 托马斯滑板鞋 | Original post link

Try setting tidb_executor_concurrency to 1 and test again.

| username: TiDBer_WSzLrdh1 | Original post link

The disk is SSD.

| username: TiDBer_WSzLrdh1 | Original post link

Another phenomenon is that when running sysbench, it gets stuck for a long time at the step “Initializing worker threads…”

| username: TiDBer_WSzLrdh1 | Original post link

The default value of tidb_enable_async_commit is OFF, and the default value of tidb_enable_1pc is also OFF. You can check the current values of these two parameters in your cluster by executing the following SQL statements:

SHOW GLOBAL VARIABLES LIKE 'tidb_enable_async_commit';
SHOW GLOBAL VARIABLES LIKE 'tidb_enable_1pc';

If you want to enable these two features, you can execute the following SQL statements:

SET GLOBAL tidb_enable_async_commit = ON;
SET GLOBAL tidb_enable_1pc = ON;

Note that enabling these two features may have an impact on the performance and stability of your cluster, so it is recommended to conduct sufficient testing before enabling them in a production environment.

| username: 托马斯滑板鞋 | Original post link

Improved it a bit, but it’s still not enough :upside_down_face:

| username: TiDBer_WSzLrdh1 | Original post link

Yes, it has some effect. Are there any other parameters that can be adjusted?

| username: 托马斯滑板鞋 | Original post link

Have you tried reducing the concurrency to 32? --threads=32
Did you implement load balancing on the two TiDB instances?

| username: TiDBer_WSzLrdh1 | Original post link

Okay, currently we are only testing the limits of a single TiDB, so we have not implemented load balancing.

| username: 托马斯滑板鞋 | Original post link

I suggest starting with 8 concurrent threads, then gradually increasing to 16, 32, and so on.

| username: 有猫万事足 | Original post link

The CPU is maxed out, that’s where the bottleneck is.
CPU/IO/memory, if any one of them is maxed out while other resources still have surplus, that’s how you identify the bottleneck.

Your current approach should be to trade space for time. If IO/memory is still far from the limit, see if you can increase various caches in TiDB/TiKV to relatively reduce CPU computation, aiming to optimize the entire system.

| username: 像风一样的男子 | Original post link

Here are the results of my previous stress test:
sysbench oltp_write_only --threads=128 --rand-type=uniform --db-driver=mysql --mysql-db=sbtest --mysql-host=10.20.10.71 --mysql-port=4002 --mysql-user=root --mysql-password=‘$9LBzi4_&JF6u301=e’ --tables=16 --table-size=10000000 --report-interval=10 --time=300 run

| username: 有猫万事足 | Original post link

In the absence of load balancing, is the CPU of TiKV fully utilized? If not, there is still room for improvement.

| username: Fly-bird | Original post link

Check both the CPU and disk resources.