What is best topology for my three dedicated server?

Hi,
I have three dedicated servers with below specs but I still can not get desired performance from tidb !

Can you suggest a recommended topology please with assuming I want to configure from scratch?

Application environment:

Production use
Debian 10
Cluster configured with TiUP command

TiDB version: v6.4.0

Resource allocation:

All servers are same with this specs each one (3 servers) :

  • 1 TB nvme storage
  • 32 cores AMD EPYC™ 7502P
  • 128 GB DDR4 ECC

I’ll appreciate any help.
Thanks

What performance do you get? How are you testing this? Is throughput or latency the main issue?

Thanks fir replying,

I’m just comparing queries running time with current MySQL server. TiDB is faster than MySQL but I want to make sure I’m using BEST available configuration (topology) for my current hardware and I’m using its maximum potential.

Is there any tool to recommend best topology according to hardware ?

For example one of these configs is theoretically better for sure right ? I want to know this then will going to check it deep.


+----------+-----------+-----------+
| Server 1 | Server 2  | Server 3  |
+----------+-----------+-----------+
| 1 TiKV   | 1 TiKV    | 1 TiKV    |
| 1 PD     | 1 PD      | 1 TiFLASH |
| 2 TiDB   | 1 TiFLASH | 2 TiCDC   |
+----------+-----------+-----------+



+----------+----------+-----------+
| Server 1 | Server 2 | Server 3  |
+----------+----------+-----------+
| 1 PD     | 3 TiKV   | 3 TiFLASH |
| 2 TiDB   | 1 TiCDC  | 1 TiCDC   |
|          |          |           |
+----------+----------+-----------+


Thank you

  1. I assume all three servers have identical specs.
  2. While it is possible to run with 3 servers we recommend for production deployments that TiKV and TiDB are deployed on separate servers and the same for TiFlash.
  3. For PD and TiKV we recommend 3 instances for redundancy. For TiKV a multiple of 3 is recommended.
  4. For TiDB, TiFlash and TiCDC we recommend 2 instances (or more)

For the first image:
a. Not all services have the minimum number of instances to be redundant.

I would recommend:
Server1: PD, TiKV, TiDB, TiCDC
Server 2: PD, TIKV, TiDB, TiFlash
Server 3: PD, TiKV, TiFlash, TiCDC

Depending on your server config this is a lot to combine on a single server, but it might be fine. I assume that all servers are in different availability zones?
If you use Kubernetes with our operator it should be able to figure out what the best way is to combine things without you having to define this upfront.

Note that MySQL with a correctly sized InnoDB buffer pool is going to serve many queries directly from memory if the data set is small enough. To really see the advantages of TiDB the dataset has to be bigger to outgrow what a single machine can hold in memory.

Depending on the size of the server you might want to consider not using TiFlash and allow more resources for TiKV and TiDB.

Can you share information about how you are testing and what numbers you get?

And please make sure you enable the plan cache and configure some tables to use TiFlash replicas if you keep using that.

And v6.4.0 is an older version and not an LTS version. Consider upgrading to v6.5.1 if you want/need LTS or v7.0.0 if you want a DMR release.

1 Like

Thank you dveeden, Your answer was exactly what I expected.

I assume that all servers are in different availability zones?

No all servers are dedicated but are in same datacenter.

If you use Kubernetes with our operator it should be able to figure out what the best way is to combine things without you having to define this upfront.

I’m using tiup command to manage and deploy cluster.

Depending on the size of the server you might want to consider not using TiFlash and allow more resources for TiKV and TiDB.

About using TiFLASH or not using it I’m not sure yet. In my tests I can not see significant difference when using it and when its disabled ! maybe my dataset is not big enough or not has queries to use TiFLASH like group by and …

Can you share information about how you are testing and what numbers you get?

Sure. I’m comparing same codebase and database with MySQL and TiDB. TiDB usually is at least same or maximum 10x faster than TiDB but I except even more performance from TiDB. I also tested it with sysbench but I don’t know how to analyze the results or with what index to compare.

There is my analyze results. If could help let me know please I can make more tests .

sysbench --config-file=sysbenchconfig oltp_point_select --tables=32 --table-size=10000000 run

SQL statistics:
    queries performed:
        read:                            19982547
        write:                           0
        other:                           0
        total:                           19982547
    transactions:                        19982547 (33304.10 per sec.)
    queries:                             19982547 (33304.10 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          600.0019s
    total number of events:              19982547

Latency (ms):
         min:                                    0.16
         avg:                                    0.48
         max:                                  798.04
         95th percentile:                        0.56
         sum:                              9592880.46

Threads fairness:
    events (avg/stddev):           1248909.1875/349.36
    execution time (avg/stddev):   599.5550/0.01
sysbench --config-file=sysbenchconfig oltp_update_index --tables=32 --table-size=10000000 run

SQL statistics:
    queries performed:
        read:                            0
        write:                           2893142
        other:                           48485
        total:                           2941627
    transactions:                        2941627 (4902.65 per sec.)
    queries:                             2941627 (4902.65 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          600.0063s
    total number of events:              2941627

Latency (ms):
         min:                                    0.54
         avg:                                    3.26
         max:                                45541.28
         95th percentile:                        6.43
         sum:                              9598796.81

Threads fairness:
    events (avg/stddev):           183851.6875/1244.99
    execution time (avg/stddev):   599.9248/0.00

Thank you

Depending on the size of the server you might want to consider not using TiFlash and allow more resources for TiKV and TiDB.

About using TiFLASH or not using it I’m not sure yet. In my tests I can not see significant difference when using it and when its disabled ! maybe my dataset is not big enough or not has queries to use TiFLASH like group by and …

Did you do something like ALTER TABLE ... SET TIFLASH REPLICA 2 to make sure there is actually a TiFlash replica of the tables that you need? The EXPLAIN... output should also tell you if TiFlash is used or not.

With 3 copies of the data and 3 servers every server hold the full dataset. Once you outgrow this and when you have 6 servers with 3 copies each server only holds half of the dataset. Queries have to go from TiDB to TiKV and back.

With MySQL and 3 servers ever server has the full dataset and can quickly answer queries with the local data. There is no strong consistency, which means that there might be replication delay causing outdated data to be returned to the application. (unless you use Group Replicaton with the right settings). With TiDB the results are consistent. And with MySQL growing to 6 servers means you need to somehow shard the data, MySQL won’t do this for you.

Did you do something like ALTER TABLE ... SET TIFLASH REPLICA 2 to make sure there is actually a TiFlash replica of the tables that you need? The EXPLAIN... output should also tell you if TiFlash is used or not.

Yes, I always use this command to enable TiFLASH for a database.

With 3 copies of the data and 3 servers every server hold the full dataset. Once you outgrow this and when you have 6 servers with 3 copies each server only holds half of the dataset. Queries have to go from TiDB to TiKV and back.

With MySQL and 3 servers ever server has the full dataset and can quickly answer queries with the local data. There is no strong consistency, which means that there might be replication delay causing outdated data to be returned to the application. (unless you use Group Replicaton with the right settings). With TiDB the results are consistent. And with MySQL growing to 6 servers means you need to somehow shard the data, MySQL won’t do this for you.

Yes, I also think that the larger the database is and the bigger it is, the better TiDB can show its potential compared to a smaller database.

So I’m going to use this topology for my servers as you said its best possible option .

Thank you very much :tulip: :hibiscus: