How much performance improvement can be achieved by binding NUMA in TiDB performance testing?

translator_bot · June 23, 2024, 8:28am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb 性能测试绑定numa 性能有多大提升

| username: Raymond

Background:
3 physical machines, 80 CPUs, 2 NUMA nodes. Each physical machine deploys 2 TiDB servers, 1 PD, and 1 TiKV. The 2 TiDB servers are bound to the 2 NUMA nodes respectively. After conducting tests with sysbench, it was found that binding NUMA resulted in a 19% improvement in QPS and TPS performance under a 10-thread concurrent test. However, as the number of concurrent threads increased, the improvement in QPS and TPS became less significant. When testing with 500 concurrent threads, QPS and TPS even decreased by 1% compared to when NUMA was not bound.

Is this test result reasonable?
How much performance improvement did you observe after binding NUMA?
What are the best practices for binding NUMA?

translator_bot · June 23, 2024, 8:28am

| username: h5n1 | Original post link

You only bound the TiDB server, but not PD and TiKV? They should all be bound. Also, you need to check the CPU utilization after 500 concurrent requests.

translator_bot · June 23, 2024, 8:28am

| username: Raymond | Original post link

Why bind NUMA when CPU utilization is around 60%, and there is only one PD and one TiKV deployed?

translator_bot · June 23, 2024, 8:28am

| username: hey-hoho | Original post link

You mean that each physical machine is mixed with 2 TiDB, 1 PD, and 1 TiKV, and then the 2 NUMA nodes are allocated to TiDB respectively.

When there are 500 concurrent requests, the CPU utilization is only 60%. Is it possible that PD and TiKV have reached their bottleneck? Check their monitoring to see.

For NUMA best practices, you can refer to this article: 专栏 - 单机 8 个 NUMA node 如何玩转 TiDB - AMD EPYC 服务器上的 TiDB 集群最优部署拓扑探索 | TiDB 社区

translator_bot · June 23, 2024, 8:28am

| username: Raymond | Original post link

Thank you for your reply, I will go study this article.

translator_bot · June 23, 2024, 8:28am

| username: h5n1 | Original post link

To avoid components using CPUs across NUMA nodes. Do you have a load balancer like HAProxy in front of the 6 TiDB servers? You can start by testing with 3 TiDB servers, with PD and TiDB on one NUMA node and TiKV on another NUMA node, and then adjust based on resource conditions.

translator_bot · June 23, 2024, 8:28am

| username: Raymond | Original post link

Are there load balancers similar to HAProxy for 6 TiDB servers? ---- Yes, you can access the backend TiDB servers through HAProxy.

The CPU utilization is around 60%, and there is one PD and one TiKV deployed respectively. Why bind NUMA? —> Because two TiDB servers are deployed on one physical machine, and we also want to test how much performance improvement binding NUMA can bring to TiDB.

translator_bot · June 23, 2024, 8:28am

| username: Lucien-卢西恩 | Original post link

If the issue is resolved, you can mark the corresponding answer as “Best Answer”~

translator_bot · June 23, 2024, 8:28am

| username: system | Original post link

This topic will be automatically closed 60 days after the last reply. No new replies are allowed.