Impact of NUMA Binding on Performance During TiDB Deployment

translator_bot · June 21, 2024, 12:59am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TIDB部署时候NUMA绑核对性能影响

| username: residentevil

【TiDB Usage Environment】Production Environment
【TiDB Version】V6.5.8
【Encountered Problem: Issue Phenomenon and Impact】The impact of NUMA binding on performance during TiDB deployment. I have searched many documents online, and some data indicators based on sysbench stress testing show that enabling NUMA binding can improve performance by 20%. Has anyone tested this in a production environment?

translator_bot · June 21, 2024, 12:59am

| username: RenlySir | Original post link

There will also be performance differences in production. The specific gap depends on various factors such as actual testing, business, and concurrency. Especially when the database encounters a slight bottleneck, the difference will become more noticeable.

translator_bot · June 21, 2024, 12:59am

| username: residentevil | Original post link

NUMA binding is mainly more effective when configured at the TiKV layer, right? Does TiDBServer also need to be bound to specific cores? I’m also using it for the first time, so I’m not very familiar with it.

translator_bot · June 21, 2024, 12:59am

| username: Miracle | Original post link

Is NUMA binding used during mixed deployment? Is it necessary to bind cores for normal deployment?

translator_bot · June 21, 2024, 12:59am

| username: RenlySir | Original post link

Yes, it is needed. Typically, tidb-server uses up to 24 cores of the CPU. If the machine has a lot of CPUs, it is still recommended to bind them. For example, if the machine has 128 cores and 512GB of RAM with 8 NUMA nodes, then a single node would have 16 cores and 64GB of RAM. In this case, if you are only deploying tidb-server on this machine, it is recommended to deploy 8 or 4 tidb-servers. If deploying 4, then bind each tidb-server to 2 NUMA nodes. If deploying 8, then bind each tidb-server to 1 NUMA node.

translator_bot · June 21, 2024, 12:59am

| username: RenlySir | Original post link

It depends on the machine. Generally, we will bind the cores.

translator_bot · June 21, 2024, 12:59am

| username: residentevil | Original post link

Indeed, it makes sense. Although TiDB has the capability of compute pushdown, TiDBServer still handles some global sorting results. The performance will be better after binding the cores. I plan to test the environment next week. Thanks for the support.

translator_bot · June 21, 2024, 12:59am

| username: RenlySir | Original post link

Another point is that previously, it was generally UMA architecture, which is Uniform Memory Access. Now, this NUMA is Non-Uniform Memory Access. In this case, cross-NUMA access may reduce the efficiency of CPU and memory. Therefore, binding NUMA can allow the CPU to access its corresponding memory more closely, which can also improve efficiency. Additionally, binding NUMA can relatively better isolate resources.

translator_bot · June 21, 2024, 12:59am

| username: tidb菜鸟一只 | Original post link

If a machine only deploys a single node, the difference with or without NUMA binding is not significant. However, if a machine deploys multiple nodes, it is more appropriate to bind each node to a different NUMA.

translator_bot · June 21, 2024, 12:59am

| username: TiDBer_5cwU0ltE | Original post link

You can try it in a test environment first. It’s more practical to conduct an experiment than just reading the documentation, and the conclusions you draw yourself will be more profound.

translator_bot · June 21, 2024, 12:59am

| username: 托马斯滑板鞋 | Original post link

This is more noticeable on ARM architecture hosts, with a performance gap of about 5 times. For x86, the data warehouse product next door has a performance gap of about 40% based on tests. However, binding NUMA cores means reduced available memory, so this needs to be balanced.