Impact of Three-AZ Deployment on TiDB Product Performance

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TIDB产品三AZ部署对性能的影响

| username: residentevil

[TiDB Usage Environment] Production Environment
[TiDB Version] V6.5.8
[Encountered Issue: Problem Phenomenon and Impact] Considering the high availability scenario of the cluster, we hope to deploy a cluster with 3 AZs within the same REGION, where the network latency between the 3 AZs is less than 1ms. We want to know how this architecture affects the performance of online read and write requests. For example, how much will the overall read latency increase?

| username: WinterLiu | Original post link

What is AZ?

| username: residentevil | Original post link

For example, in the Beijing area, there are three data centers, A, B, and C, with network latency between them being less than 1ms.

| username: dba远航 | Original post link

Cross-data center deployment results in double the network latency, which has a noticeable impact. It depends on the business’s tolerance level.

| username: zhanggame1 | Original post link

This has very high network requirements.

| username: residentevil | Original post link

Have you tested it?

| username: residentevil | Original post link

Yes, but for high availability scenarios and considering cost budgets, the three AZ solution is still in demand.

| username: Kongdom | Original post link

:yum: Availability Zone (AZ)

| username: residentevil | Original post link

I’ve reviewed this solution, but I’m not sure about its impact on the execution time of business SQL. It seems we really need to conduct a stress test on this.

| username: lemonade010 | Original post link

At that time, you need to carefully consider the distribution of machines, label conditions, and network bandwidth. 1ms should be acceptable, but the overall performance loss can only be determined through actual testing.

| username: Kongdom | Original post link

:thinking: Yes, this probably needs to be stress-tested to know for sure.

| username: tidb菜鸟一只 | Original post link

Are the three data centers in the same city?

| username: 小龙虾爱大龙虾 | Original post link

You can actually test it, find an environment to simulate, and use the chaos-mesh tool to add network latency.

| username: residentevil | Original post link

Yes, three data centers in the same city.

| username: tidb菜鸟一只 | Original post link

Then consider which topology structure suits you: 单区域多 AZ 部署 TiDB | PingCAP 文档中心
Deploying in three centers within the same city will definitely improve availability, but performance will indeed decrease, mainly due to network impact. Write latency will at least double the latency between AZs, and reading data will generally have more than 1x latency.

| username: danghuagood | Original post link

TiDB has relatively high network requirements. If you want to deploy across AZs (Availability Zones), it’s best to keep network latency within 1ms. Additionally, it depends on the tolerance of the business. Some businesses cannot accept even 1ms of latency, in which case cross-AZ deployment is not feasible. If the business can accept 1ms of latency, then it can be considered.

| username: Jellybean | Original post link

Without considering caching and other factors, we can roughly estimate:

  1. For a write request, the 2PC process in TiDB requires accessing the TSO in PD twice, involving two round-trip gRPC requests. A read request involves one.
  2. Interaction between TiDB and TiKV: if it’s an update, it first queries back to TiDB memory and then updates TiKV, involving two round-trip gRPC requests. If it’s a select, it involves one; if it’s an insert, it also queries and locks, involving two round-trip gRPC requests.
  3. Within TiKV, assuming three replicas and not considering asynchronous distribution issues, the raft log is distributed from the Leader to two followers, involving two round-trip gRPC requests. The apply log stage also involves two round-trip gRPC requests.

You can see from the above process that if the network latency between AZs increases by 0.1ms, this delay will be magnified several times in a single request. Of course, this is a theoretical estimate. In practice, if there is a dedicated line, the latency might be very low. It is recommended to conduct thorough validation and testing if you have an actual environment.

There are also users who have implemented multi-center deployment solutions, indicating that this is a feasible solution.

| username: residentevil | Original post link

If doubling the latency is acceptable, the time consumption at the network layer is indeed unavoidable.

| username: residentevil | Original post link

According to this process, the time consumption will increase multiple times. I will verify it once I have the environment ready. I understand that the time consumption mainly manifests in two areas:

  1. TIDB → PD [obtaining TSO], after all, only the primary PD provides services. The requests from TIDBs in the other two AZs to PD will indeed increase the time consumption.
  2. TIDB → TIKV, if it can ensure local access [TIDB → TIKV within the same AZ], it would be fine; otherwise, the impact would be too significant.
| username: Jellybean | Original post link

In fact, TiDB has a cached TSO, which can greatly alleviate this issue.

By default, TiKV reads and writes to the leader replica, and you can also control the storage location of replicas through labels or placement rules. Placing the leader and TiDB together can also significantly optimize this issue.