TiProxy Q&A & Future Plans

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiProxy 问题解答 & 未来规划

| username: djshow832-PingCAP

Hello everyone, I am a developer from the TiProxy team. I am very happy to see that everyone likes TiProxy. We have read every comment about TiProxy and decided to answer your questions here.

TiProxy just GA, is anyone using it? Is it stable?

  • TiProxy has been online with TiDB Serverless since January 2023. Every SQL you run on TiDB Serverless is processed by TiProxy; every TiDB Pod scheduling, scaling, and upgrade in TiDB Serverless uses TiProxy’s connection migration. Refer to the blog Maintaining Database Connectivity in Serverless Infrastructure with TiProxy.

  • TiProxy has been online with TiDB Dedicated since April 2023, undergoing the test of commercial customers in production environments, with many successful cases of connection retention and load balancing.

  • TiProxy was originally designed to adapt to the elastic scaling and rapid iteration of TiDB in the cloud, but later we saw that users off the cloud had the same needs, so we started integrating ecosystem tools, and the basic functions of TiProxy are the same as those in the cloud.

TiProxy is declared GA in the TiDB DMR version, can it be used in production environments?

  • TiProxy is released independently, and TiProxy v1.0.0 is the GA version, which can be used in production environments.

  • TiProxy does not necessarily need to be used with TiDB v8.0.0, it can be used with any LTS version of TiDB v6.5.0 and above.

Can TiProxy replace HAProxy?

  • You need to weigh functionality and performance. If you think TiProxy’s functionality is more important, then replace HAProxy. One reference: TiProxy cannot replace NLB on TiDB Cloud, which is equivalent to adding an extra component to the link, increasing latency, instance costs, and cross-availability zone traffic costs, but some customers are willing to use it. In comparison, the cost of replacing HAProxy is much lower, so we believe there are definitely scenarios for it.

  • If you use VIP to achieve high availability of TiProxy, you still need to manually deploy keepalived, as TiUP does not integrate keepalived.

Why is TiProxy’s performance lower than HAProxy?

  • HAProxy is an L4 proxy and does not need to parse MySQL packets; TiProxy is an L7 proxy, and connection migration requires parsing MySQL packets. There are already many products for L4 proxies, while L7 proxies can do more.

  • TiProxy’s network architecture is still a simple goroutine-per-connection model, and goroutine switching has more overhead.

What are TiProxy’s future plans?

TiProxy has huge potential, and here are our current plans (plans may be adjusted at any time, the following times are just estimates, not commitments):

  • High Availability & Business Continuity

    • (Within six months) More comprehensive TiDB health checks, such as migrating connections when TiDB cannot connect to PD or TiKV, quickly restoring business

    • (Within six months) Quickly migrate connections when TiDB has OOM risk, reducing the impact on business; large SQL usually executes to OOM and cannot be migrated

    • (Long-term) Maintain connections when TiDB unexpectedly goes offline (not planned maintenance operations)

    • (Long-term) TiUP integrates keepalived, easily achieving high availability of TiProxy in small-scale clusters

    • (Long-term) TiProxy’s functionality is modularized, allowing for online expansion or updates at any time

  • Performance & Cost

    • (Within six months) TiProxy routes to local TiDB, mainly to reduce cross-availability zone traffic fees in the cloud, and can also be used for cross-data center deployment and mixed deployment of TiProxy and TiDB to reduce network latency

    • (Within six months) Load balancing based on TiDB CPU usage (rather than the number of connections), fully utilizing TiDB resources when workloads on different connections vary greatly

    • (Long-term) Optimize the network model to improve throughput

  • Stability

    • (Within one year) Configure TiDB instance quotas for each tenant, achieving resource isolation at the computing layer instance level, forming a complete multi-tenant solution with resource isolation at the storage layer (Resource Control or Placement Rules)

    • (Long-term) Record traffic and replay it to the new TiDB cluster to verify the compatibility of the new TiDB version, ensuring worry-free upgrades

    • (Long-term) Rate limit during high TiDB load to avoid crashing TiDB or causing service level degradation

  • Security

    • (Within one year) Support certificate-based authentication, compatible with TiDB

If you have any suggestions or ideas, feel free to leave a comment or raise an issue at https://github.com/pingcap/tiproxy!

Finally, please stay patient and stay excited!

| username: Kamner | Original post link

Do we have any test results for comparison?

TiProxy does not necessarily have to be used with TiDB v8.0.0; it can be used with any LTS version of TiDB v6.5.0 and above.

That’s great, looking forward to the continuous improvement of TiProxy’s features.

| username: ShawnYan | Original post link

Ti liked it :+1:

| username: Kongdom | Original post link

:+1: TiCool拉

| username: Soysauce520 | Original post link

Super excited about this issue. Load balancing based on TiDB CPU usage (rather than the number of connections) can fully utilize TiDB resources even when the workloads on different connections vary significantly.

| username: djshow832-PingCAP | Original post link

Refer to the official performance test report: TiProxy 性能测试报告 | PingCAP 文档中心

| username: DBAER | Original post link

Really good.

| username: db_user | Original post link

That’s quite powerful.

| username: Mingdr | Original post link

Looking forward to the implementation of these two features.

| username: TiDBer_jYQINSnf | Original post link

:+1: :+1: :+1:

| username: pepezzzz | Original post link

The ability of TiProxy to manage VIP is a highly desired feature for offline deployment. However, integrating keepalived might not be the best choice. Isn’t the VRRP protocol too heavy? The new generation of MySQL cluster management software rarely uses keepalived. Wouldn’t it be better to implement VIP management using a raft leader?

| username: Jellybean | Original post link

I am looking forward to the CPU load balancing, plugin functionality, and traffic mirroring features based on TiDB.

| username: 晓宇晨曦 | Original post link


| username: wangkk2024 | Original post link

Impressive :+1:t2:

| username: 边城元元 | Original post link


| username: TIDB-Learner | Original post link

GOAT (Greatest of All Time)

| username: TiDBer_21wZg5fm | Original post link

Sure, let me test it.

| username: djshow832-PingCAP | Original post link

Thank you for the guidance. It is indeed better to let TiProxy manage the VIP itself. However, I would like to make an adjustment by changing the raft election to leader election on PD’s etcd, because TiDB’s normal operation relies on PD. In the case of a network partition, etcd can always elect a TiProxy in the same partition as the PD leader, ensuring routing to a functioning TiDB; whereas raft might route to another partition, rendering the cluster completely unusable.

Additionally, I am thinking that allowing users to choose a primary preference or configure weights like keepalived might be more adaptable to various scenarios:

  • Since there is only one primary instance, its hardware configuration might need to be very high. To save costs, users might want a primary with 32C and a standby with 16C. As long as the 32C instance is alive, it should always be the primary.
  • Users have two data centers, but to save costs, small clusters only use soft load balancing. For high availability, TiProxy is configured in both data centers, but they want the TiProxy in the same data center as the application to be the primary.

I don’t have a deep understanding of the business, so please correct me if I am wrong.

| username: chris-zhang | Original post link

Ti liked it