What? Is there an even better and more cost-effective option than TiDB Community Edition?

translator_bot · June 22, 2024, 9:38pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 什么？还有比 TiDB 社区版更好更省的选择？

| username: Billmay表妹

Preface

A few days ago, I saw Mr. Ma’s article on Zhihu: https://zhuanlan.zhihu.com/p/589970576. Below is the complete original text:

Every once in a while, TiDB releases some big news about architectural evolution. For example, TiFlash and HTAP in 2020, MPP in 2021, and TiDB Cloud in 2022. As we approach the end of the year, we are excited to share another big news: TiDB Serverless with the next-generation cloud-native architecture is now online.

Targeting Cost-Effective Scenarios

TiDB has always been designed for large-scale critical online businesses, which has made our product positioning lean towards such scenarios. In reality, as a general-purpose database, TiDB also plays a significant role in countless non-critical or small to medium-scale scenarios for many users. Examples include historical data queries, real-time data services and insights, warm data storage, SMB scenarios, etc. These scenarios undoubtedly have quite different highlights and requirements compared to critical online businesses: for instance, being more cost-sensitive, having a higher storage-to-compute resource ratio, and placing more emphasis on elasticity and on-demand scaling. A single TiDB product trying to cater to these different scenarios would seem inadequate and ambiguously positioned. The newly launched TiDB Serverless Tier is designed to solve this problem.

New Cloud-Native and Serverless Tier

Cloud-native has always been a target for many database vendors, but very few can clearly explain what cloud-native means. As one of the database vendors, we believe cloud-native means leveraging cloud infrastructure to provide capabilities far superior to private deployments. For example, Snowflake, one of the pioneers of cloud-native architecture, uses cloud object storage and virtual machine resource pools to offer very low-cost storage and highly elastic computing capabilities, which are “superpowers” that any privately deployed data warehouse platform cannot match. Delegating storage to cloud object storage gives the database extremely high availability and durability, but it also requires careful handling of the high latency that comes with it. Therefore, heavily relying on S3 as storage has always been an exclusive design for analytical databases. But TiDB has taken a new step.

Under the new cloud-native architecture, TiDB originally uses local caching supplemented by cheap and reliable object storage as the main storage to achieve a lower-cost, more elastic, and even higher-performance storage architecture. In the original architecture of TiDB, data is stored separately in each TiKV’s RocksDB, and each write is synchronized to each replica through Raft Log. In the new architecture, while retaining the original Raft Log transmission mechanism to ensure fast writes, data is synchronized to different replicas via S3 for persistent storage. This design gains many cloud-native advantages without introducing higher latency.
Additionally, computing resources are provided by pooled virtual machines, allowing computing nodes (TiDB and TiFlash) to elastically change according to the load at any time.

Less Consumption

In the new architecture, TiKV writes do not need to be repeatedly applied across multiple replicas but only need to change the primary replica and spread to other replicas via object storage. This reduces the CPU consumption of writes from three times to slightly more than one time, achieving a 30% to 50% improvement in CPU efficiency (or cost reduction) in the overall storage layer.
Higher stability, less resource reservation
Due to the main storage being changed to shared object storage, in the new architecture, operations such as LSM compaction, Analyze Table, Add Index, and even BR, which used to intermittently interfere with normal operations, can be delegated to independent microservices to obtain resources and run on-demand. Previously, users needed to reserve 1/3 to 1/4 resources for this, but in the new architecture, these reservations are no longer needed, and performance will be more stable. Meanwhile, since there is no need to consider business stability, heavyweight operations such as backups can achieve an order of magnitude speed improvement.

More Friendly to Warm Data Storage

In the new design, different Regions no longer share the same LSM tree, significantly reducing the number of levels, improving read and write performance, and being able to withstand much larger Region sizes than before, reducing the maintenance overhead related to Raft Regions. This also allows the storage capacity limit of a single TiKV node to be much greater than the current 4T limit. For warm data storage scenarios, we can choose fewer single-node CPUs and larger storage (1 to 2 times the storage-to-compute ratio improvement), greatly saving the computing resources required per unit of storage.

Ultra-High Elasticity

In the previous design, the elasticity of the TiDB computing layer was relatively easy to achieve, but the actual scaling of the storage layer required writing replica data to the target node through the Leader Region to achieve migration. Since this action requires a certain amount of resources, we had to limit the speed of replica migration to prevent affecting the operation of online businesses. In the new architecture, data is stored in object storage with almost unlimited bandwidth, and data balancing is only limited by the ingress bandwidth of the node itself, allowing the storage layer to scale 30 times faster or even more. This greatly enhances TiDB’s ability to handle more frequent traffic fluctuations, allowing users to plan resources only for the required load, such as using different amounts of resources during the day and night to significantly reduce costs. Additionally, under Serverless, TiDB, combined with the resource pool, will better provide resource elasticity based on load, eliminating the need to pay for idle resources during low loads.

So? What Now?

In everyone’s perception, TiDB is often more suitable for medium to large-scale data volumes (TB scale and above). After all, under the scale that a single MySQL can handle, the previous TiDB design did not have better performance and cost-effectiveness; moreover, although it has good elasticity, we often encounter examples where users have very large differences in load between day and night, but the cluster cannot quickly scale to save resources; and in warm data storage scenarios with medium to low load, TiDB’s inherent consumption also makes some users concerned about its holding cost.
But under the new architecture, Serverless Tier provides a better choice: it offers better cost-effectiveness than MySQL when the business startup load is low, unique HTAP capabilities without the need to build a complex analytical platform, and built-in high availability without worrying about business continuity; and as the business continues to grow, users do not need to re-plan and select a new database, TiDB Serverless can continue to provide good performance and elastic resources as the load increases. There is no need to prepay for potential future business growth, which is a choice worth considering in the current economic environment.

Welcome to Try

For small-scale applications under 5 GB, the new cloud-native architecture with Serverless is already available for free to users on TiDB Cloud (AWS). If you want to try larger-scale scenarios, feel free to contact us for an experience, scan the code to join the group and discuss with us.

translator_bot · June 22, 2024, 9:38pm

| username: tidb菜鸟一只 | Original post link

Not bad, good stuff. Now that the company’s business is unified on the cloud, deploying a set of K8S first and then deploying TiDB is indeed troublesome and not very user-friendly. Serverless TiDB is the way to go.

translator_bot · June 22, 2024, 9:38pm

| username: ShawnYan | Original post link

The new cloud-native architecture combined with Serverless is now available for free to a wide range of users on TiDB Cloud (AWS).

translator_bot · June 22, 2024, 9:38pm

| username: Billmay表妹 | Original post link

If you have any questions related to serverless, feel free to raise them at https://asktug.com/c/ecosystem/tidb-serverless/420024~

translator_bot · June 22, 2024, 9:38pm

| username: 望海崖2084 | Original post link

I am still a beginner and have a few questions to ask:

What does “serverless” mean? Can it be simply understood as a cloud database?
I tried to connect using Windows Python according to the example, but it got stuck on SSL. After downloading cacert.pem, it prompted
OperationalError: (2026, 'SSL connection error: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. Error 10060/0x0000274C'). Could you please explain what SSL is and how to use it?

translator_bot · June 22, 2024, 9:38pm

| username: tidb菜鸟一只 | Original post link

I am using a Linux machine to connect, and I need to specify the SSL file when connecting: mysql --connect-timeout 15 -u '' -h gateway01.us-west-2.prod.aws.tidbcloud.com -P 4000 -D test --ssl-mode=VERIFY_IDENTITY --ssl-ca=/etc/pki/tls/certs/ca-bundle.crt -p. When using a visual tool on Windows, you generally need to configure the specified SSL file and SSL mode, right?

translator_bot · June 22, 2024, 9:38pm

| username: sunxiaoguang | Original post link

Serverless is a general concept. You can refer to the FAQ or other Serverless-related documentation. https://docs.pingcap.com/tidbcloud/serverless-tier-faqs#serverless-tier-faqs
It looks like there is a network issue. You can first use traceroute to check if there are any problems with the network path, or try using a VPN to connect. The service itself does not restrict access from within China, but whether you need a VPN depends on your network situation due to the long connection path.

translator_bot · June 22, 2024, 9:38pm

| username: Christophe | Original post link

Why does tidbcloud.com require a VPN to access?

translator_bot · June 22, 2024, 9:38pm

| username: sunxiaoguang | Original post link

@Christophe This is not an active block, but whether it can be accessed depends on the local network’s ability to access overseas sites.

translator_bot · June 22, 2024, 9:38pm

| username: jansu-dev | Original post link

Recently, while experiencing actix_web (rust) and sqlx (rust) → demo project, I also tried using TiDB serverless.

Firstly, I think serverless is still a relatively advanced concept in China.
At the same time, I encountered issues with accessing the internet scientifically. Does the official website have plans to address this at the product level in the future? Putting serverless into production still significantly affects its usage.
Lastly, when I raised practical issues about Serverless on the TiDB Forum, the engineers’ feedback was very timely and excellent. → How to connect TiDB serverless in grafana with TLS? - #5 by jansu-dev

@sunxiaoguang

translator_bot · June 22, 2024, 9:38pm

| username: sunxiaoguang | Original post link

Thank you all for your interest in the serverless product. Considering the resources currently available, R&D and product delivery are limited to overseas AWS regions. Of course, we will expand to GCP and Azure in the future. Services in China will be scheduled for deployment later based on the matching degree of AWS China region product capabilities and market maturity when we have the capacity.

translator_bot · June 22, 2024, 9:38pm

| username: 霸王龙的日常 | Original post link

TiDB Serverless eliminates the need to prepay for database expenses for potential future business growth, which is indeed a worthwhile option to consider in the current economic environment.

translator_bot · June 22, 2024, 9:38pm

| username: ddmasato | Original post link

Give it a try.