How to Build a TiDB Database in a Microservices Architecture

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 微服务下TIDB数据库该如何建设

| username: 你的选择

In a distributed microservices architecture, the system may be split by functionality (e.g., into an order center and an inventory center…). In TiDB, should the database be built as one or split into multiple databases? If split into multiple databases, will there be performance issues with cross-database queries, and does TiDB natively support cross-database distributed transactions?

| username: 人如其名 | Original post link

Creating one or multiple databases in TiDB is not a big issue; you just need to set up one TiDB cluster. Cross-database queries within a TiDB cluster will not have efficiency problems, as TiDB supports cross-database distributed transactions. Additionally, it is recommended to deploy TiDB-server computing nodes separately for batch and slightly more complex query statements, and try to separate them from the TP system.

| username: 你的选择 | Original post link

Hello, thank you for your reply. Supporting cross-database transactions seems to be a relatively difficult topic in the industry. Of course, there are third-party middleware solutions (such as Alibaba Seata, but the efficiency is relatively poor). How does TiDB handle this? Where can I find relevant information?

| username: 啦啦啦啦啦 | Original post link

You mean cross-cluster, right? The cross-database mentioned above refers to cross-schema within the same cluster. Supporting transactions across schemas is not difficult and does not require middleware. If there is such a need, just put the data in one cluster.

| username: 人如其名 | Original post link

He probably hasn’t used a distributed database and doesn’t know that TiDB is designed to solve such problems. For him, what he wants is likely just a cluster with multiple schemas (databases).

| username: 你的选择 | Original post link

Indeed, I haven’t used a distributed database before. What I mean is actually different schemas (for example, one database for the order center and another database for the inventory center). Can TiDB handle distributed transactions based on microservices? For instance, if the order center calls the inventory center’s service, the inventory center succeeds, but the order center fails, how does the inventory center roll back (does TiDB natively support this)?

| username: 你的选择 | Original post link

Multiple microservices, each corresponding to a database (schema). Actually, what I want to ask is how TiDB solves cross-service distributed transactions, or does it not support them?

| username: tidb菜鸟一只 | Original post link

You can think of TiDB as a very powerful single-node MySQL database.

| username: 啦啦啦啦啦 | Original post link

I understand that within the same cluster, each microservice corresponds to a database (schema). Essentially, there is no difference from accessing the same database; the schema is just logical.

| username: 数据小黑 | Original post link

Many key issues are mixed together. In the MySQL scenario, multiple schemas may be on multiple MySQL instances because MySQL’s performance is limited by single-machine performance. For large systems, if you do vertical partitioning of the database and use Seata for distributed transactions, it is a relatively good solution. If the underlying database uses TiDB, you need to think from a different perspective. TiDB is a database with “unlimited” performance and storage, so the business side only needs to keep creating schemas and tables without worrying too much. Before middleware like Seata existed, if the inventory center succeeded but the order center failed, how would the inventory center roll back? It just needs to be placed in one transaction, and this remains the case in TiDB.

| username: h5n1 | Original post link

Refer to this: a cluster with 6000+ databases.

| username: 你的选择 | Original post link

So, TiDB itself addresses the issue of elastic scaling for relational databases and supports distributed transactions based on multiple nodes. However, these distributed transactions should be based on the same database connection (of course, they can also be cross-schema). But if the system is microserviced, the calls between systems are made through APIs, which no longer belong to a single connection. I understand that TiDB cannot solve the distributed transaction issues brought about by inter-microservice calls (even if the underlying layer is a TiDB cluster). At this point, upper-layer middleware like Seata is needed to handle it.

| username: 你的选择 | Original post link

Yes, but what are TiDB’s recommended measures for microservice architecture? Should different schemas be built for different microservices, or should all microservices share one schema? If TiDB’s underlying structure truly makes no difference between these two approaches, then it would be better to split them vertically by function. Each microservice, whether at the application layer or the storage layer, should perform its own duties without interfering with each other. Otherwise, it would become a very large cloud-based SaaS monolithic application.

| username: 你的选择 | Original post link

A cluster with over 6000 databases feels like too many. Is the application layer also using a microservices architecture? If so, how do you solve the issue of distributed transactions across services?

| username: 你的选择 | Original post link

Actually, there is another core issue. After microservices are split, in principle, we cannot directly access the databases of other applications. We can only access them through APIs; otherwise, the coupling between microservices will be broken, violating the design principles of microservices. This is why we cannot directly use TiDB distributed transactions within the same transaction.

| username: ealam_小羽 | Original post link

In this case, it doesn’t matter which database is used. The development team needs to consider distributed transaction solutions for microservices. Generally, these solutions involve some form of compensation or notification mechanism to ensure eventual consistency of transactions. You can refer to this article for more information:

| username: 我是咖啡哥 | Original post link

A system, a set of TiDB clusters, different applications (functional modules) can create different databases and use different users. Because in a cluster, distributed transactions are inherently supported by TiDB. No need for the application to consider it.
From a development perspective, you can treat TiDB as a MySQL single instance that can scale infinitely.

| username: hey-hoho | Original post link

What you need is a business-level distributed transaction, which should definitely be handled by the application layer. TCC, Seata, 2PC, and 3PC are all commonly used solutions.

| username: 数据小黑 | Original post link

I understand what you mean. Personally, I don’t quite agree with using Seata or even message middleware to perform two-phase commits to maintain cross-application transactions. We generally use message middleware to maintain eventual consistency, which provides more room for optimization for the system. In our understanding, any two interfaces that require strong transaction guarantees should be within a single application and should not be separated.

| username: TIxiaoP | Original post link

Distributed services do not affect using a whole set of TiDB cluster services. If business services require data isolation, it can be achieved at the TiDB user level.