Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 目前核心交易服务端是分库分表多,还是使用分布式数据库如TiDB多?
With the development of the company’s business, we currently need to consider data expansion. Recently, I have been looking into MySQL’s sharding, and it seems quite complicated. In comparison, TiDB seems to be simpler. I consulted some colleagues from large companies, and their core businesses are still using sharding. Why don’t big companies use distributed databases like TiDB for their core databases?
There are fewer and fewer people using database sharding and partitioning now. You see, domestic databases no longer follow this approach, including OceanBase.
Yes, but after consulting with some colleagues from major companies, their core business is still using sharding. If TiDB is so good, why not switch to it?
Tencent’s TDSQL is also a distributed system with sharding, right?
Tencent is not very clear, but I know that JD.com and 58.com use a strategy of database and table partitioning for transactions.
It is also possible to package and sell the product of database and table partitioning.
Since it is a core service, it must have been written for several years and has been running for many, many years. Do you dare to touch it? TiDB wasn’t that famous a few years ago. In another five to eight years, it is estimated that sharding will almost be replaced.
I’ve been working with database sharding for a long time, and replacing it requires too much manpower. The internet business scenario is simple, and database sharding is quite suitable for this scenario. It also helps to highlight the “value” of operations and maintenance.
I think the main issue lies in the term “big companies.” In smaller companies, one person can decide to use distributed systems with just a word, without the need for database sharding.
The trend is distributed systems; sharding and partitioning are things of the past.
There are many distributed ones.
I have also used distributed ones, where databases and tables are further divided, and there is a row limit for a single table.
If you put it that way, many system cores are still using Oracle. Despite advocating for the removal of IOE for so many years, many people still can’t let go of Oracle, largely due to historical reasons. I can only say that if you are now launching new business and trying to promote the old method of database sharding, it will be very difficult, and very few people will be fooled by it anymore…
Key industries have had their own selection criteria for many years, prioritizing stability and maturity, and progressing step by step. How long has TiDB been around? Its penetration rate is too low, and there are few knowledgeable maintenance personnel. It needs time and product maturity to accumulate.
These are all historical burdens~ We have them too, and we can’t get rid of them~
TiDB’s latency and long-tail issues are currently technical shortcomings in core transaction scenarios.
Switching comes at a cost, especially since sharding and partitioning databases are inherently invasive to the code. After finally stabilizing everything post-deployment, now with just one sentence, you want to switch to a distributed system. It’s impossible for the code not to change, right? This transformation cost is the biggest obstacle. Even concerns about performance ultimately boil down to cost.
I see the current trend is that new systems increasingly use DBMSs guaranteed by distributed protocols like Raft and Paxos. The old architecture methods, whether sharding or master-slave synchronization, are gradually being abandoned.
What you said is very true. It’s not that we don’t want to switch, but the current manpower cost of switching is too high. This is a gradual process. We should pay attention to whether distributed data or database and table partitioning is used in refactoring or new applications, rather than focusing on the existing ones. Many of the existing ones are due to historical legacy and cost issues.
Yes, most transitions cannot be resolved with just a few words, and it’s not simply a matter of changing the database. There are still many issues, such as modifying the application-side code and switching upstream and downstream systems, which can be quite troublesome. However, these problems can be solved, but ultimately it comes down to how much it will cost. After all, it involves real financial investment, not just theoretical discussions.