What is the maximum number of rows in the largest table in your database?

translator_bot · June 22, 2024, 2:41am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 请问大家的数据库最大的表有多少行数据

| username: breakyang

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Problem Phenomenon and Impact]
[Resource Configuration] Enter TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachment: Screenshot/Logs/Monitoring]
I would like to know about everyone’s data capacity situation. We have a table with 6 billion rows. Will there be any performance issues with point queries?

translator_bot · June 22, 2024, 2:41am

| username: zhanggame1 | Original post link

No performance issues with partitioning for up to hundreds of millions of rows.

translator_bot · June 22, 2024, 2:41am

| username: hey-hoho | Original post link

No problem, I’ve handled projects at the hundred-billion level before.

translator_bot · June 22, 2024, 2:41am

| username: 我是咖啡哥 | Original post link

If it can handle point queries, what are you worried about?

translator_bot · June 22, 2024, 2:41am

| username: 啦啦啦啦啦 | Original post link

Our largest is over 5 billion, no need to worry about point queries.

translator_bot · June 22, 2024, 2:41am

| username: Jellybean | Original post link

For our partitioned table, the maximum number of rows in a single table is 60 billion. Point queries are very fast and there’s no difference compared to regular tables. With TiFlash, even a full table count(*) query can be completed in about 1 minute. Overall, it’s quite good.

translator_bot · June 22, 2024, 2:41am

| username: breakyang | Original post link

May I ask what is meant by partitioned table here? I just started using TiDB and don’t quite understand, would like to ask for some guidance.

translator_bot · June 22, 2024, 2:41am

| username: Jellybean | Original post link

Partitioned tables are a special type of table organization in TiDB. You can search the official documentation for more details; the official website explains it very clearly.

translator_bot · June 22, 2024, 2:41am

| username: redgame | Original post link

Our 3 billion

translator_bot · June 22, 2024, 2:41am

| username: breakyang | Original post link

How large does a table need to be to require partitioning, or in what scenarios is it needed?

translator_bot · June 22, 2024, 2:41am

| username: zhanggame1 | Original post link

The main function of partitioning in TiDB is to facilitate data deletion. You can directly use DDL to delete that partition. Deleting large tables with ordinary DELETE is very troublesome and requires splitting the SQL to delete in segments.

translator_bot · June 22, 2024, 2:41am

| username: Jellybean | Original post link

TiDB’s underlying storage, through region division and scheduling, naturally supports the storage of extremely large tables and has perfectly solved the problem of sharding in traditional single-machine databases.

A table usually has over a hundred million rows, and only when there are specific business needs do we set it as a partitioned table when creating a new table. In our case, the use of partitioned tables is more for the convenience of operations and maintenance, as well as to utilize the characteristics of hot and cold storage.

Operations and maintenance: Mainly for cleaning up historical expired data.
TiDB’s delete operation is a transactional operation. When a large amount of data (over a million rows) needs to be deleted at once, it requires executing a super-large transaction, which can take hours or even fail due to the transaction being too large. By using the partitioned table feature, we can place expired data (by day, week, month, etc.) in a partition, which can be deleted very quickly (in seconds), achieving the purpose of data deletion. Hours vs. seconds, the choice is obvious.
Using the partition feature, certain partitions can be placed on regular HDDs, while important partitions can be placed on NVME SSDs, allowing administrators to flexibly control data storage and reduce storage costs.

There are many other application scenarios for partitioned tables, and the specific design and use depend on the characteristics of the business itself.

translator_bot · June 22, 2024, 2:41am

| username: breakyang | Original post link

Thank you very much for your detailed answer.

translator_bot · June 22, 2024, 2:41am

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.