For the transaction log table, is it better to use physical sharding or underlying partitioning with TiDB?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 流水记录表,用tidb是做物理分表还是底层分区好

| username: cy6301567

For the transaction log table, is it better to use physical sharding or underlying partitioning with TiDB?

| username: 有猫万事足 | Original post link

How much data? In principle, each table in TiDB is divided into several regions on TiKV. Therefore, physical sharding is completely useless; even if you shard the table, it will still be stored in regions. It is entirely possible that the tables you have sharded are actually stored in a single region.

Partitioning is entirely for the purpose of deleting data or configuring hot and cold data separation based on placement rules. It is purely for management purposes and does not improve concurrent read/write performance.

To improve concurrency, you need to reduce read/write hotspots, add hardware, and horizontally scale read/write capabilities.

| username: cassblanca | Original post link

Whether to use sharding or partitioning depends on your data volume and query business scenarios. If you want to separate online data from historical data, sharding is better. For example, online data can be stored for the last 3 months for ad hoc queries, while data older than 3 months can be considered historical data and used for asynchronous batch queries. As mentioned above, TiDB uses column clusters and regions for physical data storage, which automatically manages splitting and merging. Theoretically, as long as storage performance is sufficient, it is not sensitive to data volumes below the petabyte level.

| username: 像风一样的男子 | Original post link

We stored the time-series message data in a time-series database because storing this data in TiDB is too costly.

| username: zhanggame1 | Original post link

TiDB is designed to solve the problem of MySQL physical sharding, so you can choose partitioning. You can regularly delete partitions of historical data.

| username: cy6301567 | Original post link

Currently, there are 3 KV nodes at the bottom layer, each with 32 CPUs and 64 GB of memory.

| username: zhanggame1 | Original post link

The configuration is not low, just use partitioned tables.

| username: 有猫万事足 | Original post link

This configuration is pretty good.

| username: redgame | Original post link

Partitioned table…