In which version will partitioned table performance see a significant improvement?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 分区表表性能在哪个版本会有很大提升?

| username: wluckdog

In which version will the performance of partitioned tables see a significant improvement?

| username: Kongdom | Original post link

I briefly checked the release records and found the following improvements.

release-4.0.5 Support for pre-splitting partitioned tables

release-4.0.6 Improved write performance for partitioned tables

release-4.0.8 Fixed an unexpected Panic issue when using partitioned tables

release-5.1.4 Support for partition pruning with IN expressions in Range partitioned tables

release-6.0.0 Support for partition pruning with IN expressions in Range partitioned tables

release-6.1.0 Support for dynamic partition pruning

release-6.4.0 Improved performance of statistics collection for partitioned tables

release-6.5.3 Improved speed of TRUNCATE operations on partitioned tables with Placement Rules

release-6.5.3 Fixed performance degradation of TableScan operator in partitioned tables during Region migration

release-7.4.0 Optimized memory usage and performance of ANALYZE on partitioned tables

| username: 江湖故人 | Original post link

Since you plan to upgrade, why not upgrade to the latest version? In addition to performance improvements, there are also more features. For example, the features added in version 7.4 make management very convenient:

  • Convert partitioned tables to non-partitioned tables
  • Partition existing non-partitioned tables
  • Modify the partition type of existing partitioned tables
| username: 小龙虾爱大龙虾 | Original post link

The performance improvement of partitioned tables is highly related to the way the application is used. Partitioning requires careful design; simply using partitioned tables does not guarantee performance improvement.

| username: Jellybean | Original post link

Based on our own experience, the performance of partitioned tables has greatly improved after v6.1.0 supported dynamic partition pruning, as it only needs to scan the partitions that need to be accessed, without scanning the partitions that do not need to be accessed.

| username: Jellybean | Original post link

I agree with this expert’s opinion.
The use of partitioned tables should pay attention to the scenario and method.
Tables that need partitioning are usually large tables with hundreds of millions, billions, or even tens of billions of rows. It is crucial to include the partition column as one of the filtering conditions in the WHERE clause, or explicitly specify accessing only a certain partitioned table.

| username: wluckdog | Original post link

Issue: Using a partition key can prune a specific dimension’s time partition range. However, querying different time ranges for different dimensions might not be as efficient. For example, partitioning by a_time and querying by a_time, b_time, c_time, and d_time.

| username: 像风一样的男子 | Original post link

Is your data time-series data?

| username: wluckdog | Original post link

It is not considered time-series data, as it involves modification and deletion operations.

| username: Jellybean | Original post link

Based on the business access situation, you can only choose the most important and frequently used time column as the partition column. There is no other good way. If it is very necessary to use other columns, consider trading space for time, and use another time column as the partition column, and then create another partition table for use.

| username: dba远航 | Original post link

Always use the latest version, that’s right.

| username: TIDB-Learner | Original post link

I haven’t used the partition table feature yet.

| username: zhanggame1 | Original post link

The recent version has many bugs and poor performance with partitioned tables, but it is continuously improving.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.