Issues with TiDB Time Partitioning

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb时间分区问题

| username: TiDBer_Xy7fsN7j

When creating a table, I need to partition the collection_start_time field by day and then by hour. How should I write the partition statement?

| username: xfworld | Original post link

In TiDB, you can use partitioned tables to partition by day and then by hour. Here is an example demonstrating how to create a table partitioned by day and then by hour:

First, create a regular table, defining the necessary columns and data types:

CREATE TABLE my_table (
    id INT,
    name VARCHAR(50),
    created_at DATETIME
);

Next, use the ALTER TABLE statement to convert the table into a partitioned table, partitioned by day and then by hour:

ALTER TABLE my_table
PARTITION BY RANGE(TO_DAYS(created_at) * 24 + HOUR(created_at)) (
    PARTITION p20220101 VALUES LESS THAN (TO_DAYS('2022-01-01') * 24),
    PARTITION p20220102 VALUES LESS THAN (TO_DAYS('2022-01-02') * 24),
    PARTITION p20220103 VALUES LESS THAN (TO_DAYS('2022-01-03') * 24),
    ...
);

In the above example, we use TO_DAYS(created_at) * 24 + HOUR(created_at) to calculate the partition value, converting the date and hour into an integer value. Then, we use the PARTITION BY RANGE clause to specify partitioning based on this integer value.

As needed, you can define more partitions according to your actual situation, with each partition corresponding to different date and hour ranges.

Please note that creating and managing partitioned tables requires some additional considerations, such as partition maintenance and query optimization. It is recommended to read TiDB’s official documentation in detail before using partitioned tables to learn more about partitioned tables and best practices.

Check it out!

| username: dba远航 | Original post link

The original poster probably means using multi-level partitioning, first by range of days, then by range of hours.

| username: TiDBer_Xy7fsN7j | Original post link

In one year, PARTITION p20220103 VALUES LESS THAN (TO_DAYS(‘2022-01-03’) * 24), doesn’t this mean writing 365 lines :joy:

| username: TiDBer_Xy7fsN7j | Original post link

The above method requires constant changes to the table structure over time. That’s not what I want :sweat_smile:

| username: forever | Original post link

What you probably want is something similar to Oracle interval partitioning, where partitions are automatically added when data is inserted.

| username: TiDBer_Xy7fsN7j | Original post link

Yes, could you provide an SQL template?

| username: forever | Original post link

Try this:
Partitioned Table | PingCAP Documentation Center

| username: dockerfile | Original post link

Which AI are you using?

| username: dba远航 | Original post link

You are probably looking for an automatically generated database, which is unlikely. It seems that the MYSQL version does not support it.

| username: xfworld | Original post link

We can only handle it in this compromise way. Currently, only OLAP engines can support such complex partitioning.

| username: TiDBer_小阿飞 | Original post link

I think what he wants is partitioning (by day) and subpartitioning (by hour). I just don’t know if it supports subpartitioning like ORACLE.

| username: 像风一样的男子 | Original post link

It seems that TiDB does not have sub-partitions, so your requirement cannot be met. Your requirement seems a bit special; is it time-series data?

| username: TiDBer_Xy7fsN7j | Original post link

Yes, the data continues to progress over time.

| username: 像风一样的男子 | Original post link

I suggest you research time-series databases.

| username: 有猫万事足 | Original post link

In version 7.5, the syntax for subpartitioning does not produce an error. However, in practice, only the partition is created, without any subpartitions.

| username: TI表弟 | Original post link

Remember to use the UTC time zone when creating the database. There is a pitfall with time zones: if you use the Beijing time zone, there will be an 8-hour gap where data cannot be retrieved.

| username: tidb菜鸟一只 | Original post link

Currently, a table can have a maximum of 8192 partitions. If you create one partition per hour, 8192 partitions won’t even last a year. You would need 8760 partitions for a year…

| username: TiDBer_lBAxWjWQ | Original post link

How many partitions need to be created for this?

| username: TI表弟 | Original post link

The performance is actually quite good for a partition with around 5 million records.