Exporting Large Amounts of Data from Queries

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 查询导出大量数据

| username: Hacker_ythhI9Qd

[TiDB Usage Environment]
TiKV, no TiFlash, 32C 64G

[Phenomenon] Business and Database Phenomenon
Paginated query to export business data to Excel is slow

[TiDB Version]

Currently, exporting business data to Excel through paginated queries, with 10,000 records per page, is slow, especially when the number of pages is large. In MySQL, you can use the last business ID from the sorted results as the initial ID for the next request, as the table’s IDs are incrementally increasing. However, in TiDB, IDs are segmented by region and are not continuously increasing. Do you have any good suggestions for this kind of batch export?

| username: xfworld | Original post link

Pagination optimization has many solutions. You can refer to the description in the specification:


| username: Hacker_ythhI9Qd | Original post link

The business involves multiple table associations, including a header table, a detail table, a configuration table, and a permissions table, currently sorted by creation time. If pagination is handled by sorting according to the ID, the data cannot be displayed in chronological order. If pagination is handled by sorting according to time, batch operations may result in the same timestamp, potentially causing the same data to appear on multiple pages. Additionally, is it recommended to consolidate into a large table, so that only three tables (header table, detail table, permissions table) are associated?

| username: xfworld | Original post link

This depends on the requirements of the scenario. Do whatever is appropriate.

Flattening it into a wide table is also an option, but you need to balance the data replica usage and computation cost.

| username: system | Original post link

This topic will be automatically closed 60 days after the last reply. No new replies are allowed.