Add an Option to Estimate the Size of a Table in BR or Dumpling Tools

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: BR工具或dumpling工具添加估算某张表大小的选项

| username: Kamner

Requirement Feedback
Please clearly and accurately describe the problem scenario, required behavior, and background information to facilitate timely follow-up by the product team.

[Problem Scenario Involved in the Requirement]
How to prepare to estimate the actual size of a table. From the current community issues, there are still many such needs, especially before exporting and archiving large tables or migrating large tables, it is necessary to estimate the table size first to better plan storage space.

The current SQL query results provided by the official documentation still have a significant discrepancy with the actual data volume, which should be related to the high watermark of the table.

[Expected Requirement Behavior]
To accurately estimate the size of a table, it should be scanned from the physical layer. Is it possible to consider adding a parameter to the BR tool to estimate the data volume, similar to Oracle’s expdp ESTIMATE_ONLY (although it is also a logical export and has the high watermark issue)?

[Alternative Solutions for the Requirement]

[Background Information]
Such as which users will benefit from it, and some usage scenarios. Any API design, model, or diagram would be more helpful.

| username: zhaokede | Original post link

It’s difficult to estimate with compression, as it depends on the compression algorithm and the data stored in the table.