Total Capacity Requirements for Data Migration with TiDB Lightning

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb-lightning迁移数据时总容量要求

| username: TiDBer_djgos04V

[TiDB Usage Environment] Production Environment
The official documentation states that the total storage space of the target TiKV cluster must be greater than the data source size × number of replicas × 2. For example, if the cluster uses 3 replicas by default, then the total storage space needs to be more than 6 times the size of the data source. So if I originally have 1TB of data in MySQL, the TiDB cluster requires at least 6TB of free space before migration. But doesn’t TiDB compress the data? Wouldn’t a large portion of this space be left unused? Also, when using Lightning to import data, the official documentation mentions the need to create a temporary folder. If I now run Lightning on a server with the TiKV component, with a data source of 20GB and 35GB of remaining capacity on the server, but the total capacity in the cluster is 150GB, will the data import be successful?

| username: tidb菜鸟一只 | Original post link

Here, we are actually considering the worst-case scenario. In reality, there is 1TB of data in MySQL, and migrating to TiDB with 3 replicas would also be around 1TB of data. However, we need to consider the worst-case scenario to prevent errors during the import process from increasing the workload.

| username: zhanggame1 | Original post link

The original poster is correct, it will compress and free up some space.
The compression ratio varies depending on the data, so it’s best to import a portion first to test the actual usage. For example, test with 100GB and provide an accurate estimate of the disk overhead.

| username: redgame | Original post link

Leave more, the data in different environments varies, and it’s not certain how much compression will be achieved.

| username: 昵称想不起来了 | Original post link

To be on the safe side, it’s better to follow the official recommendations. Additionally, according to the official suggestions, you should try to reserve space and not use below the healthy water level usage rate. Otherwise, there will be issues with usage after the migration is completed.