Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 请教一下大家,单表TB级别的测数据,大家都是怎么准备的呢?
[Test Environment for TiDB] Testing
[TiDB Version] V6.1
[Encountered Issue]
In production, we encounter single-table data processing at the TB level.
[Issue Phenomenon and Impact]
We are now preparing for this single-table TB-level test scenario. When preparing data, we insert data into the table via JDBC, but the efficiency is very low. I would like to ask everyone, how can we efficiently prepare this single-table TB-level test data?
tiup bench can generate data.
Can a TiDB cluster deployed via K8S use tiup?
TiDB in K8S needs to be managed using TIDB Operator.
When the data volume reaches a certain level, sysbench becomes inadequate because the default type of one of the fields in the tables created by sysbench is int. Once the int limit is reached, it fails, and each piece of data is very small.
Is the TiDB Operator’s cluster management functionality the same as tiup?
It is best to desensitize the data, back it up, and have it ready for use at any time. When preparing to move data to the downstream data lake, extract or back up a large amount of test data from the data lake without affecting production.