How do you prepare TB-level test data for a single table?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 请教一下大家,单表TB级别的测数据,大家都是怎么准备的呢?

| username: Lystorm

[Test Environment for TiDB] Testing
[TiDB Version] V6.1
[Encountered Issue]
In production, we encounter single-table data processing at the TB level.
[Issue Phenomenon and Impact]
We are now preparing for this single-table TB-level test scenario. When preparing data, we insert data into the table via JDBC, but the efficiency is very low. I would like to ask everyone, how can we efficiently prepare this single-table TB-level test data?

| username: buddyyuan | Original post link

tiup bench can generate data.

| username: Lystorm | Original post link

Can a TiDB cluster deployed via K8S use tiup?

| username: OnTheRoad | Original post link

TiDB in K8S needs to be managed using TIDB Operator.

| username: wuxiangdong | Original post link

sysbench

| username: Lystorm | Original post link

When the data volume reaches a certain level, sysbench becomes inadequate because the default type of one of the fields in the tables created by sysbench is int. Once the int limit is reached, it fails, and each piece of data is very small.

| username: Lystorm | Original post link

Is the TiDB Operator’s cluster management functionality the same as tiup?

| username: 近墨者zyl | Original post link

It is best to desensitize the data, back it up, and have it ready for use at any time. When preparing to move data to the downstream data lake, extract or back up a large amount of test data from the data lake without affecting production.