The import speed is closely related to your hardware configuration, target table structure, number of columns per row, and SQL conditions. You can provide more information.
Usually, using tools like Lightning or BR for physical import will be the fastest, but you also need to consider whether your target cluster is offline and whether there are any ongoing business operations.
It depends on the concurrency. For example, loading data with 40 concurrent threads; if addressing hotspot issues for insertion, then fewer concurrent threads are needed. The premise is that there are no resource bottlenecks, such as NVMe disks, and some rate limiting parameters in TiKV may need to be adjusted.
120,000 records per second? I’ve managed to reach 20,000-30,000 records per second with Kettle, but 100,000 per second seems a bit high, and it also depends on the size of each record. JDBC probably can’t meet this requirement.
I have stress tested with 500 threads inserting one by one and it exceeded 100,000. If it’s just importing data with one insert carrying more data, a few concurrent threads are enough.
Real-time business insertion?
Try to make the batch size larger for each insertion, that is, include more values.
Try using multiple connections.
If we assume each piece of data is 1k, writing a hundred megabytes per second is not too large. Increase the write buffer size.
You can insert using JDBC, with each SQL statement containing 500 rows. One thread runs 20 SQL statements, and then multiple threads run simultaneously.