Importing Data into TiDB

translator_bot · June 21, 2024, 11:21pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiDB 导入数据

| username: TiDBer_djgos04V

Is there any way to let TiDB import SQL scripts that are not exported by dumpling, similar to SQL scripts exported by Navicat?

translator_bot · June 21, 2024, 11:21pm

| username: tidb菜鸟一只 | Original post link

Isn’t an SQL script supposed to be executed directly?

translator_bot · June 21, 2024, 11:21pm

| username: TiDBer_djgos04V | Original post link

Scripts containing data have a relatively large magnitude, and I hope there are methods to improve the speed.

translator_bot · June 21, 2024, 11:21pm

| username: 啦啦啦啦啦 | Original post link

The SQL exported by Navicat can only be imported in a single thread, and using Navicat for export indicates that the data volume is not too large. If you want to speed up, it’s better to use Dumpling or BR.

translator_bot · June 21, 2024, 11:21pm

| username: TiDBer_djgos04V | Original post link

The SQL script provided upstream is over 20GB and will be even larger in the future. Currently, it is also not possible to use Dumpling to access the source database, so there is no solution.

translator_bot · June 21, 2024, 11:21pm

| username: h5n1 | Original post link

Manually split into multiple files, then name the files according to the lightning format.

translator_bot · June 21, 2024, 11:21pm

| username: tidb菜鸟一只 | Original post link

In that case, you probably need to write a script to split the SQL file into meta files and SQL files like those exported by Dumpling. Otherwise, with just one file, you can’t parallelize the process…

translator_bot · June 21, 2024, 11:21pm

| username: TiDBer_djgos04V | Original post link

Alright, thank you everyone.

translator_bot · June 21, 2024, 11:21pm

| username: cassblanca | Original post link

Is this 20G SQL insert script for stress testing TiDB? You can split the file into multiple parts and import them into different tables, then merge the data from multiple tables together. Using partitioned tables is also an option.

translator_bot · June 21, 2024, 11:21pm

| username: redgame | Original post link

Use lightning.

translator_bot · June 21, 2024, 11:21pm

| username: 像风一样的男子 | Original post link

Can the data format exported from upstream be changed to CSV? Lightning supports importing CSV files.

translator_bot · June 21, 2024, 11:21pm

| username: zhanggame1 | Original post link

Using Navicat to export as CSV is faster than SQL. Just create the table in advance, then import it with Lightning.

translator_bot · June 21, 2024, 11:21pm

| username: 啦啦啦啦啦 | Original post link

The person providing the upstream data is quite something. Importing 20GB with Navicat to a local machine would be extremely slow. It’s better to solve the issue at the source, whether by splitting the SQL file or something else, as both are quite troublesome. Alternatively, exporting to CSV with Navicat is also an option, but exporting will definitely be much slower than using Dumpling.

translator_bot · June 21, 2024, 11:21pm

| username: zhanggame1 | Original post link

Navicat exporting 20G to local isn’t that slow. In my test, exporting to my own laptop was about 100,000 rows per second, and it took less than 10 minutes for a single table of 1G. Navicat can open multiple windows for concurrent operations, which speeds things up significantly. With two windows open, it can take less than 5 minutes for 1G.

translator_bot · June 21, 2024, 11:21pm

| username: 啦啦啦啦啦 | Original post link

Isn’t this still slow?

translator_bot · June 21, 2024, 11:21pm

| username: zhanggame1 | Original post link

Find a tool to export CSV by yourself, it should be faster.

translator_bot · June 21, 2024, 11:21pm

| username: 像风一样的男子 | Original post link

Why not use mysqldump on the server to export? Isn’t it faster?