Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 7.5的import into算物理导入还是逻辑导入
[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] What operations were performed when the problem occurred
[Encountered Problem: Problem Phenomenon and Impact]
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]
- The recently released 7.5 LTS has officially GA’d a feature called IMPORT INTO (IMPORT INTO | PingCAP 文档中心). This feature integrates the physical import capability of tidb-lightning into TiDB compute nodes, allowing large-scale data imports to be completed with a single SQL statement, significantly simplifying the complexity of writing ultra-large-scale data.
Borrowing your post, I will also do some research.
According to the description, it should be the same as Lightning: parse the CSV file >> generate SST files >> import SST files into TiKV. It is considered as Lightning local mode import, which is more inclined to “physical import.”
The original text explains: “IMPORT INTO
statement uses TiDB Lightning’s physical import mode to import data in CSV
, SQL
, PARQUET
, and other formats into an empty table in TiDB.”
There are too many side effects, so it is recommended to use it cautiously in a production environment.
It should be a physical import.
Taking advantage of this, please take some time to study it.
Directly generating SST files and persisting them to TiKV, instead of generating SQL statements for execution, is physical import.
I don’t see any advantages.
Didn’t the documentation mention physical import? 
It is a physical import mode that allows data to be inserted directly into TiKV nodes in the form of key-value pairs, bypassing the SQL interface.
According to the official documentation, it is possible to insert KV directly without going through the SQL interface, which will definitely greatly improve efficiency.
It should still be physical import.
Since it’s on the TiDB server node, it’s clearly a logical import because it doesn’t directly import KV key-value pair data.
The backend mode corresponding to the physical import mode is “local”.
The backend mode corresponding to the logical import mode is “tidb”.