Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: INSERT语句的select中不会走tiflash么?
[TiDB Usage Environment] Production Environment
[TiDB Version] v6.5.3
[Reproduction Path] None
[Encountered Problem: Problem Phenomenon and Impact]
During optimization today, I found that the select in insert does not use the TiFlash component? Even with a hint, it doesn’t work. I have been using version 4.0, and after upgrading to a higher version, I didn’t pay attention to it.
I searched the community and found that this is not supported in higher versions? Can anyone using a higher version verify this?
Does the official team plan to support this in new versions? It seems the issue has been stagnant for four years.
We also have this requirement, where the analysis results are directly written to another table. If TiFlash cannot be used, it is indeed quite unfortunate.
Inserting into select with different MySQL isolation levels requires locking, which TiFlash cannot achieve. Moreover, TiFlash might fail relatively often, making it difficult to ensure linear consistency.
TiFlash feels inherently less reliable than TiKV, and the risks outweigh the benefits.
I don’t understand, does this have anything to do with locking? Whether I query directly or query and save to another table, it feels like the requirements for the source data are the same.
Is it also risky if I query and display directly through TiFlash? It doesn’t seem like it should be.
It should refer to transaction locking, right?
Actually, the official version can support insert into select import without transactions (supporting TiFlash), which is faster.
Oh, I see what you mean. Indeed, it’s difficult to control transactions here. But it’s definitely impossible to support insert without transactions.
You can enable this parameter.
The official documentation explains this point: 使用 TiDB 读取 TiFlash | PingCAP 文档中心
Alright, it is already supported. Thanks to @WalterWj for the guidance.
Although v6.5 supports it, it is an experimental feature, so it is disabled by default.
Final summary statement.
Supported starting from v6.5.0, but as an experimental feature,
Officially GA in version v7.1.0
It is possible to force the use of TiFlash.
I tried it, but it doesn’t work even with force.
Forcing it won’t work either
In TiDB, the behavior of the INSERT ... SELECT
statement does have some peculiarities, especially when it involves the TiFlash component. TiFlash is TiDB’s columnar storage engine, primarily used to accelerate analytical queries. For mixed OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) workloads, TiDB provides an efficient solution through TiFlash.
INSERT ... SELECT
and TiFlash
By design, the INSERT ... SELECT
operation is mainly aimed at transactional data modification and data migration scenarios. This operation typically involves a large amount of row-level data reading and writing. TiFlash, on the other hand, is optimized for columnar storage analytical queries, particularly those involving a large number of column calculations but fewer rows.
Therefore, even if you use a query hint like /*+ READ_FROM_STORAGE(TIFLASH[table_name]) */
to instruct the TiDB optimizer to use TiFlash, operations like INSERT ... SELECT
may still not choose TiFlash for execution. This is primarily because:
-
Execution Plan Selection: The TiDB optimizer selects the most appropriate execution plan based on the nature of the operation and the current data distribution. For large-scale row operations, even if TiFlash is specified, the optimizer may consider using RowStore (TiKV) to be more efficient.
-
Transaction Consistency and Real-time Performance: There is a slight delay in data synchronization between TiFlash and TiKV. For INSERT ... SELECT
operations that require real-time data consistency, using TiKV can ensure the transaction’s isolation level and consistent view.
Solutions and Recommendations
If you indeed need to leverage TiFlash’s columnar storage advantages through the INSERT ... SELECT
operation, here are a few possible solutions:
-
Analytical Query Optimization: If your goal is to optimize analytical queries, consider first INSERT
ing the data to be analyzed into a temporary table, then perform analytical queries on that temporary table, explicitly using TiFlash.
-
Adjust Data Model or Query: Evaluate whether you can avoid relying on INSERT ... SELECT
by adjusting the data model or restructuring the query logic, which may help better utilize TiFlash.
-
Contact Support: If you encounter performance issues or have specific use cases requiring TiFlash support for INSERT ... SELECT
, it is recommended to contact PingCAP’s technical support or community for assistance to understand if there are any latest optimization solutions or configuration adjustment suggestions.
In summary, the reason INSERT ... SELECT
does not use TiFlash is due to the decisions made by the internal TiDB optimizer and the design positioning of TiFlash. In scenarios where you need to optimize query performance using TiFlash, you may need to adjust the data processing logic or query method to achieve this.