Does the SELECT in an INSERT statement not use TiFlash?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: INSERT语句的select中不会走tiflash么?

| username: Kongdom

[TiDB Usage Environment] Production Environment
[TiDB Version] v6.5.3
[Reproduction Path] None
[Encountered Problem: Problem Phenomenon and Impact]
During optimization today, I found that the select in insert does not use the TiFlash component? Even with a hint, it doesn’t work. I have been using version 4.0, and after upgrading to a higher version, I didn’t pay attention to it.
I searched the community and found that this is not supported in higher versions? Can anyone using a higher version verify this?
Does the official team plan to support this in new versions? It seems the issue has been stagnant for four years. :joy:

| username: 源de爸 | Original post link

We also have this requirement, where the analysis results are directly written to another table. If TiFlash cannot be used, it is indeed quite unfortunate.

| username: DBAER | Original post link

Inserting into select with different MySQL isolation levels requires locking, which TiFlash cannot achieve. Moreover, TiFlash might fail relatively often, making it difficult to ensure linear consistency.

| username: zhanggame1 | Original post link

TiFlash feels inherently less reliable than TiKV, and the risks outweigh the benefits.

| username: Kongdom | Original post link

:flushed: I don’t understand, does this have anything to do with locking? Whether I query directly or query and save to another table, it feels like the requirements for the source data are the same.

| username: Kongdom | Original post link

:joy: Is it also risky if I query and display directly through TiFlash? It doesn’t seem like it should be.

| username: 托马斯滑板鞋 | Original post link

It should refer to transaction locking, right?
Actually, the official version can support insert into select import without transactions (supporting TiFlash), which is faster.

| username: Kongdom | Original post link

Oh, I see what you mean. Indeed, it’s difficult to control transactions here. But it’s definitely impossible to support insert without transactions. :thinking:

| username: Soysauce520 | Original post link

You can enable this parameter.

| username: Soysauce520 | Original post link

The official documentation explains this point: 使用 TiDB 读取 TiFlash | PingCAP 文档中心

| username: Kongdom | Original post link

Alright, it is already supported. Thanks to @WalterWj for the guidance.
Although v6.5 supports it, it is an experimental feature, so it is disabled by default.

| username: Kongdom | Original post link

:+1: :+1: :+1: However, in v6.5 it is still an experimental feature, so it is not enabled.

| username: Kongdom | Original post link

Final summary statement.

Supported starting from v6.5.0, but as an experimental feature,

Officially GA in version v7.1.0

| username: xiaoqiao | Original post link

Force to go down.

| username: 呢莫不爱吃鱼 | Original post link

It is possible to force the use of TiFlash.

| username: Kongdom | Original post link

I tried it, but it doesn’t work even with force.

| username: Kongdom | Original post link

Forcing it won’t work either :yum:

| username: Hacker_QGgM2nks | Original post link

In TiDB, the behavior of the INSERT ... SELECT statement does have some peculiarities, especially when it involves the TiFlash component. TiFlash is TiDB’s columnar storage engine, primarily used to accelerate analytical queries. For mixed OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) workloads, TiDB provides an efficient solution through TiFlash.

INSERT ... SELECT and TiFlash

By design, the INSERT ... SELECT operation is mainly aimed at transactional data modification and data migration scenarios. This operation typically involves a large amount of row-level data reading and writing. TiFlash, on the other hand, is optimized for columnar storage analytical queries, particularly those involving a large number of column calculations but fewer rows.

Therefore, even if you use a query hint like /*+ READ_FROM_STORAGE(TIFLASH[table_name]) */ to instruct the TiDB optimizer to use TiFlash, operations like INSERT ... SELECT may still not choose TiFlash for execution. This is primarily because:

  1. Execution Plan Selection: The TiDB optimizer selects the most appropriate execution plan based on the nature of the operation and the current data distribution. For large-scale row operations, even if TiFlash is specified, the optimizer may consider using RowStore (TiKV) to be more efficient.

  2. Transaction Consistency and Real-time Performance: There is a slight delay in data synchronization between TiFlash and TiKV. For INSERT ... SELECT operations that require real-time data consistency, using TiKV can ensure the transaction’s isolation level and consistent view.

Solutions and Recommendations

If you indeed need to leverage TiFlash’s columnar storage advantages through the INSERT ... SELECT operation, here are a few possible solutions:

  • Analytical Query Optimization: If your goal is to optimize analytical queries, consider first INSERTing the data to be analyzed into a temporary table, then perform analytical queries on that temporary table, explicitly using TiFlash.

  • Adjust Data Model or Query: Evaluate whether you can avoid relying on INSERT ... SELECT by adjusting the data model or restructuring the query logic, which may help better utilize TiFlash.

  • Contact Support: If you encounter performance issues or have specific use cases requiring TiFlash support for INSERT ... SELECT, it is recommended to contact PingCAP’s technical support or community for assistance to understand if there are any latest optimization solutions or configuration adjustment suggestions.

In summary, the reason INSERT ... SELECT does not use TiFlash is due to the decisions made by the internal TiDB optimizer and the design positioning of TiFlash. In scenarios where you need to optimize query performance using TiFlash, you may need to adjust the data processing logic or query method to achieve this.

| username: Kongdom | Original post link

:+1: GTP is still okay

| username: Hacker_PtIIxHC1 | Original post link

:+1: