Why are some of our SQL queries not using TiFlash? What factors determine whether TiFlash is used or not?

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 我们的有些SQL就是不走tiflash是怎么回事?走不走tiflash和啥有关系

| username: jaybing926

[TiDB Usage Environment] Production Environment
[TiDB Version]
tidb v5.4.3

[Encountered Problem: Phenomenon and Impact]
Recently, a new business went online, and the high query frequency caused the TiDB node CPU to reach 100%. Therefore, we want to route the queries to TiFlash. After adding the relevant table data to TiFlash, the queries still do not use TiFlash as shown by the desc command.
Attempts made:

  1. Setting session-level variables via command line did not work:
    set @@session.tidb_isolation_read_engines = "tiflash,tikv,tidb";
  2. Manually adding hint in SQL statements also did not work. The queries still do not use TiFlash.

So, I am very troubled and would like to ask how to make the queries use TiFlash?

| username: ShawnYan | Original post link

Have you confirmed that the TiFlash replica has been successfully created?

How about trying again with tidb_isolation_read_engines = tidb,tiflash?

| username: jaybing926 | Original post link

Yes, the three tables used in the query are all 1.

| username: jaybing926 | Original post link

Setting tidb_isolation_read_engines = tidb,tiflash works. It uses TiFlash now. But the query speed is still not as fast as KV. What’s going on, experts?

| username: Kongdom | Original post link

There is no hint writing in the screenshot, right?
Is it that desc doesn’t work, or explain analyze doesn’t work either?

| username: jaybing926 | Original post link

Yes, the hint wasn’t captured in the screenshot. Do these two methods have different effects? I usually only use desc to test this.

| username: 像风一样的男子 | Original post link

For high concurrency simple queries, TiFlash is not as efficient as KV.

| username: redgame | Original post link

The database’s decision to use TiFlash is poor.

| username: 啦啦啦啦啦 | Original post link

Is it because SQL BINDING is used, that its priority is higher than hint, or is it because the client version is low and --comments is not added?

| username: tidb菜鸟一只 | Original post link

What makes this SQL query suitable for TiFlash?

| username: Kongdom | Original post link

Try using a hint. It seems that regardless of whether TiFlash should be used, as long as there is a hint, it should follow the hint.

| username: TiDBer_QHSxuEa1 | Original post link

TiFlash is columnar storage, suitable for aggregation operations such as group by summation. This SQL is a row query, so TiDB choosing TiKV row storage is definitely more appropriate.

| username: zhanggame1 | Original post link

The query frequency is very high, not suitable for TiFlash. TiFlash has poor concurrency.

| username: Hacker007 | Original post link

SQL statements that are suitable for TiFlash will use it, but sometimes row-based TiKV is better.

| username: dba-kit | Original post link

Before version 7.X, TiFlash would scan all the data first and then perform calculations. Therefore, if your table is very large and the SQL query only needs a portion of the data, TiFlash won’t be able to accelerate the process and might even be slower than TiKV. This is why the optimizer prefers to use TiKV.

I suggest upgrading to version 7.1 and trying again; you should notice a significant performance improvement.

| username: dba-kit | Original post link

The optimization points are mentioned in the release notes, and the applicability of TiFlash is greater after the upgrade. However, caution is still needed in a production environment. If there are partitioned tables in the cluster, it is best to wait for version 7.1.2 before using it. You can experiment in a test environment first.

| username: 大飞哥online | Original post link

Your SQL involves a three-table left join without any aggregation or heavy computation, so there’s no need to use TiFlash. The optimizer has also determined that TiFlash would be slower than TiKV, so it chose to use TiKV instead.

| username: 昵称想不起来了 | Original post link

For simple row queries without the need for aggregation tasks, TiKV is more suitable in principle.

| username: ShawnYan | Original post link

Those two query conditions are point get, using TiKV is sufficient.