In what scenarios is it appropriate to enable TiFlash, and what should be noted?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 什么场景下开启tiflash比较合适,需要注意什么?

| username: TIDB-Learner

In what scenarios is it more appropriate to enable TiFlash, and what should be noted?

| username: xiaoqiao | Original post link

Large, wide tables

| username: zhanggame1 | Original post link

For tables with large amounts of data that require OLAP operations, such as applying aggregate functions to some of the columns.

| username: ShawnYan | Original post link

You can first take a look at this collection post:

| username: DBAER | Original post link

AP scenarios can actually leverage ClickHouse.

| username: 小龙虾爱大龙虾 | Original post link

You can’t run Tikv.

| username: Jellybean | Original post link

Analytical computing tasks involving large amounts of data, especially complex AP scenarios such as multi-table join computations.

| username: YuchongXU | Original post link

AP computation

| username: 随缘天空 | Original post link

In OLAP scenarios, such as statistical analysis scenarios, report display, and the like.

| username: ShawnYan | Original post link

This is because TiFlash was initially developed based on ClickHouse, which is nothing to hide. You can still see parts of ClickHouse in the TiFlash source code commit history.

| username: 有猫万事足 | Original post link

Simply put, if your SQL contains a GROUP BY clause and the computation time is particularly long, you should consider using TiFlash. If you can read the execution plan, look for aggregation operators like hashagg and streamagg near the execution plan nodes.

It is not necessarily required to have large or wide tables, but if your scenario involves multi-table joins and aggregation computations, be sure to use TiFlash + MPP. Only in MPP mode can hash joins be pushed down to TiFlash for execution.

| username: 呢莫不爱吃鱼 | Original post link

Wide tables that were originally on ClickHouse can all be moved to TiDB.

| username: TiDBer_QYr0vohO | Original post link

Large table, requires OLAP scenario.

| username: Blylei | Original post link

It is more appropriate to enable it when there is an OLAP requirement. Enabling it in an OLTP scenario feels like a bit of a waste of resources.

| username: TiDBer_JUi6UvZm | Original post link

If you are already using TiDB and have AP requirements, then just use TiFlash. Be sure to pay attention to resource isolation to avoid affecting the neighboring KV nodes. Also, don’t overload the network, as network bandwidth is shared.

| username: Hacker_QGgM2nks | Original post link

Actually, it’s just large SQL, AP computation, and complex queries.

| username: Hacker_PtIIxHC1 | Original post link

AP requests (i.e., large SQL queries for statistical analysis)

| username: 友利奈绪 | Original post link

Analytical non-business SQL

| username: Swan | Original post link

The concept is great, but it’s not that convenient in practice.

| username: TiDBer_fbU009vH | Original post link

When the data volume is large and involves aggregation operations such as sorting on a column, columnar scanning will be enabled by default.