If table t1 does not have a columnar replica, but table t2 does, is it supported to filter data from the columnar replica of t2 and then JOIN with row-based t1?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 如果一个表t1没有列存副本,另一个表t2有列存副本,支持从t2的列存副本过滤数据然后与行存 t1 JOIN 吗

| username: TiDBer_v5ifqMzA

If table t1 does not have a columnar replica, but table t2 does, is it supported to filter data from the columnar replica of t2 and then JOIN with row-based t1?

| username: WalterWj | Original post link

Mixing row and column storage is supported. You can check the relevant content on the TiFlash official website: 使用 TiFlash | PingCAP 归档文档站

| username: Running | Original post link

Yes, you can force it to use a specific engine or let the optimizer choose on its own.

| username: Kongdom | Original post link

Support this scenario.

| username: ljluestc | Original post link

In TiDB, you can filter data from the columnar storage replica of table t2 and then perform a JOIN with the row storage table t1. This scenario is supported and can be achieved using TiDB’s query capabilities.

Here is an example to illustrate this process:

Suppose you have two tables, t1 (row storage) and t2 (columnar storage).

You can filter data from the columnar storage replica of t2 using conditions or predicates. For example, you can use the WHERE clause to filter rows based on specific conditions.

Example: SELECT * FROM t2 WHERE column = ‘value’;

After filtering data from t2, you can perform a JOIN operation with the row storage table t1 using the appropriate JOIN condition.

Example: SELECT * FROM t1 JOIN t2 ON t1.id = t2.id;

The query will join the filtered results from t2 with the corresponding rows from t1 based on the specified JOIN condition.

It is important to note that JOIN operations between row storage tables and columnar storage tables may affect performance due to differences in their storage formats. Columnar storage tables are optimized for analytical workloads and may not perform as well as traditional row-based operations.

Additionally, keep in mind that TiDB’s query optimizer will analyze the query and determine the most efficient execution plan based on statistics and other factors. It may choose to perform optimizations such as filter or projection pushdown, utilizing indexes, or other optimization techniques to improve performance.

Overall, TiDB supports filtering data from columnar storage replicas and then performing JOINs with row storage tables. However, it is recommended to carefully analyze the query execution plan, monitor performance, and consider the specific characteristics of your data and workload to ensure optimal query performance.

| username: redgame | Original post link

Support this kind of thing

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.