Questions about Multi-Table Queries in TiDB

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb多表查询的疑问

| username: lijin

distSQL is responsible for executing complex queries. The official course says that for multi-table queries, due to incomplete data, the corresponding data from each table needs to be loaded into memory for querying. I don’t think that’s the case. Single-table queries can also have table data scattered across various TiKV nodes. Since single-table queries can achieve operator pushdown, why can’t multi-table queries do the same? It’s just that there are more simple queries to convert, right?

| username: Miracle | Original post link

When multiple tables are joined, operator pushdown is also possible. This depends on whether the conditions are met, allowing some data to be filtered at the TiKV layer. However, the join operation will be performed in TiDB. You can write two joins and check the execution plan to see this.

| username: 春风十里 | Original post link

Multi-table joins cannot be pushed down to TiKV for execution, but the conditions corresponding to individual tables in multi-table queries can be pushed down to TiKV. This reduces the network traffic from TiKV to the TiDB server and the memory usage of the TiDB server.

| username: lijin | Original post link

Okay, thank you.

| username: 江湖故人 | Original post link

Mr. Li used to use Gauss, right? :smile:

| username: zhanggame1 | Original post link

The join operation in TiDB consumes a lot of memory, and if there is not enough memory, it will result in an OOM (Out of Memory) error.

| username: lijin | Original post link

What do you mean by that? :smirk:

| username: lijin | Original post link

Got it, thank you!

| username: TIDB-Learner | Original post link

The general optimization suggestion of having the smaller table drive the larger table in join connections is also effective in TiDB, right?

| username: Jellybean | Original post link

It is effective. The internal optimizer will choose the appropriate driving table (build first, then probe) for the join. For more details, you can refer to: 用 EXPLAIN 查看 JOIN 查询的执行计划 | PingCAP 文档中心

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.