Questions about the Principles of TiDB Index Join

translator_bot · June 23, 2024, 1:46am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb index join 原理疑问

| username: Raymond

May I ask why during the index join process in TiDB, an inner table row hash table is generated? I always thought that for example, a join b on a.id=b.id (assuming a is the outer table), data is taken from table a and then sequentially joined with table b based on the join condition. Why is a hash table generated in table b?

translator_bot · June 23, 2024, 1:46am

| username: xfworld | Original post link

If you consider it as a single database (since all the data is on one node), you definitely won’t be able to understand this description.

Assuming the data is distributed across multiple nodes, the inner worker needs to fetch data from N nodes and then return it. After the data is returned, how can it be associated with the data of the outer worker? The simplest structure is a hash, associating through key values.

Additionally, the returned information needs to be temporarily stored and can only be released after the computation with the outer worker is completed…

translator_bot · June 23, 2024, 1:46am

| username: forever | Original post link

You should understand once you read this sentence.

translator_bot · June 23, 2024, 1:46am

| username: 人如其名 | Original post link

In traditional databases, an outer table record is matched with related records in the inner table by querying the cache one row at a time. However, TiDB uses a distributed architecture with separated storage and computation. If it were to query row by row, it would need to access TiKV, resulting in significant network interactions. Therefore, TiDB has implemented some optimizations:

Instead of processing row by row, it processes in batches, changing from Row to Chunk. Without considering parallelism, the outer table first retrieves a chunk of data (with the chunk size increasing gradually), then performs filtering, deduplication, sorting, and other operations to form a set of keyRanges. These are then handed over to the TiDB backend to be organized into cop_task tasks and sent to TiKV for execution. However, after TiKV retrieves all the data corresponding to a chunk from the outer table, it cannot determine which records in the chunk correspond to the outer table. Therefore, it uses a hash table to organize the data, allowing the chunk from the outer table to be matched with the data retrieved from the inner table.

In simple terms: Traditional databases do not need to use a hash table because the outer table processes one row of data at a time, matching directly. TiDB, on the other hand, organizes data in chunks and needs to use a method similar to small table hash join.

translator_bot · June 23, 2024, 1:46am

| username: Raymond | Original post link

However, after TiKV obtains all the data corresponding to a chunk of the outer table, this data cannot determine which records correspond to this chunk of the outer table. ----> Can’t this be determined based on the join conditions in the SQL statement?

translator_bot · June 23, 2024, 1:46am

| username: h5n1 | Original post link

github.com/pingcap/tidb

Refine Index Join

opened 09:50AM - 27 Nov 18 UTC

closed 12:00PM - 20 Nov 19 UTC

zz-jason

type/enhancement sig/execution

## Feature Request At present, the Index Join implementation is not efficient… at some scenarios: 1. It may cause TiDB OOM because it uses the inner table to construct the hash table 2. It can not response to the parent in a short period, because it has to wait to all the inner rows matched the outer join key to be fetched out from TiKV and have build hash table on it, and do the join operation on the main thread. 3. The execution is not efficient, because all the join work are performed in the main thread, the outer and inner workers are only responsible for fetching data from TiKV **Describe the feature you'd like:** Split Index Join into two operators: 1. One for keep order. In this operator, the output of the Index Join should be ordered by the outer join key. We can do a Merge Join on a task 2. One for no need to keep order. In this operator, the output of the Index Join can have arbitrary order. In order to limit the memory consumption, we can use the outer rows inside a task to build the hash table and do hash join on the fetched inner rows, return a Chunk as soon as possible. **Describe alternatives you've considered:** No **Teachability, Documentation, Adoption, Migration Strategy:** After discussing offline, [@yu34po](https://github.com/yu34po) will work on this issue.

github.com/pingcap/tidb

executor: support index nested loop hash join

pingcap:master ← yu34po:indexjoin

opened 06:25AM - 12 Dec 18 UTC

yu34po

+477 -7

### What problem does this PR solve? https://github.com/pingcap/tidb/issues/8470 ### What is changed and how it works? We'll split index_lookup_join into index_hash_join and index_merge_join. This PR only adds the implementation for **index_hash_join**. If we do not need to keep the result order as the outer plan, we'll build an index_hash_join executor. Otherwise, we use the old index_lookup_join executor. IndexNestedLoopHashJoin employs one outer worker and N inner workers to execute concurrently. The output order is not promised. The execution flow is very similar to IndexLookUpReader: 1. The outer worker reads N outer rows, builds a task and sends it to the inner worker channel. 2. The inner worker receives the tasks and does 3 things for every task: 1. builds hash table from the outer rows 2. builds key ranges from outer rows and fetches inner rows 3. probes the hash table and sends the join result to the main thread channel. Note: `step i` and `step ii` runs concurrently. 3. The main thread receives the join results. ### Check List Tests Existing test case. Code changes - Has exported function/method change - Has exported variable/fields change - Has interface methods change Side effects 1. The current implementation of tryToMatch uses an inner iterator and one outer row, this commit will cause performance reduction. 2. After adding a new joiner function trytoMatch(outerIterator, innerRow)/onMissMatch(https://github.com/pingcap/tidb/pull/9286), performance can be recovered. 3. After https://github.com/pingcap/tidb/pull/11832 is merged, the hashtable can be optimized further. **[DONE]** --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/pingcap/tidb/8661)

github.com/pingcap/tidb

executor: support index_look_up_merge_join to speed up index_look_up_join

pingcap:master ← lzmhhh123:dev/index_look_up_merge_join

opened 02:49AM - 06 Mar 19 UTC

lzmhhh123

+887 -42

### What problem does this PR solve? Support index_look_up_merge_join for index_look_up_join, reference issue #8470. ### What is changed and how it works? In the process of index_look_up_join execution, if the inner table is `PhysicalIndexScan` and the join keys of the inner table are the suffix of the index, then we can choose index_look_up_merge_join. **Some details**: 1. Because the merge tasks returned from inner workers are unsorted, the results of index_look_up_merge_join can't keep the order of the outer table by join keys. 2. The time complexity of index_look_up_merge_join are same as index_look_up_join. But it eliminates the process of the hash map. So in theory, it will speed up. ### Check List Tests - Unit test - Integration test Code changes - Has exported function/method change - Has exported variable/fields change Side effects - Increased code complexity Related changes - Need to cherry-pick to the release branch - Need to be included in the release note

translator_bot · June 23, 2024, 1:46am

| username: 人如其名 | Original post link

Isn’t using the outer table for hashing an index hash join?

translator_bot · June 23, 2024, 1:46am

| username: h5n1 | Original post link

It is index hash join.

translator_bot · June 23, 2024, 1:46am

| username: 人如其名 | Original post link

Determining based on the join condition is done within the hash table.