The TiDB optimizer did not rewrite the "or" condition in the join clause

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiDB优化器未对关联条件中“or”进行改写

| username: 人如其名

For the TPCH table, there is the following test statement (this statement has no meaning, only to examine or rewrite ability):
select count(a.S_NAME), sum(b.s_acctbal) from supplier a, supplier b where a.S_ADDRESS = b.S_ADDRESS or a.S_PHONE = b.S_PHONE;

Check its execution plan:

mysql> explain select count(a.S_NAME), sum(b.s_acctbal) from supplier a, supplier b where a.S_ADDRESS = b.S_ADDRESS or a.S_PHONE = b.S_PHONE;
+-------------------------------+----------------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| id                            | estRows        | task      | access object | operator info                                                                                                                                   |
+-------------------------------+----------------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| StreamAgg_10                  | 1.00           | root      |               | funcs:count(tpch1.supplier.s_name)->Column#15, funcs:sum(tpch1.supplier.s_acctbal)->Column#16                                                   |
| └─HashJoin_17                 | 10000000000.00 | root      |               | CARTESIAN inner join, other cond:or(eq(tpch1.supplier.s_address, tpch1.supplier.s_address), eq(tpch1.supplier.s_phone, tpch1.supplier.s_phone)) |
|   ├─TableReader_16(Build)     | 100000.00      | root      |               | data:TableFullScan_15                                                                                                                           |
|   │ └─TableFullScan_15        | 100000.00      | cop[tikv] | table:b       | keep order:false                                                                                                                                |
|   └─TableReader_14(Probe)     | 100000.00      | root      |               | data:TableFullScan_13                                                                                                                           |
|     └─TableFullScan_13        | 100000.00      | cop[tikv] | table:a       | keep order:false                                                                                                                                |
+-------------------------------+----------------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
6 rows in set (0.01 sec)

mysql> explain analyze select a.l_partkey from lineitem a cross join lineitem b limit 10;
+-------------------------------+-------------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+-----------+---------+
| id                            | estRows     | actRows  | task      | access object | execution info                                                                                                                                                                                                                                                                                                                                                      | operator info         | memory    | disk    |
+-------------------------------+-------------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+-----------+---------+
| Limit_12                      | 10.00       | 10       | root      |               | time:2m46.8s, loops:2                                                                                                                                                                                                                                                                                                                                               | offset:0, count:10    | N/A       | N/A     |
| └─HashJoin_13                 | 10.00       | 1024     | root      |               | time:2m46.8s, loops:1, build_hash_table:{total:17.4s, fetch:10.5s, build:6.87s}, probe:{concurrency:3, total:8m23.6s, max:2m48.7s, probe:7m31.4s, fetch:52.1s}                                                                                                                                                                                                      | CARTESIAN inner join  | 1023.6 MB | 1.34 GB |
|   ├─TableReader_18(Build)     | 59986052.00 | 59986052 | root      |               | time:10.3s, loops:58813, cop_task: {num: 2197, max: 281.3ms, min: 1.1ms, avg: 31.4ms, p95: 68.4ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 56.1s, tot_wait: 9.09s, rpc_num: 2197, rpc_time: 1m9s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 5}                                                                                                 | data:TableFullScan_17 | 1.86 MB   | N/A     |
|   │ └─TableFullScan_17        | 59986052.00 | 59986052 | cop[tikv] | table:b       | tikv_task:{proc max:271ms, min:0s, avg: 25.6ms, p80:42ms, p95:58ms, iters:67264, tasks:2197}, scan_detail: {total_process_keys: 59986052, total_process_keys_size: 2159497872, total_keys: 59988249, get_snapshot_time: 204.3ms, rocksdb: {key_skipped_count: 59986052, block: {cache_hit_count: 17464, read_count: 185633, read_byte: 3.77 GB, read_time: 15.7s}}} | keep order:false      | N/A       | N/A     |
|   └─TableReader_16(Probe)     | 0.00        | 3072     | root      |               | time:10.2ms, loops:3, cop_task: {num: 9, max: 11.9ms, min: 956.6µs, avg: 6.62ms, p95: 11.9ms, max_proc_keys: 992, p95_proc_keys: 992, tot_proc: 22ms, tot_wait: 26ms, rpc_num: 9, rpc_time: 59.4ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 5}                                                                                                             | data:TableFullScan_15 | 48.4 KB   | N/A     |
|     └─TableFullScan_15        | 0.00        | 3552     | cop[tikv] | table:a       | tikv_task:{proc max:9ms, min:0s, avg: 2.78ms, p80:9ms, p95:9ms, iters:32, tasks:9}, scan_detail: {total_process_keys: 3552, total_process_keys_size: 697625, total_keys: 3561, get_snapshot_time: 18.2ms, rocksdb: {key_skipped_count: 3552, block: {cache_hit_count: 71, read_count: 14, read_byte: 750.3 KB, read_time: 10.4ms}}}                                 | keep order:false      | N/A       | N/A     |
+-------------------------------+-------------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------+-----------+---------+
6 rows in set (2 min 48.75 sec)

It can be seen that the use of hash join + internal cross join results in a very slow hash join process. If a union rewrite can be performed, the efficiency in this scenario would be much higher. Therefore, should cost-based execution plan rewriting be enhanced to optimize such issues?

| username: tidb菜鸟一只 | Original post link

I can only say that the optimizer of a database is really difficult. Even an established database like Oracle, which has been around for decades, hasn’t achieved perfection.

| username: h5n1 | Original post link

What others don’t have, I have. What others have, I excel in. This is the goal.

| username: 人如其名 | Original post link

As the saying goes, “What you think of will eventually happen.” Sure enough, we encountered a similar issue in our current production environment. There are quite a few of these associated fields containing OR conditions that were migrated from other established databases. When testing OB at the time of posting, its optimizer could handle the rewrite. Currently, we can only optimize through manual rewriting. The rewrite logic is roughly as follows:

select a.c1, b.c2 from a, b where a.c1 = b.c1 or a.c2 = b.c2

Rewritten as:

select a.pk, b.pk, a.c1, b.c2 from a, b where a.c1 = b.c1
union
select a.pk, b.pk, a.c1, b.c2 from a, b where a.c2 = b.c2

Here, adding a.pk and b.pk is mainly to prevent the union from removing duplicate data, which would violate the original semantics.

If the table does not have a primary key, you need to use the implicit _tidb_rowid instead (thanks to @小龙虾爱上大龙虾 for the assistance).

| username: zhaokede | Original post link

For this, you need to optimize it yourself. The compiler does not perform this kind of optimization. Oracle and MySQL are the same in this regard.

| username: 我是吉米哥 | Original post link

Replace “uion” with “union all” to avoid deduplication. In this case, is it possible not to add a primary key column?

| username: 小龙虾爱大龙虾 | Original post link

I remember this scenario is possible with Oracle :joy_cat: The Oracle optimizer is still :+1:

| username: 像风一样的男子 | Original post link

Is the index merge feature enabled?

| username: zhaokede | Original post link

Does index merging have an impact on this?

| username: zhaokede | Original post link

I found the information and saw it.
Oracle Database introduced more advanced query optimization techniques starting from version 8i, including support for Disjunctive Normal Form (DNF).
Many modern relational database management systems (RDBMS) implement Disjunctive Normal Form (DNF) optimization to improve query performance. Here are some database systems that have implemented DNF optimization:

  1. PostgreSQL - PostgreSQL’s query optimizer can recognize and optimize queries containing OR conditions, converting them into more efficient execution plans.
  2. Oracle Database - Oracle’s optimizer also supports converting complex OR conditions into multiple subqueries and then using UNION ALL to merge the results.
  3. Microsoft SQL Server - SQL Server’s query optimizer can also handle OR conditions by converting them into multiple subqueries to optimize the execution path.
  4. MySQL - MySQL’s InnoDB storage engine can optimize OR conditions in certain cases, especially when applicable indexes are available.
  5. IBM DB2 - DB2’s query optimizer also supports DNF optimization and can effectively handle complex queries containing OR logic.
  6. Google BigQuery - As a cloud-native data warehouse, BigQuery’s optimizer can also handle and optimize queries containing OR conditions.
  7. Amazon Redshift - Amazon Redshift’s query optimizer can recognize and optimize queries containing OR conditions, improving query efficiency.

These database systems, through the intelligent analysis of their internal optimizers, can automatically convert queries containing OR conditions into more efficient forms, such as using UNION ALL instead, thereby avoiding unnecessary repeated calculations and data scans, improving query speed and resource utilization. However, specific optimization strategies and effects may vary depending on the database version and configuration.

| username: TiDBer_QKDdYGfz | Original post link

Awesome, very helpful. Without comparison, there is no harm.

| username: Raymond | Original post link

I don’t know why a Cartesian product is formed after a hash join. This doesn’t seem to be necessary in Oracle.

| username: 时空旅行者 | Original post link

The ORACLE optimizer is also very slow when it includes OR statements, and you need to rewrite it using the union method.

| username: residentevil | Original post link

I can relate, haha.