Has anyone encountered different execution plans for the same SQL statement on different tidb-servers within the same TiDB cluster?

translator_bot · June 21, 2024, 11:22pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 大家遇到过同一个Tidb集群中，不同的tidb-server，执行同一条sql语句，执行计划不一样

| username: TiDBer_oHSwKxOH

【TiDB Usage Environment】Production Environment / Testing / PoC
【TiDB Version】
【Reproduction Path】What operations were performed that led to the issue
【Encountered Issue: Problem Phenomenon and Impact】
【Resource Configuration】Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
【Attachments: Screenshots / Logs / Monitoring】

Has anyone encountered a situation where different tidb-servers in the same TiDB cluster execute the same SQL statement but have different execution plans?

translator_bot · June 21, 2024, 11:22pm

| username: tidb菜鸟一只 | Original post link

Could you provide the two execution plans? Did they go to TiKV and TiFlash respectively?

translator_bot · June 21, 2024, 11:22pm

| username: vcdog | Original post link

Post the execution plan immediately.

translator_bot · June 21, 2024, 11:22pm

| username: vcdog | Original post link

Connect to the TiDB server at 10.3.72.95 using the Navicat client and execute the same SQL to view the execution plan, as shown below:

image1373×793 45.3 KB
Connect to the TiDB server at 10.3.72.94 using the Navicat client and execute the same SQL to view the execution plan, as shown below:

image1380×798 54.2 KB

translator_bot · June 21, 2024, 11:22pm

| username: tidb菜鸟一只 | Original post link

Check if the configurations of the two TiDB servers are different. Did one configure TiFlash and the other didn’t?

translator_bot · June 21, 2024, 11:22pm

| username: vcdog | Original post link

The configuration is the same.

translator_bot · June 21, 2024, 11:22pm

| username: dba-kit | Original post link

show variables like ‘%tidb_isolation_read_engines%’;
Take a look.

translator_bot · June 21, 2024, 11:22pm

| username: tidb菜鸟一只 | Original post link

Then check this as well.

translator_bot · June 21, 2024, 11:22pm

| username: h5n1 | Original post link

Try restarting.
You can compare parameter differences through information_schema.cluster_config.
Query variables_info on each TiDB to compare differences.

translator_bot · June 21, 2024, 11:22pm

| username: h5n1 | Original post link

The cluster hasn’t been upgraded, right?

translator_bot · June 21, 2024, 11:22pm

| username: vcdog | Original post link

I haven’t upgraded before; I directly installed version v6.5.0.

translator_bot · June 21, 2024, 11:22pm

| username: vcdog | Original post link

The case has been closed. On the tidb-server at 10.3.72.95, I executed the following commands on five tables:

        ANALYZE table es_charge_incoming_fee;
        ANALYZE table es_charge_incoming_data;
        ANALYZE table es_charge_voucher_project_set;
        ANALYZE table es_charge_pay_mode;
        ANALYZE table es_info_object_and_owner;

After execution, I checked the two tidb-servers again, and this time, the execution plans were the same.

translator_bot · June 21, 2024, 11:22pm

| username: cy6301567 | Original post link

The deployment resources of the services are different, right? Check the health status of the table analysis loaded on the two TiDBs.

translator_bot · June 21, 2024, 11:22pm

| username: zhanggame1 | Original post link

In other words, the statistics cache of different TiDB servers varies.

translator_bot · June 21, 2024, 11:22pm

| username: tidb菜鸟一只 | Original post link

Isn’t the statistical information stored on PD, and is there a difference between the two TiDBs?
I can only suspect that the statistical information of the original table is outdated, and then TiDB used pseudo-random numbers to collect the statistical information when executing SQL, resulting in different random outcomes for the two TiDBs.

translator_bot · June 21, 2024, 11:22pm

| username: zhanggame1 | Original post link

Statistics and region information are stored in PD, but TiDB also needs to cache this data, otherwise the efficiency would be too low.

translator_bot · June 21, 2024, 11:22pm

| username: cassblanca | Original post link

There should be an execution plan cache, similar to Oracle’s hard parsing and software parsing. If statistics need to be cached on each node, then for a database with high load, frequent updates, and many nodes, the update mechanism for this cache would be difficult to design.

translator_bot · June 21, 2024, 11:22pm

| username: TiDBer_oHSwKxOH | Original post link

Execution plans rely on statistics. If the statistics are inaccurate, the execution plan will change.

translator_bot · June 21, 2024, 11:22pm

| username: ajin0514 | Original post link

It’s very strange.

translator_bot · June 21, 2024, 11:22pm

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.