Indexes Repeatedly Fail After Cluster Migration and Upgrade

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 机群迁移升级后,索引反复失效

| username: xmlianfeng

[TiDB Usage Environment] Production Environment
[TiDB Version] 7.1.5
Background:
After migrating from 6.1.5 to 7.1.5, some queries experienced index failure. After re-analyzing, they returned to normal, but after a few hours, the issue reappeared. The table’s health check is 99. Is this a bug in the new version?

Before analysis
explain analyze

After analysis

| username: 像风一样的男子 | Original post link

I suggest you change the issue category to product defect, and there will be product and research personnel following up.

| username: zhaokede | Original post link

Indeed, I have never encountered index repeatedly becoming invalid.

| username: xmlianfeng | Original post link

Okay. This problem is quite annoying.

| username: lemonade010 | Original post link

Is it useful to bind the execution plan?

| username: WalterWj | Original post link

Bind the execution plan.

Try turning off the join reorder optimization: Join Reorder 算法简介 | PingCAP 文档中心

Turn it off at the session level and test it to see.

| username: h5n1 | Original post link

What operations are frequently performed on this table? What is the time range condition in the SQL? How does it change each time? You can capture the information next time it becomes invalid.

| username: tidb菜鸟一只 | Original post link

Is it an index on the time field?

| username: xmlianfeng | Original post link

Yes.

| username: 霸王龙的日常 | Original post link

Are the SQL conditions the same?

| username: tidb菜鸟一只 | Original post link

Indexes on time fields can be affected by constantly changing query conditions. For example, if each query searches for data within the last 15 minutes, and the table’s statistics are older than 15 minutes, the index might not be used. Therefore, it is recommended to specify SQL hints or bind execution plans.

| username: xmlianfeng | Original post link

With subqueries, there is no way to bind the execution plan.

| username: tidb菜鸟一只 | Original post link

Send the SQL.

| username: xmlianfeng | Original post link

select CoreVer, Chl, CoreVer2, Chl2, AppVer, Idfa, Mt, CreateTime, Id, Ip, LastLoginTime as RowUpdateTimestamp, RegCountry, UniqueCdReaderId 
from tidb_cdc_en.AccountInfo 
where mt = 4 
  and idfa <> '' 
  AND LastLoginTime > DATE_ADD(now(), INTERVAL -24 HOUR) 
  AND UniqueCdReaderId IN (
    select UniqueCdReaderId 
    FROM tidb_cdc_en.log_installreferrerlog 
    where createtime > DATE_ADD(now(), INTERVAL -4 HOUR) 
      AND createtime < DATE_ADD(now(), INTERVAL -10 MINUTE) 
      AND Mt = 4 
      AND UacStatus = 0
  );
| username: tidb菜鸟一只 | Original post link

Try changing it to this and see if it can use the index:

select /*+ USE_INDEX(t1, ix_LastLoginTime) */ CoreVer,
       Chl,
       CoreVer2,
       Chl2,
       AppVer,
       Idfa,
       Mt,
       CreateTime,
       Id,
       Ip,
       LastLoginTime as RowUpdateTimestamp,
       RegCountry,
       UniqueCdReaderId
  from tidb_cdc_en.AccountInfo t1
 where mt = 4
   and idfa <> ''
   AND LastLoginTime > DATE_ADD(now(), INTERVAL - 24 HOUR)
   AND UniqueCdReaderId IN
       (select UniqueCdReaderId
          FROM tidb_cdc_en.log_installreferrerlog
         where createtime > DATE_ADD(now(), INTERVAL - 4 HOUR)
           AND createtime < DATE_ADD(now(), INTERVAL - 10 MINUTE)
           AND Mt = 4
           AND UacStatus = 0);
| username: ziptoam | Original post link

TiDB has a query plan cache mechanism. If a suboptimal plan is chosen during the first execution of a query, this plan might be cached and reused. You can try adjusting the relevant parameters of the query plan cache, such as plan-cache-enabled and plan-cache-evict-ttl, or manually clear the cache when encountering performance issues (although this is usually not a long-term solution).