The same SQL executed twice yields inconsistent results, with the two outcomes alternating

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 同一个sql执行两次结果不一致,两个结果来回变化

| username: beebol

[TiDB Usage Environment] Production Environment
[TiDB Version]
[Reproduction Path] What operations were performed when the issue occurred
Executing SQL:

[Encountered Issue: Issue Phenomenon and Impact]
The data volume returned by two queries is inconsistent.
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

The table is added to TiFlash
root 11:05: [information_schema]> select * from TIFLASH_REPLICA;
±----------------±--------------------±---------±--------------±----------------±----------±---------+
| TABLE_SCHEMA | TABLE_NAME | TABLE_ID | REPLICA_COUNT | LOCATION_LABELS | AVAILABLE | PROGRESS |
±----------------±--------------------±---------±--------------±----------------±----------±---------+
| ad_common_datas | report_simple_day | 4056 | 2 | | 1 | 1 |

| username: tidb菜鸟一只 | Original post link

Check the execution plan of the SQL to see if it is using TiFlash.
Also, I noticed that you set 2 TiFlash replicas for the table, but only 1 is available?

| username: beebol | Original post link

It went through TiFlash.

| username: xfworld | Original post link

Try adding an ORDER BY.

| username: tidb菜鸟一只 | Original post link

You set 2 TiFlash replicas for the table, but only 1 is available? How many TiFlash nodes do you have?

| username: beebol | Original post link

Adding order by is the same.

| username: beebol | Original post link

3 TiFlash nodes

| username: tidb菜鸟一只 | Original post link

ALTER TABLE a SET tiflash replica 1;—Set the table’s replica to 1 and try, there should be an issue with one of the replicas.

| username: xfworld | Original post link

  1. Is there still business data being appended?
  2. Is the data still being synchronized? Has the synchronization not been completed?

So the data keeps changing…
The data for slot 200006 and 200026 hasn’t changed.

| username: beebol | Original post link

There is no additional data because it is querying historical time. However, there was a delete operation a few days ago, and based on the data volume, it is possible to find the data that was deleted previously.

| username: beebol | Original post link

This is not the number of available replicas, but rather to determine whether this TiFlash is available.

| username: Billmay表妹 | Original post link

Is it possible to see the specific number of actRows returned at each stage in the execution plan of the SQL when an inconsistency occurs in the dashboard?

| username: Billmay表妹 | Original post link

It could be due to data inconsistency among several TiFlash replicas. For example, after you performed a deletion, are there any replicas with inconsistent data?

| username: beebol | Original post link

How can this be checked and verified?

| username: beebol | Original post link

The first query uses TiKV as the storage engine, while the second query uses TiFlash. The execution time for the TiKV query is significantly longer (769.4ms) compared to the TiFlash query (54.3ms). This indicates that TiFlash provides better performance for this specific query.

| username: Billmay表妹 | Original post link

Did you find inconsistency with TiFlash both times? Or was it TiKV once and TiFlash the other time? You can check if the TiKV queries are consistent both times.

| username: Billmay表妹 | Original post link

If the same SQL statement produces inconsistent results when executed twice, it could be due to various reasons such as data inconsistency, transaction isolation levels, or hardware issues.

Here are some possible causes and solutions:

  1. Data Inconsistency: If the data in the table is modified between the two executions, the results may be inconsistent. You can try checking if other processes or transactions are modifying the data during execution.

  2. Transaction Isolation Level: If the transaction isolation level is set to “READ UNCOMMITTED” or “READ COMMITTED,” it may lead to inconsistent results. You can try setting the transaction isolation level to “REPEATABLE READ” or “SERIALIZABLE” to ensure consistency.

  3. Hardware Issues: If there are hardware issues such as disk I/O errors or network problems, it may lead to inconsistent results. You can try checking the hardware status and logs to see if there are any errors or warnings.

Additionally, you can use the “EXPLAIN” command to analyze the execution plan of the SQL statement and see if there are any differences between the two executions. This may help you identify the root cause of the inconsistency.

| username: weixiaobing | Original post link

I suggest checking the execution plan to see if TiFlash is being used. If TiFlash is being used, try forcing it to use TiKV and check again. If there are no issues with TiKV, it might be a known bug.

| username: xfworld | Original post link

Do you suspect that the data deleted in TiKV was not synchronized and released in TiFlash?

Well, considering the version you’re using, it’s not surprising if there are issues. :upside_down_face:

| username: beebol | Original post link

There is an inconsistency with the data checked in TiFlash, but there is no issue with TiKV. It should be a known bug. Thank you very much.