TiDB Query Error SQL Error [1105] [HY000]: Other Error for MPP Stream

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb查询报错SQL 错误 [1105] [HY000]: other error for mpp stream

| username: TiDBer_rYOSh9JN

[TiDB Usage Environment] Production Environment
[TiDB Version] V6.5.0
[Reproduction Path] SELECT COUNT(*) FROM table;
[Encountered Problem: Phenomenon and Impact] The table has 70 million rows. After enabling the HTAP feature, it has been running stably for half a month. Starting today, every query immediately reports an error:
SQL Error [1105] [HY000]: other error for mpp stream: From MPPquery:444771302863798278:2,task: Poco::Exception. Code: 1000, e.code() = 2, e.displayText() = Exception: Receive TsoResponse failed, e.what() = Exception

TiFlash log:
Constantly printing this information:
get safe point failed: 9: mismatch cluster id, need 7272615872205770545 but got 0

Tried to re-establish HTAP:
ALTER TABLE d.table SET TIFLASH REPLICA 0;
ALTER TABLE d.table SET TIFLASH REPLICA 1;

It is now back to normal, but I don’t know the cause and whether it will happen again next time… Seeking help.

| username: Billmay表妹 | Original post link

Take a look at this issue.

| username: TiDBer_rYOSh9JN | Original post link

It doesn’t seem to be this issue, and the version we are using is V6.5.0, which should no longer have this configuration limitation.

| username: TiDBer_rYOSh9JN | Original post link

The cluster experienced a power outage and automatically restarted. Is it possible that the power outage and subsequent restart of the cluster caused this issue?

| username: Jellybean | Original post link

This error occurs because the cluster ID stored locally in TiKV or TiFlash does not match the cluster ID specified by PD.
When the storage node is initialized for the first time, it will obtain the cluster ID from PD and store it locally. The next time it starts, it will check whether the local cluster ID matches the PD cluster ID. If they do not match, an error will be reported and the process will exit.

It is possible that the power outage you mentioned caused this anomaly.

| username: ti-tiger | Original post link

The issue might be caused by an inconsistency in the cluster ID between TiFlash and PD. The cluster ID is randomly assigned during PD initialization and is used to identify different TiDB clusters. If the cluster IDs of TiFlash and PD do not match, TiFlash cannot obtain TSO from PD, leading to MPP query failures.

Possible reasons for cluster ID inconsistency include:

  • The TiFlash node was mistakenly added to another TiDB cluster, or the data on the TiFlash node was cleared without reinitialization.
  • The PD node was mistakenly added to another TiDB cluster, or the data on the PD node was cleared without reinitialization.
  • The cluster-id parameter in the configuration files of TiFlash or PD was incorrectly modified.
| username: 有猫万事足 | Original post link

The statements above are quite reliable.

Moreover, your error message also indicates
Receive TsoResponse failed

It is likely that PD is in a state where it cannot even provide TSO.

| username: TiDBer_小阿飞 | Original post link

Master, how do I fix it? Should I restart TIFLASH?

| username: ti-tiger | Original post link

  • Check the network status between the TiFlash node and other nodes to ensure smooth connectivity and low latency. You can use tools like ping or telnet to test network connectivity and latency.
  • Check whether the cluster-id parameter in the TiFlash node’s configuration file is consistent with the cluster-id of the PD node. You can use the pd-ctl tool to view the cluster-id of the PD node and compare it with the cluster-id in the TiFlash node’s configuration file. If they are inconsistent, you need to redeploy the TiFlash node and ensure the cluster-id matches the PD node.
| username: TiDBer_rYOSh9JN | Original post link

Re-establishing HTAP can fix it; it was probably caused by a power outage in the cluster. TiDB’s robustness in this area still has room for improvement.
ALTER TABLE d.table SET TIFLASH REPLICA 0;
ALTER TABLE d.table SET TIFLASH REPLICA 1;