TiSpark cross-data center connection to TiFlash throws an exception, Call mpp isAlive fail with Exception. There is a delay of tens of milliseconds between data centers, no exception within the same data center

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tispark跨机房连tiflash抛异常, Call mpp isAlive fail with Exception机房之间有延迟几十ms,同机房无异常。

| username: TiDBer_w4puKrlI

[TiDB Usage Environment] Production Environment
[TiDB Version]
[Reproduction Path] No issues within the same data center, tispark cross-data center connection has issues
[Encountered Problem: Problem Phenomenon and Impact]
tispark cross-data center connection to tiflash throws an exception, there is a delay of tens of milliseconds between data centers, normal within the same data center.

Executor.java:1149) ~[?:1.8.0_152] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_152] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_152] 2024-05-25 08:02:47 [WARN] [storeStatus-thread-0] com.pingcap.tikv.region.RegionStoreClient#1223 - Call mpp isAlive fail with Exception shade.io.grpc.StatusRuntimeException: UNIMPLEMENTED at shade.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:244) ~[tispark-assembly-3.3_2.12-3.1.5.jar:?] at shade.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:225) ~[tispark-assembly-3.3_2.12-3.1.5.jar:?] at shade.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:142) ~[tispark-assembly-3.3_2.12-3.1.5.jar:?] at com.pingcap.tikv.region.RegionStoreClient.isMppAlive(RegionStoreClient.java:1219) ~[tispark-assembly-3.3_2.12-3.1.5.jar:?] at com.pingcap.tikv.TiSession.lambda$null$0(TiSession.java:254) ~[tispark-assembly-3.3_2.12-3.1.5.jar:?] at java.util.concurrent.ConcurrentHashMap.replaceAll(ConcurrentHashMap.java:1610) ~[?:1.8.0_152] at com.pingcap.tikv.TiSession.lambda$getStoreStatusCache$1(TiSession.java:252) ~[tispark-assembly-3.3_2.12-3.1.5.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_152] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[?:1.8.0_152] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_152] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[?:1.8.0_152] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_152] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_152] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_152] 2024-05-25 08:02:47 [WARN] [storeStatus-thread-0] com.pingcap.tikv.region.RegionStoreClient#1223 - Call mpp isAlive fail with Exception shade.io.grpc.StatusRuntimeException: UNIMPLEMENTED at shade.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:244) ~[tispark-assembly-3.3_2.12-3.1.5.jar:?] at shade.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:225) ~[tispark-assembly-3.3_2.12-3.1.5.jar:?] at shade.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:142) ~[tispark-assembly-3.3_2.12-3.1.5.jar:?] at com.pingcap.tikv.region.RegionStoreClient.isMppAlive(RegionStoreClient.java:1219) ~[tispark-assembly-3.3_2.12-3.1.5.jar:?] at com.pingcap.tikv.TiSession.lambda$null$0(TiSession.java:254) ~[tispark-assembly-3.3_2.12-3.1.5.jar:?] at java.util.concurrent.ConcurrentHashMap.replaceAll(ConcurrentHashMap.java:1610) ~[?:1.8.0_152] at com.pingcap.tikv.TiSession.lambda$getStoreStatusCache$1(TiSession.java:252) ~[tispark-assembly-3.3_2.12-3.1.5.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_152] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[?:1.8.0_152] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_152] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[?:1.8.0_152] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_152] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_152] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_152] 2024-05-25 08:02:49 [INFO] [dispatcher-Executor] org.apache.spark.executor.YarnCoarseGrainedExecutorBackend#61 - Driver commanded a shutdown 2024-05-25 08:02:49 [INFO] [CoarseGrainedExecutorBackend-stop-executor] org.apache.spark.storage.memory.MemoryStore#61 - MemoryStore cleared 2024-05-25 08:02:49 [INFO] [CoarseGrainedExecutorBackend-stop-executor] org.apache.spark.storage.BlockManager#61 - BlockManager stopped 2024-05-25 08:02:49 [INFO] [shutdown-hook-0] org.apache.spark.util.ShutdownHookManager#61 - Shutdown hook called

| username: TiDBer_w4puKrlI | Original post link

There are no issues with TiDB connecting to TiFlash across data centers.

| username: yytest | Original post link

The error message indicates that there might be a network communication issue or a server-side problem when attempting to perform certain operations. Specifically, the shade.io.grpc.StatusRuntimeException: UNIMPLEMENTED exception usually means that the gRPC method the client is trying to execute is not yet implemented on the server side. This situation could be caused by changes in the server’s API or configuration errors.

| username: 友利奈绪 | Original post link

Is it a configuration issue?