Query engine configured only for TiFlash and TiDB, but not TiKV, results in query error

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 查询引擎只配置为tiflash、tidb,而不配置tikv,查询报错

| username: wenyi

[TiDB Usage Environment] Test
[TiDB Version] 6.5.2
set @@session.tidb_isolation_read_engines = “tiflash,tidb”;
[Resource Configuration]
3 PD, TiDB, deployed together, each node has 32-core CPU, 128GB memory, mechanical disk, i.e., each node deploys TiDB and PD
3 TiKV, each node has 32-core CPU, 128GB memory, SSD disk
1 TiFlash, each node has 32-core CPU, 256GB memory, SSD disk
Query error


Specific logs:
[2023/05/09 14:50:10.630 +08:00] [ERROR] [MPPTask.cpp:429] [“task running meets error: Code: 0, e.displayText() = DB::Exception: write to tunnel which is already closed, Receiver cancelled, reason: Push mpp packet failed. Receiver state: CLOSED, e.what() = DB::Exception, Stack trace:\n\n\n 0x17221ce\tDB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, int) [tiflash+24256974]\n \tdbms/src/Common/Exception.h:46\n 0x6e23f34\tDB::MPPTunnel::write(std::__1::shared_ptrDB::TrackedMppDataPacket&&) [tiflash+115490612]\n \tdbms/src/Flash/Mpp/MPPTunnel.cpp:168\n 0x16f5d6c\tDB::HashPartitionWriter<std::__1::shared_ptrDB::MPPTunnelSet >::partitionAndEncodeThenWriteBlocks() [tiflash+24075628]\n \tdbms/src/Flash/Mpp/HashPartitionWriter.cpp:103\n 0x6dc7042\tDB::ExchangeSenderBlockInputStream::readImpl() [tiflash+115109954]\n \tdbms/src/DataStreams/ExchangeSenderBlockInputStream.cpp:43\n 0x61609a5\tDB::IProfilingBlockInputStream::read(DB::PODArray<unsigned char, 4096ul, Allocator, 15ul, 16ul>&, bool) [tiflash+102107557]\n \tdbms/src/DataStreams/IProfilingBlockInputStream.cpp:75\n 0x6160695\tDB::IProfilingBlockInputStream::read() [tiflash+102106773]\n \tdbms/src/DataStreams/IProfilingBlockInputStream.cpp:43\n 0x6dcd95e\tDB::ParallelInputsProcessor<DB::UnionBlockInputStream<(DB::StreamUnionMode)0, true>::Handler, (DB::StreamUnionMode)0>::work(unsigned long, DB::ParallelInputsProcessor<DB::UnionBlockInputStream<(DB::StreamUnionMode)0, true>::Handler, (DB::StreamUnionMode)0>::WorkingInputs&) [tiflash+115136862]\n \tdbms/src/DataStreams/ParallelInputsProcessor.h:270\n 0x6dcd476\tstd::__1::__function::__func<DB::ParallelInputsProcessor<DB::UnionBlockInputStream<(DB::StreamUnionMode)0, true>::Handler, (DB::StreamUnionMode)0>::process()::‘lambda’(), std::__1::allocator<DB::ParallelInputsProcessor<DB::UnionBlockInputStream<(DB::StreamUnionMode)0, true>::Handler, (DB::StreamUnionMode)0>::process()::‘lambda’()>, void ()>::operator()() [tiflash+115135606]\n \t/usr/local/bin/…/include/c++/v1/__functional/function.h:345\n 0x1804c9b\tDB::ExecutableTask<std::__1::packaged_task<void ()> >::execute() [tiflash+25185435]\n \tdbms/src/Common/ExecutableTask.h:52\n 0x18081e3\tDB::DynamicThreadPool::executeTask(std::__1::unique_ptr<DB::IExecutableTask, std::__1::default_deleteDB::IExecutableTask >&) [tiflash+25199075]\n \tdbms/src/Common/DynamicThreadPool.cpp:101\n 0x1807840\tDB::DynamicThreadPool::fixedWork(unsigned long) [tiflash+25196608]\n \tdbms/src/Common/DynamicThreadPool.cpp:115\n 0x1808932\tvoid std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_deletestd::__1::__thread_struct >, std::__1::thread DB::ThreadFactory::newThread<void (DB::DynamicThreadPool::)(unsigned long), DB::DynamicThreadPool, unsigned long&>(bool, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, void (DB::DynamicThreadPool::&&)(unsigned long), DB::DynamicThreadPool&&, unsigned long&)::‘lambda’(auto&&…), DB::DynamicThreadPool*, unsigned long> >(void*) [tiflash+25200946]\n \t/usr/local/bin/…/include/c++/v1/thread:291\n 0x7f177819eea5\tstart_thread [libpthread.so.0+32421]\n 0x7f17775a396d\t__clone [libc.so.6+1042797]”] [source=MPPquery:441349571902701578:3,task] [thread_id=281]
[2023/05/09 14:50:10.630 +08:00] [WARN] [MPPTaskManager.cpp:152] [“Begin to abort query: 441349571902701578, abort type: ONERROR, reason: From MPPquery:441349571902701578:3,task: Code: 0, e.displayText() = DB::Exception: write to tunnel which is already closed, Receiver cancelled, reason: Push mpp packet failed. Receiver state: CLOSED, e.what() = DB::Exception,”] [thread_id=281]
[2023/05/09 14:50:10.630 +08:00] [WARN] [MPPTaskManager.cpp:195] ["Remaining task in query 441349571902701578 are: MPPquery:441349571902701578:1,task MPPquery:441349571902701578:12,task MPPquery:441349571902701578:11,task MPPquery:441349571902701578:13,task MPPquery:441349571902701578:3,task "] [thread_id=281]
[2023/05/09 14:50:10.630 +08:00] [WARN] [MPPTask.cpp:471] [“Begin abort task: MPPquery:441349571902701578:1,task, abort type: ONERROR”] [source=MPPquery:441349571902701578:1,task] [thread_id=281]
[2023/05/09 14:50:10.630 +08:00] [WARN] [MPPTask.cpp:500] [“Finish abort task from running”] [source=MPPquery:441349571902701578:1,task] [thread_id=281]
[2023/05/09 14:50:10.630 +08:00] [WARN] [MPPTask.cpp:471] [“Begin abort task: MPPquery:441349571902701578:12,task, abort type: ONERROR”] [source=MPPquery:441349571902701578:12,task] [thread_id=281]
[2023/05/09 14:50:10.630 +08:00] [WARN] [MPPTask.cpp:500] [“Finish abort task from running”] [source=MPPquery:441349571902701578:12,task] [thread_id=281]
[2023/05/09 14:50:10.631 +08:00] [WARN] [MPPTask.cpp:471] [“Begin abort task: MPPquery:441349571902701578:11,task, abort type: ONERROR”] [source=MPPquery:441349571902701578:11,task] [thread_id=281]
[2023/05/09 14:50:10.632 +08:00] [WARN] [MPPTask.cpp:500] [“Finish abort task from running”] [source=MPPquery:441349571902701578:11,task] [thread_id=281]
[2023/05/09 14:50:10.632 +08:00] [WARN] [MPPTask.cpp:471] [“Begin abort task: MPPquery:441349571902701578:13,task, abort type: ONERROR”] [source=MPPquery:441349571902701578:13,task] [thread_id=281]
[2023/05/09 14:50:10.632 +08:00] [WARN] [MPPTask.cpp:500] [“Finish abort task from running”] [source=MPPquery:441349571902701578:13,task] [thread_id=281]
[2023/05/09 14:50:10.633 +08:00] [WARN] [MPPTask.cpp:471] [“Begin abort task: MPPquery:441349571902701578:3,task, abort type: ONERROR”] [source=MPPquery:441349571902701578:3,task] [thread_id=281]
[2023/05/09 14:50:10.633 +08:00] [WARN] [MPPTask.cpp:500] [“Finish abort task from running”] [source=MPPquery:441349571902701578:3,task] [thread_id=281]
[2023/05/09 14:50:10.633 +08:00] [WARN] [MPPTaskManager.cpp:207] [“Finish abort query: 441349571902701578”] [thread_id=281]
[2023/05/09 14:50:10.647 +08:00] [WARN] [MPPTaskManager.cpp:152] [“Begin to abort query: 441349571902701578, abort type: ONCANCELLATION, reason: Receive cancel request from TiDB”] [thread_id=140]
[2023/05/09 14:50:10.647 +08:00] [WARN] [MPPTaskManager.cpp:167] [“441349571902701578 already in abort process, skip abort”] [thread_id=140]

| username: WalterWj | Original post link

Go to the issue feedback section and report it. It feels unexpected.

| username: wenyi | Original post link

How do you understand it without exceeding the deadline?

| username: TiDBer_UUTlqVvZ | Original post link

Let me give it a try. Based on the error information you provided, this is an MPP-related error. It may be due to an exception in the MPP Stream of the TiFlash node, causing the query to fail.

There are many possible reasons for the MPP Stream exception in the TiFlash node, leading to query failure. Here are some common reasons:

  1. Network connection issues: The network connection between the TiFlash node and the TiDB/TiKV nodes is unstable or has insufficient bandwidth, causing the MPP Stream transmission to fail.
  2. Resource limitation issues: The TiFlash node has insufficient resource limits, such as memory or CPU, causing the MPP Stream processing to fail.
  3. TiFlash version issues: The TiFlash version is outdated or has known bugs, causing the MPP Stream processing to fail.
  4. Query statement issues: The query statement itself has issues, such as syntax errors or data type mismatches, causing the MPP Stream processing to fail.
  5. TiDB cluster configuration issues: There are configuration issues in the TiDB cluster, such as incompatible versions of TiDB/TiKV/TiFlash nodes or incorrect parameter configurations, causing the MPP Stream processing to fail.

You can troubleshoot these issues one by one to determine the specific cause.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.