After TiKV Data Loss Recovery, Tiflash Synchronization Error Occurs, No Data Synchronized

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiKV 有损数据恢复后 Tiflash 同步数据出现error 错误, 无数据被同步

| username: 济南小老虎

【TiDB Usage Environment】Poc
【TiDB Version】6.5.3
【Reproduction Path】

  1. TiKV uses a single replica due to performance requirements, two nodes encountered anomalies, and data recovery with loss was performed.
  2. TiFlash stopped syncing, setting replica to 0 and then to 1 didn’t work.
  3. Scaled-in TiFlash and then scaled-out TiFlash.
  4. Found that TiFlash still didn’t sync the TiKV tables, errors occurred.

【Encountered Problem: Phenomenon and Impact】
2023/09/19 15:04:36.820 +08:00] [WARN] [TiDBSchemaSyncer.h:225] [“apply diff meets exception : DB::TiFlashException: miss table in TiKV : 600140 \n stack is \n 0x15395a4\tDB::TiFlashException::TiFlashException(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, DB::TiFlashError const&) [tiflash+22255012]\n \tdbms/src/Common/TiFlashException.h:250\n 0x60286cc\tDB::SchemaBuilder<DB::SchemaGetter, DB::SchemaNameMapper>::applyDiff(DB::SchemaDiff const&) [tiflash+100828876]\n \tdbms/src/TiDB/Schema/SchemaBuilder.cpp:521\n 0x5f811d0\tDB::TiDBSchemaSyncer<false, false>::syncSchemas(DB::Context&) [tiflash+100143568]\n \tdbms/src/TiDB/Schema/TiDBSchemaSyncer.h:128\n 0x605d750\tstd::__1::__function::__func<DB::SchemaSyncService::SchemaSyncService(DB::Context&)::$_0, std::__1::allocatorDB::SchemaSyncService::SchemaSyncService(DB::Context&)::$_0, bool ()>::operator()() [tiflash+101046096]\n \t/usr/local/bin/…/include/c++/v1/__functional/function.h:345\n 0x5cc872c\tvoid* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_deletestd::__1::__thread_struct >, DB::BackgroundProcessingPool::BackgroundProcessingPool(int, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >)::$_1> >(void*) [tiflash+97290028]\n \t/usr/local/bin/…/include/c++/v1/thread:291\n 0xffffac3b87ac\t [libpthread.so.0+34732]\n 0xffffac0a60fc\t [libc.so.6+876796]”] [source=SchemaSyncer] [thread_id=194]
[2023/09/19 15:53:19.486 +08:00] [WARN] [TiDBSchemaSyncer.h:225] [“apply diff meets exception : DB::TiFlashException: miss table in TiKV : 600153 \n stack is \n 0x15395a4\tDB::TiFlashException::TiFlashException(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, DB::TiFlashError const&) [tiflash+22255012]\n \tdbms/src/Common/TiFlashException.h:250\n 0x60286cc\tDB::SchemaBuilder<DB::SchemaGetter, DB::SchemaNameMapper>::applyDiff(DB::SchemaDiff const&) [tiflash+100828876]\n \tdbms/src/TiDB/Schema/SchemaBuilder.cpp:521\n 0x5f811d0\tDB::TiDBSchemaSyncer<false, false>::syncSchemas(DB::Context&) [tiflash+100143568]\n \tdbms/src/TiDB/Schema/TiDBSchemaSyncer.h:128\n 0x605d750\tstd::__1::__function::__func<DB::SchemaSyncService::SchemaSyncService(DB::Context&)::$_0, std::__1::allocatorDB::SchemaSyncService::SchemaSyncService(DB::Context&)::$_0, bool ()>::operator()() [tiflash+101046096]\n \t/usr/local/bin/…/include/c++/v1/__functional/function.h:345\n 0x5cc872c\tvoid* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_deletestd::__1::__thread_struct >, DB::BackgroundProcessingPool::BackgroundProcessingPool(int, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >)::$_1> >(void*) [tiflash+97290028]\n \t/usr/local/bin/…/include/c++/v1/thread:291\n 0xffffac3b87ac\t [libpthread.so.0+34732]\n 0xffffac0a60fc\t [libc.so.6+876796]”] [source=SchemaSyncer] [thread_id=198]
[2023/09/19 16:12:03.657 +08:00] [WARN] [TiDBSchemaSyncer.h:225] [“apply diff meets exception : DB::TiFlashException: miss table in TiKV : 600158 \n stack is \n 0x15395a4\tDB::TiFlashException::TiFlashException(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, DB::TiFlashError const&) [tiflash+22255012]\n \tdbms/src/Common/TiFlashException.h:250\n 0x60286cc\tDB::SchemaBuilder<DB::SchemaGetter, DB::SchemaNameMapper>::applyDiff(DB::SchemaDiff const&) [tiflash+100828876]\n \tdbms/src/TiDB/Schema/SchemaBuilder.cpp:521\n 0x5f811d0\tDB::TiDBSchemaSyncer<false, false>::syncSchemas(DB::Context&) [tiflash+100143568]\n \tdbms/src/TiDB/Schema/TiDBSchemaSyncer.h:128\n 0x605d750\tstd::__1::__function::__func<DB::SchemaSyncService::SchemaSyncService(DB::Context&)::$_0, std::__1::allocatorDB::SchemaSyncService::SchemaSyncService(DB::Context&)::$_0, bool ()>::operator()() [tiflash+101046096]\n \t/usr/local/bin/…/include/c++/v1/__functional/function.h:345\n 0x5cc872c\tvoid* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_deletestd::__1::__thread_struct >, DB::BackgroundProcessingPool::BackgroundProcessingPool(int, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >)::$_1> >(void*) [tiflash+97290028]\n \t/usr/local/bin/…/include/c++/v1/thread:291\n 0xffffac3b87ac\t [libpthread.so.0+34732]\n 0xffffac0a60fc\t [libc.so.6+876796]”] [source=SchemaSyncer] [thread_id=192]

【Resource Configuration】4* 96-core 512G Kunpeng servers, NVME SSD, sufficient remaining disk space.
【Attachments: Screenshots/Logs/Monitoring】

| username: tidb菜鸟一只 | Original post link

Then there’s a problem with your TiKV’s loss recovery… No wonder the TiFlash replica can’t be refreshed…

| username: 像风一样的男子 | Original post link

Another post… Did you successfully prune after scaling in?

| username: 济南小老虎 | Original post link

Hmm…

| username: 济南小老虎 | Original post link

Mr. Huang said one question per post… So I posted them separately…

| username: 济南小老虎 | Original post link

Is there a way to solve this problem? TiKV has indeed undergone lossy recovery.