Tidbv6.6 TIKV Running disconnected -> down

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: Tidbv6.6 TIKV 运行中disconnected → down

| username: TiDBer_SfTDh46h

【TiDB Usage Environment】Testing/PoC
【TiDB Version】6.6
【Reproduction Path】Normal usage
【Encountered Issue】tikv disconnected down, unable to start
【Resource Configuration】
【Attachments: Screenshot/Logs/Monitoring】

[2023/03/29 11:28:46.813 +08:00] [FATAL] [lib.rs:497] [“called Result::unwrap() on an Err value: Other(Os { code: 2, kind: NotFound, message: "No such file or directory" })”] [backtrace=" 0: tikv_util::set_panic_hook::{{closure}}\n at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/tikv_util/src/lib.rs:496:18\n 1: <alloc::boxed::Box<F,A> as core::ops::function::Fn>::call\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:2032:9\n std::panicking::rust_panic_with_hook\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:692:13\n 2: std::panicking::begin_panic_handler::{{closure}}\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:579:13\n 3: std::sys_common::backtrace::__rust_end_short_backtrace\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:137:18\n 4: rust_begin_unwind\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:575:5\n 5: core::panicking::panic_fmt\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panicking.rs:65:14\n 6: core::result::unwrap_failed\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:1791:5\n 7: core::result::Result<T,E>::unwrap\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:1113:23\n server::server::TikvServer::init_storage_stats_task::{{closure}}\n at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/server/src/server.rs:1563:33\n tikv_util::worker::pool::Worker::spawn_interval_task::{{closure}}\n at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/tikv_util/src/worker/pool.rs:384:17\n <core::future::from_generator::GenFuture as core::future::future::Future>::poll\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:91:19\n yatp::task::future::RawTask::poll\n at /rust/git/checkouts/yatp-e704b73c3ee279b6/bcf431a/src/task/future.rs:59:9\n 8: yatp::task::future::TaskCell::poll\n at /rust/git/checkouts/yatp-e704b73c3ee279b6/bcf431a/src/task/future.rs:103:9\n <yatp::task::future::Runner as yatp::pool::runner::Runner>::handle\n at /rust/git/checkouts/yatp-e704b73c3ee279b6/bcf431a/src/task/future.rs:387:20\n 9: <tikv_util::yatp_pool::YatpPoolRunner as yatp::pool::runner::Runner>::handle\n at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/tikv_util/src/yatp_pool/mod.rs:122:24\n yatp::pool::worker::WorkerThread<T,R>::run\n at /rust/git/checkouts/yatp-e704b73c3ee279b6/bcf431a/src/pool/worker.rs:48:13\n yatp::pool::builder::LazyBuilder::build::{{closure}}\n at /rust/git/checkouts/yatp-e704b73c3ee279b6/bcf431a/src/pool/builder.rs:114:25\n std::sys_common::backtrace::rust_begin_short_backtrace\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:121:18\n 10: std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}}\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:551:17\n <core::panic::unwind_safe::AssertUnwindSafe as core::ops::function::FnOnce<()>>::call_once\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panic/unwind_safe.rs:271:9\n std::panicking::try::do_call\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:483:40\n std::panicking::try\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:447:19\n std::panic::catch_unwind\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:137:14\n std::thread::Builder::spawn_unchecked::{{closure}}\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:550:30\n core::ops::function::FnOnce::call_once{{vtable.shim}}\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:513:5\n 11: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce>::call_once\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:2000:9\n <alloc::boxed::Box<F,A> as core::ops::function::FnOnce>::call_once\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:2000:9\n std::sys::unix::thread::thread::new::thread_start\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/unix/thread.rs:108:17\n 12: start_thread\n 13: __clone\n"] [location=components/server/src/server.rs:1563] [thread_name=background-0]

| username: ffeenn | Original post link

Why did you put the storage directory under tmp? Check if this directory has been automatically cleaned up. The tmp directory is set to be cleaned up periodically by the system, so it’s very confusing that you placed it there.

| username: TiDBer_SfTDh46h | Original post link

The storage directory exists, but indeed some files have disappeared. /tmp is mounted on a separate SSD. Does the naming of /tmp cause an impact?

| username: ffeenn | Original post link

No, you shouldn’t do it that way. You should modify the mount directory again. The lost files cannot be recovered. If the data is not important, reinitialize it to a different path. The rules for cleaning tmp vary depending on the system’s distribution. You should understand this basic knowledge first.

| username: undefined | Original post link

Check if there are any rules by running cat /etc/cron.daily/tmpwatch.