TiKV cannot start normally

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TIkv无法正常启动

| username: 小码农-小牛马

【TiDB Usage Environment】Test
【TiDB Version】Release Version: 6.1.0
【Reproduction Path】What operations were performed when the issue occurred: Followed the k8s process for normal deployment
【Encountered Issue: Problem Phenomenon and Impact】TIkv won’t start
【Resource Configuration】Nfs as the k8s cluster’s StorageClass, disk is ext4, deployment process according to the official documentation: 在 Kubernetes 上快速上手 TiDB | PingCAP 文档中心
【Attachments: Screenshots/Logs/Monitoring】

[2022/11/28 02:35:21.408 +00:00] [INFO] [server.rs:369] [“connect to PD cluster”] [cluster_id=7170886967293250264]

[2022/11/28 02:35:21.408 +00:00] [INFO] [config.rs:2041] [“readpool.storage.use-unified-pool is not set, set to true by default”]

[2022/11/28 02:35:21.408 +00:00] [INFO] [config.rs:2064] [“readpool.coprocessor.use-unified-pool is not set, set to true by default”]

[2022/11/28 02:35:21.419 +00:00] [WARN] [config.rs:2977] [“memory_usage_limit:ReadableSize(6492433066) > recommended:ReadableSize(6298492416), maybe page cache isn’t enough”]

[2022/11/28 02:35:21.469 +00:00] [INFO] [server.rs:1541] [“beginning system configuration check”]

[2022/11/28 02:35:21.470 +00:00] [INFO] [config.rs:891] [“data dir”] [mount_fs=“FsInfo { tp: "nfs4", opts: "rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.2.155,local_lock=none,addr=192.168.2.152", mnt_dir: "/var/lib/tikv", fsname: "192.168.2.152:/nfs/share/tidb-cluster-tikv-basic-tikv-0-pvc-5572c5c9-772c-4ba0-84ea-39fb4a18af16" }”] [data_path=/var/lib/tikv]

[2022/11/28 02:35:21.471 +00:00] [WARN] [server.rs:1557] [“check: rocksdb-data-dir”] [err=“config fs: data-dir.rotation.get: "192.168.2.152:/nfs/share/tidb-cluster-tikv-basic-tikv-0-pvc-5572c5c9-772c-4ba0-84ea-39fb4a18af16" no device find in block”] [path=/var/lib/tikv]

[2022/11/28 02:35:21.472 +00:00] [INFO] [config.rs:891] [“data dir”] [mount_fs=“FsInfo { tp: "nfs4", opts: "rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.2.155,local_lock=none,addr=192.168.2.152", mnt_dir: "/var/lib/tikv", fsname: "192.168.2.152:/nfs/share/tidb-cluster-tikv-basic-tikv-0-pvc-5572c5c9-772c-4ba0-84ea-39fb4a18af16" }”] [data_path=/var/lib/tikv/raft]

[2022/11/28 02:35:21.472 +00:00] [WARN] [server.rs:1565] [“check: raftdb-path”] [err=“config fs: data-dir.rotation.get: "192.168.2.152:/nfs/share/tidb-cluster-tikv-basic-tikv-0-pvc-5572c5c9-772c-4ba0-84ea-39fb4a18af16" no device find in block”] [path=/var/lib/tikv/raft]

[2022/11/28 02:35:21.475 +00:00] [INFO] [server.rs:338] [“using config”] [config="{"log-level":"info","log-file":"","log-format":"text","log-rotation-timespan":"0s","log-rotation-size":"300MiB","slow-log-file":"","slow-log-threshold":"1s","panic-when-unexpected-key-or-data":false,"abort-on-panic":false,"memory-usage-limit":6492433066,"memory-usage-high-water":0.9,"log":{"level":"info","format":"text","enable-timestamp":true,"file":{"filename":"","max-size":300,"max-days":0,"max-backups":0}},"quota":{"foreground-cpu-time":0,"foreground-write-bandwidth":"0KiB","foreground-read-bandwidth":"0KiB","max-delay-duration":"500ms"},"readpool":{"unified":{"min-thread-count":1,"max-thread-count":4,"stack-size":"10MiB","max-tasks-per-worker":2000},"storage":{"use-unified-pool":true,"high-concurrency":4,"normal-concurrency":4,"low-concurrency":4,"max-tasks-per-worker-high":2000,"max-tasks-per-worker-normal":2000,"max-tasks-per-worker-low":2000,"stack-size":"10MiB"},"coprocessor":{"use-unified-pool":true,"high-concurrency":3,"normal-concurrency":3,"low-concurrency":3,"max-tasks-per-worker-high":2000,"max-tasks-per-worker-normal":2000,"max-tasks-per-worker-low":2000,"stack-size":"10MiB"}},"server":{"addr":"0.0.0.0:20160","advertise-addr":"basic-tikv-0.basic-tikv-peer.tidb-cluster.svc:20160","status-addr":"0.0.0.0:20180","advertise-status-addr":"basic-tikv-0.basic-tikv-peer.tidb-cluster.svc:20180","status-thread-pool-size":1,"max-grpc-send-msg-len":10485760,"raft-client-grpc-send-msg-buffer":524288,"raft-client-queue-size":8192,"raft-msg-max-batch-size":128,"grpc-compression-type":"none","grpc-concurrency":5,"grpc-concurrent-stream":1024,"grpc-raft-conn-num":1,"grpc-memory-pool-quota":9223372036854775807,"grpc-stream-initial-window-size":"2MiB","grpc-keepalive-time":"10s","grpc-keepalive-timeout":"3s","concurrent-send-snap-limit":32,"concurrent-recv-snap-limit":32,"end-point-recursion-limit":1000,"end-point-stream-channel-size":8,"end-point-batch-row-limit":64,"end-point-stream-batch-row-limit":128,"end-point-enable-batch-if-possible":true,"end-point-request-max-handle-duration":"1m","end-point-max-concurrency":4,"end-point-perf-level":0,"snap-max-write-bytes-per-sec":"100MiB","snap-max-total-size":"0KiB","stats-concurrency":1,"heavy-load-threshold":75,"heavy-load-wait-duration":null,"enable-request-batch":true,"background-thread-count":2,"end-point-slow-log-threshold":"1s","forward-max-connections-per-address":4,"reject-messages-on-memory-ratio":0.2,"labels":{}},"storage":{"data-dir":"/var/lib/tikv","gc-ratio-threshold":1.1,"max-key-size":8192,"scheduler-concurrency":524288,"scheduler-worker-pool-size":4,"scheduler-pending-write-threshold":"100MiB","reserve-space":"0KiB","enable-async-apply-prewrite":false,"api-version":1,"enable-ttl":false,"background-error-recovery-window":"1h","ttl-check-poll-interval":"12h","flow-control":{"enable":true,"soft-pending-compaction-bytes-limit":"192GiB","hard-pending-compaction-bytes-limit":"1TiB","memtables-threshold":5,"l0-files-threshold":20},"block-cache":{"shared":true,"capacity":"3715MiB","num-shard-bits":6,"strict-capacity-limit":false,"high-pri-pool-ratio":0.8,"memory-allocator":"nodump"},"io-rate-limit":{"max-bytes-per-sec":"0KiB","mode":"write-only","strict":false,"foreground-read-priority":"high","foreground-write-priority":"high","flush-priority":"high","level-zero-compaction-priority":"medium","compaction-priority":"low","replication-priority":"high","load-balance-priority":"high","gc-priority":"high","import-priority":"medium","export-priority":"medium","other-priority":"high"}},"pd":{"endpoints":["http://basic-pd:2379"],"retry-interval":"300ms","retry-max-count":9223372036854775807,"retry-log-every":10,"update-interval":"10m","enable-forwarding":false},"metric":{"job":"tikv"},"raftstore":{"prevote":true,"raftdb-path":"/var/lib/tikv/raft","capacity":"0KiB","raft-base-tick-interval":"1s","raft-heartbeat-ticks":2,"raft-election-timeout-ticks":10,"raft-min-election-timeout-ticks":10,"raft-max-election-timeout-ticks":20,"raft-max-size-per-msg":"1MiB","raft-max-inflight-msgs":256,"raft-entry-max-size":"8MiB","raft-log-compact-sync-interval":"2s","raft-log-gc-tick-interval":"3s","raft-log-gc-threshold":50,"raft-log-gc-count-limit":73728,"raft-log-gc-size-limit":"72MiB","raft-log-reserve-max-ticks":6,"raft-engine-purge-interval":"10s","raft-entry-cache-life-time":"30s","split-region-check-tick-interval":"10s","region-split-check-diff":"6MiB","region-compact-check-interval":"5m","region-compact-check-step":100,"region-compact-min-tombstones":10000,"region-compact-tombstones-percent":30,"pd-heartbeat-tick-interval":"1m","pd-store-heartbeat-tick-interval":"10s","snap-mgr-gc-tick-interval":"1m","snap-gc-timeout":"4h","lock-cf-compact-interval":"10m","lock-cf-compact-bytes-threshold":"256MiB","notify-capacity":40960,"messages-per-tick":4096,"max-peer-down-duration":"10m","max-leader-missing-duration":"2h","abnormal-leader-missing-duration":"10m","peer-stale-state-check-interval":"5m","leader-transfer-max-log-lag":128,"snap-apply-batch-size":"10MiB","consistency-check-interval":"0s","report-region-flow-interval":"1m","raft-store-max-leader-lease":"9s","check-leader-lease-interval":"2s250ms","renew-leader-lease-advance-duration":"2s250ms","right-derive-when-split":true,"merge-max-log-gap":10,"merge-check-tick-interval":"2s","use-delete-range":false,"snap-generator-pool-size":2,"cleanup-import-sst-interval":"10m","local-read-batch-size":1024,"apply-max-batch-size":256,"apply-pool-size":2,"apply-reschedule-duration":"5s","apply-low-priority-pool-size":1,"store-max-batch-size":256,"store-pool-size":2,"store-reschedule-duration":"5s","store-low-priority-pool-size":0,"store-io-pool-size":0,"store-io-notify-capacity":40960,"future-poll-size":1,"hibernate-regions":true,"dev-assert":false,"apply-yield-duration":"500ms","perf-level":0,"evict-cache-on-memory-ratio":0.0,"cmd-batch":true,"cmd-batch-concurrent-ready-max-count":1,"raft-write-size-limit":"1MiB","waterfall-metrics":true,"io-reschedule-concurrent-max-count":4,"io-reschedule-hotpot-duration":"5s","inspect-interval":"500ms","report-min-resolved-ts-interval":"0s","reactive-memory-lock-tick-interval":"2s","reactive-memory-lock-timeout-tick":5,"report-region-buckets-tick-interval":"10s","max-snapshot-file-raw-size":"100MiB"},"coprocessor":{"split-region-on-table":false,"batch-split-limit":10,"region-max-size":"144MiB","region-split-size":"96MiB","region-max-keys":1440000,"region-split-keys":960000,"consistency-check-method":"mvcc","enable-region-bucket":false,"region-bucket-size":"96MiB","region-size-threshold-for-approximate":"1440MiB","prefer-approximate-bucket":true,"region-bucket-merge-size-ratio":0.33},"coprocessor-v2":{"coprocessor-plugin-directory":null},"rocksdb":{"info-log-level":"info","wal-recovery-mode":2,"wal-dir":"","wal-ttl-seconds":0,"wal-size-limit":"0KiB","max-total-wal-size":"4GiB","max-background-jobs":3,"max-background-flushes":1,"max-manifest-file-size":"128MiB","create-if-missing":true,"max-open-files":256,"enable-statistics":true,"stats-dump-period":"10m","compaction-readahead-size":"0KiB","info-log-max-size":"1GiB","info-log-roll-time":"0s","info-log-keep-log-file-num":10,"info-log-dir":"","rate-bytes-per-sec":"10GiB","rate-limiter-refill-period":"100ms","rate-limiter-mode":2,"rate-limiter-auto-tuned":true,"bytes-per-sync":"1MiB","wal-bytes-per-sync":"512KiB","max-sub-compactions":1,"writable-file-max-buffer-size":"1MiB","use-direct-io-for-flush-and-compaction":false,"enable-pipelined-write":false,"enable-multi-batch-write":true,"enable-unordered-write":false,"defaultcf":{"block-size":"64KiB","block-cache-size":"2002MiB","disable-block-cache":false,"cache-index-and-filter-blocks":true,"pin-l0-filter-and-index-blocks":true,"use-bloom-filter":true,"optimize-filters-for-hits":true,"whole-key-filtering":true,"bloom-filter-bits-per-key":10,"block-based-bloom-filter":false,"read-amp-bytes-per-bit":0,"compression-per-level":["no","no","lz4","lz4","lz4","zstd","zstd"],"write-buffer-size":"128MiB","max-write-buffer-number":5,"min-write-buffer-number-to-merge":1,"max-bytes-for-level-base":"512MiB","target-file-size-base":"8MiB","level0-file-num-compaction-trigger":4,"level0-slowdown-writes-trigger":20,"level0-stop-writes-trigger":36,"max-compaction-bytes":"2GiB","compaction-pri":3,"dynamic-level-bytes":true,"num-levels":7,"max-bytes-for-level-multiplier":10,"compaction-style":0,"disable-auto-compactions":false,"disable-write-stall":true,"soft-pending-compaction-bytes-limit":"192GiB","hard-pending-compaction-bytes-limit":"1TiB","force-consistency-checks":false,"prop-size-index-distance":4194304,"prop-keys-index-distance":40960,"enable-doubly-skiplist":true,"enable-compaction-guard":true,"compaction-guard-min-output-file-size":"8MiB","compaction-guard-max-output-file-size":"128MiB","bottommost-level-compression":"zstd","bottommost-zstd-compression-dict-size":0,"bottommost-zstd-compression-sample-size":0,"titan":{"min-blob-size":"1KiB","blob-file-compression":"lz4","blob-cache-size":"0KiB","min-gc-batch-size":"16MiB","max-gc-batch-size":"64MiB","discardable-ratio":0.5,"sample-ratio":0.1,"merge-small-file-threshold":"8MiB","blob-run-mode":"normal","level-merge":false,"range-merge":true,"max-sorted-runs":20,"gc-merge-rewrite":false}},"writecf":{"block-size":"64KiB","block-cache-size":"1201MiB","disable-block-cache":false,"cache-index-and-filter-blocks":true,"pin-l0-filter-and-index-blocks":true,"use-bloom-filter":true,"optimize-filters-for-hits":false,"whole-key-filtering":false,"bloom-filter-bits-per-key":10,"block-based-bloom-filter":false,"read-amp-bytes-per-bit":0,"compression-per-level":["no","no","lz4","lz4","lz4","zstd","zstd"],"write-buffer-size":"128MiB","max-write-buffer-number":5,"min-write-buffer-number-to-merge":1,"max-bytes-for-level-base":"512MiB","target-file-size-base":"8MiB","level0-file-num-compaction-trigger":4,"level0-slowdown-writes-trigger":20,"level0-stop-writes-trigger":36,"max-compaction-bytes":"2GiB","compaction-pri":3,"dynamic-level-bytes":true,"num-levels":7,"max-bytes-for-level-multiplier":10,"compaction-style":0,"disable-auto-compactions":false,"disable-write-stall":true,"soft-pending-compaction-bytes-limit":"192GiB","hard-pending-compaction-bytes-limit":"1

| username: 小码农-小牛马 | Original post link

After the container restarts, it keeps showing:

[2022/11/28 02:31:45.126 +00:00] [FATAL] [setup.rs:304] [“invalid configuration: Found raft data set when it should not exist.”]

| username: 小码农-小牛马 | Original post link

The documentation process is: 在 Kubernetes 上快速上手 TiDB | PingCAP 文档中心

I saw that TiKV did not start, so I did not continue with the monitoring installation process.

| username: 小码农-小牛马 | Original post link

I tried deploying on Alibaba Cloud ACK but it wasn’t successful. This time, I built a K8s cluster on VMware Esxi Vsphere. Last time, it was mentioned that the kernel of Alibaba Cloud ECS had been modified, but what about this time? TiDB has never been successfully deployed on the K8s cluster. Guys, please help me out :cold_sweat:

| username: 胡杨树旁 | Original post link

Sorry, I haven’t deployed a cluster using containers. Could this be the issue?

| username: 胡杨树旁 | Original post link

I think the main reason is that the number of connections is too high, causing the CPU to be too busy. You can try to reduce the number of connections and see if it improves.

| username: 小码农-小牛马 | Original post link

Thank you, brother. I also feel there is a problem, but I don’t know why following the official guidelines would lead to such an issue.

| username: 小码农-小牛马 | Original post link

The first image is not visible. The second image is not visible.

| username: wuxiangdong | Original post link

How about using a self-built NFS PVC?

| username: 小码农-小牛马 | Original post link

Did you create this PVC yourself? Currently, basic-tikv is a StatefulSet, which will automatically create PVCs based on the template. I have an NFS StorageClass that provides storage by default. The issue now is that TiKV won’t start. By “creating PVC yourself,” do you mean manually creating a PVC named basic-tikv-tikv-0?

| username: 小码农-小牛马 | Original post link

Oh, I don’t know what to do, can any expert give me some guidance?
Why does this thing appear? [“encryption: none of key dictionary and file dictionary are found.”]

| username: wuxiangdong | Original post link

The documentation uses local-volume-provisioner, you can switch to NFS and use nfs-subdir-external-provisioner.

| username: 小码农-小牛马 | Original post link

Brother, I just saw the latest error message

[2022/11/28 06:17:15.709 +00:00] [FATAL] [lib.rs:491] [“failed to open raft engine: Other("[components/raft_log_engine/src/engine.rs:464]: IO Error: fallocate")”] [backtrace=" 0: tikv_util::set_panic_hook::{{closure}}\n at home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/tikv_util/src/lib.rs:490:18\n 1: std::panicking::rust_panic_with_hook\n at rust/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:702:17\n 2: std::panicking::begin_panic_handler::{{closure}}\n at rust/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:588:13\n 3: std::sys_common::backtrace::__rust_end_short_backtrace\n at rust/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:138:18\n 4: rust_begin_unwind\n at rust/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:584:5\n 5: core::panicking::panic_fmt\n at rust/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panicking.rs:143:14\n 6: core::result::unwrap_failed\n at rust/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:1749:5\n 7: core::result::Result<T,E>::expect\n at rust/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:1022:23\n <raft_log_engine::engine::RaftLogEngine as server::server::ConfiguredRaftEngine>::build\n at home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/server/src/server.rs:1434:13\n server::server::TiKvServer::init_raw_engines\n at home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/server/src/server.rs:1468:27\n 8: server::server::run_impl\n at home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/server/src/server.rs:129:35\n server::server::run_tikv\n at home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/server/src/server.rs:163:5\n 9: tikv_server::main\n at home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/cmd/tikv-server/src/main.rs:189:5\n 10: core::ops::function::FnOnce::call_once\n at rust/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5\n std::sys_common::backtrace::__rust_begin_short_backtrace\n at rust/toolchains/nightly-2022-02-14-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:122:18\n 11: main\n 12: __libc_start_main\n 13: \n"] [location=components/server/src/server.rs:1435] [thread_name=main]

I kept missing this line

| username: 小码农-小牛马 | Original post link

Okay, I’ll give it a try.

| username: wuxiangdong | Original post link

I am also guessing, it might be this issue.

| username: 小码农-小牛马 | Original post link

Uh, bro, it’s done.

| username: 小码农-小牛马 | Original post link

Thank you, bro.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.