Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: 两台8核32G的机子带的动每秒150的并发不?

[TiDB Usage Environment] Production Environment
[TiDB Version]
6.6.0
[Reproduction Path]
[Encountered Problem: Problem Phenomenon and Impact]
Currently, the company has two machines with 8 cores and 32GB each. One machine has 1 TiDB and 1 PD installed, while the other has 3 TiKV instances installed. The CPU on the machine with TiKV is fully utilized.
I am considering expanding one TiKV on the machine with TiDB and shrinking one TiKV on the machine with TiKV to solve the problem. Will this work?
Additionally, TiDB frequently goes down, and there are some slow queries. The Coprocessor execution time is quite long, but checking the execution time shows it’s not an index issue; it should be a server resource issue.
id task estRows operator info actRows execution info memory disk
Projection_8 root 0.00 tokvideos.videos.id, tokvideos.videos.mid 10000 time:43.5s, loops:11, Concurrency:OFF 163.4 KB N/A
└─Limit_14 root 0.00 offset:0, count:10000 10000 time:43.5s, loops:11 N/A N/A
└─IndexLookUp_35 root 0.00 10000 time:43.5s, loops:10, index_task: {total_time: 43.5s, fetch_handle: 16.4s, build: 46.3ms, wait: 27s}, table_task: {total_time: 3m14.3s, num: 14, concurrency: 5}, next: {wait_index: 23.5ms, wait_table_lookup_build: 218.3µs, wait_table_lookup_resp: 43.5s} 15.7 MB N/A
├─IndexRangeScan_32(Build) cop[tikv] 0.00 table:v, index:mode_2(mode, status, ischeck, is_vip, ioscheck, createtime), range:(1 1 1 0 0 2023-02-19 12:34:15,1 1 1 0 0 +inf], keep order:true, desc, stats:pseudo 261604 time:16.4s, loops:239, cop_task: {num: 67, max: 5.54s, min: 260.4µs, avg: 315ms, p95: 1.84s, max_proc_keys: 33760, p95_proc_keys: 17376, tot_proc: 19.3s, tot_wait: 1.53s, rpc_num: 67, rpc_time: 21.1s, copr_cache_hit_ratio: 0.49, build_task_duration: 29.7µs, max_distsql_concurrency: 2}, tikv_task:{proc max:5.39s, min:0s, avg: 385.6ms, p80:50ms, p95:1.84s, iters:506, tasks:67}, scan_detail: {total_process_keys: 195077, total_process_keys_size: 19507700, total_keys: 201402, get_snapshot_time: 589.4µs, rocksdb: {delete_skipped_count: 14345, key_skipped_count: 230087, block: {cache_hit_count: 647}}} N/A N/A
└─Selection_34(Probe) cop[tikv] 0.00 isnull(tokvideos.videos.deletetime) 140918 time:3m14.3s, loops:162, cop_task: {num: 35, max: 43.5s, min: 377.9µs, avg: 3.95s, p95: 42.3s, max_proc_keys: 12104, p95_proc_keys: 10912, tot_proc: 2m14.7s, tot_wait: 3.29s, rpc_num: 35, rpc_time: 2m18.3s, copr_cache_hit_ratio: 0.26, build_task_duration: 2.6ms, max_distsql_concurrency: 5}, tikv_task:{proc max:42.8s, min:9ms, avg: 3.94s, p80:4.53s, p95:42.2s, iters:287, tasks:35}, scan_detail: {total_process_keys: 136239, total_process_keys_size: 53236007, total_keys: 211480, get_snapshot_time: 451.6µs, rocksdb: {delete_skipped_count: 45724, key_skipped_count: 296550, block: {cache_hit_count: 490074}}} N/A N/A
└─TableRowIDScan_33 cop[tikv] 0.00 table:v, keep order:false, stats:pseudo 140918 tikv_task:{proc max:42.8s, min:9ms, avg: 3.94s, p80:4.53s, p95:42.2s, iters:287, tasks:35} N/A N/A
Deployment script as follows
# Global variables apply to all deployments and are used as default values for deployments if specific deployment values are missing.
global:
# User running the TiDB cluster.
user: "tidb"
# Group specifies the group name the user belongs to (if different from user)
# group: "tidb"
# SSH port of the managed cluster servers.
ssh_port: 22
# Path to store cluster deployment files, startup scripts, and configuration files.
deploy_dir: "/tidb/tidb-deploy"
# TiDB cluster data storage directory
data_dir: "/tidb/tidb-data"
# Supported values: amd64, arm64 (default: amd64)
arch: "amd64"
# Resource control to limit instance resources.
# See: https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html
# Support overriding global "resource_control" with instance-level "resource_control".
resource_control:
# See: https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#MemoryLimit=bytes
# Memory limit
memory_limit: "8G"
# See: https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#CPUQuota=
# Percentage specifying how much CPU time the unit should get relative to the total available CPU time on one CPU. Use values > 100% to allocate CPU time on multiple CPUs.
# Example: CPUQuota=200% ensures that the executed processes will never get more than two CPU time.
cpu_quota: "400%"
# See: https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#IOReadBandwidthMax=device%20bytes
# io_read_bandwidth_max: "/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0 100M"
# io_write_bandwidth_max: "/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0 100M"
# # Monitored variables apply to all machines.
monitored:
# Communication port for reporting system information on each node in the TiDB cluster.
node_exporter_port: 9100
# Blackbox exporter communication port for TiDB cluster port monitoring.
blackbox_exporter_port: 9115
# Path to store monitoring component deployment files, startup scripts, and configuration files.
# deploy_dir: "/tidb-deploy/monitored-9100"
# Data storage path for monitoring components
# data_dir: "/tidb-data/monitored-9100"
# Log storage path for monitoring components
# log_dir: "/tidb-deploy/monitored-9100/log"
# Server configuration to specify runtime configuration for TiDB components
# All configuration items can be found in the TiDB documentation:
# - TiDB: https://pingcap.com/docs/stable/reference/configuration/tidb-server/configuration-file/
# - TiKV: https://pingcap.com/docs/stable/reference/configuration/tikv-server/configuration-file/
# - PD: https://pingcap.com/docs/stable/reference/configuration/pd-server/configuration-file/
# - TiFlash: https://docs.pingcap.com/tidb/stable/tiflash-configuration
# #
# # All configuration items use dots to represent hierarchy, e.g:
# # readpool.storage.use-unified-pool
# # ^ ^
# # - example: https://github.com/pingcap/tiup/blob/master/embed/examples/cluster/topology.example.yaml
# You can override this configuration with instance-level 'config' fields.
server_configs:
# tidb:
tikv:
# Whether to use a unified read thread pool to handle storage requests
readpool.storage.use-unified-pool: true
# Whether to use a unified read thread pool (configured in readpool.unified) to handle coprocessor requests
readpool.coprocessor.use-unified-pool: true
# Maximum number of threads in the unified read thread pool
readpool.unified.max-thread-count: 2
# Size of the shared block cache
storage.block-cache.capacity: "3GB"
# Storage capacity, i.e., the maximum data storage size allowed. If not set, the current disk capacity is used. If multiple TiKV instances are deployed on the same physical disk, this parameter needs to be added in the TiKV configuration
raftstore.capacity: "50GB"
pd:
replication.enable-placement-rules: true
replication.location-labels: ["host"]
# tiflash:
# tiflash-learner:
# kvcdc:
# PD configuration.
pd_servers:
- host: 172.27.197.96
# SSH port of the server.
# ssh_port: 22
# PD server name
# name: "pd-1"
# Communication port for connecting to TiDB servers.
# client_port: 2379
# Communication port between PD Server nodes
# peer_port: 2380
# Path to store PD Server deployment files, startup scripts, and configuration files
# deploy_dir: "/tidb-deploy/pd-2379"
# Data storage directory for PD Server.
# data_dir: "/tidb-data/pd-2379"
# Log storage path for PD Server
# log_dir: "/tidb-deploy/pd-2379/log"
# Numa node binding
# numa_node: "0"
# The following configuration is used to override the values of `server_configs.pd`.
config:
# schedule.max-merge-region-size: 20
# schedule.max-merge-region-keys: 200000
# TiDB configuration.
tidb_servers:
- host: 172.27.197.96
# ssh_port: 22
# Port to access the TiDB cluster
port: 4000
# Port for reporting TiDB server status information.
status_port: 10080
# Path to store TiDB server deployment files, startup scripts, and configuration files.
# deploy_dir: "/tidb-deploy/tidb-4000"
# Log storage path for TiDB server
# log_dir: "/tidb-deploy/tidb-4000/log"
# Recommended to bind numa node
# numa_node: "1"
# TiKV configuration
tikv_servers:
- host: 172.27.197.97
port: 20160
status_port: 20180
# numa_node: "0"
# The following configuration is used to override server_configs.tikv
config:
log.level: warn
server.labels:
host: tikv1
- host: 172.27.197.97
port: 20161
status_port: 20181
# numa_node: "1"
config:
log.level: warn
server.labels:
host: tikv1
- host: 172.27.197.97
port: 20162
status_port: 20182
# numa_node: "0"
config:
log.level: warn
server.labels:
host: tikv1
# # Server configuration to specify TiFlash server configuration
tiflash_servers:
# # The IP address of the TiFlash Server.
- host: 172.27.197.97
# tcp_port: 9000
# http_port: 8123
# # TiFlash raft service and coprocessor service listening address.
# flash_service_port: 3930
# # TiFlash Proxy service port.
# flash_proxy_port: 20170
# # Prometheus pulls TiFlash Proxy metrics port.
# flash_proxy_status_port: 20292
# # Prometheus pulls the TiFlash metrics port.
# metrics_port: 8234
monitoring_servers:
# # The IP address of the Monitoring Server.
- host: 172.27.197.96
grafana_servers:
# # The IP address of the Grafana Server.
- host: 172.27.197.96
[Resource Configuration]
Two machines with 8 cores and 32GB each, one with 1 TiDB and 1 PD, the other with 3 TiKV instances and 1 TiFlash
[Attachments: Screenshots/Logs/Monitoring]