Configuration of storage.engine parameter set to partitioned-raft-kv in version 7.1

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 关于7.1版本中storage.engine参数配置为partitioned-raft-kv

| username: wenyi

In version 7.1.0, with a data volume of around 10T, each region being 192M, and having hundreds of regions, configured with storage.engine = partitioned-raft-kv, how many RocksDB instances are there in total across all TiKV nodes? Having too many RocksDB instances can actually degrade performance.

| username: 裤衩儿飞上天 | Original post link

In this case, a single region seems to have a default size of 10GB.

| username: wenyi | Original post link

The size of the region can be controlled by parameters. I configured it to 192M. The documentation says that when using TiFlash, it is not recommended for the region to be larger than 256M.

| username: ljluestc | Original post link

When addressing performance issues in a TiDB cluster, especially with large datasets and multiple TiKV nodes, there are several factors to consider, and you can take some measures to improve performance:

Hardware Configuration: Ensure that your hardware resources (such as CPU, memory, and storage) are adequately configured to handle the workload. Consider factors like disk I/O, network bandwidth, and latency. Monitor resource utilization to identify any bottlenecks.

TiDB Configuration: Review and optimize TiDB configuration parameters based on your workload characteristics. Parameters like tidb_index_lookup_concurrency, tidb_index_serial_scan_concurrency, and tidb_hash_join_concurrency can significantly impact query performance. Adjust them according to your specific workload requirements.

TiKV Configuration: Adjust TiKV configuration parameters to optimize performance. Parameters like raftstore.apply-pool-size,, and rocksdb.max-sub-compactions can have a significant impact on TiKV performance. Properly tuning these parameters can help improve performance.

Placement Rules: Consider checking and adjusting placement rules in the cluster to ensure optimal distribution and load balancing of regions across TiKV nodes. Imbalanced region distribution can lead to performance issues. Use the pd-ctl tool to check region distribution and balance them if necessary.

Monitoring and Metrics: Set up monitoring and metrics collection for the TiDB cluster using tools like Prometheus and Grafana. Monitor key metrics such as CPU usage, memory usage, disk I/O, network traffic, and TiKV thread pool utilization. This will help you identify performance bottlenecks and address issues.

Slow Query Analysis: Identify and optimize slow-running queries. Analyze query plans, identify any missing indexes, and optimize query execution paths. Use TiDB’s built-in EXPLAIN statement to understand how queries are executed and determine areas for improvement.

Data Partitioning: Consider partitioning large tables based on data access patterns and query requirements. Partitioning can improve query performance by reducing the amount of data scanned for each query.

TiDB Upgrade: Consider upgrading to the latest version of TiDB. Newer versions often include performance improvements and bug fixes that can enhance overall cluster performance.

Data Model Optimization: Review your data model and schema design. Ensure that appropriate indexes are in place to support query performance. Normalize or denormalize your data structures based on access patterns and query requirements.

| username: redgame | Original post link

Is it that big? Our company’s is very small.