Consultation on Disk IO Issues in New TiDB Environment

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 新tidb环境磁盘io问题咨询

| username: Jolyne

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] What operations were performed that caused the issue
[Encountered Issue: Issue Phenomenon and Impact]
The company is preparing to migrate TiDB, moving the existing cloud cluster to a self-built cloud. Currently, the cloud cluster disk is an ordinary SAS disk, with read/write IO at 5000 :sweat_smile: In the new environment, IO 8K random write is approximately 160,000 IOPS, mixed read/write is approximately 220,000 IOPS, and 4K random read/write (7:3) is approximately 175,000 IOPS. Is this acceptable?
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

| username: xfworld | Original post link

POC stress test~

| username: caiyfc | Original post link

Refer to this:

| username: Jolyne | Original post link

Sure, should we use sysbench or TPC-C for testing?

| username: caiyfc | Original post link

You can refer to the following for testing with fio:

| username: 啦啦啦啦啦 | Original post link

Sure, please provide the text you need translated.

| username: Jolyne | Original post link

Okay, thank you very much.

| username: Jolyne | Original post link

Thank you, I will try them all.

| username: h5n1 | Original post link

Suggestions from the official expert:
IOPS is divided into read and write parts. The high IOPS claimed by cloud disks are mostly achieved by leveraging cache to enhance read IOPS. Disk performance also includes bandwidth and fdatasync. TiKV requires disk sync operations when writing data to ensure that the data has been flushed from the buffer to the hardware before returning to the business side, specifically through the fdatasync system call.

TiKV disk recommendations are a write bandwidth of over 2GB/s, more than 20K fdatasync operations per second, and P99.99 latency below 3ms in 4KB high-concurrency direct write tests. You can use the latest version of fio or the pg_test_fsync tool for testing. You can add the -fdatasync=1 option to test, for example, high concurrency with each write of 4k and fsync each time:
fio -direct=0 -fdatasync=1 -iodepth=4 -thread=4 -rw=write -ioengine=libaio -bs=4k -filename=./fio_test -size=20G -runtime=60 -group_reporting -name=write_test

fdatasync performance reference:
Reference 1: Non-NVMe SSD fdatasync/s is about 5~8K/s
Reference 2: Early NVMe fdatasync/s is about 20~50K/s
Reference 3: Current mature PCIE 3 NVMe is about 200~500K/s