Can the performance of such SSDs meet the requirements of TiKV in a production environment?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 这样的ssd硬盘性能能否达到tikv的生产环境的要求

| username: TiDBer_ssvwtrcq

Hard disk test results:
The main test is random read/write

Can such performance be used in the TiKV production environment?

root@localhost:/# sudo fio --filename=/dev/nvme1n1p1 --direct=1 --rw=randrw --bs=4k --numjobs=16 --iodepth=64 --runtime=300 --time_based --group_reporting --name=randrw-test
randrw-test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=64
...
fio-3.36
Starting 16 processes
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
Jobs: 16 (f=16): [m(16)][100.0%][r=69.0MiB/s,w=69.1MiB/s][r=17.7k,w=17.7k IOPS][eta 00m:00s]
randrw-test: (groupid=0, jobs=16): err= 0: pid=88434: Thu Jun  6 09:55:32 2024
  read: IOPS=13.4k, BW=52.2MiB/s (54.7MB/s)(15.3GiB/300003msec)
    clat (usec): min=15, max=15414, avg=410.30, stdev=818.46
     lat (usec): min=15, max=15415, avg=410.54, stdev=818.45
    clat percentiles (usec):
     |  1.00th=[   52],  5.00th=[   78], 10.00th=[   82], 20.00th=[   86],
     | 30.00th=[   91], 40.00th=[   98], 50.00th=[  106], 60.00th=[  119],
     | 70.00th=[  147], 80.00th=[  219], 90.00th=[ 2212], 95.00th=[ 2540],
     | 99.00th=[ 3359], 99.50th=[ 3523], 99.90th=[ 3916], 99.95th=[ 4359],
     | 99.99th=[10552]
   bw (  KiB/s): min= 5512, max=89968, per=100.00%, avg=53463.39, stdev=1401.22, samples=9584
   iops        : min= 1378, max=22492, avg=13364.32, stdev=350.37, samples=9584
  write: IOPS=13.4k, BW=52.2MiB/s (54.8MB/s)(15.3GiB/300003msec); 0 zone resets
    clat (usec): min=12, max=168037, avg=780.68, stdev=3732.68
     lat (usec): min=12, max=168037, avg=781.07, stdev=3732.69
    clat percentiles (usec):
     |  1.00th=[   20],  5.00th=[   23], 10.00th=[   25], 20.00th=[   29],
     | 30.00th=[   32], 40.00th=[   38], 50.00th=[   45], 60.00th=[   68],
     | 70.00th=[  122], 80.00th=[  186], 90.00th=[ 2540], 95.00th=[ 3032],
     | 99.00th=[ 5800], 99.50th=[32375], 99.90th=[48497], 99.95th=[55837],
     | 99.99th=[76022]
   bw (  KiB/s): min= 5384, max=91754, per=100.00%, avg=53507.87, stdev=1405.20, samples=9584
   iops        : min= 1346, max=22938, avg=13375.42, stdev=351.36, samples=9584
  lat (usec)   : 20=0.79%, 50=26.40%, 100=27.58%, 250=28.52%, 500=2.92%
  lat (usec)   : 750=0.40%, 1000=0.02%
  lat (msec)   : 2=0.08%, 4=12.41%, 10=0.50%, 20=0.02%, 50=0.32%
  lat (msec)   : 100=0.04%, 250=0.01%
  cpu          : usr=1.05%, sys=2.84%, ctx=8016130, majf=0, minf=73013
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=4008043,4011383,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=52.2MiB/s (54.7MB/s), 52.2MiB/s-52.2MiB/s (54.7MB/s-54.7MB/s), io=15.3GiB (16.4GB), run=300003-300003msec
  WRITE: bw=52.2MiB/s (54.8MB/s), 52.2MiB/s-52.2MiB/s (54.8MB/s-54.8MB/s), io=15.3GiB (16.4GB), run=300003-300003msec

Disk stats (read/write):
  nvme1n1: ios=4004789/4008004, sectors=32040408/32064032, merge=0/0, ticks=1552885/3034191, in_queue=4587077, util=100.00%

Another software test result:

root@localhost:/data# sudo sysbench fileio --file-total-size=1G --file-test-mode=rndrw --max-time=300 --max-requests=0 --file-extra-flags=direct run
WARNING: --max-time is deprecated, use --time instead
sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time


Extra file open flags: directio
128 files, 8MiB each
1GiB total file size
Block size 16KiB
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Initializing worker threads...

Threads started!


File operations:
    reads/s:                      1032.92
    writes/s:                     688.61
    fsyncs/s:                     2203.75

Throughput:
    read, MiB/s:                  16.14
    written, MiB/s:               10.76

General statistics:
    total time:                          300.0530s
    total number of events:              1177676

Latency (ms):
         min:                                    0.02
         avg:                                    0.25
         max:                                   11.82
         95th percentile:                        0.46
         sum:                               298953.00

Threads fairness:
    events (avg/stddev):           1177676.0000/0.00
    execution time (avg/stddev):   298.9530/0.00

| username: zhanggame1 | Original post link

Not very good, take a look at my test from yesterday:

root@tidb3:/dev# fio --bs=4k --ioengine=libaio --iodepth=64 --direct=1 --rw=write --numjobs=32 --time_based --runtime=30 --randrepeat=0 --group_reporting --name=fio-read --size=10G --filename=/tidb-data/fiotest
fio-read: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
...
fio-3.28
Starting 32 processes
Jobs: 32 (f=32): [W(32)][100.0%][w=3614MiB/s][w=925k IOPS][eta 00m:00s]
fio-read: (groupid=0, jobs=32): err= 0: pid=5267: Thu Jun  6 10:36:53 2024
  write: IOPS=858k, BW=3350MiB/s (3513MB/s)(98.2GiB/30003msec); 0 zone resets
    slat (nsec): min=1323, max=3566.9k, avg=5601.99, stdev=4799.29
    clat (usec): min=36, max=12578, avg=2381.51, stdev=407.84
     lat (usec): min=48, max=12583, avg=2387.25, stdev=407.69
    clat percentiles (usec):
     |  1.00th=[ 1680],  5.00th=[ 2040], 10.00th=[ 2089], 20.00th=[ 2147],
     | 30.00th=[ 2180], 40.00th=[ 2212], 50.00th=[ 2245], 60.00th=[ 2343],
     | 70.00th=[ 2474], 80.00th=[ 2638], 90.00th=[ 2868], 95.00th=[ 3097],
     | 99.00th=[ 3523], 99.50th=[ 3785], 99.90th=[ 5342], 99.95th=[ 6718],
     | 99.99th=[10159]
   bw (  MiB/s): min= 2491, max= 3724, per=99.97%, avg=3349.17, stdev= 9.71, samples=1888
   iops        : min=637779, max=953512, avg=857386.29, stdev=2486.50, samples=1888
  lat (usec)   : 50=0.01%, 100=0.01%, 250=0.02%, 500=0.17%, 750=0.19%
  lat (usec)   : 1000=0.17%
  lat (msec)   : 2=2.54%, 4=96.59%, 10=0.32%, 20=0.01%
  cpu          : usr=3.71%, sys=17.36%, ctx=8969805, majf=1, minf=1798
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=0,25731196,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=3350MiB/s (3513MB/s), 3350MiB/s-3350MiB/s (3513MB/s-3513MB/s), io=98.2GiB (105GB), run=30003-30003msec
| username: 友利奈绪 | Original post link

It also depends on the usage; if the concurrency is high, the performance might be a bit poor.

| username: TiDBer_QYr0vohO | Original post link

It keeps prompting that you are using a synchronous I/O engine, and iodepth=64 is not effective, being limited to 1. Try using asynchronous I/O.
Note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1.

| username: Kongdom | Original post link

Try using the official test standards.

| username: ziptoam | Original post link

This is indeed very good. Using the official testing method for testing can achieve the highest degree of matching.

| username: Hacker_zuGnSsfP | Original post link

It is recommended to use solid-state drives (SSDs) in a production environment and set up a RAID array. This can improve read and write speeds.

| username: xiaohaozifeifeifei | Original post link

This disk will definitely work.

| username: 濱崎悟空 | Original post link

You can test it.

| username: YuchongXU | Original post link

Can support