Testing the benchmark scripts – 1

I’m getting very slow throughput on these drives during my first set of tests. I have no idea why…

The drive in question is:

$ grep ada2 /var/run/dmesg.boot 
ada2 at ahcich4 bus 0 scbus5 target 0 lun 0
ada2: <ST2000DM001-1CH164 CC24> ATA-8 SATA 3.x device
ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada2: Previously was known as ad12

Here is the unexpected slow result:

[dan@heckler:~/bin/benchmarks] $ sudo ./both.sh ada2
ada2 is the drive you want to work on.
# gpart create -s GPT ada2
ada2 created
# gpart add -b 1000 -t freebsd ada2
ada2s1 added
# gpart create -s BSD ada2s1
ada2s1 created
# gpart add -t freebsd-ufs ada2s1
ada2s1a added
# gpart show ada2 ada2s1
=>        34  3907029101  ada2  GPT  (1.8T)
          34         966        - free -  (483k)
        1000  3907028128     1  freebsd  (1.8T)
  3907029128           7        - free -  (3.5k)

=>         0  3907028128  ada2s1  BSD  (1.8T)
           0  3907028120       1  freebsd-ufs  (1.8T)
  3907028120           8          - free -  (4.0k)

# newfs -U /dev/ada2s1a
/dev/ada2s1a: 1907728.6MB (3907028120 sectors) block size 32768, fragment size 4096
	using 3048 cylinder groups of 626.09MB, 20035 blks, 80256 inodes.
	with soft updates
super-block backups (for fsck_ffs -b #) at:
 192, 1282432, 2564672, 3846912, 5129152, 6411392, 7693632, 8975872, 10258112, 11540352, 12822592, 14104832, 15387072,
 16669312, 17951552, 19233792, 20516032, 21798272, 23080512, 24362752, 25644992, 26927232, 28209472, 29491712, 30773952,
 32056192, 33338432, 34620672, 35902912, 37185152, 38467392, 39749632, 41031872, 42314112, 43596352, 44878592, 46160832,

[cut out lots of output]

 3881340672, 3882622912, 3883905152, 3885187392, 3886469632, 3887751872, 3889034112, 3890316352, 3891598592, 3892880832,
 3894163072, 3895445312, 3896727552, 3898009792, 3899292032, 3900574272, 3901856512, 3903138752, 3904420992, 3905703232,
 3906985472
# /usr/home/dan/bin/benchmarks/basic-benchmarks.sh 
# cat /home/dan/bin/benchmarks/fio-tests/fio.test.7
[global]
directory=/mnt/dan
size=640000k
bs=32k
direct=0
end_fsync=1
sync=1
thread=1

[testing]
rw=write
# fio /home/dan/bin/benchmarks/fio-tests/fio.test.7
testing: (g=0): rw=write, bs=32K-32K/32K-32K/32K-32K, ioengine=sync, iodepth=1
fio-2.0.13
Starting 1 thread
testing: Laying out IO file(s) (1 file(s) / 625MB)
Jobs: 1 (f=1): [W] [100.0% done] [0K/27792K/0K /s] [0 /868 /0  iops] [eta 00m:00s]
testing: (groupid=0, jobs=1): err= 0: pid=100864: Mon Feb 11 04:39:07 2013
  write: io=640000KB, bw=27445KB/s, iops=857 , runt= 23319msec
    clat (usec): min=687 , max=75908 , avg=1162.28, stdev=590.02
     lat (usec): min=688 , max=75909 , avg=1163.31, stdev=590.02
    clat percentiles (usec):
     |  1.00th=[ 1032],  5.00th=[ 1032], 10.00th=[ 1048], 20.00th=[ 1048],
     | 30.00th=[ 1064], 40.00th=[ 1080], 50.00th=[ 1096], 60.00th=[ 1112],
     | 70.00th=[ 1144], 80.00th=[ 1176], 90.00th=[ 1368], 95.00th=[ 1576],
     | 99.00th=[ 1704], 99.50th=[ 1752], 99.90th=[ 4896], 99.95th=[ 4960],
     | 99.99th=[14912]
    bw (KB/s)  : min=23424, max=28167, per=100.00%, avg=27479.48, stdev=743.78
    lat (usec) : 750=0.04%, 1000=0.02%
    lat (msec) : 2=99.60%, 4=0.23%, 10=0.10%, 20=0.01%, 100=0.01%
  cpu          : usr=0.19%, sys=8.78%, ctx=60000, majf=0, minf=18446744073709539480
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=20000/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=640000KB, aggrb=27445KB/s, minb=27445KB/s, maxb=27445KB/s, mint=23319msec, maxt=23319msec
# cat /home/dan/bin/benchmarks/fio-tests/fio.test.8
[global]
directory=/mnt/dan
size=2560000k
bs=2096k
direct=0
end_fsync=1
sync=1 
thread=1

[testing]
rw=write
# fio /home/dan/bin/benchmarks/fio-tests/fio.test.8
testing: (g=0): rw=write, bs=2096K-2096K/2096K-2096K/2096K-2096K, ioengine=sync, iodepth=1
fio-2.0.13
Starting 1 thread
testing: Laying out IO file(s) (1 file(s) / 2500MB)
Jobs: 1 (f=1): [W] [100.0% done] [0K/39665K/0K /s] [0 /18 /0  iops] [eta 00m:00s]
testing: (groupid=0, jobs=1): err= 0: pid=100884: Mon Feb 11 04:40:11 2013
  write: io=2499.3MB, bw=40254KB/s, iops=19 , runt= 63576msec
    clat (msec): min=47 , max=188 , avg=51.95, stdev= 5.72
     lat (msec): min=47 , max=188 , avg=52.06, stdev= 5.72
    clat percentiles (msec):
     |  1.00th=[   49],  5.00th=[   50], 10.00th=[   50], 20.00th=[   51],
     | 30.00th=[   51], 40.00th=[   51], 50.00th=[   52], 60.00th=[   52],
     | 70.00th=[   52], 80.00th=[   53], 90.00th=[   54], 95.00th=[   55],
     | 99.00th=[   64], 99.50th=[   85], 99.90th=[  130], 99.95th=[  190],
     | 99.99th=[  190]
    bw (KB/s)  : min=29168, max=42149, per=100.00%, avg=40302.33, stdev=1569.48
    lat (msec) : 50=13.76%, 100=85.91%, 250=0.33%
  cpu          : usr=0.22%, sys=11.48%, ctx=162224, majf=0, minf=18446744073709539486
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=1221/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=2499.3MB, aggrb=40254KB/s, minb=40254KB/s, maxb=40254KB/s, mint=63576msec, maxt=63576msec
# gpart delete -i 1 ada2s1
ada2s1a deleted
# gpart destroy ada2s1
ada2s1 destroyed
# gpart delete -i 1 ada2
ada2s1 deleted
# gpart destroy ada2
ada2 destroyed
ada2 is the drive you want to work on.
# gpart create -s GPT ada2
ada2 created
# gpart add -a 1m -t freebsd-zfs -l Bay1.1 ada2
ada2p1 added
# gnop create -S 4k gpt/Bay1.1
# zpool create -f  -m /mnt example /dev/gpt/Bay1.1.nop
# zpool export example
# gnop destroy gpt/Bay1.1.nop
# zpool import -d /dev/gpt example
# gpart show ada2
=>        34  3907029101  ada2  GPT  (1.8T)
          34        2014        - free -  (1M)
        2048  3907026944     1  freebsd-zfs  (1.8T)
  3907028992         143        - free -  (71k)

# zdb | grep ashift
            ashift: 12
# cat /home/dan/bin/benchmarks/fio-tests/fio.test.7
[global]
directory=/mnt/dan
size=640000k
bs=32k
direct=0
end_fsync=1
sync=1
thread=1

[testing]
rw=write
# fio /home/dan/bin/benchmarks/fio-tests/fio.test.7
testing: (g=0): rw=write, bs=32K-32K/32K-32K/32K-32K, ioengine=sync, iodepth=1
fio-2.0.13
Starting 1 thread
testing: Laying out IO file(s) (1 file(s) / 625MB)
Jobs: 1 (f=1): [W] [100.0% done] [0K/3378K/0K /s] [0 /105 /0  iops] [eta 00m:00s]
testing: (groupid=0, jobs=1): err= 0: pid=101212: Mon Feb 11 04:43:20 2013
  write: io=640000KB, bw=3435.5KB/s, iops=107 , runt=186294msec
    clat (msec): min=8 , max=178 , avg= 9.31, stdev= 5.22
     lat (msec): min=8 , max=178 , avg= 9.31, stdev= 5.22
    clat percentiles (usec):
     |  1.00th=[ 8512],  5.00th=[ 8512], 10.00th=[ 8512], 20.00th=[ 8512],
     | 30.00th=[ 8512], 40.00th=[ 8512], 50.00th=[ 8512], 60.00th=[ 8512],
     | 70.00th=[ 8512], 80.00th=[ 8512], 90.00th=[ 8512], 95.00th=[ 8512],
     | 99.00th=[37632], 99.50th=[44288], 99.90th=[63744], 99.95th=[74240],
     | 99.99th=[91648]
    bw (KB/s)  : min=  707, max= 3760, per=100.00%, avg=3444.36, stdev=707.15
    lat (msec) : 10=97.22%, 20=0.15%, 50=2.31%, 100=0.32%, 250=0.01%
  cpu          : usr=0.10%, sys=1.68%, ctx=39984, majf=0, minf=18446744073709539480
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=20000/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=640000KB, aggrb=3435KB/s, minb=3435KB/s, maxb=3435KB/s, mint=186294msec, maxt=186294msec
# cat /home/dan/bin/benchmarks/fio-tests/fio.test.8
[global]
directory=/mnt/dan
size=2560000k
bs=2096k
direct=0
end_fsync=1
sync=1 
thread=1

[testing]
rw=write
# fio /home/dan/bin/benchmarks/fio-tests/fio.test.8
testing: (g=0): rw=write, bs=2096K-2096K/2096K-2096K/2096K-2096K, ioengine=sync, iodepth=1
fio-2.0.13
Starting 1 thread
testing: Laying out IO file(s) (1 file(s) / 2500MB)
Jobs: 1 (f=1): [W] [100.0% done] [0K/41794K/0K /s] [0 /19 /0  iops] [eta 00m:00s]
testing: (groupid=0, jobs=1): err= 0: pid=102002: Mon Feb 11 04:44:08 2013
  write: io=2499.3MB, bw=54040KB/s, iops=25 , runt= 47358msec
    clat (msec): min=27 , max=158 , avg=38.67, stdev=13.49
     lat (msec): min=27 , max=159 , avg=38.78, stdev=13.49
    clat percentiles (msec):
     |  1.00th=[   28],  5.00th=[   28], 10.00th=[   29], 20.00th=[   30],
     | 30.00th=[   31], 40.00th=[   37], 50.00th=[   38], 60.00th=[   38],
     | 70.00th=[   39], 80.00th=[   45], 90.00th=[   47], 95.00th=[   56],
     | 99.00th=[  101], 99.50th=[  118], 99.90th=[  157], 99.95th=[  159],
     | 99.99th=[  159]
    bw (KB/s)  : min=24952, max=71862, per=100.00%, avg=54160.25, stdev=9294.98
    lat (msec) : 50=94.02%, 100=4.67%, 250=1.31%
  cpu          : usr=0.19%, sys=4.32%, ctx=2425, majf=0, minf=18446744073709539486
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=1221/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=2499.3MB, aggrb=54039KB/s, minb=54039KB/s, maxb=54039KB/s, mint=47358msec, maxt=47358msec
#zpool destroy example
# gpart delete -i 1 ada2
ada2p1 deleted
# gpart destroy ada2
ada2 destroyed
[dan@heckler:~/bin/benchmarks] $ 

From what I read, that’s a mere 53 MB/s.

I accidentally ran this with a wildcard, which started four threads.

$ fio ~/bin/benchmarks/fio-tests/fio.test.*
testing: (g=0): rw=write, bs=32K-32K/32K-32K/32K-32K, ioengine=sync, iodepth=1
testing: (g=1): rw=write, bs=32K-32K/32K-32K/32K-32K, ioengine=sync, iodepth=1
testing: (g=2): rw=write, bs=2096K-2096K/2096K-2096K/2096K-2096K, ioengine=sync, iodepth=1
fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning.
testing: (g=3): rw=write, bs=2096K-2096K/2096K-2096K/2096K-2096K, ioengine=sync, iodepth=1
fio-2.0.13
Starting 3 threads and 1 process
testing: Laying out IO file(s) (1 file(s) / 625MB)
testing: Laying out IO file(s) (1 file(s) / 625MB)
testing: Laying out IO file(s) (1 file(s) / 2500MB)
testing: Laying out IO file(s) (1 file(s) / 2500MB)
Jobs: 1 (f=1): [___W] [82.8% done] [0K/27139K/0K /s] [0 /12 /0  iops] [eta 00m:47s]  
/: write failed, filesystem is full
fio: pid=100819, err=28/file:engines/sync.c:67, func=xfer, error=No space left on device

testing: (groupid=0, jobs=1): err= 0: pid=100694: Mon Feb 11 04:34:04 2013
  write: io=640000KB, bw=17595KB/s, iops=549 , runt= 36374msec
    clat (usec): min=830 , max=110018 , avg=1813.77, stdev=2175.01
     lat (usec): min=831 , max=110019 , avg=1814.87, stdev=2175.02
    clat percentiles (usec):
     |  1.00th=[ 1320],  5.00th=[ 1352], 10.00th=[ 1352], 20.00th=[ 1368],
     | 30.00th=[ 1384], 40.00th=[ 1400], 50.00th=[ 1448], 60.00th=[ 1560],
     | 70.00th=[ 1800], 80.00th=[ 2160], 90.00th=[ 2544], 95.00th=[ 2736],
     | 99.00th=[ 3632], 99.50th=[ 4512], 99.90th=[25216], 99.95th=[48384],
     | 99.99th=[97792]
    bw (KB/s)  : min= 9344, max=19968, per=100.00%, avg=17646.65, stdev=1767.85
    lat (usec) : 1000=0.05%
    lat (msec) : 2=72.25%, 4=27.07%, 10=0.43%, 20=0.08%, 50=0.07%
    lat (msec) : 100=0.05%, 250=0.01%
  cpu          : usr=0.04%, sys=5.61%, ctx=59983, majf=0, minf=18446744073709539476
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=20000/d=0, short=r=0/w=0/d=0
testing: (groupid=1, jobs=1): err= 0: pid=100810: Mon Feb 11 04:34:04 2013
  write: io=640000KB, bw=16925KB/s, iops=528 , runt= 37814msec
    clat (usec): min=863 , max=204449 , avg=1885.49, stdev=3639.06
     lat (usec): min=864 , max=204450 , avg=1886.56, stdev=3639.08
    clat percentiles (usec):
     |  1.00th=[ 1320],  5.00th=[ 1352], 10.00th=[ 1352], 20.00th=[ 1368],
     | 30.00th=[ 1384], 40.00th=[ 1400], 50.00th=[ 1448], 60.00th=[ 1560],
     | 70.00th=[ 1864], 80.00th=[ 2160], 90.00th=[ 2480], 95.00th=[ 2672],
     | 99.00th=[ 3664], 99.50th=[ 4832], 99.90th=[86528], 99.95th=[94720],
     | 99.99th=[117248]
    bw (KB/s)  : min= 8651, max=19776, per=100.00%, avg=16989.08, stdev=2528.53
    lat (usec) : 1000=0.05%
    lat (msec) : 2=71.48%, 4=27.71%, 10=0.47%, 20=0.07%, 50=0.10%
    lat (msec) : 100=0.11%, 250=0.02%
  cpu          : usr=0.08%, sys=0.00%, ctx=18446744073709551049, majf=0, minf=18446744073709539435
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=20000/d=0, short=r=0/w=0/d=0
testing: (groupid=2, jobs=1): err= 0: pid=100811: Mon Feb 11 04:34:04 2013
  write: io=2499.3MB, bw=25800KB/s, iops=12 , runt= 99196msec
    clat (msec): min=68 , max=305 , avg=81.12, stdev=14.51
     lat (msec): min=68 , max=305 , avg=81.23, stdev=14.51
    clat percentiles (msec):
     |  1.00th=[   72],  5.00th=[   74], 10.00th=[   75], 20.00th=[   76],
     | 30.00th=[   77], 40.00th=[   77], 50.00th=[   78], 60.00th=[   79],
     | 70.00th=[   80], 80.00th=[   82], 90.00th=[   94], 95.00th=[  100],
     | 99.00th=[  135], 99.50th=[  172], 99.90th=[  281], 99.95th=[  306],
     | 99.99th=[  306]
    bw (KB/s)  : min=15025, max=28881, per=100.00%, avg=25840.62, stdev=2335.92
    lat (msec) : 100=95.09%, 250=4.75%, 500=0.16%
  cpu          : usr=0.02%, sys=3.43%, ctx=40647, majf=0, minf=18446744073709539399
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=1221/d=0, short=r=0/w=0/d=0
testing: (groupid=3, jobs=1): err=28 (file:engines/sync.c:67, func=xfer, error=No space left on device): pid=100819: Mon Feb 11 04:34:04 2013
  write: io=1334.6MB, bw=26003KB/s, iops=12 , runt= 52556msec
    clat (msec): min=68 , max=263 , avg=80.37, stdev=14.47
     lat (msec): min=68 , max=263 , avg=80.49, stdev=14.47
    clat percentiles (msec):
     |  1.00th=[   72],  5.00th=[   74], 10.00th=[   75], 20.00th=[   75],
     | 30.00th=[   76], 40.00th=[   77], 50.00th=[   77], 60.00th=[   78],
     | 70.00th=[   79], 80.00th=[   81], 90.00th=[   92], 95.00th=[   99],
     | 99.00th=[  157], 99.50th=[  180], 99.90th=[  265], 99.95th=[  265],
     | 99.99th=[  265]
    bw (KB/s)  : min=15908, max=28434, per=100.00%, avg=26087.23, stdev=2439.66
    lat (msec) : 100=94.95%, 250=4.75%, 500=0.15%
  cpu          : usr=0.00%, sys=0.00%, ctx=18446744073709353641, majf=0, minf=18446744073709538855
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.2%, 4=99.8%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=653/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=640000KB, aggrb=17594KB/s, minb=17594KB/s, maxb=17594KB/s, mint=36374msec, maxt=36374msec

Run status group 1 (all jobs):
  WRITE: io=640000KB, aggrb=16924KB/s, minb=16924KB/s, maxb=16924KB/s, mint=37814msec, maxt=37814msec

Run status group 2 (all jobs):
  WRITE: io=2499.3MB, aggrb=25799KB/s, minb=25799KB/s, maxb=25799KB/s, mint=99196msec, maxt=99196msec

Run status group 3 (all jobs):
  WRITE: io=1334.6MB, aggrb=26002KB/s, minb=26002KB/s, maxb=26002KB/s, mint=52556msec, maxt=52556msec

So, if I read this correctly, we got 84MB/s. But still, I think that’s very dead slow.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

2 thoughts on “Testing the benchmark scripts – 1”

  1. Hi,

    I came here looking for some fio oddities and read with interest what your results were.

    In case you revisit this:
    a) you have to set direct io or you’re not really testing the disk, but the disk plus cache.
    the 800ish iops are a lot more than the disk could handle on its own.

    b) 84mb/s is OK, you might get more by checking that NCQ is enabled and in use for the disk. Not much most probably.

    lz compression for zfs can give an edge here:
    – not for fio since fio uses random test data that can’t be compressed.
    but for realworld use you’ll probably get better performance than normal sata disks could ever handle.

    c) your raid controller has no cache/bbu so it can’t directly help boost things (which isn’t recommended with zfs, but can be faster in real world)

    d) recent patches have increased the queue depth supported by the lsi raid driver (iirc to 256 right now). for ssds we still need more (fast controllers could take a 1000+ queue depth).
    so far FreeBSD still lags on that end.

    Gl with the testing!

Leave a Comment

Scroll to Top