In this post, we’re going to demonstrate why aligning your disk partitions can be a good idea. For starters, we’re going to go with the default alignment, do a test, then align, and redo the test.
But first, in case you have no idea what I’m talking about, this randomly discovered post may be a useful primer.
Create the scheme on the disk
This command creates a GPT scheme on the drive.
$ sudo gpart create -s GPT ada1 ada1 created $ gpart show ada1 => 34 488397101 ada1 GPT (232G) 34 488397101 - free - (232G) $
Create a FreeBSD partition
This command creates a FreeBSD partition, and I am purposely not aligning it.
$ sudo gpart add -t freebsd -a 574 ada1 ada1s1 added, but partition is not aligned on 4096 bytes $ gpart show ada1 => 34 488397101 ada1 GPT (232G) 34 540 - free - (270k) 574 488396510 1 freebsd (232G) 488397084 51 - free - (25k) $
As you can see, this partition starts 574 blocks in on the disk, or 574 * 512 = 293888 bytes in. Clearly, not 4K block aligned, as stated in the output of the command, and as shown in the math: 293888 / 4096 = 71.75.
Create a BSD scheme
Here, we create a scheme, into which we will create a partition and newfs it.
$ sudo gpart create -s BSD ada1s1 ada1s1 created $ gpart show ada1s1 => 0 488396510 ada1s1 BSD (232G) 0 488396510 - free - (232G)
Create a partition
Let’s create a 90GB partition. That’s big enough for some writes, but smaller than using the whole drive, because we don’t need it all.
$ sudo gpart add -t freebsd-ufs -s 90g ada1s1 ada1s1a added $ gpart show ada1s1 => 0 488396510 ada1s1 BSD (232G) 0 2 - free - (1.0k) 2 188743680 1 freebsd-ufs (90G) 188743682 299652828 - free - (142G) $
newfs
Let’s create a filesystem on there, so we can write to it.
$ sudo newfs -U /dev/ada1s1a /dev/ada1s1a: 92160.0MB (188743680 sectors) block size 32768, fragment size 4096 using 148 cylinder groups of 626.09MB, 20035 blks, 80256 inodes. with soft updates super-block backups (for fsck_ffs -b #) at: 192, 1282432, 2564672, 3846912, 5129152, 6411392, 7693632, 8975872, 10258112, 11540352, 12822592, 14104832, 15387072, 16669312, 17951552, 19233792, 20516032, 21798272, 23080512, 24362752, 25644992, 26927232, 28209472, 29491712, 30773952, 32056192, 33338432, 34620672, 35902912, 37185152, 38467392, 39749632, 41031872, 42314112, 43596352, 44878592, 46160832, 47443072, 48725312, 50007552, 51289792, 52572032, 53854272, 55136512, 56418752, 57700992, 58983232, 60265472, 61547712, 62829952, 64112192, 65394432, 66676672, 67958912, 69241152, 70523392, 71805632, 73087872, 74370112, 75652352, 76934592, 78216832, 79499072, 80781312, 82063552, 83345792, 84628032, 85910272, 87192512, 88474752, 89756992, 91039232, 92321472, 93603712, 94885952, 96168192, 97450432, 98732672, 100014912, 101297152, 102579392, 103861632, 105143872, 106426112, 107708352, 108990592, 110272832, 111555072, 112837312, 114119552, 115401792, 116684032, 117966272, 119248512, 120530752, 121812992, 123095232, 124377472, 125659712, 126941952, 128224192, 129506432, 130788672, 132070912, 133353152, 134635392, 135917632, 137199872, 138482112, 139764352, 141046592, 142328832, 143611072, 144893312, 146175552, 147457792, 148740032, 150022272, 151304512, 152586752, 153868992, 155151232, 156433472, 157715712, 158997952, 160280192, 161562432, 162844672, 164126912, 165409152, 166691392, 167973632, 169255872, 170538112, 171820352, 173102592, 174384832, 175667072, 176949312, 178231552, 179513792, 180796032, 182078272, 183360512, 184642752, 185924992, 187207232, 188489472
Simple test
Let’s write a 60GB file into there.
But first, mount it:
$ sudo mount /dev/ada1s1a /mnt $ ls -l /mnt total 0 $ cd /mnt $ sudo mkdir test1 $ sudo chown dan:dan test1 $ cd test1
Now, we can write:
$ dd if=/dev/zero of=FirstTest bs=32k count=15728640 /mnt: write failed, filesystem is full dd: FirstTest: No space left on device 2626900+0 records in 2626899+0 records out 86078226432 bytes transferred in 647.749411 secs (132888159 bytes/sec)
That filled up the space, but that’s OK. The goal is the rate. We got about 126 MB/s. Let’s try the other HDD. We will do the same steps as above, but for ada2, instead of ada1.
$ gpart show ada2 => 34 488397101 ada2 GPT (232G) 34 540 - free - (270k) 574 488396510 1 freebsd (232G) 488397084 51 - free - (25k) $ gpart show ada2s1 => 0 488396510 ada2s1 BSD (232G) 0 2 - free - (1.0k) 2 188743680 1 freebsd-ufs (90G) 188743682 299652828 - free - (142G) $ dd if=/dev/zero of=FirstTest bs=32k count=15728640 /mnt: write failed, filesystem is full dd: FirstTest: No space left on device 2626900+0 records in 2626899+0 records out 86078226432 bytes transferred in 692.483754 secs (124303604 bytes/sec)
And that is 118MB/s. Hmmm, this is very interesting.
Now let’s try 4K aligned partitions
First, we destroy what we already created:
$ sudo gpart delete -i 1 ada2s1 ada2s1a deleted $ sudo gpart destroy ada2s1 ada2s1 destroyed $ sudo gpart delete -i 1 ada2 ada2s1 deleted $ sudo destroy ada2 ada2 destroy $ sudo gpart delete -i 1 ada1s1 ada1s1a deleted $ sudo gpart destroy ada1s1 ada1s1 destroyed $ sudo gpart delete -i 1 ada1 ada1s1 deleted $ sudo gpart destroy ada1 ada1 destroyed
Aligned
Now we do the same steps as above, but align the partitions.
$ sudo gpart create -s GPT ada1 ada1 created $ sudo gpart add -t freebsd -a 4k ada1 ada1s1 added $ gpart show ada1 => 34 488397101 ada1 GPT (232G) 34 6 - free - (3.0k) 40 488397088 1 freebsd (232G) 488397128 7 - free - (3.5k) $ sudo gpart create -s BSD ada1s1 ada1s1 created $ sudo gpart add -t freebsd-ufs -s 90g -a 4k ada1s1 ada1s1a added $ gpart show ada1s1 => 0 488397088 ada1s1 BSD (232G) 0 188743680 1 freebsd-ufs (90G) 188743680 299653408 - free - (142G) $ sudo newfs -U /dev/ada1s1a /dev/ada1s1a: 92160.0MB (188743680 sectors) block size 32768, fragment size 4096 using 148 cylinder groups of 626.09MB, 20035 blks, 80256 inodes. with soft updates super-block backups (for fsck_ffs -b #) at: 192, 1282432, 2564672, 3846912, 5129152, 6411392, 7693632, 8975872, 10258112, 11540352, 12822592, 14104832, 15387072, 16669312, 17951552, 19233792, 20516032, 21798272, 23080512, 24362752, 25644992, 26927232, 28209472, 29491712, 30773952, 32056192, 33338432, 34620672, 35902912, 37185152, 38467392, 39749632, 41031872, 42314112, 43596352, 44878592, 46160832, 47443072, 48725312, 50007552, 51289792, 52572032, 53854272, 55136512, 56418752, 57700992, 58983232, 60265472, 61547712, 62829952, 64112192, 65394432, 66676672, 67958912, 69241152, 70523392, 71805632, 73087872, 74370112, 75652352, 76934592, 78216832, 79499072, 80781312, 82063552, 83345792, 84628032, 85910272, 87192512, 88474752, 89756992, 91039232, 92321472, 93603712, 94885952, 96168192, 97450432, 98732672, 100014912, 101297152, 102579392, 103861632, 105143872, 106426112, 107708352, 108990592, 110272832, 111555072, 112837312, 114119552, 115401792, 116684032, 117966272, 119248512, 120530752, 121812992, 123095232, 124377472, 125659712, 126941952, 128224192, 129506432, 130788672, 132070912, 133353152, 134635392, 135917632, 137199872, 138482112, 139764352, 141046592, 142328832, 143611072, 144893312, 146175552, 147457792, 148740032, 150022272, 151304512, 152586752, 153868992, 155151232, 156433472, 157715712, 158997952, 160280192, 161562432, 162844672, 164126912, 165409152, 166691392, 167973632, 169255872, 170538112, 171820352, 173102592, 174384832, 175667072, 176949312, 178231552, 179513792, 180796032, 182078272, 183360512, 184642752, 185924992, 187207232, 188489472 $ dd if=/dev/zero of=FirstTest bs=32k count=15728640 /mnt: write failed, filesystem is full dd: FirstTest: No space left on device 2626900+0 records in 2626899+0 records out 86078226432 bytes transferred in 649.784040 secs (132472054 bytes/sec)
That’s pretty much the same rate.
Again, for ada2:
$ dd if=/dev/zero of=FirstTest bs=32k count=15728640640 /mnt: write failed, filesystem is full dd: FirstTest: No space left on device 2626900+0 records in 2626899+0 records out 86078226432 bytes transferred in 693.786576 secs (124070182 bytes/sec)
Hmm. Am I doing this wrong? Are these HDD not subject to a 4K alignment issue?
Sequential doesn’t count
Sequential sequence doesn’t count. “Alignment doesn’t matter much for sequential access, because the disk can coalesce requests. Only the first and last block will require a read-modify-write” – Dag-Erling Smørgrav
Instead, let’s try bonnie++ on ada1 with aligned partitions:
$ bonnie++ -s 66000 Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order...done. Stat files in random order...done. Delete files in random order...done. Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP floater.unix 66000M 532 99 113838 20 40954 36 1049 99 116269 19 132.0 5 Latency 16232us 755ms 15295ms 19001us 1624ms 4518ms Version 1.96 ------Sequential Create------ --------Random Create-------- floater.unixathome. -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 31292 58 +++++ +++ +++++ +++ 23477 46 +++++ +++ +++++ +++ Latency 158ms 72us 38us 125ms 35us 37us 1.96,1.96,floater.unixathome.org,1,1359208317,66000M,,532,99,113838,20,40954,36,1049,99,116269,19,132.0,5,16,,,,,31292,58,+++++,+++,+++++,+++,23477,46,+++++,+++,+++++,+++,16232us,755ms,15295ms,19001us,1624ms,4518ms,158ms,72us,38us,125ms,35us,37us
Then the same drive, unaligned:
first on ada1:
$ bonnie++ -s 66000 Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order...done. Stat files in random order...done. Delete files in random order...done. Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP floater.unix 66000M 519 99 109704 20 41924 33 1028 98 114528 18 120.0 7 Latency 16477us 448ms 15903ms 58855us 902ms 4473ms Version 1.96 ------Sequential Create------ --------Random Create-------- floater.unixathome. -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 31825 70 +++++ +++ +++++ +++ 24005 57 +++++ +++ +++++ +++ Latency 111ms 50us 41us 252ms 34us 41us 1.96,1.96,floater.unixathome.org,1,1359185135,66000M,,519,99,109704,20,41924,33,1028,98,114528,18,120.0,7,16,,,,,31825,70,+++++,+++,+++++,+++,24005,57,+++++,+++,+++++,+++,16477us,448ms,15903ms,58855us,902ms,4473ms,111ms,50us,41us,252ms,34us,41us
And now on ada2 (aligned):
$ bonnie++ -s 66000 Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order...done. Stat files in random order...done. Delete files in random order...done. Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP floater.unix 66000M 524 99 121793 23 39647 43 1053 99 120336 20 160.2 12 Latency 16407us 795ms 10873ms 22076us 1623ms 579ms Version 1.96 ------Sequential Create------ --------Random Create-------- floater.unixathome. -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ Latency 62188us 36us 61us 62548us 37us 41us 1.96,1.96,floater.unixathome.org,1,1359202461,66000M,,524,99,121793,23,39647,43,1053,99,120336,20,160.2,12,16,,,,,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,16407us,795ms,10873ms,22076us,1623ms,579ms,62188us,36us,61us,62548us,37us,41us
Now, again with unaligned partitions:
Then on ada2:
$ bonnie++ -s 66000 Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order...done. Stat files in random order...done. Delete files in random order...done. Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP floater.unix 66000M 526 99 98514 18 37055 35 1048 99 99222 16 132.4 5 Latency 16232us 588ms 17524ms 47938us 1624ms 4466ms Version 1.96 ------Sequential Create------ --------Random Create-------- floater.unixathome. -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ Latency 63286us 33us 45us 65937us 33us 46us 1.96,1.96,floater.unixathome.org,1,1359213425,66000M,,526,99,98514,18,37055,35,1048,99,99222,16,132.4,5,16,,,,,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,16232us,588ms,17524ms,47938us,1624ms,4466ms,63286us,33us,45us,65937us,33us,46us
Ahhh, yes, as was pointed out, today, and early (had I recalled it), alignment does not matter much for sequential access.
I’m running some bonnie++ tests now.