Storage Spaces and Slow Parity Performance

Collected from my experience, here is how to deal with slow parity virtual disk performance in Microsoft Storage Spaces. In most cases, recreation will be required, but in some cases, you might get away with not having to dump all your data elsewhere. This article presumes a Storage Spaces array with no SSDs and no cache. The problem being solved here is HDDs in SS parity volumes providing 20 to 40 MB/s writes instead of multiples of disk’s rated write speeds (100 to 300 MB/s).

Update 5/2022: This is by far the most popular article on this website. I have fixed up some typos, cleared up formatting and added some clarifications I get most commonly asked about. Also, the formula for calculating right AUS and interleave values and also the PowerShell command are now written in RED color, so you can find them on the page easier.

If you do a Google search for Storage Spaces parity performance, you will find both enthusiasts and professionals complaining about it. No wonder – the UI is very limited in what it shows to the user and sometimes doesn’t even work properly. Today, we will talk about Interleave sizes, Allocation Unit sizes and how they work together.

Columns

To understand all this, we have to know what columns are in Storage Spaces, since the user interface doesn’t communicate the NumberOfColumns parameter at all. SS gives users the opportunity to decide how many physical targets will be used for each write. This is more important than you might think. For parity spaces, this argument decides what is the storage efficiency going to be and affects performance as well. With 3 columns, we get 66% efficiency:

Every time anything is written, the data written is 150% of the original data, since there is extra write for parity. Hence, the efficiency of storage is 66%, as we are effectively losing 1 whole disk to parity. Notice, however, that in practice, parity travels across all disks. There is no one dedicated parity disk.

With 4 columns, we get 75% efficiency:

With 4 columns (4 physical disks minimum) we get higher efficiency. From 66 to 75 percent.

Here comes the kick:

Number of columns is not tied with number of physical disks in the pool.

Data and parity rotates between physical disks, so you can have 3-column parity virtual disk in a pool with more than 3 physical disks and data will be stacked evenly on all disks in the pool, with efficiency still being 66%, as shown below:

3 columns on 4 physical disks.

With parity spaces, the use of columns is quite clear. Same logic applies to other storage strategies as well, however. Consider this two-way mirror space, for example:

A simple 1 column two-way mirror space.

Simple, right? Everything gets written twice. What’s the big deal? Well, what if we increase amount of disks in the pool?

Still a simple 1 column two-way mirror space, where everything gets written twice.

Well, what if we increase number of columns to 2?

2-column two-way mirror space effectively becomes RAID-10.

Now that we have done that, we have a mirror space with 4 physical drives, which splits data into two slabs and then creates copies of those two slabs. This brings awesome performance benefits, but decreases storage efficiency to 50%.

Interleave

This parameter says how much data fits into single cell of a column. In a 3-column parity Space, an interleave size of 16KB means that our row is 48KB long, but 16KB must be reserved for parity, leaving 32KB for data. Hence, every write request will be split into an array of 32KB stripes, and each stripe split into two 16KB slabs. Those two 16KB slabs will be written to individual disks, with third 16KB slab being calculated on the fly as their parity and written to a third disk:

One stripe can handle 32KB of data if Interleave value is set to 16KB.

What we haven’t been considering all this time is volumes, partitions, allocation units and cluster sizes. Neither the pool, or the space have any care about what it is that they are actually writing into their disks. The volume provider has to take care of that, and above that volume is a partition. So the whole diagram gets even more complicated:

A top down layer infrastructure of how data is stored.

This is where performance gets hindered the most. Say that we have 4 disks in a pool, we want the data protected by parity, and we create a NTFS partition on this space, doing it all through Storage Spaces UI in Control Panel:

Unaligned nonsense created by Windows – this is by default.

If your Virtual Disk is set to have maximum capacity below 16TB, Windows will create the partition with an Allocation Unit Size or AUS of 4KB. If it’s above 16TB, it will switch up to 8KB. This setting cannot be changed and you have to recreate the partition in order to set it differently. The performance loss here comes from the work that has to be done in order to align the data properly. Each write request has to first allocate the right number of units on the partition, which is 4KB each, and then all this has to translate into 256KB slabs which are propagated to the drives. This is expensive operation and causes the scenario that many Storage Spaces users know by heart – a 1 GB of blast speeds and then slowdown to unusable speeds:

Parity write performance dips after 1GB of data written, this is 4 WD REDs in a 4 column parity space.

The 1 GB rush is a Windows-default 1 GB write cache on system disk, if Storage Spaces considers it worthy. How do we fix this? We align the AUS and Interleave sizes. Since NTFS offers AUS of 4, 8, 16, 32 or 64KB, we have to set our Interleave size so that the following statement is true for our parity space:

$ntfs_aus = ($number_of_columns - 1) * $interleave

Let’s imagine this on a practical example. Say that we have a 3-column parity space of 3 disks, with an interleave of 32KB and a NTFS partition with AUS of 64KB. This means that our stripe size (row length over 3 columns) is 3x32KB, some 96KB. Of this, 1 column is parity, so only 64KB of the stripe are actual data. Now match that with NTFS AUS of 64KB. Perfect alignment. The secret sauce here is, that since Windows 10 build 1809, if you do that, Windows will completely bypass the cache and do full stripe writes, since it doesn’t have to recalculate the allocation. With this, the speeds are great:

A steady 300MB/s write.

In my experience, logical and physical cluster sizes don’t even come into play for mere mortals here, since all disks I was able to get my hands on are 512e. Windows knows this and will create the pool with both logical and physical sector sizes of 4KB. Google this if you want to know more, but I won’t be considering this topic in this article any further.

Incidentally, number of disks in pool has little, if any, performance impact on speeds of the virtual disk. You can have a 3 column parity space on 4 disks, which, if set as above, will perform great. With this in mind, you might have figured out already that there is no viable 4 column configuration for NTFS partition. There is no NTFS AUS that would be divisible by 3 and settable as interleave size for the virtual disk. Unfortunately, this configuration is exactly what Windows will do if given 4 disks in pool, as we have demonstrated above. To gain more parity performance, 5 disks and 5 columns are required. Then you can divide the NTFS AUS by 4 and that would be the interleave size for virtual disk on a pool with at least 5 disks.

As you can see, a lot of thought has to be given to the architecture of Storage Spaces before they are created, because once these values are set, it is hard or impossible to change them. For example, if you opted for permission-less exFAT partition, then you can’t change the partition’s size. That operation simply isn’t supported, so adding new disks into the pool will give no options. Additionally, if you opted for NTFS partition, you might be able to shrink it, create new virtual disk on the pool with correct values of NumberOfColumns and Interleave parameters and keep shrinking the original partition until all of the data is moved over. Of course the best course of action is to move the data off the pool completely, recreate the VirtualDisk and move the data back.

The last step here is to disclose to any potential readers how to actually do all this without the use of Control Panel, since the UI doesn’t give the user ability to change any of these parameters. It’s quite simple actually – but you will need to use PowerShell. Not to worry, all Windows 10 installations come with PowerShell preinstalled, so just run it from your Start menu. Microsoft, in all of it’s wisdom, has included 4 different ways of running PowerShell:

4 options to run PS in Windows 10 x64

Just run the one highlighted above. The procedure is as follows:

1. Create a Pool

Create a Pool as you normally would. You can do this from Control Panel, same as before. If you have a pool already (from past Windows installations, for example), you might need to upgrade the pool. If this is the case, you can launch Storage Spaces from Control Panel and you will see the label “Upgrade pool” by the pool UI area.

2. PowerShell

Run PowerShell as Administrator. Right-click it in the Start menu and select Run as Administrator. When opened, you will need to run a command. In PowerShell, these are called cmdlets:

New-VirtualDisk -StoragePoolFriendlyName "YourPoolsName" -ProvisioningType Thin -Interleave 32KB -FriendlyName "FastParity" -Size 1TB -ResiliencySettingName ParityΒ -NumberOfColumns 3

In this command, the following placeholders can/have to be replaced:

PlaceholderDescriptionExample
YourPoolsNameName of the pool. This will not be shown anywhere in Explorer, it’s just a name of your pool that holds all of your drives.POOL1
InterleaveLength of your stripe excluding the parity slab. Your future AUS must be wholly divisible by interleave value*32KB
FastParityName of the virtual disk (not the partition). Again, this is not used in Explorer.FastParity, or Disk1
SizeYou can enter any size. Virtual disk can be smaller or larger than size of the Pool. Storage Spaces will let your create a 20TB virtual disk on a pool with combined size of 8TB. You can just add more drives to the pool once you start running out of space on the pool. You can also increase the size of the virtual disk in the future, but not shrink it.1TB
NumberOfColumnsNumber of columns including parity columns. This is all well explained above. You can’t set this number higher than the amount of physical drives in the pool, but you can set it lower, and in some cases you have to in order to avoid performance issues.3
Placeholders to go over before executing the PS command.


*: Pay attention to this. Interleave value must be lower or equal to AUS value. Interleave value of 32KB and AUS value of 64KB is OK, because 64/32 = 2. However, Interleave value of 64KB and AUS value of 32KB is not OK, because 32/64 = 0.5 which is not a whole number.

Be sure to replace appropriate parameter values before hitting Enter.

Now that your virtual disk (or Space) is created, you can head right into Disk Management console of Windows, and you will find new, empty disk chilling in there:

Newly created parity space is visible in Disk Management window.

From here, it’s business as usual. Create new volume and partition as you normally would’ve, and do not forget to set the AUS correctly. Now you are able to enjoy full speeds.

Sources:

https://social.technet.microsoft.com/Forums/en-US/64aff15f-2e34-40c6-a873-2e0da5a355d2/parity-storage-space-so-slow-that-its-unusable?forum=winserver8gen

https://social.technet.microsoft.com/wiki/contents/articles/15200.storage-spaces-designing-for-performance.aspx

63 thoughts on “Storage Spaces and Slow Parity Performance”

  1. How would this work with 7x4TB disks with parity? I would need to set an interleave size of 24K which is not viable… thanks for any help

    1. Then this become efficiency discussion. You can create 3 column parity space on a pool with 7 disks, but your efficiency would be 66% (2 disks worth of data, 1 disk worth of parity). That means on a 7x4TB disks with 28TB theoretical capacity, you would see around ~18.5TB usable space. However, if you bump this up to 5 columns, then your efficiency jumps to 80% (4 disks worth of data for every 1 disk worth of parity). This scheme, used on all of 7 disks in the pool would give you 5.6 disks worth of capacity, or 22.4TB.

      Your allocation unit size on partition has to be divisible entirely by interleave size – so in your case a 256kB allocation unit size on partition would be fully divisible by 64kB interleave (column width). If you set your virtual disk as 5 column parity w/ Interleave of 64kB, you would align perfectly. The little downside to this would be that instead of having 6 columns for data and 1 column for parity (which would give you highest possible storage efficiency), you’ll have 4 columns for data and 1 column for parity, which on 7 disks will degrade storage efficiency as per above.

      Logically, this means your options for aligned parity performance is to have 3 columns, 5 columns, 9 columns, etc. Without the parity column, the 2, 4 and 8 are numbers that allow you to set interleave column width to some portion of allocation unit size of partition that the AUS will be directly divisible by.

      1. I am struggling with the calculation for a redundancy factor of 2. I have 7 drives but would prefer a 2-drive failure instead. Is the calculation different in this case? I have similarly reproduced your results with 5 drives in a parity setup with interleave 16KB and AUS 64KB (redundancy factor of 1) with great effect (thank you!!).

        1. I put his formula into excel and was messing around with different values.

          It took me a few minutes to figure it out but when he says

          $ntfs_aus = ($number_of_columns – 1) * $interleave

          I think he really means

          $ntfs_aus is divisible by ($number_of_columns – 1) * $interleave

          Please correct me if I am wrong.

          (I’m pretty sure he mentioned this somewhere else, but I love formulas and it was throwing me for a loop!)

          If I understand correctly the formula for dual parity would be

          $ntfs_aus is divisible by ($number_of_columns – 2) * $interleave (Notice the – 2)

          Suitable column numbers should include 4,6 and 10

          With 7 drives that would give us a maximum of 7 columns and so the nearest valid column number would be 6.

          64 is divisible by (6-2) * 16kb
          64 is divisible by 4 * 16
          64 is divisible by 64

          Do note that the allocation size “16kb” doesn’t effect this equation.

          Even if we went to 32kb it would still be valid.

          128 is divisible by (6-2) * 32kb
          128 is divisible by 4 * 32
          128 is divisible by 128

          1. Upon further review

            $ntfs_aus = ($number_of_columns – 1) * $interleave

            and

            $ntfs_aus is divisible by ($number_of_columns – 2) * $interleave (Notice the – 2)

            Effectively mean the same thing for this context.

            If done currently they should always equal each other and be a factor of 2 up to 256kb which was the whole point lol.

            Sorry for the confusion.

          2. i know this sounds dumb but it would be nice to see how you did this in excel..

  2. Very interesting thanks. I tried this using your settings and found an initial write speed burst of around a 1GB/s but then the write speed dropped dramatically and remained at or well below 1MB/s.

    How would you determine the interleave and columns that are applied by default? Do you have a powershell command?

    1. Further on this, out of interest, I recreated the storage space using the control panel. Then reformated the virtual drive to 512K (based on your info that the default interleave size is 256K) and that lead to sustained write speeds of 250-300MB/s.

      Initially, when I tested the read speed I was getting just 10MB/s. But then on subsequent testing, getting the sustained read speed of around 300MB/s and that seemed consistent.

      Perhaps I needed to give the initial settings a bit longer.

      1. Reading through your comments, I don’t know your config. How many drives do you have? Are they all the same? How many columns did you go for and what was their interleave width? To provide at least some advice – the powershell commands are present in the article and the burst you see at the begining of copy/write seems to be the default cache that SS will create and use if partition AUS and Interleave are not aligned. If those would be aligned, the cache would’ve been bypassed entirely.

  3. What would be the ideal settings for 4x2TB drives? I read through your article two times and still do not understand the right combination of interleave, AUS & columns for a given set of disks πŸ™‚ Thanks

    1. The article gives you formula you can use: $ntfs_aus = ($number_of_columns - 1) * $interleave.

      The left-hand part of the formula can only have values you see in the menu when you’re formatting the drive. Windows will offer you 32 KiB, 64 KiB, etc. You have to make this work with what’s on the right-hand side of the equation. If your selected AUS is 256 KiB, then logically you can’t have 4 columns, because that would mean you have to come up with interleave size that is 256/3 which is 85 and one third of a kilobyte and you can’t set that as interleave. Your number of columns can’t be higher than the number of disks in your Space (array), so that pretty much tells you that with 4 disks, you will have to work with 3 columns. This means your effective storage will be 4x2TB times 66 percent, one third of your storage will be taken by parity. With 3 columns, you can figure out the equation on your own.

      1. Thanks! Now I understand it. The only question I have is the interleave size. Since Windows Partition Creation wizard allows AUS up to 2048KB in a NTFS partion, in a 3-column setup I could go up to 1024KB per Interleave. Why 32KB?

        1. I can’t think of a better reason other than this: The AUS sizes are usually determined based on the workload that the data space is expecting to carry. Large partition AUS will benefit storage of large files (like backups), because the allocation will be fast and indexing/defrag will be faster and easier. However, when used with small files, will lead to significant space waste (size vs. size on disk in properties dialog). Using smaller AUS will work better with smaller files but will be heavier on the amount of allocations required if used with larger files. It’s up to you to decide what will work best for you. With interleave size in SS, I can’t actually remember why should you be working with 32KiB and not smaller/larger amounts. This is the only resource I found that sorta makes sense. It’s just the typical IO transaction size I guess.

  4. Very interesting and useful article. I experienced this problem at some point, where a 4-disks Storage Space could only read/write at a rate of 30MB/s and I decided to rebuild it. I tested its performance after applying the above recommended settings (I used ReFS instead of NTFS) and the Storage Space delivered ~200MB/s write performance. All looked good until I activated BitLocker encryption. Performance suddenly dropped back to 30MB/s and the computer’s performance degraded significantly, even though the CPU (i9 7900X) was mostly idle!!!

    Why enabling Bitlocker affected the performance of the Storage Space so significantly is what I need to investigate next.

    1. I strongly suggest reading up on my experience with Bitlocker on SS in the Data Disaster series. In second part, I go into detail how I lost some data using those in combination. Bitlocker is no joke and I wouldn’t recommend using it unless your storage setup is completely redundantly backed up to some other physical machine or two. Excerpt:

      Eventually I would come to find that there is no way to recover the crippled array. No tool was able to punch through broken parity array that had BitLocker on it. It was simply too complex of a failure.

      There’s also a lessons learned part at the end.

  5. So, question here, I have four disks and might upgrade to five, each is 4TB. Assuming I get the fifth, what would be better? 5 Columns with AUS=128KB and a 32KB interleave, or go for 9 columns AUS=128KB and 16KB interleave? Or does it even make a difference? OR would there be something better all together?

    1. I just realized, you said you can only do an AUS of up to 64 with NTFS, so that would ideally make 5 column, 5 disks, with an AUS of 64KB and an Interleave of 16 ideal, right?

      1. Yes, correct πŸ™‚ The secret sauce to bypass the cache and avoid the data aligning issue is to guarantee that block size of write in the Space will always match the allocation unit size on Partition. In your case, 5 disks with 5 columns in a single-parity scheme would work best in AUS of 64 KiB and interleave of 16KiB.

  6. Note that it is also possible to use ReFS instead of NTFS (including ReFS with data integrity) and get very decent performance.

    The only small caveat is that you want to make sure that you use 64KB for the ReFS UAS (because the default UAS for ReFS is 4KB, which would force you to use an interleave of 2KB or 1KB, and you sure don’t want to use an interleave value that’s smaller than the standard 4KB sector size of any modern HDD) and, of course, since Microsoft decided to remove the ability to format to ReFS in all recent regular versions of Windows 10 and Windows 11 (but thankfully not the ability to use and administer ReFS drives, once they have been created), you have to use a “Windows Pro for Workstations” edition to format the virtual drive (noting that you can always create a Windows To Go drive with that edition just to format the drive, if needed).

    Below are the exact commands I used to create a 65.4TB ReFS Storage Space drive, using 5 x 18TB WD180EDGZ drives, and with an interleave of 64 KB / (5 – 1) = 16 KB:

    New-VirtualDisk -StoragePoolFriendlyName “Storage pool” -ProvisioningType Fixed -PhysicalDiskRedundancy 1 -Interleave 16KB -FriendlyName “Data” -ResiliencySettingName Parity -NumberOfColumns 5 -UseMaximumSize

    Then, after first creating an initial NTFS volume with 64KB UAS, to validate the performance, I reformatted the whole volume to ReFS, while also making sure that the UAS would be set to 64KB, with:

    Format-Volume -DriveLetter D -FileSystem ReFS -AllocationUnitSize 65536

    Using this, I found that I could seeing a sustained write speed of about 500 MB/s, even after turning ReFS Data Integrity on (See https://blog.habets.se/2017/08/ReFS-integrity-is-not-on-by-default.html), which is quite impressive. Of course, there’s a small drop in performance compared to using NTFS, where I saw the same virtual drive sustain a 600 MB/s write speed, but I am very pleased with the result.

    So thank you for helping folks finally get a decent performance with Storage space and parity.

    1. Just an update on my earlier post with the 65.4TB ReFS array. Unfortunately, even when you did everything right, it appears that Microsoft can still screw you up on a whim, because my fast parity array barely lasted a few weeks before Microsoft downgraded it to the dreadful write speeds everyone coming to this post is probably familiar with.

      And the thing, I literally did not do anything! I just left the machine powered off for a couple of weeks, and when I powered it back on, I had the unpleasant surprise to find out that my beautiful 500MB/s write speed ReFS drive had now been downgraded to a paltry 30 MB/s… πŸ™

      The only change I can think off is that UEFI CSM was briefly enabled when Windows booted (this ASUS motherboard sometimes feel like enabling UEFI CSM on its own, and I have to force it back to disabled), but I can’t really see how that would have downgraded my write speed. Oh and of course, I checked that it wasn’t a HDD about to fail or cabling issue…

      I also know that it is very important that Windows sees the physical drives in the exact same order as they had when the Storage Space pool was created, or else you will get degraded speeds, but Get-PhysicalDisk / Get-VirtualDisk confirm that, as far as Windows should be concerned, my configuration has not changed from how I created the array.

      So, even when manually setting the interleave and everything, Microsoft WILL still screw you up with parity and Storage Spaces!

      This is just so disappointing…

      1. You’re correct ReFS has/has had known performance issues, but I’m not familiar with ReFS, so I can’t give help on this.

        1. I have since switched to using ZFS with Debian Linux (and upgraded to an 8 x 18 TB raidz2 array while I was at it) because Storage Spaces is simply not worth it in terms of the amount of cajoling it takes to try to obtain a bare minimum level of performance… and keep it! But before I ditched my Storage Spaces array, I was able to find that the reason my performance was degraded was due to Microsoft issuing unwarranted continuous read I/O on two HDDs from the array during what should be a *pure* write operation. And of course, when you’re issuing simultaneous read + write I/O on HDDs, your performance completely tanks.

          I have not been able to find out why on earth those read operations were being issued. Even with ReFS file integrity turned on, there’s absolutely no reason why writing brand new data should ever induce read operation, especially one that involves reading a very extensive amount of data as I found out.

          Basically, it was as if, when writing a 4 TB file split over 5 HDDs (with parity), and instead of simply writing 1 TB of continuous data on each disk, the people who designed Storage Spaces thought it would be a “great idea” to read between 500 GB to 1 TB of random existing data (WHY?!?!?) at the same time… This makes absolutely ZERO sense when writing new data to a Storage Spaces array. And what’s even more unfathomable is that this behaviour did not occur when the array was first created and only started to manifest itself after a while…

          So, rather than hope on Microsoft to provide an explanation for this utter bullshit of a storage software layer and/or finally fix it, so that people who want to use parity cease to be treated like second class citizens, I ditched Storage Spaces altogether in favour of ZFS, and would encourage anybody who is seriously thinking about storing large amount of data with an actual proper level of performance to do the same. As long as Microsoft can pull bullshit like this Storage Spaces is just not worth it…

          1. So I found this page/thread after being already aware of this write cache parity bypass with the interleave magic match… In search of why ReFS was working then basically going back to normal (not bypassing write cache)… For me what I discovered is ReFS is initially fast just like ntfs but there is a catch is only on fresh formatted volume, you can fill the entire volume at full speed but as soon as you delete one file on the volume the speeds tank.

            For some reason its like as soon as a delete happens something happens where new writes can’t bypass the write cache and I’m not sure why but am determined to figure it out.

            For now I am between doing a mixed Mirror Accelerated Parity all on samespinning disks (has decent sustainedspeeds even when doing auto moves between tier on same disks) and does not suffer this issue as the writes first land on the mirror tier, but not as fast as parity while it is bypassing write cache.

            For testing create a small volume like 300-400 GB then fresh format with refs and write about 300gb of new data all should be fast, then seekers a few files and try adding new data to see the write bypass no longer working, then just format and try again back to really fast again… So frustrating because I want refs over ntfs on the storage space for auto healing of corruption.

            If anyone has any ideas please let me know

  7. Hi,
    Great article, thanks a lot.
    I wonder if there is the possibility to change the interlave and columns on an already active storage space volume.
    I have a running 64K partition on an storage space volume created from the console and would like to upgrade to 4 columns and 32Kb interlave…

    1. I was there. You can only do that in case your current Storage Space occupies less than half of the total pool size. In that case, you can create another Storage Spaces configured the way you want and start dragging stuff over to it. When done, simply delete the old storage space and you’re done. It will take a lot of time because you’re copying data between two spaces that reside on the same pool of disks. If this is not an option for you, a less feasible, but doable way of doing this is to buy something like 18TB drive or drives, move data over to those, reformat your pool, move data back and return the drives for a full refund if your jurisdiction has that kind of laws in your favor. For this reason specifically I built myself secondary NAS so that I always have all the data on two different boxes in case I need to do some high-risk stuff like this.

      1. Great! Thanks for the tips.
        Just to be clear. I need to create a very flaxible StoragePool with a start of 5 disks that can increase until 10. I need 64KB cluster size in the partition due to the nature of files that will be stored in the SP, several GB each. Is it correct to create a SP having 16Kb interleave and 4 columns to balance storage utilization with performances?

        1. Sorry, just checked that the perfect combination is 5 columns and 16KB interleave!

        2. Just one question: what happened if I add disks to the SP and extend the partition? Will it retain all configuration and just extend everything and maintain performances so columns and interleaves?

  8. Great article! It makes clearer understanding about relations of column, interleaves and AUSes. But I cannot achieve the best performance even with this understanding.
    I have 10 physical disks and created new virtual drive with 10 columns, 16 KB interleave and double parity and create a volume on it with 128KB AUS (with powershell). Tests show 4000 Mbps of read performance and 1 Mpbs of write performance.
    I created new double parity Vdisk with 6 columns, 16 KB interleave and 64KB AUS. Tests’ results are the same. Could you help me with that?

    1. Are you by any chance using SMR disks? Also note that 1 mbps and 1 MB/s are vastly different speeds. The write performance will be affected fatally even if only one of the drives in the array is SMR. Otherwise your approach seems valid to me, the 16KiB interleave and 128KiB AUS make sense on a 10-column double parity Storage Space. If you’re using PMR or CMR disks, please post your exact powershell commands you used to create this setup, and also your HW (disks, CPU, RAM and also Windows build).

      1. Thank you for the quick answer.

        The disk model is ST4000NM017A (I didn’t find info if it is SMR). Those disks are in Dell MD 1400 Storage which is connected to a server Dell PowerEdge R340 via Dell 12 Gbps HBA. The number of disks is 10.
        CPU Intel(R) Xeon(R) E-2224 CPU @ 3.40GHz
        RAM 32 GB (2 Dimms x 16 GB DDR-4)
        OS Windows Server 2019 Datacenter Version 10.0 (Build 17763) (x64).

        I’m sorry for my interpretation of the write speed. I use diskspd (Microsoft utility for disk tests https://github.com/Microsoft/diskspd/wiki/Sample-command-lines): diskspd -c2G -w100 -b4K -F8 -r -o32 -W60 -d60 -Sh testfile.dat

        and results are:
        for single disk without storage space: 2,5 MiB/s
        for 10 disks in double parity: 1,5 MiB/s
        for simple storage space: 26 MiB/s

        the command I use to create double parity:
        New-VirtualDisk -FriendlyName “DDisk” -Interleave 16KB -NumberOfColumns 10 -ResiliencySettingName Parity -PhysicalDiskRedundancy 2 -UseMaximumSize -StoragePoolFriendlyName DPool |Initialize-Disk -PartitionStyle GPT -PassThru | New-Partition -DriveLetter “D” -UseMaximumSize | Format-Volume -FileSystem NTFS -NewFileSystemLabel “D” -AllocationUnitSize 64KB -Confirm:$false -UseLargeFRS

        1. Hey thanks for the diskspd reference, I had no idea this existed. I can’t find anything wrong with your PS command used to create the SS vdisk. Also, I couldn’t find much info on the drives, same as you – it appears it’s some 4TB SAS medium with specific instant erase features, encryption and all that stuff. I’m sorry I can’t be of more help, I’ll try running this tool on my instance and see where I get – those numbers are quite weird — if I understood the syntax right, that should be random 4KB writes – on a medium with 64KB AUS, that’s my first concern. Second is that single disk space without SS of 2.5 MiB/s kind of spells doom for the workload this diskspd tool generates, it appears you’re trying to use your spinning rust as solid state storage under that load. For reference, can you try something more rednecky, like moving 10 GB file onto the vdisk to see what kind of speeds you’re looking at with this?

          1. It’s strange that test shows unbelievable results. I tried robocopy 32GB file and it took 43 sec (700 MBps as I calculate).

          2. So I understand that simply moving a file onto the medium works as expected, but the testtool that is blasting the volume with random 4KiB read requests is showing slower results? Wouldn’t that be kind of expected?

        2. This is probably way too late, but I just found this site today. If my reading of your OP is correct, then this PowerShell command has the wrong AUS set. You need 128KB for a 10-column dual-parity with a 16KB Interleave ( (10-2)*16=128KB ), which is what you showed in your requirements. The PowerShell you say you used is setting the AUS to half that at 64KB.

  9. Great article, albeit a bit over my head – but every now and then some knowledge drops on my head.

    I am trying a new setup that has 5 x 4Tb CMR drives and dual parity with the intent to add a drive as needed until I hit 8 x 4Tb at which point I will probably go a different direction. But I see that quite a ways off in the future.

    I see Diz says posts above with ” 5 disks with 5 columns in a single-parity scheme would work best in AUS of 64 KiB and interleave of 16KiB”

    Not sure if this was single or dual parity, but would that hold true adding drives? Or have I messed something (likely).

    1. Oh boy. I need more caffeine before posting in the morning. I now see Diz is a single parity.

      Revising my question, would 5 disks with 5 columns in a dual-parity scheme would work best in AUS of 64 KiB and interleave of 16KiB in a system that will be systematically increased by adding one 4TB at a time until a max of 8 x 4TB drives is reached.

      1. You would need at least 7 units for dual parity setup. The number of columns only dictates your minimum number of units and your storage efficiency, you can totally have 8 units in a 5 column setup. It only means that after writing the first data stripe across 5 units, the second data stripe will continue across the 3 remaining ones and continue over 2 more drive units from the original 5. This is visible in the visual aids across the article.

        You can also keep adding disks to your pool and keep increasing size of your storage space. Storage Spaces do not care if you set maximum space size larger than that of the pool, but will warn you once you start creeping towards the pool size. I’ve recently put out article “planning for a drive failure”, you should read it to understand why certain amount of empty space in the pool is desirable.

        I think your overall approach is correct. You’re using CMR drives, your AUS and data stripe sizes are all correct and you’re using 5 column setup with 8 drives instead of sticking with 3 columns.

  10. Fudge. My PowerShell is weak. I cannot seem to get anything but default values to load without some link of error.

  11. alright i have been struggling with this for several hours. I feel like I grasp your concepts and it makes since until I try to do this myself..

    I have 8x8TB drives the only way i can get the AU divisible by the Interleave would be putting 5 Columns in there (i am horrible at math so i know i could be wrong) the problem is it seems that is what 21% efficiency? I am sure i am doing this wrong..

    Why i am struggling i don’t know. Could i beg you for some help on this?

    1. sorry i forgot to mention my settings

      i was using AUS – 64K, 5 Columns, Interleave – 16K

      1. Nah you’re fine. 8 units in 5 column setup and interleave of 16KiB gives you data stripe size of 64KiB. The 21% isn’t your efficiency, it’s your parity loss – your efficiency is 80%.

  12. I am having no luck with this… Not sure why… Server 2022 – 5x18TB drives, 16KB Interleave, 5 columns, 64K AUS ReFS file system. I thought everything was going fine with some test transfers, then I start my copy from another array and performance craters. It also appears to be caching. Any ideas?

    1. UPDATE: I do a test… good performance. Stop the copy. Do the test again, poor performance – clearly caching. Stop the test, wait for cache to settle, reformat partition, start another copy – good performance and perfmon shows cache bypass. I don’t get it…

  13. UPDATE2: Well… I think I have figured what is going on. But I’m not sure why. If I BitLocker the volume, the cache doesn’t get bypassed, and performance tanks. If I re-format the volume and test with BitLockering the drive, performance is great. Any idea why? On this system, ordinarily BitLocker has a negligible effect on the performance of other (mirrored) volumes.

    1. I’m not sure why Bitlocker would tank your performance, but I will caution you to use the Bitlocker encryption on a Storage Spaces volume. I describe in the Data Disaster series how that combo makes the data irrecoverably lost if anything happens to the volume.

  14. I’m testing out different AUS values on an array of 4x4TB, Interleave 32K, 3 Columns.

    64K AUS works great as expected, but so does 4K, 8K, 16K, and 32K. Why is this the case?

    For my use case 4K AUS is best, storing several hundred thousand 32KB – 4MB files alongside thousands of 2GB – 50GB files. Lots of wasted space with 64K AUS.

    Is it imperative to match AUS with the output of the equation? So far I can see no ill effects using 4K AUS with the above settings.

    1. The AUS should be an integer multiple of your stripe size. If your stripe size is 64KiB (3 columns with 1 parity, 32KiB interleave) and your AUS is 32 KiB, then your AUS/STRIPE ratio is 0.5 so the condition isn’t met.

      I’m thinking either Microsoft has implemented update that would somehow support your scenario, or something else is at play. Did you try to move sizeable content onto the medium? Like 50 GiB file so verify that the transfer speed doesn’t tank after 1 or 2 GiBs written? Also, the transfer speed should be 2 times your single disk write performance, so depending on your model, up to 400MiB/s.

      Also be sure to check that your drives are not SMR, that would also sort of explain the drops you’re seeing.

      1. I was able to push 2.21TB to it over gigabit ethernet in about 7 hours 31 minutes. Transferring large files locally gives me inconsistent results. A 50GB file copied at 400MiB/s to 450MiB/s only about 18GB of that went into modified memory and it flushed rather quickly.

        Tried a 137GB file, got about 350MiB/s to 400MiB/s which dropped to about 200MiB/s to 250MiB/s halfway through. Modified memory stayed pretty consistent hovering around ~10GB.

        Copied that same file again and got wildly different results. Fast at first then slowed to a crawl at 20MiB/s. So far this has been pretty confusing.

        All my drives are HGST 4TB HUS724040AL (CMR)

        So for now I have it set back to AUS of 64K and will just take the loss on storing small files. Also probably going to switch to using 5 drives, 5 columns, 1 parity to get better than 66% storage efficiency.

        Thanks so much for putting this article together!

  15. A lot of the examples here are of one parity drives, just for those curious this was my command to make a 7 drive pool with two parity drives (all 3TB drives): New-VirtualDisk -StoragePoolFriendlyName “RDA” -ProvisioningType Thin -Interleave 16KB -FriendlyName “Array” -ResiliencySettingName Parity -PhysicalDiskRedundancy 2 -NumberOfColumns 6 -Size 13.6TB

    I could not figure out how to make it a maximum size, so I just took the number they said was the full amount (around 19TB) and multiplied it by .6, then went under a bit. I don’t have fast drives to transfer data from but all of the transfers have been happening at the source disk maximum speed, plus diskfilltest was operating around 450MB/s

    1. I take that back, I calculated space by taking the individual drive size that was reported (2.73TB) and multiplying it by 5, then went a little under.

  16. This post has changed my life! Why on earth don’t MS surface this kind of information. My dual parity storage space now gets around 300MB/sec!!

  17. Great post and information. Got my 5x 4TB setup to saturate 1Gbps Ethernet. Copied ~5TB of data from my backup machine with consistent 120MB/s writes.
    Then, (a few week later), the system SSD disk crashed, and had to reinstall Windows on a new system disk. All five disks beloning to the virtual drive(SS) were automatically found and attached. But now I get full write speed for 5 GB, and then speed drops to a few MB/s or stops completely.
    Is there a way to check that settings for my virtual disk is still the same? Have no idea what is going on.

  18. This guide saved me a ton of headache (and possibly money), thank you very much!
    I did the 3 disk variant, just as you described, and the 30-40MB/s-ish speeds got up to 250-450MB/s, which is incredible. It dips to 30-50 when copying a lot of images for example, but I think it’s just natural. Small files, HDD doesn’t get along well.
    I’m planning on upgrading to 5 disk in the future, so I’ll try out that version too!

    Thank you again!

  19. Hi,

    I seem to be missing an important point here:

    When writing a NTFS cluster of 64 kB, the correct interleave size will provide for perfect alignment and max performance. But most of the time, I’m not writing exactly 64 kB. Most writes are over that.

    Will NTFS always issue writes to the storage system in cluster size, one AUS at a time? Will it not request to write a blob greater than that, leaving it to the storage subsystem to divvy it up?

    Regards

  20. hi, this guide was very helpful in increasing writing speeds. I have 5x3tb WD30EFRX in parity 1. I have set Interleave 32 and AUS 128. With Interleave 16 and AUS 64 the performance is lower in my case.

  21. This has been a great education. Thank You!
    That being said, I have 8 mixed drives ranging from 500gb to 4TB. All drives separately achieve write speeds of 150 to 200 MB/s. I have my pool set to 5 columns, single parity, and have tried every interleave and AUS under the sun. Any help would be appreciated.

    1. @Anonymous, mixed drives may not be the best choice. For optimal performance, you definitely want 8 identical drives. In addition, once you’ve done that, I believe you want 7 columns for 8 drives. Unfortunately, if the drives must be mixed, YMMV on performance, and finding the optimal configuration becomes much more complex.

      1. Correction, I believe 8 columns (or more) for 8 drives is optimal. From this article, it looks like 5 columns can work, but I don’t believe it will provide maximum throughput.

      2. Ok, final correction: I still believe you’re asking for trouble mixing disks like that, but regarding the columns point, I clearly fell through for you. 8 columns will not allow you to choose a proper AUS, so 5 was a fine choice. Sorry for the confusion

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.