When configuring the disk drives on a Windows or Linux server, it is preferable to use RAID (redundant array of inexpensive disks) instead of a single hard drive. However, all RAID configurations are not equal, and some are preferable. In short, we always recommend and use RAID-1 mirroring instead of RAID-0 disk striping (or RAID-5 striping with parity) for servers.
RAID-0 and RAID-5 require two or more disk drives and spreads the data evenly across all of the drives. An easy way to visualize RAID-0 is to imagine a document with ten pages. Page 1 is stored on the first hard drive, page 2 on the second drive and page 3 on the first drive. As the document is saved or retrieved, both hard drives work to return the data, providing faster performance than reading the entire document from a single drive. This is known as split seeking.
RAID-5 uses an extra drive for parity checking. Some configurations will dedicate the last drive to parity, while others will distribute parity information across all drives.
The intended benefit of RAID-0 is that the system provides fault tolerance when a disk drive develops an error or fails completely, and it provides improved performance over a single drive.
However, RAID-0 configurations cannot run with a degraded array and will shutdown or fail to startup until the failed drive is replaced and the data is restored. On a simple striped two drive array, a failure of either drive results in the loss of all data and requires a restore from backup. We do not recommend using RAID-0 for servers.
With RAID-5, the additional drive provides parity checking so that any of the drives can fail and be regenerated using the parity data.
Once the failed drive in an array with parity is replaced, the missing data must be rebuilt by reading the entire contents of the remaining drives and then calculating the missing data. Some RAID controllers require rebuilding the array before the operating system is started, resulting in delays of hours before the server is available.
When a RAID controller card is used, Windows Server software disables the write cache and advanced write cache option. Most RAID cards have dedicated cache RAM and some are battery backed to provide protection from unexpected shutdowns, but these integrated caches are small (32mb or 64mb) in comparison to the cache RAM available on a Windows server, which can be 1gb or more.
Windows Servers use all available unused RAM for caching. For example, on a Windows 2003 server with 4gb RAM, up to 3gb RAM can be available for read/write caching after the operating system and programs are loaded.
When using RAID-5, we recommend configuring a fourth drive as a hot spare. This allows the RAID controller to automatically fail-over to the spare drive and rebuild the array using the spare.
Another limitation of RAID-0 systems involves disk expansion. When an existing array is expanded, the next drive must be the same size or larger than the existing drives. But when a larger disk is provided, the array manager will only expand the volume in an increment equal to an existing drive. For example, on a 3-drive array using 74gb SCSI drives, only the first 74gb of the fourth drive will be added to the array; the remaining space is unused and unavailable to the array.
Over the life of a system, adding drives to an array as more space is required ends up retaining the original drives, increasing the failure rate of the system.
The better alternative to RAID-0 or RAID-5 is RAID-1, also called disk mirroring. RAID-1 uses two drives and operates the drives as duplicates, saving all information simultaneously to both drives. This feature can be configured either through a controller card or the Windows server operating system.
In the same way that RAID-0 and RAID-5 perform split seeks, Windows servers will perform split seeks when reading data from mirrored drives, resulting in improved performance over a single drive.
When RAID-1 disk mirroring is configured in the Windows Server disk utility, Windows also provides the option of 3-second write caching or 10-second advanced write caching. With write caching enabled, all available server RAM can be used to save data for up to ten seconds, allowing the server to prioritize read requests ahead of write requests, resulting in significant performance gains.
When write caching is enabled in Windows, the server should be protected with a battery backup (“UPS”) to prevent unexpected shutdowns due to power failure. As long as the UPS provides more than ten seconds of shutdown protection, the server has enough time to flush all unsaved data from the RAM cache onto the hard drive, ensuring no data is lost.
RAID-1 mirroring provides better fault tolerance and recovery than RAID-5, since a failure of either disk will not shutdown the server. When Windows server software is used for RAID-1 mirroring, a replacement drive can be re-mirrored while the server is running, eliminating the rebuilding delay required with RAID-0 and RAID-5.
In conclusion, the best choice for a server with disk fault tolerance is software RAID-1 disk mirroring.