9916 Brooklet Drive Houston, Texas 77099 Phone 832-327-0316 www.safinatechnolgies.com RAID Made Easy By Jon L. Jacobi, PCWorld What is RAID, why do you need it, and what are all those mode numbers that are constantly bandied about? RAID stands for Redundant Array of Independent Disks or Redundant Array of Inexpensive Disks, depending on who you talk to. Note that the word array is included in the acronym, so saying RAID array, as a lot of people do, is redundant. Back when hard drives were less capacious and more expensive, RAID was created to combine multiple, lessexpensive drives into a single, higher-capacity and/or faster volume. On top of that, it was designed to facilitate redundancy, also known as fault tolerance or failover protection, so that the array and its data remain usable when a drive fails. You ll often hear about 1-disk or 2-disk redundancy, which refers to the number of drives that can fail while the array remains viable. Redundancy is important for a small business, as drive failure does happen. RAID s data redundancy offers no protection against data lost to malware, theft, or natural disaster and it s certainly no substitute for proper backup practices but it does provide a fail-safe against hardware failure. RAID has levels, or methods by which the drives are ganged together; commonly people refer to levels by number. The three most common levels in the consumer and small-office markets are RAID 0, RAID 1, and RAID 5. However, you ll encounter numerous other options too, including levels 6, 10, 5+1, JBOD ( just a bunch of disks ), and Microsoft s virtual disk RAID, as well as abstracted RAID implementations such as Drobo BeyondRAID, Netgear X-RAID, and Synology SHR. Common RAID Modes RAID 0 Picture the 0 in the RAID 0 name as an oval racetrack and you ve divined its primary purpose: Faster performance. RAID 0 distributes data across multiple drives (for example, block A goes to and from drive 1, block B goes to and from drive 2), which permits increased write and read speeds. This approach is often referred to as striping, and other modes (as you ll see later) employ the technique as well. Regrettably (and dangerously, if you aren t aware of the risks) RAID 0 offers no protection against drive failure, since this mode does not write any duplicate or parity information. Hence, when a drive fails, you end up with a puzzle that s missing pieces. In such a situation, your data is quite possibly gone, though you can find service providers that might be able to recover it for a lot of money. RAID 1 RAID 1 writes and reads the same data to pairs of drives; it s also referred to as mirroring. The drives are equal partners should either fail, you can continue working with the good one until you can replace the bad one. RAID 1 is the simplest, easiest method to create failover disk storage. However, it costs you a whopping 50 percent of your total available drive capacity; for example, two 1TB drives in a mirrored array nets you only 1TB of usable space, not 2TB.
You may have as many pairs of mirrored drives as your RAID controller allows. And in the unlikely event that said consumer-grade controller supports duplex reading, RAID 1 can provide an increase in read speeds by fetching blocks alternately from each drive. RAID 5 This RAID mode offers both speed and data redundancy. RAID 5 writes data to and reads from multiple disks, and it distributes parity data across all the disks in the array. Parity data is a smaller amount of data derived mathematically from a larger set that can accurately describe that larger amount of data, and thus serves to restore it. Since parity information is distributed across all the drives, any drive can fail without causing the entire array to fail. RAID 5 uses approximately one-third of the available disk capacity for parity information, and requires a minimum of three disks to implement. Since data is read from multiple disks, performance can improve under RAID 5, though some users report that RAID 5 slows performance greatly when it s processing multiple reads in a server situation. JBOD This is shorthand for just a bunch of disks. It's not actually RAID, but it is often available as an option on multidisk storage boxes that offer RAID. JBOD offers no speed increase or redundancy. Rather, it simply concatenates a group of disks into a single volume. Data is written to the first drive until it s full, then to the second until it is full, and so on, until the last drive has no more room. Even though many network-attached storage devices provide this option, we don t recommend using it unless it s the only thing available, you really need a single large volume, and you don t have the choice of using RAID 0 (an unlikely circumstance). Drive Extender Microsoft has abandoned this technology, which was formerly employed on NAS boxes running Microsoft Windows Home Server (prior to WHS 2011). A smart file replication methodology, drive extender allowed you to configure which data would be replicated on a folder-by-folder basis. Abstracted RAID Drobo, Intel, Netgear, and Synology all offer what is frequently referred to as abstracted or virtual RAID. You ll even find a form of abstracted RAID in Windows 7, and Windows 8 Spaces takes the idea even further. Abstracted means that instead of using physical disks as the building blocks of an array, this arrangement employs virtual volumes (or virtual disks, in Microsoft s parlance). Virtual volumes are handy in that they might take up only part of a disk, or they can expand across multiple disks. For instance, you could have a virtual volume that consumes all of one 500GB disk and half of a 1TB disk. You would then have a second virtual volume that uses the remaining 500GB on the 1TB drive. The RAID software manages them behind the scenes, and they appear as a single storage unit to the user, if so desired. Abstracted RAID allows you to mix different-capacity drives and varying levels of fault tolerance, as well as to expand capacity automatically without user intervention. Without it, changing RAID levels involves backing up all the data, reconfiguring, and then copying the data back a time-consuming and sometimes technically challenging activity.
Hot Spare RAID arrays sometimes employ a hot spare, which is simply an extra disk preinstalled in the NAS box or system that serves to replace a failed disk. This setup allows the rebuilding of the array to proceed automatically without user intervention. Maximum Redundancy You ll encounter three other RAID options that can be useful. They aren t often found in consumer-level RAID boxes, though they are present in some business-oriented NAS boxes. RAID 6 is very much like RAID 5: It has distributed parity info, as RAID 5 has, but it also has twice as much of it, which means that RAID 6 can withstand the loss of two drives. With RAID 6, a second set of parity information is distributed across the drives to the obvious detriment of total capacity. Nevertheless, in situations where you need the highest level of fault tolerance, RAID 6 is a good choice. RAID 10, also referred to as RAID 1 + 0, stripes data (RAID 0) across mirrored pairs (RAID 1) of drives. With this arrangement, you get back some of the write speed that RAID 1 can cost you but you need at least four drives to implement it, and 50 percent of the total drive capacity becomes devoted to redundancy. Conversely, RAID 0 + 1 mirrors (RAID 1) striped pairs (RAID 0) of drives. As with RAID 10, you regain some of the write speed that RAID 1 can cost you. Again, you need at least four drives, and you spend 50 percent of the total drive capacity on redundancy. Other RAID Options The RAID specification includes several other levels aside from the ones addressed above; however, these are not commonly used anymore. RAID 2 distributes data across multiple drives at the bit level (the smallest unit of computer information with a value of either 0 or 1) instead of at the block level. RAID 2 writes Hamming ECC (error-correcting code) recovery information to dedicated parity disks at the byte level, which requires a lot of processing power. RAID 3 is another mode that got kicked off the consumer island because it doesn t use data blocks; it distributes data across multiple drives as bytes (8 bits), and like RAID 2 it stores parity information on a dedicated drive. RAID 4 fell into disuse because it distributes data across multiple drives as blocks and stores all parity information on dedicated parity drives; if a dedicated parity drive fails, the entire array remains unprotected until it s replaced and the information is reconstructed. This weakness is also inherent in RAID 2 and 3. Choosing RAID: A Cheat Sheet Trying to determine which RAID level is best for you? Here s our take. First off, use hardware RAID over software RAID when you have a choice. Software RAID is fast, but many implementations tend to rebuild at the drop of a hat, reducing performance while in progress. Use RAID 0 when all you want is faster performance with large files, and you don t need fault tolerance. (But be sure to back up your drives regularly.) Use RAID 1 when you have only two drives and you want to protect against drive failure. Use RAID 5 when you have more than two drives and you want a hedge against drive failure. Use abstracted RAID to derive maximum storage from a collection of drives of different capacities.
Understanding RAID Advantages and Disadvantages of Each Level There are different kinds of RAID setups, which provide various levels of data transfer speed, and data backup or storage reliability. You can choose if you want your drives to be very fast, very safe, or a combination of the two. RAID 0 is the fastest, RAID 1 is the safest, and RAID 5 is a great combination of safe and fast. The number of disks you use in your RAID setup is part of this formula. Introduction to different RAID levels RAID Level Description Advantages Disadvantages RAID 0 (striping) Stripes two or more hard drives together and treats them as one large volume. For example, two 250GB drives combined in a RAID 0 configuration creates a single 500GB volume. RAID 0 is used by those wanting the most performance out of two or more drives. Because a little of the data is kept on each drive, performance increases as more disks are added. Writing to 10 drives is roughly 10 times faster than writing to one drive. This is handy if you need large and fast volumes. Every drive has a limited life and each disk adds another point of failure to the RAID. Every disk in a RAID 0 is critical losing any of them means the entire RAID (and all of the data) is lost. RAID 1 (mirroring) Mirroring creates an exact duplicate of a disk on the fly. Every time you write information to one drive, the exact information is written to the other drive in your mirror. Important files (accounting, financial, personal records) are commonly backed up with a RAID 1. This is the safest option for your data. If one drive is lost, your data still exists in its complete form, and takes no time to recover. Your investment in data safety increases your disk costs since multiple, mirrored drives are seen as one. RAID 2 A rare implementation of striping similar to RAID 0 it stripes at the bit level instead of by blocks. RAID 3 An implementation of parity striping. Its limitation is that it cannot service multiple requests. RAID 4 Parity striping at the block level with an entire disk dedicated to parity data. Similar to but less common than RAID 5.
RAID 5 (parity striping) A common RAID setup for volumes that are larger, faster, and more safe than any single hard drive. Parity striping at the block level with user data and parity data striped across all disks. At least three disks are required for RAID 5. No matter how many disks are used, an amount equal to one of them will be used for the parity data and cannot be used for user data. You can lose any one disk and not lose your backup data. Just replace the disk with a new one. RAID 6 Very similar to RAID 5, but adds an additional parity block. It allows for the failure of two disks simultaneously with no data loss. Slightly slower than RAID 5 on writes but there is no added delay for reads. RAID 10 (RAID 1+0) RAID 10 works by striping and mirroring your data across at least two disks. RAID 10 is secure because mirroring duplicates all your data. It s fast because the data is striped across two or more disks, meaning chunks of data can be read and written to different disks RAID 50 (RAID 5+0) A RAID 50 combines the straight blocklevel striping of RAID 0 with the distributed parity of RAID 5. This is a RAID 0 array striped across RAID 5 elements. It requires at least 6 drives. Provides great balance between storage performance, storage capacity, and data integrity that s not necessarily found in other RAID levels.one drive from each of the RAID 5 sets could fail without loss of data. The time spent in recovery (detecting and responding to a drive failure, and the rebuild process to the newly inserted drive) represents a period of vulnerability to the RAID set. RAID 60 (RAID 6+0) A RAID 60 combines the straight blocklevel striping of RAID 0 with the distributed double parity of RAID 6. That is, a RAID 0 array striped across RAID 6 elements. It requires at least eight drives. A great fit when you need higher usable capacity and better reliability. Slight loss in write speed and performance.
Hardware vs Software RAID RAID can be implemented in hardware, in the form of special disk controllers that are typically built into a multi-drive enclosure, or in software, with an operating system module that takes care of the housekeeping required for data to be written properly to the disks used in the RAID configuration. The Windows, Mac OS X, and Linux operating systems all offer the ability to create a RAID configuration without any additional software. The drawback to using your operating system, or other software, to create a RAID is that it will add to the compute load on your computer, which will likely slow your computer s performance. Using a hardware RAID system, in an external drive enclosure or an expansion card installed in the computer, would not slow down your computer s performance. How do I RAID? You need at least two hard drives and a way to RAID them, whether via software or hardware. Many CRU hard drive enclosures perform RAIDs on the device, so you don t need additional software beyond a simple configurator (most CRU enclosures can be configured on the enclosure itself, without any additional software required). Some RAID levels require at least 3 disks, but some need 4 or 5. You ll want to buy matching drives for your RAID, so plan accordingly. If you attempt to RAID disks of different sizes together, most RAID methods treat each of the disks as if they are the same size as the smallest disk in the RAID. (The exception to this is a non-raid option where spanning is involved; the full capacity of each disk is used in that case. Similar to RAID, a spanned volume stores data on more than one drive, yet it appears as one volume. The difference is that spanned volumes are not at all redundant since a single drive is filled before data is written to the next drive in the sequence.)