1 RAID in a nutshell A Redundant Array of Independent Disks (RAID) is a collection of disks managed by specialized array management software that coordinates their activities. An array s member disks are part of a disk subsystem the disks and the hardware that powers, packages, and connects them to host computer systems and controls their operation. Array management software may be host-based (executes in a host computer) or subsystem-based (executes in an intelligent disk controller). In either case, its functions are to: present the array s storage capacity to the hosting environment as one or more virtual disks with the desired balance of cost, data availability, and I/O performance. mask the array s internal complexity from the hosting environment by transparently mapping its available storage capacity onto its member disks and converting I/O requests directed to virtual disks into operations on member disks. recover from disk and path failures and provide continuous I/O service to the hosting environment. Array management software creates virtual disks for application use. From an application standpoint, virtual disks are functionally identical to physical disks with superior reliability and, in most cases, performance. The redundant in RAID is achieved by dedicating part of an array s storage capacity to check data. Check data can be used to regenerate individual blocks of data from a failed disk as they are requested by applications, or to reconstruct the entire contents of a failed disk to restore data protection after a failure. The most common forms of check data are a mirror (identical) copy of user data and shared parity, which involves appending mathematical code to data bits for later comparison, matching, and correction. Different combinations of mapping and check data are called RAID Levels.* Of the seven well-defined RAID Levels, three are in common use. Level 1 uses mirroring for data protection and may incorporate striping. Striping refers to the location of consecutive sequences of data blocks on successive array members. Striping balances I/O load, increasing performance. Levels 3 and 5 both use parity for data protection and almost always incorporate striping. RAID Levels 3 and 5 use different algorithms for updating both user data and check data in response to application write requests. RAID is a component technology that can be combined with other technologies to create highly available, highperformance storage systems. * See page 9 for more details on the key RAID levels. See page 6 for an illustration depicting striping.
2 RAID can protect your most valuable asset the data that drives your business. 1 For even the simplest business tasks, you depend on reliable, current information. Now, more than ever before, readily-accessible data is vital to keeping your enterprise up and running. While keeping that information available was once complex and costly, especially in a distributed environment, today s RAID technology makes high data availability simple and affordable. What s a RAID array, anyway? A disk array is a collection of disks controlled by a common array management function which coordinates their activities. If the management function provides RAID capability, the array may be called a RAID array. A storage enclosure that holds disks, one or more intelligent controllers, and possibly other storage devices, is called a storage system. If the array management function is provided by firmware within an intelligent controller, then the array s disks all reside within a single storage system. This configuration is often called subsystem-based RAID. If, however, the array management function is provided by software that executes in the hosting environment, the disks comprising a single RAID array may be parts of different storage systems. This configuration is often called a host-based RAID system. RAID (Redundant Array of Independent Disks) is a way of coordinating multiple disk drives to protect against loss of data availability if one of them fails. RAID technology can help a storage system provide on-line data access that doesn t break down when it s coupled with other technologies for highly available systems such as: uninterruptable power control systems, redundant power supplies and fans, intelligent controllers that can back each other up, and operating environments that can detect and respond to storage systems recovery actions. In most cases, the second-generation RAID systems available today deliver improved I/O performance as well. With RAID, you can balance the cost, availability, and I/O performance of your storage system to meet your business and application needs, and adjust configurations to meet your requirements as they change. The Buyer s Guide to RAID can help you understand RAID technology and its role in highly available storage systems. In consultation with your subsystem supplier, it can equip you to choose the right RAID system for your applications. Digital Equipment Corporation offers both hostand subsystem-based RAID products that are fully tested and certified for operation in multiple computing environments, including: Novell NetWare SunOS Sun Solaris Microsoft Windows NT SCO UNIX HP-UX IBM AIX OpenVMS Digital UNIX RAID ADVISORY BOARD
3 2 What do you want from your storage system? Aren t data reliability and data availability the same thing? In discussions of RAID, the terms data reliability and data availability frequently come up. Data reliability, usually expressed as Mean Time to Data Loss (MTDL), is the average length of time a storage system can be expected to perform without a failure that causes loss or destruction of data. Highly available storage systems that include redundant components are usually designed so that single component failures do not cause data loss. Data reliability is therefore usually much higher than individual component reliability in these systems. Since RAID protects against loss of data due to failure of a disk, it increases MTDL substantially (from hundreds of thousands to millions of hours in typical modern storage systems). Data availability, usually expressed as Mean Time to Availability Loss (MTAL), is a storage system s ability to deliver correct data on demand. Data may be intact, but if the only access path to it has failed, it is not available. To provide extended data availability, a storage system must not only protect against data loss (e.g., using RAID), but also against failures that could cause the hosting environment to lose access to data. Dual-redundant controllers, dual paths to disks, redundant power supplies and host I/O buses, with host support for these features are typical components of a storage system that provides high data availability. On-line storage systems are essentially simple. They store data and deliver it on request back to the hosting environment. The qualities that make a storage system more or less desirable are also quite simple. The ideal storage system would cost very little, store and deliver data on demand, and never fail. Technologies that provide very high data reliability and availability do exist at a cost. Technologies that provide ultra-high I/O performance are available also at a cost. And while low-cost storage systems are also available, they don t offer the performance or the data reliability and availability of more expensive solutions. In general, you must decide which of the three properties of storage low cost, high availability, and high I/O performance is most important, and choose accordingly. RAID technology expands your options substantially. Lowest Cost JBOD* RAID 0 Highest Performance RAID 0+1 RAID 5 RAID 3 RAID 1 Highest Availability *Just a Bunch of Disks
4 3 RAID: one part of the solution Prior to the introduction of RAID, a disk failure meant that the processing of the disk s data stopped. Recovery entailed a lengthy operation using backup tapes and transaction journals or, even worse, re-doing work to restore current data. Enter RAID. The introduction of commercial RAID, in the form of disk mirroring (also called shadowing) changed all this. Also known as RAID Level 1, disk mirroring allows processing to continue even after a disk fails. Sophisticated mirroring systems can even use a pre-designated spare disk to restore protection without interrupting data processing. Disk mirroring also improves I/O performance for many applications because concurrent read requests can be directed to different disks. About the only drawback to disk mirroring is cost. For every byte of on-line storage required, you must purchase, house, power, cool, and attach additional bytes. And although the cost of disk storage is steadily decreasing, IS budgets are being squeezed even more rapidly. Paying twice for storage is clearly undesirable for all but the most critical data. Doesn t replacing a failed component always affect availability? Replacement of failed components in a storage system can affect data availability differently: Cold swap means that a system must be powered down before the failed component can be replaced. Cold swapping is common with logic modules and nonredundant power supplies. Warm swap means that a system must be quiesced, but need not be powered down, for component replacement. Some storage systems require warm swapping of disks. Hot swap means that a failed component can be replaced while the system containing it continues to operate. The ability to hot swap major components is a common feature of storage systems built for highavailability operations. Hot Spares refer to pre-installed spare components that are powered and ready to operate without human intervention. Many highly available storage systems provide for pre-installation of spare components that are automatically brought into service if a primary component fails. Spare disks (in RAID systems), power supplies, and fans are often provided.
5 4 Two of the RAID Levels identified by researchers at the University of California, Levels 3 and 5, have proven commercially attractive, and most RAID storage systems offer one or both of these in some form. Levels 3 and 5 use part of the disk array capacity to store parity (check data). RAID Levels 3 and 5 differ in the way they coordinate member disk activities to satisfy application I/O requests, and as a result tend to work better in different environments. In a RAID Level 3 array, the disks are physically or logically synchronized, and each contributes equally to satisfy every I/O request made to the array (called parallel-access). Storage systems that contain RAID Level 3 arrays perform well for applications that transfer large files, but less well in applications that make frequent requests for smaller amounts of data. In a RAID Level 5 array, the disks are allowed to operate independently (called independentaccess) so that in principle, the array may satisfy multiple application I/O requests concurrently. RAID Level 5 arrays perform well for either large-file I/O or transaction-like I/O as long as the majority of the I/O requests are application read requests. Application write requests, which require both user data and check data to be updated, perform less well. Mirroring RAID One member of each mirrorset provides capacity; the remaining members provide data redundancy 2-way mirrorset 3-way mirrorset Parity RAID The capacity equal to one member of the array provides redundancy for the data capacity of the remaining members of the array e f1 d c b a f e d 1 c b a e d c 1 b a f f f f e c d b 1 a e d c b a 1 e 1 c d b a Hot Spare Spares may be dedicated to a single array or available to any array in the system
6 5 Where does RAID fit in a storage hierarchy? RAID Levels 3 and 5, known as parity RAID, add a new dimension to a cost-availability-performance on-line storage hierarchy. With parity RAID, protection against single disk failure can be had for a much smaller cost premium usually 10% to 35% than with mirroring. Many storage systems allow you to choose the number of disks to be arrayed, effectively allowing you to choose the cost premium within broad limits. There is a price to be paid for this low-cost availability, however. The two forms of parity RAID, Levels 3 and 5, have very different I/O performance characteristics. Early RAID systems forced you to choose between RAID Levels 3 and 5, effectively dedicating each array to a single type of application. With today s sophisticated storage systems, you no longer need to choose between RAID Levels 3 and 5. Some new storage systems allow mixed arrays within the same storage subsystem. Others automatically and dynamically adjust between the two forms of parity RAID as the I/O load changes, and augment the RAID capability with cache to further mitigate performance differences. The resulting storage systems offer high-performance access to protected data for almost any application. Retrieving data from a failed disk Regeneration is the on-demand recreation of user data from a failed array member disk. Regeneration not only recovers data from failed disks, it can also recover data when a hard media error occurs. Data regeneration in a RAID Level 1 array consists simply of delivering an alternate copy of data. In a parity RAID array, data regeneration is performed using the parity and data from corresponding locations on the array s surviving member disks.* Reconstruction is the restoration of data protection when a replacement for a failed member disk has been made available. Reconstruction consists of block-byblock regeneration of the user and check data from the failed disk, and writing it to the replacement disk. Reconstruction is performed transparently by the array s management function, but requires considerable I/O resources, and may therefore impact application performance. Some RAID array implementations allow the user to choose whether to use all available I/O resources for reconstruction (restoring data protection as quickly as possible), or to guarantee some level of resource availability for applications. *Computed by using the exclusive OR function Parity RAID still has one drawback restoring data protection in a parity RAID array takes longer and requires more I/O resources than with mirroring RAID. Thus, while parity RAID costs less per-stored-byte than mirrored storage, it also opens a longer window of time through which data access can be lost if a second disk fails during reconstruction.
7 6 Where does RAID fit in your total storage strategy? How cache broadens RAID s appeal A cache is a solid-state memory used as an intermediate stage in the I/O path between host memory and storage devices. A read cache holds data on its way to the host; a write cache holds data on its way to the storage device. A read cache improves I/O performance by anticipating that certain data will be required soon, and holding it in the cache. A read-ahead cache anticipates that data adjacent to recently read data will be read soon, and pre-reads it. A most recently used read cache anticipates that recently read or written data will be required again soon, and holds it as long as possible. In either case, when a host requests cached data it can be delivered instantly, without waiting for disk seeking and rotation. Application requests are satisfied more rapidly, and response improves. A write cache holds data waiting to be written to storage devices. If response to an application write request is delayed until data is written to the storage device, the cache is called a write-through cache. If response occurs while data is still waiting to be written, it is a write-behind cache. Parity RAID in a highly available storage system offers a dramatic increase in cost-effective data availability. Newer RAID products also offer improved performance compared to independently managed disks. Parity RAID offers affordable protection for a far greater portion of a typical enterprise s data than would be possible if only mirroring technology were available. Because it allows greater quantities of storage to be managed as a single disk, RAID can also simplify life for your system administrator. However, RAID does not replace the need to backup data; the fact is, most data loss is caused by human error, not storage system failures, so thorough backup processes are still required. Conventional Disks (data distributed sequentially) Hot Spot ; 80/20 effect. Entire system is bottlenecked by the slowest data access. Striped array Hot Spot is spread across multiple disks, improving overall performance. Striping distributes frequently accessed data (hot spots) and the associated workload across multiple disk spindles, providing better application and subsystem performance.
8 7 Deciding if you need RAID With RAID, you can match your storage resources to your data access needs more closely than ever before. Parity RAID, in particular, creates new possibilities for you and your system administrator. If performance and availability are unequivocally your highest priorities, mirroring is your best choice. Some systems are capable of making three or more copies of data so that consistent backups can be made while protected data is available for application use. Some are also capable of striping data across multiple sets of mirrored disks for improved performance. For data that s not clearly mission-critical, you may need to analyze the cost benefits of availability. Parity RAID allows you to match the cost of protection to the value of the data. Finally, for data that meets none of the above criteria, it may be worth asking whether you need it stored on-line at all. For such data, it may make sense to migrate directly from a RAID array to a near-line or off-line storage facility. How cache broadens RAID s appeal, (continued) Write-behind cache speeds write performance, but presents a risk. If the storage system responds to a write request with a completion indication, the application may take further action based on the assumption that its data is safely stored on a disk. If, however, a system failure (most often due to a power failure) occurs before the data is actually written, then all record of the pending update is lost. Application processing may therefore be inconsistent with disk contents. To guard against this possibility, many storage systems with write cache provide auxiliary power of some kind to preserve cache contents until the system can restart and write data to media. This makes the cache non-volatile. A non-volatile write cache is called a write-back cache, since data in it can safely be written back to disk after the response to the application.
9 8 Can you afford to use RAID? More important, can you afford not to? Hardware RAID versus Software RAID The data mapping and protection algorithms that comprise RAID technology are so complex that a software implementation is a virtual necessity. Nevertheless, the term hardware RAID has come into use to describe RAID functionality that is implemented in the firmware of an intelligent storage controller. This may be contrasted with software RAID, which executes in a host computer. Hardware RAID usually offers superior I/O performance, with little or no impact on host computing capability. In general, hardware RAID requires the purchase of storage systems with that capability built in. Software RAID, on the other hand, can be much less expensive to implement using already-existing storage components, and can offer superior data availability since it can allow RAID array member disks to be spread across multiple storage systems, or in multi-host environments, even across host computers. Which RAID is the right RAID depends on whether you are adding to an existing system or installing a new system, as well as on your I/O performance and data availability requirements. You can estimate the cost of unavailable data. For example, in a 200 node network supporting telemarketers with a labor cost of $25 per hour, an hour without the data that supports these employees means $5,000 in lost labor cost. The cost of lost labor may be only the tip of the iceberg. If each employee above averages $200 per hour in sales, loss of access to the data they need to work means $40,000 in lost orders. At least some of these will be irretrievable. Even if you re not losing sales directly, loss of access to data can also impede such customer activities as service, order tracking, or customer account information. Losses of good will and customer satisfaction are difficult to quantify, but clearly they eventually can impact your success. Compare these costs to the incremental cost of adding RAID to a storage system and RAID usually emerges a clear winner.
10 9 Choosing the right RAID for you Most early RAID systems implemented one or more of the RAID technologies outlined in the University of California research without much embellishment. A by-product of this implementation style was that RAID array I/O performance was very dependent on application I/O load. A conventional wisdom grew up around RAID array I/O performance, as summarized below: RAID Level 1 performs well for a variety of applications, especially when multiple mirrored disk pairs have data striped across them. RAID Level 3 performs very well in applications that transfer large files, but poorly in transaction applications. RAID Level 5 performs very well in applications whose I/O loads consist mostly of read requests, and very poorly in applications whose I/O loads include a high proportion of writes. The result of this was that early RAID arrays had to be matched to applications as described in the table below. Today, RAID systems offer enhanced RAID capabilities. Perhaps the most important is the addition of writeback cache. Another is the ability of firmware to switch between parallel access (RAID Level 3) and independent access (RAID Level 5) as the I/O load changes. And you can improve performance and lower costs with new backplane RAID controllers, which allow you to configure an entire RAID system inside your server cabinet or expansion storage enclosure and mix RAID levels to meet application needs. The net result is that today s RAID systems are much more flexible and provide a number of performance, availability and ease-of-management features. RAID Array Type Performance Characteristics Application Environments Mirrored High read performance, both for Mission-critical applications such as (RAID Level 1) transaction and large file system disks, root master files, database applications. Minor write penalty journals, etc. compared to individual disks. Parallel Access High large file performance. Low High-volume data collection, such as (RAID Level 3) transaction performance. seismic or telemetric. Processing of large images. Batch processing of large files. Independent Access High transaction performance for Interactive transaction processing, (RAID Level 5) read-mostly I/O loads. Fairly high Multi-user file services. Generally, performance in reading large files. office environment applications. Low performance in any application that predominantly writes data.
11 10 Other types of RAID As RAID has evolved, the terminology has grown to include several other product and technology names that include the term RAID: RAID Level 6. University of California researchers identified an additional mapping and protection model that built on the original parity RAID work by incorporating two independent parity schemes. Two variations of this ultimately emerged, but with the same net effect: RAID Level 6 provides protection against the failure of any two disks in an array. Its cost/performance factors have not made it a popular contender against simple parity RAID. RAID Levels 2 and 4. These RAID Levels are not widely used, because other levels provide comparable benefits at a lower cost. RAID Level 0. University of California research identified a striped mapping of data which maps virtual disk block addresses to member disk block addresses in a regular repeating pattern called striping. This form of data mapping improves the I/O performance of a non-raid disk array by balancing the I/O load across all of the array s disks. Called RAID Level 0, it s not true RAID, because it doesn t provide data redundancy or protection. As RAID has become popular, vendors seeking to identify with it, yet establish uniqueness, have created their own variations, using terms such as RAID 53, RAID Level 7, and RAID 10 (also called RAID 0+1). In general, these terms refer either to combinations of the seven basic models or to their combination with other technologies such as cache or parallel, asynchronously operating processes within an intelligent controller.
12 11 Summing up RAID s single greatest benefit is enhanced data availability. Remember, though, that RAID only protects against disk failure. For truly high availability, the entire storage system must be engineered from the ground up to include such features as redundant power and cooling, redundant controllers with failover capability and host support, and hot-swappable major components. The availability of cache and of parity RAID that dynamically adapts to I/O load changes puts to rest much of the earlier discussion surrounding the I/O performance of RAID arrays and the sensitivity to application I/O load characteristics. This brings the benefits of RAID to a much wider set of applications. Technology advancements and decreasing costs have combined to make RAID an affordable, effective part of a total solution that assures your data is always accessible. StorageWorks RAID Subsystems The following RAID subsystems are available from Digital today: RAID Array 410 Available for UNIX-based systems. Supports Hewlett-Packard HP-UX, IBM AIX, and Sun Microsystem SunOS and Solaris environments. RAID Array 230 Available for Intel and Alpha PCI bus systems. Supports Digital UNIX, OpenVMS, Windows NT, Novell Net- Ware, and SCO UNIX operating systems. RAID Array 210 Available for Intel and Alpha EISA bus systems. Supports Digital UNIX, OpenVMS, Windows NT, Novell NetWare, and SCO UNIX operating systems. StorageWorks RAID Controllers You can configure your own RAID subsystem with the following RAID-capable controllers: HSZ Array Controllers Support fast-wide differential SCSI; OpenVMS, Digital UNIX and NT. Combines the best of RAID 3 and 5 by dynamically adapting between I/O-intensive and data transfer-intensive applications. HSJ Array Controllers Support CI; VMS only. Combines the best of RAID 3 and 5 by dynamically adapting between I/O-intensive and data transfer-intensive applications. HSD Array Controllers Support DSSI; VMS only. Combines the best of RAID 3 and 5 by dynamically adapting between I/O-intensive and data transfer-intensive applications. StorageWorks RAID Software RAID Software for OpenVMS Supports RAID Levels 0 and 5.Can be used in conjunction with Volume Shadowing for OpenVMS to gain Level 0+1. Volume Shadowing for OpenVMS Supports RAID Level 1. Can be used in conjunction with RAID Software for OpenVMS to gain Level 0+1. Logical Storage Manager for Digital UNIX Supports RAID Level 0, Level 1, and Level 0+1.
13 12 A RAID System Checklist As RAID systems have matured, it s become much easier for you to choose. Most modern RAID systems integrate RAID with other technologies to provide good across-the-board I/O performance and high data availability for substantially lower cost than fully mirrored storage. You want to be sure that the RAID systems you consider incorporate and effectively use these advanced features. The following checklist can help you evaluate RAID systems. Basic Function: Does the RAID system provide the necessary data mapping and protection models for the anticipated usage? Mirroring (RAID Level 1) for mission-critical data? Parallel access (RAID Level 3) for large file applications? Implementation of mixed arrays? Availability Features: Can the RAID system s data protection be tuned to meet precise application requirements? Are the minimum and maximum parity arrays sizes adequate? Does the RAID system support the required range of disk capacities? Does the RAID system support simple disk striping? Does the RAID system support multiple arrays of different types operating concurrently? Failure Recovery Features: Does the RAID system provide adequate features for recovering from component failures? Can spare disks be pre-designated so that unattended restoration of data protection after a disk failure is possible? Is hot swapping of disks, power supplies, fans, and any other components critical to operation possible? Does the RAID system support dual-redundant controllers and can one controller assume the other s workload in the event of a failure? Does the RAID system incorporate an environmental monitoring unit that can provide warning of dangerous conditions such as high temperature and low power? Does the intended hosting environment support the failure recovery features? I/O Performance Features: Does the RAID system effectively exploit the basic technologies to provide superior I/O performance that is independent of I/O load characteristics? Does the RAID system incorporate some technique such as write-back cache to alleviate the parity RAID update performance shortcomings? Can the RAID system dynamically switch between parallel and independent access update algorithms according to the I/O load? Does the RAID system allow the system administrator to balance I/O resources between reconstruction and application requirements?
14 Digital believes the information in this publication is accurate as of its publication date; such information is subject to change without notice. Digital is not responsible for inadvertent errors. Digital conducts its business in a manner that conserves the environment and protects the safety and health of its employees, customers, and the community. Digital, Digital Equipment Corporation, StorageWorks, the StorageWorks logo, OpenVMS and Digital UNIX are trademarks of Digital Equipement Corporation. Hewlett-Packard and HP-UX are registered trademarks of Hewlett-Packard Company. IBM and AIX are registered trademarks of Inernational Business Machines Corporation. Intel is a trademark of Intel Corporation. Microsoft is a registered trademark of Microsoft Corporation.NT is a trademark of Microsoft Corporation. Novell and NetWare are registered trademarkes of Novell, Inc. RAB is a certification mark of the RAID Advisory Board, St Peter, MN, SunOS is a trademark of Sun Microsystems, Inc. Solaris is a registered trademark of Sun Microsystems, Inc. SCO is a trademark of Santa Cruz Operations, Inc. UNIX is a registered trademark licensed exclusively by X/Open Company Ltd. Copyright 1995 EC-G Digital Equipment Corporation. All rights reserved.