Technology White Paper Fibre Channel Optimizing Data Storage Performance for Fibre Channel Networks Abstract Many high-speed, real-time systems acquire data from specialized sensors and then store that data for future processing. Due to the high data rates typical of these Sensor-to-Storage (STS) systems, a high-performance Fibre Channel network is often used for moving the data to storage. The advantages of Fibre Channel, however, can be compromised by various elements within the host s Operating System (OS). One such element is the OS host File System (FS), which is used to organize and track files, and provide other user conveniences. The file system consumes host resources, can slow down system speed, and it can lead to nondeterministic performance in which data from the highspeed sensors may be permanently lost. This paper describes an optimized protocol for Fibre Channel, Curtiss-Wright Controls Electronic Systems FibreXpress Raw Initiator (RI). It bypasses the file system of the OS, thereby helping to enhance Fibre Channel performance for STS applications. Curtiss-Wright Controls RI protocol also enables the development of a deterministic system which is readily transportable among a number of different hardware and software environments. Examining a Typical Sensor-To-Storage System Many applications require the acquisition and storage of data that is received from specialized sensors. Examples include medical imaging; video-on-demand; Figure 1: Example Sensor-to-Storage System video editing; digital communications; telecom; and radar, sonar and infrared scanning applications. Figure 1 illustrates such a Sensor-to-Storage (STS) system. STS applications import data from a sensor via a special Host Bus Adapter (HBA), and then export the data to a storage device. Data is stored so that it can be processed either right away (e.g., displayed) or at a much later time (e.g., end of test flight). The data is typically both continuous (at least for a defined period) and copious (50 to 200+ MB/s). Page 1
Many modern STS systems are using Fibre Channel (FC) for the storage solution. FC was originally defined for storage, and it provides the advantages of an industry-standardized technology, including cost effectiveness and market-driven technology advancement. FC allows the link to operate faster (1 Gb/s and 2 Gb/s) than other widespread storage technologies like SCSI or IDE drives. Figure 1 shows HBAs plugged into the PCI bus on a generic Single-Board Computer (SBC). SBCs are common in industrial-control and military applications, but a conventional desktop PC or workstation might serve as the host in other applications. The HBAs provide the interface to the various data sources and storage destinations. The storage subsystem illustrated in the example uses both a FC HBA and a FC disk drive. While FC has many advantages for STS applications, FC performance can be compromised by various elements within the system. This paper will explore where those compromises may exist, and how to improve or eliminate them. In particular, the discussion will highlight data storage activities as a key area where FC system performance can be improved for STS applications. What Is a File System? To understand how FC can limit the performance of an STS system, it is necessary to examine in more detail how data is placed into storage. In a conventional (non-sts) computer system, data storage is performed through the use of data files and a file system. even be integrated into an OS and provide additional capabilities like backup or security features. A familiar example of a FS is the one used by Windows on workstations, home PCs, or laptop computers. In Unix -based systems, such as Linux, the basic concept behind the file system is the same, although the implementation details differ. A FS manages the storage of files by name, type, size (MBytes), location, date, and permissions. Windows stores this information in a File Allocation Table (FAT) before writing to disk and it must check the FAT prior to reading from disk. Because an OS always provides a FS to organize and track files, and because its operation is so transparent to the user, it can be easy to take the FS for granted and not even realize it s there. Conversely, because the FS plays such a fundamental role in any PC or OS, it can be difficult to imagine a computer system or network without one. For STS systems, however, eliminating the FS can be a key to improved performance. This is because the functions and capabilities provided by a FS consume significant system resources. The alternative to the use of a FS is to store the data in a raw format. A raw format allows the reading and writing of continuous data to be performed under the direct control of the system engineer and his/her application--without any unnecessary functions or undesirable overhead which can reduce overall performance. A file is a collection of data or information that has a name (i.e., filename). Almost all information stored in a computer must be in a file. There are many different types of files, each storing different types of information. Familiar examples include executable files, data files, text files, and directory files. A FS is a method of storing and organizing these individual files so that they can be more easily retrieved by the user. The OS always provides a FS to organize and track files, typically using file directories to organize files into a tree structure. Separate FS can Page 2
The Data Storage Requirements of an STS System An STS system, due to its unique need to handle large quantities of data at high-speeds in real-time requires a data storage system with characteristics which would not be critical in a desktop computer or a computer network in a standard office setting. These characteristics are of critical importance, however, in a high-speed Sensor-To-Storage system. Low Latency, Deterministic STS systems often have a number of sensors gathering and sending enormous quantities of data in real-time. As a result, there are many time-critical operations that can t wait for slower system components and actions. These operations may include getting data to storage without delay, so that the processing can occur in realtime. Or, it may be necessary to know exactly when the data write is completed, so that the system can be reconfigured. Whatever the application, it is essential that the data be written quickly and without delay. Because of this time-critical nature, STS systems require that data storage operations have low latency (short time delays) and that they be deterministic (it is well-known in advance when the data will be written). By contrast, an example of a system that is not deterministic is a typical desktop PC. When a user saves a word processing file, there can be a delay of several seconds until the file is actually written to disk. And, this delay can be different every single time, depending on what other activities the PC is undertaking at the moment the user saves the file. In an STS system, these unpredictable delays virtually guarantee the loss of sensor data, and this unpredictability must be eliminated. medium happen to be running under different OSs or using different architectures, the use of a conventional FS renders this problem virtually insoluble. The solution is to avoid the use of a FS in the first place. Maximum Storage Bandwidth An STS system needs as much bandwidth as possible. This is necessary to preclude data loss no matter how fast the data is being collected--even when the system has multiple sensors, each producing hundreds of megabytes of data per second. Extra bandwidth also allows data to be read concurrently without affecting the data storage process. Why the File System Interface Is Undesirable for STS Applications With an understanding of the storage system requirements for an STS system, it will now be shown how the various components of a typical file system are undesirable for STS applications. Cache Memory In a non-sts system, it is often beneficial to have data cached from a storage device, if the data is read or written frequently. Caching can improve the speed of storage operations in a non-sts system. However, caching data is unnecessary in an STS system because data is written just once, and read at a later date. STS systems do not have the repetitive Figure 2: Effect of Bandwidth on Data Storage Concurrent Access In some STS systems, it is necessary for data to be processed as soon as it is stored. This processing may need to be accomplished with--or may even require-- more than one system to communicate with the storage medium. This need for concurrent data access can severely reduce the set-up and configuration options available to the designer of the STS system. And if the various systems communicating with the storage Page 3
reading and/or writing of data which makes caching beneficial. In addition, since caching is limited to the size of physical RAM available, it would be of limited practicality and effectiveness on the very large data transfers which are typical of an STS system. A caching system holds data to be written (stored) until it decides the time is correct. Not only does this increase the latency of the system (delaying writes), it also makes the system non-deterministic by taking control of when the writes will occur away from the system designer. As stated previously, the optimum storage system for an STS must have the lowest possible latency, and it must be deterministic. Caching is clearly at odds with these two important design goals. Interface Limitations Most OSs are designed to support only a few different file systems. They are designed this way to make it easy to move files between similar systems. This support for only a small number of file systems becomes a problem in STS applications if concurrent access to the data is required by different architectures and operating systems. File Management An FS uses the storage medium in a dynamic manner: writing, moving, and deleting data. This often leaves various-sized holes in the storage medium. The FS will compensate for this by breaking a file into sections that fit into these holes, in a process known as fragmentation. Fragmentation adds to seek times because the data is not contiguous, and the head of Page 4 Figure 3: RI Contiguous Data Storage the disk drive must spend time moving to the different physical locations where each fragment of the file is located. This can adversely affect latency and throughput if the fragmentation is severe enough. An STS system will typically read and write data contiguously, so that having the FS handle things dynamically is not only unnecessary, but detrimental to the system. A High-Speed Solution Which Bypasses the File System One method of implementing a high-speed STS system which avoids a file system is through the use of an optimized protocol with appropriate supporting drivers and HBAs. One such system is the FibreXpress Raw Initiator (RI) protocol, drivers, and FX100 and FX200 Host Bus Adapters. RI allows applications to access the disk drives without using a file system, while providing the necessary data-storage control for an STS system. Bypassing the file system gives the application total control over the transactions performed on the disks, and eliminates any overhead that could be inserted by a file system. Functions built into RI complete DMA transactions to and from an FC storage device. The result is deterministic transfer of continuous high-speed data, suitable for STS applications. In a heterogeneous real-time system, a large amount of custom software would typically be required for each different computer/operating system to directly access disks. The RI application-programming interface
(API) dramatically reduces this need for custom programming, however, thereby allowing applications to be easily ported from platform to platform. RI has been ported successfully to hardware platforms including x86, Sparc, and PowerPC, and to software environments including Linux (PowerPC and x86), Windows NT/2000, Solaris, and VxWorks. It is important to note that a file system bypassing technology such as RI is essential for high-performance, but this one technology alone will not completely optimize overall system performance. To obtain optimum performance, it is necessary to eliminate redundant memory operations on the SBC after the initial storage command is issued. There are additional potential bottlenecks to FC system performance, as discussed in the following section. Other Fibre Channel Performance Limitations for STS Systems While bypassing the OS File System eliminates a primary limitation to FC performance for STS systems, other potential limitations must be considered. These include devices such as the FC link, FC disks, the sensor HBA, the PCI bus, CPU speed, and SBC architecture. The FC link Fibre Channel system components are available for operation at 1.0625 Gb/s or 2.125 Gb/s, and they provide maximum throughputs of approximately 102 MB/s and 204 MB/s respectively. The faster of the two speeds, 2.125 Gb/s Fibre Channel, may be required for certain applications. While 2.125 Gb/s Fibre Channel HBAs are widely available in the PCI form factor used in workstations and desktop computers, not many are available in the smaller PMC form factor suitable for SBCs. For STS systems, a critical hard drive parameter is the speed at which they can store data. The minimum data transfer rate necessary for an FC drive to avoid creating a bottleneck in an STS system is 36 MB/s (sustained rate). This assumes a 2 Gb/s FC network, with six disks in an array: 6 disks x 36 MB/s each = 216 MB/s total. In addition to providing the necessary bandwidth, a disk array such as a RAID provides fault tolerant functions. The overhead necessary for this capability will reduce throughput slightly. However, this theoretical 216 MB/s figure for the FC drive array should still be sufficient for the 204 MB/s maximum throughput of a 2 Gb/s FC link. Sensor HBA The sensor HBA must be able to able to transmit data faster than the sensor can produce it. Curtiss-Wright Controls FibreXtreme Serial FPDP HBA s are examples of sensor interfaces. The SL100 version operates at 1 Gb/s and can transfer data at nearly 106 MB/s due to fine-tuning of the link protocol. The SL240 version operates at 2.5 Gb/s and can transfer data at nearly 247 MB/s realistically. The 247 MB/s rate exceeds the 2 Gb/s FC rate (204 MB/s) at face value. However, neither rate is absolutely guaranteed in every OS on every host platform. Each system is different. PCI Bus Bandwidth The bandwidth of the host computer s PCI bus can also limit FC performance. PCI buses can be implemented with either 32-bit or 64-bit width, and transfer speeds of either 33 MHz or 66 MHz. As a worst-case scenario, consider a generic PC with a 32-bit/33 MHz The FC Disks Fibre Channel disk drives are designed for operation at 1.0625 Gb/s or 2.125 Gb/s, and many have the ability to automatically configure themselves to the correct speed ( auto negotiate ). These auto negotiating drives should be specified if operation at either or both FC data rates is anticipated. Page 5
PCI bus. This bus would provide a maximum data transfer rate of 132 MB/s, and would therefore be a limiting factor for a 2.125 Gb/s FC network (with a maximum potential bandwidth of 204 MB/s). Many PCs, workstations, and SBCs are available today with 64-bit buses. Although still at 33 MHz, a 64-bit bus increases bandwidth to 264 MB/s. This is sufficient for a 2.125 Gb/s FC network. Even more bandwidth is available from workstations with 64-bit /66 MHz PCI buses. And the newer PCI-X specification allows speed enhancements to 100 MHz and beyond. The FX200 boards support speeds as high as 133 MHz. However, the designer must keep in mind that few SBCs are available with a 66 MHz PCI bus, much less a (100 MHz-plus) PCI-X bus. CPU Speed CPU speed is well recognized as an important contributor to computer system performance. While a faster CPU is invariably better, as we ve seen in this paper, there are many other bottlenecks and limitations which can significantly compromise the throughput of an STS system. Conclusions To reliably store data coming from high-speed sources using FC components, the following techniques should be employed: Bypass the File System built into the OS. Perform PCI-to-PCI DMA without CPU and memory intervention Use a high performance FC HBA and storage device. Use a high-speed HBA to input the data from the source and to provide direct PCI data transfer. If all these measures are taken, it is possible to develop a system which provides deterministic transfer of continuous high-speed data, even for the most demanding STS application. Product specifications mentioned herein are subject to change without notice. FibreXpress is a registered trademarks of Curtiss-Wright Controls Electronic Systems. All other trademarks or registered trademarks mentioned herein are the sole property of their respective owners. 2004, Curtiss- Wright Controls Electronic Systems, All Rights Reserved. Copyright 2009, Curtiss-Wright Controls Electronic Systems All Rights Reserved. MKT-ES-Storage-for-FC-Networks-010510v2 Page 6