Performance of optimized software implementation of the iscsi protocol
|
|
- Barnaby Taylor
- 8 years ago
- Views:
Transcription
1 Performance of optimized software implementation of the iscsi protocol Fujita Tomonori and Ogawara Masanori NTT Network Innovation Laboratories 1-1 Hikarinooka Yokosuka-Shi Kanagawa, Japan Abstract The advent of IP-based storage networking has brought specialized network adapters that directly support TCP/IP and storage protocols on the market to get comparable performance that specialized high-performance storage networking architectures provide. This paper describes an efficient software implementation of the iscsi protocol with a commodity networking infrastructure. Though several studies have compared the performances of specialized network adapters and commodity network adapters, our iscsi implementation eliminates data copying overhead unlike straightforward iscsi implementations used in previous studies. To achieve it, we modified a general-purpose operating system by using techniques studied for improving TCP performance in the literature and features that commodity Gigabit Ethernet adapters support. We also quantified their effects. Our microbenchmarks show, compared with a straightforward iscsi driver that does not use these techniques, the iscsi driver with these optimizations reduces CPU utilization from 39.4% to 3.8% when writing with an I/O size of 64 KB. However, when reading, any performance gain is negated due to the high cost of operations on the virtual memory system. I. INTRODUCTION To cope with the rapidly growing volume of data, many companies have started to adopt new storage architectures, in which servers and storage devices are interconnected by specialized high-speed links such as Fibre Channel. The new storage architectures, called storage networking architectures, have superseded traditional storage architectures, in which servers and storage devices are directly connected by system buses. Storage networking architectures make it easier to expand, distribute and administer storage than with the traditional storage architectures. With advances in Ethernet technology, more advanced storage networking technology, i.e., the iscsi protocol [1] is on the horizon. It encapsulates a block-level storage protocol, that is, the SCSI protocol into the TCP/IP protocol, and it carries packets over IP networks. The iscsi protocol has some advantages over other storage networking architectures: it is based on two well-known technologies, that is, the SCSI protocol and the TCP/IP protocol, which have been used for many years. Integrating storage networking with mainstream data communications is possible by using Ethernet networks. Furthermore, many engineers who are familiar with these technologies may provide economic and management advantages. The authors current affiliation is NTT Cyber Solutions Laboratories. Though the iscsi protocol is fully compatible with commodity networking infrastructures, specialized network adapters, such as TCP Offload Engine (TOE) and Host Bus Adapter (HBA), exist on the market. Vendors claim that they can obtain comparable performance with existing specialized storage networking architectures by offloading the overheads of IP-based storage networking. A TOE adapter offloads the TCP/IP protocol overhead on the host CPU by directly supporting the TCP/IP protocol functions. Furthermore, an HBA adapter offloads the TCP/IP protocol and data copying overheads by directly supporting the TCP/IP protocol and storage protocol functions. Several studies have compared the performance of specialized network adapters and commodity Gigabit Ethernet adapters. This paper provides an analysis on how high a level of performance the iscsi protocol can provide without having specialized hardware. We made changes to the Linux kernel to get the best performance with commodity networking infrastructures by using techniques studied for improving TCP performance in the literature. We also used three features that commodity Gigabit Ethernet adapters support, i.e., checksum offloading, jumbo frames, and TCP segmentation offloading (TSO). Unlike the straightforward iscsi implementations used in previous studies, ours avoids copying between buffers in the virtual memory (VM) system and those in the network subsystem. This paper focuses on the implementation of an initiator, i.e., a client of a SCSI interface that issues SCSI commands to request services from a target, typically a disk array, that executes SCSI commands. However, most of the discussion is applicable to a target in the case it is implemented by software like some iscsi appliances on the market. The outline of the rest of this paper is as follows. Section II summarizes related work. Section III covers the issues related to optimizations for to achieve the high performance, and discusses some of the detailed implementations. Section IV presents our performance results. And then Section V summarizes the main points to conclude the paper. II. RELATED WORK Some performance studies of the iscsi protocol have been conducted in the past. Sarkar et al. [2] evaluated the performance of a software implementation and those of implementations with TOE and HBA adapters. Stephen Aiken
2 et al. [3] contrasted the performance of a software implementation with that of Fibre Channel. Wee Teck Ng et al. [4] investigated the performance of the iscsi protocol over wide area networks with high latency and congestion. Unlike our iscsi implementation, their software implementations are straightforward, that is, they are based on unmodified TCP/IP stacks, copy data between buffers in a VM system and those in a network subsystem in an operating system, and do not exploit some of the features that the network adapter supports. We cannot directly compare the performance that we obtained with those obtained from their studies on TOE adapters, HBA adapters, and Fibre Channel, due to differences in the experimental infrastructure. However, we believe that the indirect comparison between their studies and ours may provide some useful information. There are a number of papers that point out that TCP performance on high-speed links is limited by the host system and study several optimization approaches. Chase et al. [5] outline a variety of approaches. In this paper, we adopt a technique called page flipping or page remapping to avoid copying, and evaluate how much it affects the performance of optimized software implementation of the iscsi protocol. Unlike many studies [6] [8] on TCP performance, this paper does not address the avoidance of copying data between the kernel and user address spaces. Instead, it focuses on the avoidance of redundant copying between buffers in several subsystems in an operating system. That is, our goal is singlecopy from the application point of view. Even if using techniques studied in the literature to avoid copying data between the kernel and user address spaces may possibly result in better performance, making the techniques work with the iscsi protocol requires extensive modifications to subsystems in an operating system, such as file systems, a memory management architecture and a network subsystem. TCP overhead on the host system is mostly inherited from its design. It leads to many newly low overhead networking protocols, called user level protocol or light weight protocol [9] [11]. These works result in specialized file systems such as DAFS [12]. These protocols that are free from negative inheritances of the TCP protocol can achieve better performance. Furthermore, to take advantage of the merits of both new protocols and TCP protocol, integrating them with TCP protocol has been addressed. Commonly cited disadvantages of IP-based storage networking over existing specialized storage networking architectures are the TCP/IP protocol and data copying overheads. They can decrease the performance of applications by consuming CPU and memory system bandwidth. Though general SCSI host bus adapter drivers directly move data between buffers in a VM system and the target, straightforward iscsi drivers copy data between buffers in a VM system and those in a network subsystem. We were able to find four implementations of an iscsi initiator [13] [16], which are licensed under open source licenses [17], for the Linux operating system. All implementations copy data between the VM cache and the network subsystem. However, the copying can be avoided in some cases. We discuss techniques for copy avoidance and the issues of implementing them in the Linux kernel in the following sections. Operating System Software Components General SCSI driver SCSI Host Bus Adapter SCSI cable Initiator tual Memory system SCSI subsystem System Bus Target Disk drive, Tape, etc Fig. 1. VM Cache iscsi driver Network subsystem Buffer Network Adapter Network cable SCSI driver architecture. Extra Data Copy III. OPTIMIZATION TECHNIQUES Figure 1 shows the difference between the architectures of general SCSI drivers and iscsi drivers, and data flow inside an operating system 1. In general, an iscsi initiator is implemented as a SCSI host bus adapter driver. However, unlike general SCSI drivers interacting with their own host bus adapter hardware, an iscsi driver interacts with a network subsystem. 1 We assume that cached file system data are managed by a virtual memory system, as in the Linux kernel. A. Network adapter support Though commodity Gigabit Ethernet adapters cannot directly process the TCP/IP protocol and the iscsi protocol unlike specialized network adapters, they provide some features that can be useful for offloading the TCP/IP protocol and data copying overheads, which IP-based storage networking has. We use the following three features. Checksum offloading. Calculating checksums to detect the corruption of a packet is an expensive operation. Checksum calculation on a network adapter instead of the host CPU can improve performance, and it is common
3 today. Six out of nine kinds of chipsets for a Gigabit Ethernet adapter that the Linux kernel version 2.6.-test1 supports provide this feature. Checksum offloading works on both the transmitting and receiving sides 2. Jumbo frames. Large packets can improve performance by reducing the number of packets that the host processes. However, standard IEEE 2.3 Ethernet frame sizes allow the MTU size to be 15 bytes at the most. Nowadays most Gigabit Ethernet adapters support jumbo frames, that allow the host to use the larger MTU size (typically up to 9 bytes). Seven out of nine kinds of chipsets for a Gigabit Ethernet adapter that the Linux kernel version 2.6.-test1 supports provide this feature. Jumbo frames are mandatory to achieve copy avoidance when reading data. TCP Segmentation offloading. A network subsystem must break a packet whose size is larger than the TCP segment size into smaller pieces at the TCP layer before it is passed to the IP layer. TCP Segmentation offloading (TSO) can improve performance by executing some parts of this work on a network adaptor, and is relatively new. Two out of nine kinds of chipsets for a Gigabit Ethernet adapter that the Linux kernel version 2.6.-test1 supports provide this feature. Though copy avoidance is possible without TCP Segmentation offloading, it can improve the performance when writing large data. B. Write The avoidance of copying when data travel from an initiator to a target (i.e., write) is easier than when data travel from a target to an initiator (i.e., read). Instead of copying data from the VM cache into buffers in the network subsystem, our iscsi driver hands over references to buffers in the VM system to the network subsystem when writing. In the Linux kernel, an sk buff structure is used for memory management in the network subsystem. It is called skbuff, and is equivalent to an mbuf structure in the BSD lineage. An skbuff holds not only its own buffer space, but also references to a page structure describing a page in order to support avoiding redundant copying between the VM cache and skbuffs. Moreover, the Linux kernel also provides interfaces called sendpage interfaces to use these data structures. These interfaces pass references to page structures describing pages in the VM system to the network subsystem. And, a network adapter driver receives skbuffs holding the references from the network subsystem, and then the contents of the referred pages are transferred to the wire. By using these interfaces, our iscsi driver can avoid copying data between the VM cache and skbuffs without modifications to the Linux kernel. Our iscsi driver performs the following operations when writing. 2 Some network adapters can compute checksums on only transmitting packets. 1) The SCSI subsystem receives references to page structures describing dirty pages 3 from the VM system, and hands over them to the iscsi driver. 2) The iscsi driver associates these references with skbuffs for iscsi packets by using sendpage interfaces. 3) References are passed on to a network adapter driver and the contents of referred pages are transmitted by using direct memory access (DMA). Unlike the avoidance of copying data between the kernel and user address spaces, the iscsi driver does not require pinning and unpinning of transferred pages during DMA transfers, because the VM system increases a usage count of these pages. It guarantees that the VM system does not reclaim these pages until the iscsi command related them finishes. To get the best performance improvement from copy avoidance, a network adapter has to provide to calculate data checksums during DMA transfers. Without this feature, the performance gain from copy avoidance becomes negligible [5], because in the Linux kernel the TCP stack integrates calculating the checksum on data with copying the data. 1) iscsi digest: Without modifications to the Linux kernel, it is possible to avoid copying when writing data in most cases. However, this scheme for copy avoidance does not work in the case the data digest feature that the iscsi protocol supports is enabled. The iscsi protocol defines a 32-bit CRC digest on an iscsi packet to detect the corruption of iscsi PDU (protocol data unit) headers and/or data because a 16 bit checksum used by TCP to detect the corruption of a TCP packet was considered to be too weak for the requirements of storage on long distance data transfer [18]. These digests are called header digest and data digest respectively. When these digests are used, the iscsi driver performs in a slightly different way. 1) The SCSI subsystem receives references to dirty pages from the VM system, and hands over them to the iscsi driver. 2) The iscsi driver associates these references with skbuffs for iscsi packets by using sendpage interfaces. 3) If the header digest feature is enabled, the iscsi driver computes a digest on an iscsi PDU header. 4) If the data digest feature is enabled, the iscsi driver computes a digest on an iscsi PDU data, that is, the contents of dirty pages. 5) References are passed on to a network adapter driver and the contents of referred pages are transmitted by using DMA. Note that the Linux kernel permits modifying the contents of a page that are being written to persistent storage. Therefore, a process or a subsystem can modify the contents of a page at any time during all of the above operations. If the contents of a transferred page are modified between the fourth and fifth operations, a target sees the combination 3 A dirty page means a page whose contents are modified in the VM system that must be written to persistent storage.
4 of the data digest on the old contents and the new contents. Thus, the verifying of the data digest fails on the target. In this case, the iscsi driver must compute again a data digest on the new contents of the page before a network adapter transmits them. However, the iscsi driver has no chance to do it. Therefore, the iscsi driver must keep a transferred page untouched after computing a data digest on the page until the iscsi command related to the page finishes. A synchronous scheme to avoid such a situation is necessary to enable our copy avoidance scheme to work with the data digest feature. The current version of our iscsi driver simply copies data between the VM cache and skbuffs instead of using sendpage interfaces in the case the data digest feature is used. On the other hand, the header digest feature always works with our scheme to avoid copying, because a header of an iscsi PDU can be modified only by the iscsi driver. C. Read With regard to reading data, the avoidance of copying can be more problematical. This is because an iscsi driver has to place incoming data from a network subsystem at arbitrary virtual addresses specified by a VM system without copying the data. An HBA adapter processing iscsi and network protocols can determine the virtual addresses chosen by a VM system corresponding to incoming data, and directly place the data in the appropriate places. Thus the driver does not have to do anything with it. On the other hand, a commodity network adapter cannot process these protocols. Therefore, it cannot directly place incoming data at the virtual addresses specified by a VM system. One technique to avoid copying data with a commodity network adapter is page remapping. It changes address bindings between buffers in a VM system and a network subsystem. This scheme is widely used in research systems; however, it has some restrictions. The Maximum Transfer Unit (MTU) size must be larger than the system virtual memory page size. The network adapter must place incoming data in a way that the data that a VM system requires are aligned on page boundaries after removing the Ethernet, IP, TCP, and iscsi headers. Page remapping requires the high-cost operations of manipulating data structures related to a memory management unit (MMU) and flushing an address-translation cache, called a translation lookaside buffer (TLB). These operations are particularly slow in symmetric multiprocessor (SMP) environments. The first issue means that the technique does not work with standard IEEE 2.3 Ethernet frame sizes, and thus jumbo frames are mandatory. The requirement in the second issue can be achieved with some modifications to a general-purpose operating system. We explain these modifications in subsequent paragraphs. The operations in the last issue can be avoidable with data movement between buffers in a VM system and those in a network subsystem 4, though such operations are mandatory with copy avoidance between the kernel and user address spaces. The reason for this is that any kernel virtual address into which a page in a VM system will do in UNIX operating systems as long as the page is mapped. In general, pages in a VM system are temporarily mapped into the kernel virtual address space. When a process or a subsystem reads or writes the contents of a page, the page is mapped into a kernel virtual address. When it finishes the operation, the page is unmapped. The page can be mapped into another kernel virtual address next time. That is, a process does not directly see the kernel virtual address into which a page is mapped. This scheme does not work if a page in the VM system is mapped into user virtual addresses, e.g, if files or devices are mapped into user virtual addresses by using the mmap system call, high-cost MMU operations are inevitable. In this case, a process directly sees user virtual addresses, and thus a new page must be mapped into the same user virtual addresses. Our iscsi driver requires some modifications to two parts in the Linux kernel: the memory management in the network subsystem and the VM cache management. 1) Memory management in the network subsystem: The alloc skb function does not allow allocating space for the skbuff with arbitrary alignment. Thus, we slightly modified the function and an sk buff structure. A network adapter allocates all skbuffs in expectation of receiving an iscsi packet including a response to a READ command. Therefore, after the network subsystem and the iscsi driver finish processing the protocol headers, iscsi PDU data, that is, request data stored on the disk of the target, show up on page boundaries. This scheme can only be applied to the first TCP packet in the case that one iscsi PDU consists of several TCP packets. This is because the length of the header of the first packet is different from that of the rest. Therefore, after using page remapping for the first TCP packet, our iscsi driver copies data in the rest of the packets. 2) VM cache management: Figure 2 shows how the copy avoidance scheme works after the iscsi driver finishes processing an iscsi PDU header. Figure 2(a) represents data structures just before page remapping. One page structure A describes a page A in the VM system and one page structure B describes a page B used for the space for an skbuff. Figure 2(b) shows the structures after page remapping. The page structure A describes the page B in the VM system. The page structure B and the page A are not used, and the sk buff structure is freed. If necessary, the iscsi driver performs MMU operations to change the address bindings between these pages. This scheme allows the contents of the page that page structure A describes to be updated without copying. Before page remapping, when a process or a subsystem tries to read 4 The word page remapping means changing address bindings by performing MMU operations. Therefore, page remapping may not be the appropriate name for the scheme to avoid copying between buffers in a VM system and those in a network subsystem.
5 VM cache VM cache address_space page frame A address_space page frame A page structure A page structure A socket buffer sk_buff page frame B page frame B page structure B page structure B (a) Before page remapping (b) After page remapping Fig. 2. VM cache and skbuff management. the contents of the page that page structure A describes, the page A is mapped into the kernel virtual address space, and then they see the contents of a page A. After page remapping, the page B is mapped into the kernel virtual address space, and then they see the contents of a page B. To implement our scheme, we made two modifications to the Linux VM system. First, in the modified VM system, the relationship between a page structure and the page that it describes is dynamic. In the Linux VM system, a page structure is allocated for every page frame when the system boots, and the relationship between them is static. We add the index of a page frame to a page structure to enable it to describe any page. This modification increases the overheads of some operations related to memory management. For example, it becomes complicated to know the kernel virtual address into which a page is mapped by using a page structure describing the page. It also becomes complicated to know a page structure describing a page by using the physical address of the page. To keep modifications to a minimum, we implement page remapping by using a special kernel virtual address space, called high memory, which is used for a host system with a large amount of physical memory. Though the original Linux VM system maps most pages into the kernel virtual address space all the time, some pages are temporarily mapped into the high memory address space as the need arises. In the Linux VM system, pages are mapped into the kernel virtual address space starting from 3 GB with Intel 32 bit architecture. Without high memory, therefore, about only 1GB of the main memory at most can be used. The iscsi driver expands this address space and uses some amount of the physical memory exclusively for page remapping. Secondly, in the modified VM system, a page structure holds the reference to another page structure as shown in Figure 2(b). This extension allows these page structures to be restored to their original state when the page is reclaimed. The old page is not freed until the new page is freed, or replaced with another page. D. Another potential solution Another potential solution to avoid copying when reading would be to directly replace a page structure with another. As shown in Figure 2(a), we can directly replace a page structure A with a page structure B. This solution requires extensive modifications to the Linux kernel, because a page structure contains various data structures that subsystems use. Take for example the case where a process is waiting for the contents of a page to be updated by using data structures in the page structure that describes the page. If the page structure is replaced, the operating system must guarantee that the replaced page structure remains untouched until all processes that use it to wait for the contents resume. IV. PERFORMANCE We performed microbenchmarks to get the basic performance characteristics of our iscsi driver. A. Experimental infrastructure Initiator The Dell Precision Workstation 53 uses two 2 GHZ Xeon processors with 1 GB of PC RDRAM main memory, an Intel 8 chipset and an Intel Pro/1 MT Server Adapter connected to a 66-MHz 64-bit PCI slot. The network adapter supports on both the transmitting and receiving sides, jumbo frames, and TSO. The initiator runs a modified version of the Linux kernel version and our iscsi initiator implementation that is based on the source code of the IBM iscsi initiator [16], [19] for the Linux kernel version 2.4 series. The operating system has an 18GB partition on the local disk, and the iscsi driver uses a 73.4 GB partition on the target server.
6 Target The IBM TotalStorage IP Storage i is an iscsi appliance on two 1.13 GHz Pentium III processors with 1 GB of PC133 SDRAM main memory and an Intel Pro/1 F network adapter connected to a 66-MHz 64- bit PCI slot. Our system houses Ultra GB 1 RPM SCSI disks. While the details of the software have not been disclosed, it is believed that the appliance runs the Linux kernel version and the iscsi target implementation that runs in kernel mode. The initiator and the target are connected by an Extreme Summit 7i Gigabit Ethernet switch. The iscsi initiator and target implementations are compatible with the iscsi draft version 8. Though it is not the latest, we believe this issue does not significantly affect the performance under our experimental conditions, i.e., highspeed and low-latency networks. B. Measurement tool We implemented microbenchmarks that run in kernel mode and directly interact with the VM system in order to bypass the VM cache. All iscsi commands request the operation on the same disk block address. This leads to the best cache usage on the target. Therefore, the effect of the target server factors, such as disk performance, on the results is kept to a minimum. To measure the CPU utilization and where the CPU time is spent while running the microbenchmarks, we use OProfile suite [], which is a system-wide profiler for the Linux kernel. The OProfile suite uses Intel Hardware Performance Counters [21] to report detailed execution profiles. C. Configuration We conducted the microbenchmarks with an MTU of 15 bytes (standard Ethernet) or 9 bytes (Ethernet with jumbo frames). The microbenchmarks were run with the following four configurations: zero-copy, and TSO, for which the iscsi driver avoids copying data between the VM cache and skbuffs, and uses and TSO that the network adapter supports; zero-copy and checksum offloading, for which the iscsi driver avoids copying data, and uses ;, for which the iscsi driver copies data in the same way that a straightforward iscsi driver does and uses ; and no optimizations, for which the iscsi driver copies data and does not use or TSO. The microbenchmarks read or write data 1, times with the I/O sizes ranging from 512 bytes to 64 KB. Except in the 64 KB case, one I/O request is converted to one iscsi command. For the 64 KB case, one I/O request is converted to two iscsi commands due to a restriction on the maximum I/O size on the target. Here, we report the averages of five runs for all experiments. D. Results 1) CPU utilization and bandwidth: Figure 3(a) shows the results of CPU utilization during the write microbenchmarks with an MTU of 15 bytes. Except for the sizes ranging from 512 bytes to 124 bytes, the two configurations with which the iscsi driver avoids copying, the zero-copy, checksum offloading and TSO and zero-copy and configurations, yield a lower CPU utilization than the others. It means that copy avoidance improves performance over the large I/O sizes, and that over the small I/O sizes, the cost of sendpage interfaces is slightly higher than the cost of coping data. Though TSO is used for all I/O sizes, it effectively works for I/O sizes greater than 48 bytes. This is because in the block sizes of 512 bytes to 124 bytes TCP segmentation does not happen. That is, one TCP/IP packet can carry all data that constitute one iscsi PDU. Therefore, when the I/O sizes are over 48 bytes, the zero-copy, and TSO configuration yields a lower CPU utilization than the zero-copy and configuration does. But the gain from TSO is not clear. For example, CPU utilization is 29.% in the zero-copy, and TSO configuration and 3.2% in the zero-copy and configuration at 32 KB. As explained in Section III, the gap between the checksum offload and configurations is small. We see the largest gain from copy avoidance at 64 KB. CPU utilization is 3.% in the zero-copy, and TSO, 3.8% in the zero-copy and, 39.4% in the, 42.% in the configuration with the 15-byte MTU. We could not determine the reason why the zero-copy and configuration yields a lower CPU than the zero-copy, and TSO configuration in the block size of 48 bytes. Figure 3(b) shows the results of throughput during the write microbenchmarks with the 15-byte MTU. All configurations provide comparable throughputs. Figures 4(a) and 4(b) show the results of CPU utilization and throughput during the write microbenchmarks with the 9-byte MTU, respectively. These results are similar to those with the 15-byte MTU. But, as expected, CPU utilization with the 9-byte MTU is lower than one with the 15-byte MTU. The throughput with the 9-byte MTU is better than the one with the 15-byte MTU. Figures 5(a) and 5(b) show the results of CPU utilization and throughput during the read microbenchmarks with the 15-byte MTU. As explained in Section III, our scheme to avoid copying is impossible with the 15-byte MTU. In addition, TSO has no effect with regard to the read microbenchmarks. Therefore, only the and configurations are presented. Like the write microbenchmarks, the configuration yields a slightly lower CPU utilization than the configuration. Furthermore, both configurations perform comparably with regard to throughput.
7 1 zero-copy and checksum offload zero-copy, and TSO 7 CPU utilization (%) 4 Throughput (in MB/sec) p zero-copy and zero-copy, and TSO (a) Write CPU utilization (%). (b) Write Throughput (MB/s). Fig. 3. Results of write microbenchmarks with a 15-byte MTU. 1 zero-copy and checksum offload zero-copy, and TSO 1 1 CPU utilization (%) 4 Throughput (in MB/sec) zero-copy and zero-copy, and TSO (a) Write CPU utilization (%). (b) Write Throughput (MB/s). Fig. 4. Results of write microbenchmarks with a 9-byte MTU. Figures 6(a) and 6(b) show the results of CPU utilization and throughput during the read microbenchmarks with the 9-byte MTU. Though the zero-copy and checksum offload configuration is feasible under 496 bytes (the system virtual memory page size) through the combination of page remapping and copying data, its gain is smaller than the one for the 496-byte I/O size and over. Therefore we concentrate on the results over 496 bytes with regard to the zero-copy and checksum offload configuration. While at over 16 KB, both page remapping and copying data are required, at 8192 bytes, the iscsi driver does not need to copy; that is, it only needs to manipulate some data structures for memory management. Therefore, it gets the largest gain from the optimizations at 8192 bytes. However, we cannot see a clear gain. CPU utilization is 21.1% in the zero-copy and configuration and 21.3% in the checksum offloading configuration at 8192 bytes. Moreover, at 496 bytes and over 16 KB, the configuration yields the lower CPU utilization. We investigate the reasons for this behavior in subsequent paragraphs. In the read microbenchmarks, the iscsi driver provides comparable throughput in all configurations with the 9- byte MTU, as it does with the 15-byte MTU. 2) Distribution of CPU time: Next, we investigate where the CPU time is spent while running the microbenchmarks. Table I and II show the distribution of CPU time during the write and read microbenchmarks for the 8192-byte I/O size, respectively. The reason why we choose the I/O size of 8192 bytes is that this size most clearly shows the characteristics of the zero-copy and configuration in the
8 1 1 9 CPU utilization (%) 4 Throughput (in MB/sec) (a) Read CPU utilization (%). (b) Read Throughput (MB/s). Fig. 5. Results of read microbenchmarks with a 15-byte MTU. 1 zero-copy and 1 1 CPU utilization (%) 4 Throughput (in MB/sec) zero-copy and (a) Read CPU utilization (%). (b) Read Throughput (MB/s). Fig. 6. Results of read microbenchmarks with a 9-byte MTU. read microbenchmarks. As mentioned in the previous paragraph, we cannot see a clear gain from TSO. Therefore, we omit the result of the zero-copy, and TOE configuration. There are six types of large overhead: the iscsi driver, copying data and checksums, the SCSI subsystem, the network adapter driver, the TCP/IP stack, and the VM system. The results shows that in the write microbenchmarks, the combination of copy avoidance and decreases the CPU time spent on copying data and calculating checksums. And we also clearly see the effect of a large MTU. It greatly decreases the CPU time spent on the TCP/IP stack and also decreases the time spent on the network adapter driver. The read microbenchmarks perform like the write microbenchmarks. The combination of copy avoidance and decreases the CPU time spent on copying data and calculating checksums. Furthermore, a large MTU decreases the CPU time spent on the TCP/IP stack and on the network adapter driver. The major difference is that the optimization for copy avoidance increases the CPU time spent on the VM system and the TCP stack. The increased overhead of the VM is mainly due to the complexity of the modified VM system. As explained in Section III, in the modified VM system, the relationship between a page structure and the page that it describes is dynamic. This increases the overhead of memory-management operations. Furthermore, the increased overhead of memorymanagement operations increases the CPU time spent on the TCP/IP stack, which often uses these operations. As a result,
9 TABLE I DISTRIBUTION OF CPU TIME DURING THE WRITE MICROBENCHMARKS FOR THE 8192-BYTE I/O SIZE. Percent Time checksum offload zero-copy and checksum offload MTU (byte) Idle iscsi driver SCSI subsystem Copy and Checksum Network adapter driver TCP/IP VM system TABLE II DISTRIBUTION OF CPU TIME DURING THE READ MICROBENCHMARKS FOR THE 8192-BYTE I/O SIZE. Percent Time checksum offload zero-copy and checksum offload MTU (byte) Idle n/a iscsi driver n/a 2.12 SCSI subsystem n/a 1.29 Copy and Checksum n/a.16 Network adapter driver n/a 2.18 TCP/IP n/a 4.79 VM system n/a 4.89 the copy avoidance s performance gain is negated. These investigations also give some ideas on how much performance the iscsi protocol provides with specialized adapters. The important observation is that the overhead of processing the iscsi protocol and the SCSI subsystem consumes a relatively small amount of CPU time. For example, during the write microbenchmarks in the zero-copy and checksum offloading with the 15-byte MTU, it accounts for only 8.9% of the processing time in which the CPU is not idle. The largest source of overhead is the TCP/IP protocol. It accounts for 4.8% of the total processing time. V. CONCLUSION This paper presents an efficient implementation of an iscsi initiator with a commodity Gigabit Ethernet adapter, and experimentally analyzed its performance. Our implementation of an iscsi initiator avoids copying between the VM cache and buffers in the network subsystem by using page remapping, and exploits features that commodity Gigabit Ethernet adapters support, i.e.,, jumbo frames, and TCP segmentation offloading. Our analysis of writing showed that the combination of copy avoidance and reduces CPU utilization from 39.4% to 3.8% with an I/O size of 64 KB in our microbenchmarks as compared with a straightforward iscsi driver copying data. In addition, the copy avoidance technique that we used can be easily integrated into the Linux kernel by using existing interfaces. However, we cannot see a clear gain from TCP segmentation offloading. With regards to reading, the copy avoidance technique requires modifications to the Linux kernel. Furthermore, the overhead due to our modifications to the VM system negates any gain in performance that could be had from copy avoidance. Our experiments show that the CPU is not the performance bottleneck in all configurations and the iscsi driver provides comparable throughput with each MTU. The quantitative analysis shows that the overhead relative to the iscsi protocol and the SCSI subsystem consumes a relatively small amount of CPU time (8.9% of the total processing time during the write microbenchmarks with an MTU of 15 bytes) and the largest overhead comes from processing the TCP/IP protocol with standard IEEE 2.3 Ethernet frame sizes. REFERENCES [1] J. Satran, K. Meth, C. Sapuntzakis, M. Chadalapaka, and E. Zeidner, IPS Internet Draft : iscsi, January 3. [2] P. Sarkar, S. Uttamchandani, and K. Voruganti, Storage over IP: When Does Hardware Support Help? in the Conference on File and Storage Technologies (FAST 3). San Francisco, CA: USENIX, January 3, pp
10 [3] S. Aiken, D. Grunwald, A. R. Pleszkun, and J. Willek, A Performance Analysis of the iscsi Protocol, in th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies. San Diego, California: IEEE, April 3, pp [4] W. T. Ng, B. Hillyer, E. Shriver, E. Gabber, and B. Özden, Obtaining High Performance for Storage Outsourcing, in Conference on File and Storage Technologies. Monterey, CA: Usenix, January 2, pp [5] J. S. Chase, A. J. Gallatin, and K. G. Yocum, End-System Optimizations for High-Speed TCP, IEEE Communications, vol. 39, no. 4, pp , 1. [6] Hsiao-keng Jerry Chu, Zero-Copy TCP in Solaris, in the USENIX 1996 Annual Technical Conference, San Diego, CA, January 1996, pp [7] V. S. Pai, P. Druschel, and W. Zwaenepoel, IO-Lite: A Unified I/O Buffering and Caching System, ACM Transactions on Computer Systems, vol. 18, no. 1, pp ,. [8] J. C. Brustoloni and P. Steenkiste, Interoperation of Copy Avoidance in Network and File I/O, in the INFOCOM 99 Conference. New York: IEEE, March 1999, pp [9] S. Pakin, M. Lauria, and A. Chien, High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet, in Supercomputing 95, San Diego, CA, December [1] D. Dunning, G. Regnier, G. McAlpine, D. Cameron, B. Shubert, F. Berry, A. M. Merritt, E. Gronke, and C. Dodd, The Virtual Interface Architecture, IEEE micro, vol. 18, no. 2, pp , March/April [11] T. von Eicken, A. Basu, V. Buch, and W. Vogels, U-Net: A User- Level Network Interface for Parallel and Distributed Computing, in the Fifteenth ACM Symposium on Operating System Principles, December 1995, pp [12] M. DeBergalis, P. Corbett, S. Kleiman, A. Lent, D. Noveck, T. Talpey, and M. Wittle, The Direct Access File System, in 2nd USENIX Conference on File and Storage Technologies, San Francisco, CA, March 3, pp [13] Intel iscsi Reference Implementation, intel-iscsi/. [14] InterOperability Laboratory, [15] Cisco iscsi initiator Implementation, linux-iscsi/. [16] IBM iscsi Initiator Implementation, developerworks/projects/naslib/. [17] Open Source Initiative (OSI), [18] K. Z. Meth and J. Satran, Design of the iscsi Protocol, in th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies. San Diego, California: IEEE, April 3, pp [19] K. Z. Meth, iscsi Initiator Design and Implementation Experience, in Tenth NASA Goddard Conference on Mass Storage Systems and Technologies Nineteenth IEEE Symposium on Mass Storage Systems. Maryland, USA: IEEE, April 2. [] OProfile - A System Profiler for Linux, [21] Intel, The IA-32 Intel Architecture Software Developer s Manual, Volume 3: System Programming Guide, 3.
Implementation and Performance Evaluation of M-VIA on AceNIC Gigabit Ethernet Card
Implementation and Performance Evaluation of M-VIA on AceNIC Gigabit Ethernet Card In-Su Yoon 1, Sang-Hwa Chung 1, Ben Lee 2, and Hyuk-Chul Kwon 1 1 Pusan National University School of Electrical and Computer
More informationOnline Remote Data Backup for iscsi-based Storage Systems
Online Remote Data Backup for iscsi-based Storage Systems Dan Zhou, Li Ou, Xubin (Ben) He Department of Electrical and Computer Engineering Tennessee Technological University Cookeville, TN 38505, USA
More informationEnd-System Optimizations for High-Speed TCP
End-System Optimizations for High-Speed TCP Jeffrey S. Chase, Andrew J. Gallatin, and Kenneth G. Yocum Department of Computer Science Duke University Durham, NC 27708-0129 chase, grant @cs.duke.edu, gallatin@freebsd.org
More informationSockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck
Sockets vs. RDMA Interface over 1-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji Hemal V. Shah D. K. Panda Network Based Computing Lab Computer Science and Engineering
More informationIP Storage: The Challenge Ahead Prasenjit Sarkar, Kaladhar Voruganti Abstract 1 Introduction
_ IP Storage: The Challenge Ahead Prasenjit Sarkar, Kaladhar Voruganti IBM Almaden Research Center San Jose, CA 95120 {psarkar,kaladhar}@almaden.ibm.com tel +1-408-927-1417 fax +1-408-927-3497 Abstract
More informationBuilding High-Performance iscsi SAN Configurations. An Alacritech and McDATA Technical Note
Building High-Performance iscsi SAN Configurations An Alacritech and McDATA Technical Note Building High-Performance iscsi SAN Configurations An Alacritech and McDATA Technical Note Internet SCSI (iscsi)
More informationCollecting Packet Traces at High Speed
Collecting Packet Traces at High Speed Gorka Aguirre Cascallana Universidad Pública de Navarra Depto. de Automatica y Computacion 31006 Pamplona, Spain aguirre.36047@e.unavarra.es Eduardo Magaña Lizarrondo
More informationOptimizing Network Virtualization in Xen
Optimizing Network Virtualization in Xen Aravind Menon EPFL, Lausanne aravind.menon@epfl.ch Alan L. Cox Rice University, Houston alc@cs.rice.edu Willy Zwaenepoel EPFL, Lausanne willy.zwaenepoel@epfl.ch
More informationA Packet Forwarding Method for the ISCSI Virtualization Switch
Fourth International Workshop on Storage Network Architecture and Parallel I/Os A Packet Forwarding Method for the ISCSI Virtualization Switch Yi-Cheng Chung a, Stanley Lee b Network & Communications Technology,
More informationD1.2 Network Load Balancing
D1. Network Load Balancing Ronald van der Pol, Freek Dijkstra, Igor Idziejczak, and Mark Meijerink SARA Computing and Networking Services, Science Park 11, 9 XG Amsterdam, The Netherlands June ronald.vanderpol@sara.nl,freek.dijkstra@sara.nl,
More informationNetwork Attached Storage. Jinfeng Yang Oct/19/2015
Network Attached Storage Jinfeng Yang Oct/19/2015 Outline Part A 1. What is the Network Attached Storage (NAS)? 2. What are the applications of NAS? 3. The benefits of NAS. 4. NAS s performance (Reliability
More informationOptimizing Network Virtualization in Xen
Optimizing Network Virtualization in Xen Aravind Menon EPFL, Switzerland Alan L. Cox Rice university, Houston Willy Zwaenepoel EPFL, Switzerland Abstract In this paper, we propose and evaluate three techniques
More informationA Performance Analysis of the iscsi Protocol
A Performance Analysis of the iscsi Protocol Stephen Aiken aikens@cs.colorado.edu Dirk Grunwald grunwald@cs.colorado.edu Jesse Willeke willeke@schof.colorado.edu Andrew R. Pleszkun arp@boulder.colorado.edu
More informationTCP Offload Engines. As network interconnect speeds advance to Gigabit. Introduction to
Introduction to TCP Offload Engines By implementing a TCP Offload Engine (TOE) in high-speed computing environments, administrators can help relieve network bottlenecks and improve application performance.
More informationStorage and Print Solutions for SOHOs
1 Storage and Print Solutions for SOHOs Rajiv Mathews, Ravi Shankar T, Sriram P, Ranjani Parthasarathi Department Of Computer Science & Engineering, College Of Engineering, Guindy, Anna University, Chennai,
More informationNetworking Driver Performance and Measurement - e1000 A Case Study
Networking Driver Performance and Measurement - e1000 A Case Study John A. Ronciak Intel Corporation john.ronciak@intel.com Ganesh Venkatesan Intel Corporation ganesh.venkatesan@intel.com Jesse Brandeburg
More informationNew!! - Higher performance for Windows and UNIX environments
New!! - Higher performance for Windows and UNIX environments The IBM TotalStorage Network Attached Storage Gateway 300 (NAS Gateway 300) is designed to act as a gateway between a storage area network (SAN)
More informationFrom Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller
White Paper From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller The focus of this paper is on the emergence of the converged network interface controller
More informationIP SAN Best Practices
IP SAN Best Practices A Dell Technical White Paper PowerVault MD3200i Storage Arrays THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES.
More informationChapter 13 Selected Storage Systems and Interface
Chapter 13 Selected Storage Systems and Interface Chapter 13 Objectives Appreciate the role of enterprise storage as a distinct architectural entity. Expand upon basic I/O concepts to include storage protocols.
More informationPresentation of Diagnosing performance overheads in the Xen virtual machine environment
Presentation of Diagnosing performance overheads in the Xen virtual machine environment September 26, 2005 Framework Using to fix the Network Anomaly Xen Network Performance Test Using Outline 1 Introduction
More informationWireless ATA: A New Data Transport Protocol for Wireless Storage
Wireless ATA: A New Data Transport Protocol for Wireless Storage Serdar Ozler and Ibrahim Korpeoglu Department of Computer Engineering, Bilkent University, 06800 Bilkent, Ankara, Turkey {ozler, korpe}@cs.bilkent.edu.tr
More informationPerformance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009
Performance Study Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009 Introduction With more and more mission critical networking intensive workloads being virtualized
More informationBroadcom Ethernet Network Controller Enhanced Virtualization Functionality
White Paper Broadcom Ethernet Network Controller Enhanced Virtualization Functionality Advancements in VMware virtualization technology coupled with the increasing processing capability of hardware platforms
More informationFibre Channel over Ethernet in the Data Center: An Introduction
Fibre Channel over Ethernet in the Data Center: An Introduction Introduction Fibre Channel over Ethernet (FCoE) is a newly proposed standard that is being developed by INCITS T11. The FCoE protocol specification
More informationBuilding Enterprise-Class Storage Using 40GbE
Building Enterprise-Class Storage Using 40GbE Unified Storage Hardware Solution using T5 Executive Summary This white paper focuses on providing benchmarking results that highlight the Chelsio T5 performance
More informationStorage Solutions Overview. Benefits of iscsi Implementation. Abstract
Storage Solutions Overview Benefits of iscsi Implementation Aberdeen LLC. Charles D. Jansen Published: December 2004 Abstract As storage demands continue to increase and become more complex, businesses
More informationPERFORMANCE TUNING ORACLE RAC ON LINUX
PERFORMANCE TUNING ORACLE RAC ON LINUX By: Edward Whalen Performance Tuning Corporation INTRODUCTION Performance tuning is an integral part of the maintenance and administration of the Oracle database
More informationI/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology
I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology Reduce I/O cost and power by 40 50% Reduce I/O real estate needs in blade servers through consolidation Maintain
More informationVMWARE WHITE PAPER 1
1 VMWARE WHITE PAPER Introduction This paper outlines the considerations that affect network throughput. The paper examines the applications deployed on top of a virtual infrastructure and discusses the
More informationPCI Express Overview. And, by the way, they need to do it in less time.
PCI Express Overview Introduction This paper is intended to introduce design engineers, system architects and business managers to the PCI Express protocol and how this interconnect technology fits into
More informationDoubling the I/O Performance of VMware vsphere 4.1
White Paper Doubling the I/O Performance of VMware vsphere 4.1 with Broadcom 10GbE iscsi HBA Technology This document describes the doubling of the I/O performance of vsphere 4.1 by using Broadcom 10GbE
More informationiscsi Security ELEN 689 Network Security John Price Peter Rega
iscsi Security ELEN 689 Network Security John Price Peter Rega Outline iscsi Basics iscsi and NAS Differences iscsi Security Current Vulnerabilities References iscsi Basics Internet Small Computer Storage
More informationExploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand
Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand P. Balaji, K. Vaidyanathan, S. Narravula, K. Savitha, H. W. Jin D. K. Panda Network Based
More informationThe proliferation of the raw processing
TECHNOLOGY CONNECTED Advances with System Area Network Speeds Data Transfer between Servers with A new network switch technology is targeted to answer the phenomenal demands on intercommunication transfer
More informationIntel DPDK Boosts Server Appliance Performance White Paper
Intel DPDK Boosts Server Appliance Performance Intel DPDK Boosts Server Appliance Performance Introduction As network speeds increase to 40G and above, both in the enterprise and data center, the bottlenecks
More informationStudy and Performance Analysis of IP-based Storage System
Study and Performance Analysis of IP-based Storage System Zhang Jizheng, Yang Bo, Guo Jing, Jia Huibo Optical Memory National Engineering Research Center (OMNERC), Tsinghua University, Beijing, China 100084
More informationiscsi: Accelerating the Transition to Network Storage
iscsi: Accelerating the Transition to Network Storage David Dale April 2003 TR-3241 WHITE PAPER Network Appliance technology and expertise solve a wide range of data storage challenges for organizations,
More informationTCP Servers: Offloading TCP Processing in Internet Servers. Design, Implementation, and Performance
TCP Servers: Offloading TCP Processing in Internet Servers. Design, Implementation, and Performance M. Rangarajan, A. Bohra, K. Banerjee, E.V. Carrera, R. Bianchini, L. Iftode, W. Zwaenepoel. Presented
More informationDeployments and Tests in an iscsi SAN
Deployments and Tests in an iscsi SAN SQL Server Technical Article Writer: Jerome Halmans, Microsoft Corp. Technical Reviewers: Eric Schott, EqualLogic, Inc. Kevin Farlee, Microsoft Corp. Darren Miller,
More informationiscsi Top Ten Top Ten reasons to use Emulex OneConnect iscsi adapters
W h i t e p a p e r Top Ten reasons to use Emulex OneConnect iscsi adapters Internet Small Computer System Interface (iscsi) storage has typically been viewed as a good option for small and medium sized
More informationWindows TCP Chimney: Network Protocol Offload for Optimal Application Scalability and Manageability
White Paper Windows TCP Chimney: Network Protocol Offload for Optimal Application Scalability and Manageability The new TCP Chimney Offload Architecture from Microsoft enables offload of the TCP protocol
More informationHigh-Performance IP Service Node with Layer 4 to 7 Packet Processing Features
UDC 621.395.31:681.3 High-Performance IP Service Node with Layer 4 to 7 Packet Processing Features VTsuneo Katsuyama VAkira Hakata VMasafumi Katoh VAkira Takeyama (Manuscript received February 27, 2001)
More informationVTrak 15200 SATA RAID Storage System
Page 1 15-Drive Supports over 5 TB of reliable, low-cost, high performance storage 15200 Product Highlights First to deliver a full HW iscsi solution with SATA drives - Lower CPU utilization - Higher data
More informationGigabit Ethernet Design
Gigabit Ethernet Design Laura Jeanne Knapp Network Consultant 1-919-254-8801 laura@lauraknapp.com www.lauraknapp.com Tom Hadley Network Consultant 1-919-301-3052 tmhadley@us.ibm.com HSEdes_ 010 ed and
More informationAccelerating High-Speed Networking with Intel I/O Acceleration Technology
White Paper Intel I/O Acceleration Technology Accelerating High-Speed Networking with Intel I/O Acceleration Technology The emergence of multi-gigabit Ethernet allows data centers to adapt to the increasing
More informationPerformance Evaluation of InfiniBand with PCI Express
Performance Evaluation of InfiniBand with PCI Express Jiuxing Liu Server Technology Group IBM T. J. Watson Research Center Yorktown Heights, NY 1598 jl@us.ibm.com Amith Mamidala, Abhinav Vishnu, and Dhabaleswar
More informationComputer Systems Structure Input/Output
Computer Systems Structure Input/Output Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output Ward 1 Ward 2 Examples of I/O Devices
More informationHigh Speed I/O Server Computing with InfiniBand
High Speed I/O Server Computing with InfiniBand José Luís Gonçalves Dep. Informática, Universidade do Minho 4710-057 Braga, Portugal zeluis@ipb.pt Abstract: High-speed server computing heavily relies on
More informationThe Advantages of Multi-Port Network Adapters in an SWsoft Virtual Environment
The Advantages of Multi-Port Network Adapters in an SWsoft Virtual Environment Introduction... 2 Virtualization addresses key challenges facing IT today... 2 Introducing Virtuozzo... 2 A virtualized environment
More informationLinux NIC and iscsi Performance over 40GbE
Linux NIC and iscsi Performance over 4GbE Chelsio T8-CR vs. Intel Fortville XL71 Executive Summary This paper presents NIC and iscsi performance results comparing Chelsio s T8-CR and Intel s latest XL71
More informationIntroduction to MPIO, MCS, Trunking, and LACP
Introduction to MPIO, MCS, Trunking, and LACP Sam Lee Version 1.0 (JAN, 2010) - 1 - QSAN Technology, Inc. http://www.qsantechnology.com White Paper# QWP201002-P210C lntroduction Many users confuse the
More informationOptimizing Large Arrays with StoneFly Storage Concentrators
Optimizing Large Arrays with StoneFly Storage Concentrators All trademark names are the property of their respective companies. This publication contains opinions of which are subject to change from time
More informationVirtualization. Dr. Yingwu Zhu
Virtualization Dr. Yingwu Zhu What is virtualization? Virtualization allows one computer to do the job of multiple computers. Virtual environments let one computer host multiple operating systems at the
More informationInfiniBand Software and Protocols Enable Seamless Off-the-shelf Applications Deployment
December 2007 InfiniBand Software and Protocols Enable Seamless Off-the-shelf Deployment 1.0 Introduction InfiniBand architecture defines a high-bandwidth, low-latency clustering interconnect that is used
More informationUsing High Availability Technologies Lesson 12
Using High Availability Technologies Lesson 12 Skills Matrix Technology Skill Objective Domain Objective # Using Virtualization Configure Windows Server Hyper-V and virtual machines 1.3 What Is High Availability?
More informationNetwork Performance Optimisation and Load Balancing. Wulf Thannhaeuser
Network Performance Optimisation and Load Balancing Wulf Thannhaeuser 1 Network Performance Optimisation 2 Network Optimisation: Where? Fixed latency 4.0 µs Variable latency
More informationSAN Conceptual and Design Basics
TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer
More informationNAS Backup Strategies
NAS Backup Strategies Jeff Bain Hewlett-Packard Network Storage Solutions Organization Automated Storage 700 71st Ave. Greeley, CO 80634 jeff_bain@hp.com Page 1 NAS Backup Strategies NAS: A Changing Definition
More informationNAS or iscsi? White Paper 2007. Selecting a storage system. www.fusionstor.com. Copyright 2007 Fusionstor. No.1
NAS or iscsi? Selecting a storage system White Paper 2007 Copyright 2007 Fusionstor www.fusionstor.com No.1 2007 Fusionstor Inc.. All rights reserved. Fusionstor is a registered trademark. All brand names
More informationIP SAN: Low on Fibre
IP SAN: Low on Fibre Smriti Bhagat CS Dept, Rutgers University Abstract The era of Fibre Channel interoperability is slowly dawning. In a world where Internet Protocol (IP) dominates local and wide area
More information3G Converged-NICs A Platform for Server I/O to Converged Networks
White Paper 3G Converged-NICs A Platform for Server I/O to Converged Networks This document helps those responsible for connecting servers to networks achieve network convergence by providing an overview
More informationAchieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003
Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks An Oracle White Paper April 2003 Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building
More informationADVANCED NETWORK CONFIGURATION GUIDE
White Paper ADVANCED NETWORK CONFIGURATION GUIDE CONTENTS Introduction 1 Terminology 1 VLAN configuration 2 NIC Bonding configuration 3 Jumbo frame configuration 4 Other I/O high availability options 4
More informationPerformance Characteristics of VMFS and RDM VMware ESX Server 3.0.1
Performance Study Performance Characteristics of and RDM VMware ESX Server 3.0.1 VMware ESX Server offers three choices for managing disk access in a virtual machine VMware Virtual Machine File System
More informationIP SAN BEST PRACTICES
IP SAN BEST PRACTICES PowerVault MD3000i Storage Array www.dell.com/md3000i TABLE OF CONTENTS Table of Contents INTRODUCTION... 3 OVERVIEW ISCSI... 3 IP SAN DESIGN... 4 BEST PRACTICE - IMPLEMENTATION...
More informationSCSI vs. Fibre Channel White Paper
SCSI vs. Fibre Channel White Paper 08/27/99 SCSI vs. Fibre Channel Over the past decades, computer s industry has seen radical change in key components. Limitations in speed, bandwidth, and distance have
More information- An Essential Building Block for Stable and Reliable Compute Clusters
Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative
More informationVirtualization: TCP/IP Performance Management in a Virtualized Environment Orlando Share Session 9308
Virtualization: TCP/IP Performance Management in a Virtualized Environment Orlando Share Session 9308 Laura Knapp WW Business Consultant Laurak@aesclever.com Applied Expert Systems, Inc. 2011 1 Background
More informationChapter 6. 6.1 Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig. 6.1. I/O devices can be characterized by. I/O bus connections
Chapter 6 Storage and Other I/O Topics 6.1 Introduction I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections
More informationPCI Express* Ethernet Networking
White Paper Intel PRO Network Adapters Network Performance Network Connectivity Express* Ethernet Networking Express*, a new third-generation input/output (I/O) standard, allows enhanced Ethernet network
More informationDell PowerVault MD Series Storage Arrays: IP SAN Best Practices
Dell PowerVault MD Series Storage Arrays: IP SAN Best Practices A Dell Technical White Paper Dell Symantec THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND
More informationAn Analysis of 8 Gigabit Fibre Channel & 10 Gigabit iscsi in Terms of Performance, CPU Utilization & Power Consumption
An Analysis of 8 Gigabit Fibre Channel & 1 Gigabit iscsi in Terms of Performance, CPU Utilization & Power Consumption An Analysis of 8 Gigabit Fibre Channel & 1 Gigabit iscsi 1 Key Findings Third I/O found
More informationIP SAN Fundamentals: An Introduction to IP SANs and iscsi
IP SAN Fundamentals: An Introduction to IP SANs and iscsi Updated April 2007 Sun Microsystems, Inc. 2007 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054 USA All rights reserved. This
More informationThe Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology
3. The Lagopus SDN Software Switch Here we explain the capabilities of the new Lagopus software switch in detail, starting with the basics of SDN and OpenFlow. 3.1 SDN and OpenFlow Those engaged in network-related
More informationThe team that wrote this redbook Comments welcome Introduction p. 1 Three phases p. 1 Netfinity Performance Lab p. 2 IBM Center for Microsoft
Foreword p. xv Preface p. xvii The team that wrote this redbook p. xviii Comments welcome p. xx Introduction p. 1 Three phases p. 1 Netfinity Performance Lab p. 2 IBM Center for Microsoft Technologies
More informationEVOLUTION OF NETWORKED STORAGE
EVOLUTION OF NETWORKED STORAGE Sonika Jindal 1, Richa Jindal 2, Rajni 3 1 Lecturer, Deptt of CSE, Shaheed Bhagat Singh College of Engg & Technology, Ferozepur. sonika_manoj@yahoo.com 2 Lecturer, Deptt
More informationVideo Surveillance Storage and Verint Nextiva NetApp Video Surveillance Storage Solution
Technical Report Video Surveillance Storage and Verint Nextiva NetApp Video Surveillance Storage Solution Joel W. King, NetApp September 2012 TR-4110 TABLE OF CONTENTS 1 Executive Summary... 3 1.1 Overview...
More informationQuantum StorNext. Product Brief: Distributed LAN Client
Quantum StorNext Product Brief: Distributed LAN Client NOTICE This product brief may contain proprietary information protected by copyright. Information in this product brief is subject to change without
More informationHewlett Packard - NBU partnership : SAN (Storage Area Network) или какво стои зад облаците
Hewlett Packard - NBU partnership : SAN (Storage Area Network) или какво стои зад облаците Why SAN? Business demands have created the following challenges for storage solutions: Highly available and easily
More informationOptimizing LTO Backup Performance
Optimizing LTO Backup Performance July 19, 2011 Written by: Ash McCarty Contributors: Cedrick Burton Bob Dawson Vang Nguyen Richard Snook Table of Contents 1.0 Introduction... 3 2.0 Host System Configuration...
More informationEnabling Technologies for Distributed Computing
Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF Multi-core CPUs and Multithreading Technologies
More informationThe Bus (PCI and PCI-Express)
4 Jan, 2008 The Bus (PCI and PCI-Express) The CPU, memory, disks, and all the other devices in a computer have to be able to communicate and exchange data. The technology that connects them is called the
More informationWhite Paper Technology Review
White Paper Technology Review iscsi- Internet Small Computer System Interface Author: TULSI GANGA COMPLEX, 19-C, VIDHAN SABHA MARG, LUCKNOW 226001 Uttar Pradesh, India March 2004 Copyright 2004 Tata Consultancy
More informationRouter Architectures
Router Architectures An overview of router architectures. Introduction What is a Packet Switch? Basic Architectural Components Some Example Packet Switches The Evolution of IP Routers 2 1 Router Components
More informationChapter 12: Mass-Storage Systems
Chapter 12: Mass-Storage Systems Chapter 12: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space Management RAID Structure
More informationRemoving Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering
Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays Red Hat Performance Engineering Version 1.0 August 2013 1801 Varsity Drive Raleigh NC
More informationNetwork Virtualization for Large-Scale Data Centers
Network Virtualization for Large-Scale Data Centers Tatsuhiro Ando Osamu Shimokuni Katsuhito Asano The growing use of cloud technology by large enterprises to support their business continuity planning
More informationWhite Paper. Enhancing Storage Performance and Investment Protection Through RAID Controller Spanning
White Paper Enhancing Storage Performance and Investment Protection Through RAID Controller Spanning May 2005 Introduction It is no surprise that the rapid growth of data within enterprise networks is
More informationHow To Build A Clustered Storage Area Network (Csan) From Power All Networks
Power-All Networks Clustered Storage Area Network: A scalable, fault-tolerant, high-performance storage system. Power-All Networks Ltd Abstract: Today's network-oriented computing environments require
More informationIMPLEMENTATION AND COMPARISON OF ISCSI OVER RDMA ETHAN BURNS
IMPLEMENTATION AND COMPARISON OF ISCSI OVER RDMA BY ETHAN BURNS B.S., University of New Hampshire (2006) THESIS Submitted to the University of New Hampshire in Partial Fulfillment of the Requirements for
More informationBoosting Data Transfer with TCP Offload Engine Technology
Boosting Data Transfer with TCP Offload Engine Technology on Ninth-Generation Dell PowerEdge Servers TCP/IP Offload Engine () technology makes its debut in the ninth generation of Dell PowerEdge servers,
More informationHow To Test A Microsoft Vxworks Vx Works 2.2.2 (Vxworks) And Vxwork 2.4.2-2.4 (Vkworks) (Powerpc) (Vzworks)
DSS NETWORKS, INC. The Gigabit Experts GigMAC PMC/PMC-X and PCI/PCI-X Cards GigPMCX-Switch Cards GigPCI-Express Switch Cards GigCPCI-3U Card Family Release Notes OEM Developer Kit and Drivers Document
More informationIBM ^ xseries ServeRAID Technology
IBM ^ xseries ServeRAID Technology Reliability through RAID technology Executive Summary: t long ago, business-critical computing on industry-standard platforms was unheard of. Proprietary systems were
More informationBridging the Gap between Software and Hardware Techniques for I/O Virtualization
Bridging the Gap between Software and Hardware Techniques for I/O Virtualization Jose Renato Santos Yoshio Turner G.(John) Janakiraman Ian Pratt Hewlett Packard Laboratories, Palo Alto, CA University of
More informationEnd-to-end Data integrity Protection in Storage Systems
End-to-end Data integrity Protection in Storage Systems Issue V1.1 Date 2013-11-20 HUAWEI TECHNOLOGIES CO., LTD. 2013. All rights reserved. No part of this document may be reproduced or transmitted in
More informationStorage Protocol Comparison White Paper TECHNICAL MARKETING DOCUMENTATION
Storage Protocol Comparison White Paper TECHNICAL MARKETING DOCUMENTATION v 1.0/Updated APRIl 2012 Table of Contents Introduction.... 3 Storage Protocol Comparison Table....4 Conclusion...10 About the
More informationIncreasing Web Server Throughput with Network Interface Data Caching
Increasing Web Server Throughput with Network Interface Data Caching Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Computer Systems Laboratory Rice University Houston, TX 77005 hykim, vijaypai, rixner
More informationCloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com
Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...
More informationOracle Database Scalability in VMware ESX VMware ESX 3.5
Performance Study Oracle Database Scalability in VMware ESX VMware ESX 3.5 Database applications running on individual physical servers represent a large consolidation opportunity. However enterprises
More information