Building High-Performance iscsi SAN Configurations. An Alacritech and McDATA Technical Note



Similar documents
New!! - Higher performance for Windows and UNIX environments

Deployments and Tests in an iscsi SAN

Windows TCP Chimney: Network Protocol Offload for Optimal Application Scalability and Manageability

IP SAN Best Practices

An Analysis of 8 Gigabit Fibre Channel & 10 Gigabit iscsi in Terms of Performance, CPU Utilization & Power Consumption

Oracle Database Deployments with EMC CLARiiON AX4 Storage Systems

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Post-production Video Editing Solution Guide with Microsoft SMB 3 File Serving AssuredSAN 4000

Accelerating High-Speed Networking with Intel I/O Acceleration Technology

Building Enterprise-Class Storage Using 40GbE

Demartek June Broadcom FCoE/iSCSI and IP Networking Adapter Evaluation. Introduction. Evaluation Environment

Windows 8 SMB 2.2 File Sharing Performance

SAN Conceptual and Design Basics

Microsoft Exchange Solutions on VMware

Optimizing Large Arrays with StoneFly Storage Concentrators

Oracle Database Scalability in VMware ESX VMware ESX 3.5

LSI MegaRAID CacheCade Performance Evaluation in a Web Server Environment

High Performance Tier Implementation Guideline

LSI MegaRAID FastPath Performance Evaluation in a Web Server Environment

Technical Brief. DualNet with Teaming Advanced Networking. October 2006 TB _v02

ADVANCED NETWORK CONFIGURATION GUIDE

Certification Document bluechip STORAGEline R54300s NAS-Server 03/06/2014. bluechip STORAGEline R54300s NAS-Server system

From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller

Broadcom Ethernet Network Controller Enhanced Virtualization Functionality

D1.2 Network Load Balancing

Linux NIC and iscsi Performance over 40GbE

Dell PowerVault MD Series Storage Arrays: IP SAN Best Practices

Video Surveillance Storage and Verint Nextiva NetApp Video Surveillance Storage Solution

IP SAN BEST PRACTICES

I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology

How To Evaluate Netapp Ethernet Storage System For A Test Drive

Storage Protocol Comparison White Paper TECHNICAL MARKETING DOCUMENTATION

Where IT perceptions are reality. Test Report. OCe14000 Performance. Featuring Emulex OCe14102 Network Adapters Emulex XE100 Offload Engine

Doubling the I/O Performance of VMware vsphere 4.1

IP SAN Fundamentals: An Introduction to IP SANs and iscsi

Overview and Frequently Asked Questions Sun Storage 10GbE FCoE PCIe CNA

Fusionstor NAS Enterprise Server and Microsoft Windows Storage Server 2003 competitive performance comparison

3G Converged-NICs A Platform for Server I/O to Converged Networks

1000-Channel IP System Architecture for DSS

iscsi: Accelerating the Transition to Network Storage

How To Connect Virtual Fibre Channel To A Virtual Box On A Hyperv Virtual Machine

VTrak SATA RAID Storage System

The proliferation of the raw processing

Traditionally, a typical SAN topology uses fibre channel switch wiring while a typical NAS topology uses TCP/IP protocol over common networking

Fibre Channel over Ethernet in the Data Center: An Introduction

Windows Server 2008 R2 Hyper-V Live Migration

Test Report Newtech Supremacy II NAS 05/31/2012. Newtech Supremacy II NAS storage system

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

PCI Express* Ethernet Networking

Certification Document macle GmbH Grafenthal-S1212M 24/02/2015. macle GmbH Grafenthal-S1212M Storage system

Introduction to MPIO, MCS, Trunking, and LACP

EMC Unified Storage for Microsoft SQL Server 2008

FlexNetwork Architecture Delivers Higher Speed, Lower Downtime With HP IRF Technology. August 2011

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

WHITE PAPER. How To Build a SAN. The Essential Guide for Turning Your Windows Server Into Shared Storage on Your IP Network

Certification Document macle GmbH GRAFENTHAL R2208 S2 01/04/2016. macle GmbH GRAFENTHAL R2208 S2 Storage system

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

Best Practices Guide: Network Convergence with Emulex LP21000 CNA & VMware ESX Server

Technical White Paper. Symantec Backup Exec 10d System Sizing. Best Practices For Optimizing Performance of the Continuous Protection Server

PCI Express Overview. And, by the way, they need to do it in less time.

1-Gigabit TCP Offload Engine

Quantum StorNext. Product Brief: Distributed LAN Client

Microsoft SQL Server 2012 on Cisco UCS with iscsi-based Storage Access in VMware ESX Virtualization Environment: Performance Study

Hewlett Packard - NBU partnership : SAN (Storage Area Network) или какво стои зад облаците

Windows Server 2008 R2 Hyper-V Live Migration

PERFORMANCE TUNING ORACLE RAC ON LINUX

Interoperability Test Results for Juniper Networks EX Series Ethernet Switches and NetApp Storage Systems

VERITAS Backup Exec 9.0 for Windows Servers

A Dell PowerVault MD3200 and MD3200i Technical White Paper Dell

TCP Offload Engines. As network interconnect speeds advance to Gigabit. Introduction to

EMC Business Continuity for Microsoft SQL Server 2008

Gigabit Ethernet Design

Comparison of Novell, Polyserve, and Microsoft s Clustering Performance

Evaluation Report: Accelerating SQL Server Database Performance with the Lenovo Storage S3200 SAN Array

Microsoft Exchange Server 2003 Deployment Considerations

VMware Best Practice and Integration Guide

Performance Comparison of Fujitsu PRIMERGY and PRIMEPOWER Servers

Seradex White Paper. Focus on these points for optimizing the performance of a Seradex ERP SQL database:

Converged Networking Solution for Dell M-Series Blades. Spencer Wheelwright

HP iscsi storage for small and midsize businesses

Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003

Evaluation Report: Supporting Microsoft Exchange on the Lenovo S3200 Hybrid Array

A Performance Analysis of the iscsi Protocol

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

The Benefit of Migrating from 4Gb to 8Gb Fibre Channel

General Pipeline System Setup Information

iscsi Top Ten Top Ten reasons to use Emulex OneConnect iscsi adapters

Alacritech Accelerator Users Guide

Broadcom Ethernet Network Controller Enhanced Virtualization Functionality

Question: 3 When using Application Intelligence, Server Time may be defined as.

white paper A CASE FOR VIRTUAL RAID ADAPTERS Beyond Software RAID

Qsan Document - White Paper. Performance Monitor Case Studies

A Packet Forwarding Method for the ISCSI Virtualization Switch

Transcription:

Building High-Performance iscsi SAN Configurations An Alacritech and McDATA Technical Note

Building High-Performance iscsi SAN Configurations An Alacritech and McDATA Technical Note Internet SCSI (iscsi) Reaching Maximum Performance IP storage using Internet SCSI (iscsi) provides opportunity for many organizations looking to extend existing Fibre Channel SANs to stranded servers, or looking to deploy Ethernet-based SANs. It can reduce costs by allowing IT managers to take advantage of existing, familiar Ethernet networks. Often the biggest complaint with new technology like iscsi is its ability to provide more than simple functionality and connectivity. That argument is addressed by leveraging fast, wire-speed TCP/IP products common to networking today with fast storage systems. This combination can offer maximum performance and efficiency comparable to many Fibre Channel-based solutions. iscsi encompasses several components of storage networking. This includes host-based connection devices commonly referred to as initiators, and storage systems known as targets. iscsi Initiators can take several forms including: Host Bus Adapters (HBAs) with the iscsi initiator implemented in the hardware adapter card Software initiators running over standard network interfaces, such as Network Interface Cards (NICs), TCP Offload Engine (TOE) NICs (TNICs), or Host Bus Adapters iscsi targets also come in various forms, including: Disk storage systems Tape storage systems IP storage switches Since completion of the iscsi standard in early 03, a number of standards compliant products have entered the IP storage market. Many of these products have completed interoperability and conformance testing from organizations such as the University of New Hampshire s Interoperability Lab 1 and Microsoft s Designed for Windows Logo Program 2. While many of these products have achieved general interoperability and functionality, most lack the levels of performance necessary to compete against other technologies in the global storage market. Despite the lack of performance from some IP storage devices, there are a few iscsi products that can be used to build high-performance iscsi (Storage Area Networks) SANs to meet high transaction rates and/or high throughput requirements. This note examines the performance capabilities of the McDATA IPS Multi-Protocol Storage Switches and servers equipped with the Microsoft iscsi Software Initiator and hardware accelerators like the Alacritech Accelerator family of TNICs or Alacritech iscsi Accelerator family of iscsi HBAs to build such a high-performance iscsi SAN. 1 http://www.iol.unh.edu/consortiums/iscsi/index.html 2 http://www.microsoft.com/windowsserversystem/storage/technologies/iscsi/default.mspx Performance Test Objectives The objectives of the iscsi performance testing aimed to show the McDATA IPS Multi-Protocol Storage Switch in conjunction with Alacritech iscsi Accelerators and the Microsoft iscsi Software Initiator on Windows-based servers, sustaining wire-speed iscsi throughput at larger transaction sizes and a substantial transaction rate, measured in Input/Output Operations per Second (IOPs). In the case of saturating a single IP Storage Gigabit Ethernet link, the fullduplex wire-speed throughput is over 210 Megabytes per second. For more details please see Appendix B: Storage Networking Bandwidth. This test looks at a balanced single test configuration tuned for performance over a broad spectrum of operation sizes. Tests and iscsi SANs can be more specifically tuned for optimal throughput or optimal transaction rate. Performance Configuration Details Test Execution Alacritech commissioned VeriTest, a division of Lionbridge Technologies, to compare the performance of a number of iscsi initiator products. Performance reported in this note reflects the subset of testing specific to the Alacritech and McDATA configuration. The full test report is available from VeriTest at http://www.veritest.com/clients/reports/alacritech/ Page 2

Equipment The iscsi server configuration used an SuperMicro X5DPE-G2 motherboard with dual Intel Xeon processors, and an Alacritech SES1001T iscsi Accelerator. Alacritech's acceleration solutions are based on the company's high-performance SLIC Technology architecture. Products based on SLIC Technology remove I/O bottlenecks for both storage and networking systems by offloading TCP/IP and iscsi protocol processing. The server used the Microsoft iscsi Software Initiator, version 1.02, to perform iscsi connectivity. The iscsi Accelerator was connected to the network through a Dell PowerConnect 5224 Gigabit Ethernet switch to the McDATA switch. LSI ProFibre 4000 RAID 8 - RAID 0 Logical Drives SuperMicro Server Dual Xeon 3 GHz iscsi SAN configuration for highperformance 4-1G Fibre Channel connections from McDATA to LSI RAID Ethernet Management Interface Ethernet Data and Management Interface Ethernet Data and Management Interface Dell PowerConnect 5224 Alacritech SES1001T iscsi Accelerator McDATA IPS 3300 Multi-Protocol Storage Switch Figure 1: iscsi Performance Configuration A McDATA IPS3300 Multi-Protocol Storage Switch acted as the iscsi target device. The McDATA switch connected the incoming iscsi traffic from the iscsi server to a LSI ProFibre 4000 Fibre Channel RAID subsystem. The LSI RAID storage included high-performance 15K RPM disk drives from Seagate. Traffic Generation Iometer, a server performance analysis tool developed by Intel, drove traffic for the tests. The server was running an iscsi/tcp/ip session across the Gigabit Ethernet link. Version 03.12.16 was used in the configuration. For more information on how Iometer was used during these performance tests, see Appendix A: Iometer Information. Building Maximum Performance with a Software Initiator Solution Initiators can be hardware-based or software-based. Major research firms such as International Data Corp. and Gartner Dataquest are expecting deployments of both types of solutions. Network copies on receive with NICs penalizes performance on servers Software iscsi initiators can run over standard NICs. Using a NIC is a sufficient connectivity solution due to the ability to use standard Ethernet failover and link aggregation techniques, to perform in-band management of other iscsi SAN devices and operating system vendor support of the iscsi protocol implementation, leading to better protocol conformance and interoperability. NIC performance is generally acceptable for write operations, but underperforms significantly for read operations. This is due to NICs not being able to perform direct memory access (DMA) operations direct to destination memory due to protocol processing required on the host prior to knowing the destination memory address. Servers tend to favor storage read operations rather than write operations, so this limitation is significant. NIC Data Flow SCSI/iSCSI SCSI/iSCSI SCSI Buffer SCSI Buffer TCP/IP Buffer TCP/IP NIC TCP/IP NIC Figure 2: NIC Data Flow Page 3

A hardware solution uses an HBA that implements the iscsi protocol on the card and appears to the system as a SCSI device. HBAs do not have the read DMA limitation of NICs due to the on-board protocol processing of the HBA. Most HBAs lack expected Ethernet functionality The HBA approach has a number of drawbacks such as: Impossible to use standard Ethernet failover and port aggregation techniques; Because this solution is dedicated to block transport only, it does not transport network traffic, including in-band management traffic with other iscsi SAN and network switch devices; Dependence on the HBA vendor and not the operating system vendor for iscsi protocol conformance and interoperability. SCSI/iSCSI SCSI Buffer iscsi TCP/IP NIC HBA Data Flow SCSI/iSCSI SCSI Buffer iscsi TCP/IP NIC Alacritech SES1001 preserves NIC benefits and HBA performance An alternative HBA approach involves the use of Figure 3: HBA Data Flow TOE/iSCSI hardware with a software iscsi initiator. This preserves the benefits of the NIC, while also sharing the same data flow as other HBAs. Alacritech s SLIC Technology architecture includes patented methods for receive processing that allow DMA operations directly to the destination SCSI buffer on the host mirroring the HBA data path. The end result is the same as a conventional iscsi HBA of direct placement of the data in the desired host buffer without a host data copy. Alacritech also utilizes patented methods for port aggregation and failover while performing TCP offload, which HBA solutions cannot do. Throughput Performance Configuration Results Bi-directional throughput of over 210 MB/s Throughput MB/s CPU Utilization Figure 4: Bi-Directional Throughput Performance Results of the performance testing show that average bi-directional throughput of over 210 Megabytes per second was reached across a single Gigabit Ethernet link for frame sizes of 64KB and larger. * see Appendix B Average unidirectional read throughput of 113 MB/s Throughput MB/s CPU Utilization Figure 5: Read Throughput Performance Results of the performance testing show that average unidirectional read throughput of 113 Megabytes per second was reached across a single Gigabit Ethernet link for frame sizes of 8KB and larger. * see Appendix B Page 4

Throughput Performance Configuration Results Continued Figure 6: Write Throughput Performance Average unidirectional write throughput of 113 MB/s Throughput MB/s CPU Utilization Results of the performance testing show that average unidirectional write throughput of 113 Megabytes per second was reached across a single Gigabit Ethernet link for frame sizes of 8KB and larger. * see Appendix B Small Transaction Rate Performance Configuration Results Bi-directional average transaction rate of over 47,600 operations/s IO Per Second (IOPs) IO/CPU Figure 7: Bi-Directional Transaction Performance Results of the transaction performance testing show that a bi-directional average transaction rate of over 47,600 operations per second was reached across a single Gigabit Ethernet link for a frame size of 512B. Figure 8: Read Transaction Performance Unidirectional read average transaction rate of over 46,000 operations/s IO Per Second (IOPs) IO Rate Per % CPU Results of the performance testing show that unidirectional read average transaction rate of over 46,000 operations per second was reached across a single Gigabit Ethernet link for a frame size of 512B. Page 5

Small Transaction Rate Performance Configuration Results Continued Unidirectional write average transaction rate of over 23,500 operations/s IO Per Second (IOPs) IO Rate Per % CPU Figure 9: Write Transaction Performance Results of the performance testing show that unidirectional write average transaction rate of over 23,500 operations per second was reached across a single Gigabit Ethernet link for a frame size of 512B. Conclusions High-Performance iscsi The results of the tests show that full duplex throughput with iscsi can reach line rates of over 210 megabytes per second (MB/s) in both IP Storage switches from McDATA and HBAs such as Alacritech s iscsi Accelerator. In throughput tuned configurations, the configuration has been validated at closer to the 2 MB/s theoretical maximum iscsi payload bandwidth in a number of tests subsequent to the initial January 02 Alacritech, Hitachi, and Nishan Systems wirespeed demonstration 3. McDATA subsequently acquired Nishan Systems in 03, with the Nishan Systems products and technologies forming the foundation of the McDATA Multi-Protocol Storage Switch product family. The capability of the McDATA Multi-Protocol Storage Switch to handle high throughput wire-speed conversion and high transaction rate conversion between iscsi and Fibre Channel is clearly demonstrated through these results. This performance confirmation with Alacritech iscsi Accelerators helps facilitate the adoption of iscsi in the enterprise and the emergence of large-scale, high-performance IP Storage networks. 3 http://www.alacritech.com/iscsi/pr/index.html Page 6

Appendices: Building High-Performance iscsi SAN Configurations Appendix A: Iometer Information What were the Iometer settings? Iometer was set for operations at block sizes from 512 bytes to 1MB. Is the block size representative of applications? Applications run block sized anywhere from 512 bytes up to 16MB. Smaller block sizes mean that at any given moment, additional time is spent awaiting receipt of the block than sending the block itself. Use of larger block sizes reduced the relative overhead and allows more data to be in the network pipe. What was the setting for outstanding I/O operations? Iometer was set to six (6) outstanding I/O operations. Typical applications range from one to sixteen outstanding I/O operations. One outstanding I/O operation is used when the application requires confirmation of each I/O before executing the next I/O operation. The one outstanding I/O setting typically is used to guarantee the highest integrity, with a tradeoff in performance. With only one outstanding I/O at a time, the recovery time is minimized. By allowing more outstanding I/O operations at any given time, more data can be placed in the network pipe and overall link utilization will increase. Recovery of multiple I/Os also is possible, but will take slightly longer. These effects are specifically detailed in the McDATA white paper Data Storage Anywhere, Any Time. What version of Iometer was used? Version 03.12.16 Page 7

Appendix B: Storage Networking Bandwidth To verify that full wire speed is achieved, one must first determine the theoretical limits of the technology under test. The signaling rate of 1000BASE-SX Gigabit Ethernet is 1.25 Gigabits per second in each direction. After accounting for 8B/10B coding overhead and the 1460 byte payload size, the usable bandwidth is 113.16 megabytes per second. Bi-directional bandwidth is approximately 2.31 megabytes per second. Note that the bi-directional bandwidth accounts for acknowledgements (see below). SCSI and iscsi protocol overhead is not included in these calculations. The faster link speed of Gigabit Ethernet to one gigabit Fibre Channel requires the test configuration to have storage on multiple Fibre Channel ports to provide sufficient disk bandwidth to fully saturate the single Gigabit Ethernet connection between the server and switch. Additional detail is provided in Figure 10. For the purposes of these calculations, 1 megabyte is defined as 1024 2 Bytes or 1,048,576 bytes. This correlates with the definitions in Iometer. Condition Raw Link Bandwidth Link Bandwidth with 8B/10B Coding Overhead Unidirectional Payload Bandwidth Bi-directional Payload Bandwidth Gigabit Ethernet 1.250 Gbit/s 1.0 Gbit/s 113.16 MB/s (1460 Byte Payloads) 2.31 MB/s (see detail on ACKs below) The iscsi / Gigabit Ethernet frame is assumed to have 78 Bytes of overhead (Ethernet 14B, IP B, TCP B, CRC 4B, Interframe Gap B). These calculations ignore iscsi Protocol Data Unit (PDU) and SCSI command overhead. In throughput configurations this overhead is insignificant, with SCSI commands and the 48 byte iscsi PDU header occurring once per operation, i.e. one SCSI command in each direction, and one iscsi PDU header on the SCSI response per 64KB to 1MB transaction. SCSI and iscsi overhead become much more significant in smaller transactions. Figure 10: Storage Networking Bandwidth Acknowledgements (ACKs) The TCP protocol used with iscsi provides a reliable transport. All data sent over TCP is acknowledged by the receiving system. In the unidirectional read or write tests, the ACK traffic has no impact, since it is on the otherwise idle portion of the bi-directional link. iscsi traffic using TCP typically requires unique ACK frames rather than piggybacking the ACK with the data. This occurs since the traffic is unidirectional in nature a small SCSI CMD will generate lots of data in one direction, with the other direction essentially idle. The number of ACK frames can vary depending on the ACK algorithm (e.g. ACK every frame, ACK every other frame, ACK at specific intervals, etc.), however, an ACK typically will occur for every other frame. This results in a bit over six megabytes of ACK traffic on a bi-directional link, reducing the payload to 2.31 megabytes per second. Page 8

Additional Details for Figure 10 Calculation of Ethernet and iscsi Bandwidth Raw Link Bandwidth Net Rate (8B/10B Coding) Net Rate, in megabits Net Rate, in bits Net Rate, in bytes (Bps) 4 Total Bytes per Frame 5 iscsi Overhead Data Payload per Frame Payload percentage Actual Data Rate without ACKs 1.25 Gbit/s 1.00 Gbit/s 1,000 Mbit/s 1,000,000,000 bit/s 119.21 MB/s 1538 78 1460 94.93% 113.16 MB/s Net Rate, in Bytes 4 Frames per second no ACKs How many frames per ACK Frames per second with ACKs Data Rate with 2 frames per ACK Bi-directional Rate 125,000,000 81274.4 2 79113.9 110.2 MB/s 2.31 MB/s Ethernet Frame (Bytes) 1500 14 4 1538 Header CRC Interframe Gap Total Ethernet Frame TCP Overhead (Bytes) 14 4 78 Ethernet IP TCP CRC Interframe Gap Total TCP Overhead ACK Frame (Bytes) 14 6 4 64 84 Ethernet IP TCP Ethernet Pad 6 Ethernet CRC Total Interframe Gap Total ACK Bytes Notes 4 1 Megabyte = 1024 2 bytes = 1,048,576 5 Includes Ethernet preamble and interframe gap ( bytes), but no VLAN tags (4 bytes) 6 Required for Ethernet minimum frame size of 64 Bytes Alacritech, Inc. 234 East Gish Road San Jose, CA 95112 USA www.alacritech.com toll free: 877.338.7542 tel: 408.287.9997 fax: 408.287.6142 email: info@alacritech.com Page 9 Alacritech 02-04. All rights reserved. Alacritech, the Alacritech logo, SLIC Technology, the SLIC Technology logo, and Accelerating Data Delivery are trademarks and/or registered trademarks of Alacritech, Inc. McDATA, the McDATA logo, Storage over IP, SoIP, and all product names are trademarks of McDATA, Inc. Hitachi Freedom Storage and Hitachi Freedom Data Networks are registered trademarks of Hitachi Data Systems. All other marks are the property of their respective owners. One or more U.S. and international patents apply to Alacritech products,including without limitation: U.S. Patent Nos. 6,226,680, 6,247,060, 6,334,153, 6,389,479, 6,393,487, 6,427,171, 6,427,173, 6,434,6, 6,470,415, 6,591,302, 6,658,480, and 6,687,758. Alacritech, Inc., reserves the right, without notice, to make changes in product design or specifications. Product data is accurate as of initial publication. Performance data contained in this publication were obtained in a controlled environment based on the use of specific data. The results that may be obtained in other operating environments may vary significantly. Users of this information should verify the applicable data in their specific environment. Doc 00076-MKT