Online Remote Data Backup for iscsi-based Storage Systems



Similar documents
Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

NFS SERVER WITH 10 GIGABIT ETHERNET NETWORKS

A Performance Analysis of the iscsi Protocol

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Oracle Database Deployments with EMC CLARiiON AX4 Storage Systems

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

Energy aware RAID Configuration for Large Storage Systems

STICS: SCSI-To-IP Cache for Storage Area Networks

EMC Unified Storage for Microsoft SQL Server 2008

Owner of the content within this article is Written by Marc Grote

Summer Student Project Report

Fusionstor NAS Enterprise Server and Microsoft Windows Storage Server 2003 competitive performance comparison

Definition of RAID Levels

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

RAID Storage, Network File Systems, and DropBox

Chapter 10: Mass-Storage Systems

RAID Performance Analysis

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Microsoft Exchange Server 2003 Deployment Considerations

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

Network Attached Storage. Jinfeng Yang Oct/19/2015

A Packet Forwarding Method for the ISCSI Virtualization Switch

Introduction to MPIO, MCS, Trunking, and LACP

Lecture 36: Chapter 6

Performance Analysis of RAIDs in Storage Area Network

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

Distribution One Server Requirements

RAID Storage Systems with Early-warning and Data Migration

What is RAID? data reliability with performance

Distributed File System Performance. Milind Saraph / Rich Sudlow Office of Information Technologies University of Notre Dame

Chapter 12: Mass-Storage Systems

Performance Report Modular RAID for PRIMERGY

Study and Performance Analysis of IP-based Storage System

H ARDWARE C ONSIDERATIONS

Database Management Systems

Price/performance Modern Memory Hierarchy

Performance of optimized software implementation of the iscsi protocol

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Exploring RAID Configurations

Seradex White Paper. Focus on these points for optimizing the performance of a Seradex ERP SQL database:

Deploying Riverbed wide-area data services in a LeftHand iscsi SAN Remote Disaster Recovery Solution

High Performance Computing. Course Notes High Performance Storage

RAID 5 rebuild performance in ProLiant

Optimizing LTO Backup Performance

Storing Data: Disks and Files

Adaptec: Snap Server NAS Performance Study

How To Build A Clustered Storage Area Network (Csan) From Power All Networks

EMC Backup and Recovery for Microsoft Exchange 2007 SP2

An Implementation of Storage-Based Synchronous Remote Mirroring for SANs

White Paper Open-E NAS Enterprise and Microsoft Windows Storage Server 2003 competitive performance comparison

Enabling Technologies for Distributed Computing

Performance characterization report for Microsoft Hyper-V R2 on HP StorageWorks P4500 SAN storage

Installation and Configuration Guide for Cluster Services running on Microsoft Windows 2000 Advanced Server using Acer Altos Servers

S 2 -RAID: A New RAID Architecture for Fast Data Recovery

Benchmarking Hadoop & HBase on Violin

PARALLELS CLOUD STORAGE

Sistemas Operativos: Input/Output Disks

HP reference configuration for entry-level SAS Grid Manager solutions

Enabling Technologies for Distributed and Cloud Computing

Introduction to I/O and Disk Management

UNCORRECTED PROOF. STICS: SCSI-to-IP cache for storage area networks ARTICLE IN PRESS

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

SAN/iQ Remote Copy Networking Requirements OPEN iscsi SANs 1

What is RAID--BASICS? Mylex RAID Primer. A simple guide to understanding RAID

Filesystems Performance in GNU/Linux Multi-Disk Data Storage

StarWind Virtual SAN Best Practices

Introduction to Gluster. Versions 3.0.x

Visual UpTime Select Server Specifications

Building High-Performance iscsi SAN Configurations. An Alacritech and McDATA Technical Note

HP Proliant BL460c G7

760 Veterans Circle, Warminster, PA Technical Proposal. Submitted by: ACT/Technico 760 Veterans Circle Warminster, PA

Adaptec: Snap Server NAS Performance Study

DELL TM PowerEdge TM T Mailbox Resiliency Exchange 2010 Storage Solution

Quantum StorNext. Product Brief: Distributed LAN Client

iscsi Performance Factors

Storage node capacity in RAID0 is equal to the sum total capacity of all disks in the storage node.

Shared Parallel File System

Increasing the capacity of RAID5 by online gradual assimilation

Dell EqualLogic Best Practices Series

Investigations on Network-Attached Storage Devices

An Introduction to RAID. Giovanni Stracquadanio

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE

New!! - Higher performance for Windows and UNIX environments

Quick Reference Guide: Server Hosting

1000-Channel IP System Architecture for DSS

A Deduplication File System & Course Review

Quantifying the Performance Degradation of IPv6 for TCP in Windows and Linux Networking

Models Smart Array 6402A/128 Controller 3X-KZPEC-BF Smart Array 6404A/256 two 2 channel Controllers

On Benchmarking Popular File Systems

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

William Stallings Computer Organization and Architecture 7 th Edition. Chapter 6 External Memory

How To Improve Performance On A Single Chip Computer

Hard Disk Drives and RAID

VTrak SATA RAID Storage System

White Paper. Educational. Measuring Storage Performance

H ARDWARE C ONSIDERATIONS

Implementing Offline Digital Video Storage using XenData Software

Windows Server 2008 R2 Hyper-V Live Migration

EMC Virtual Infrastructure for Microsoft SQL Server

Transcription:

Online Remote Data Backup for iscsi-based Storage Systems Dan Zhou, Li Ou, Xubin (Ben) He Department of Electrical and Computer Engineering Tennessee Technological University Cookeville, TN 38505, USA {dzhou21, lou21, Hexb}@tntech.edu Stephen L. Scott Computer Science & Mathematics Division Oak Ridge National Laboratory Oak Ridge, TN 37831, USA scottsl@ornl.gov Abstract Data reliability is critical to many data sensitive applications, especially to the emerging storage over the network. In this paper, we have proposed a method to perform remote online data backup to improve reliability for iscsi based networked storage systems. The basic idea is to apply the traditional RAID technology to iscsi environment. Our first technique is to improve the reliability by mirroring data among several iscsi targets, and second is to improve the reliability and performance by striping data and rotating parity over several iscsi targets. Extensive measurement results using Iozone have shown that both techniques can provide comparable performance while improving reliability. Key Words: iscsi, RAID, Reliability, Online Backup 1. Introduction With the increasing demand to deploy storage over the network [2,4], iscsi [8] is a newly emerging technology with the goal of implementing the storage area networks (SAN) [9] technology over Internet infrastructure. Since iscsi was proposed, much research has been sparked to protocol development [5], performance evaluation and analysis [1,3], and implementation [7], but there is not much research on iscsi reliability. While it was observed that even though a server may be very robust and highly reliable, it is still vulnerable to an unexpected disaster such as a terrorist attack or a natural disaster. RAID (Redundant Array of Independent Disks) [6] is a known, mature technique to improve reliability of disk I/O through redundancy. This paper introduces a method to perform remote online data backup and disaster recovery for iscsi based storage systems by striping data among several iscsi targets in a similar way to RAID. Since it s a distributed RAID across several iscsi targets, we name it iscsi RAID, or iraid for short. The difference between iraid and traditional RAID is that in traditional RAID, disk is the unit, while in iraid each iscsi target is a unit. Similar to traditional RAID, we may have different layouts/raid levels. In this paper we only focus on two layouts: mirroring (M-iRAID) and rotated parity (P-iRAID). Both M-iRAID and P-iRAID improve the reliability and provide remote online data backup. It s remote because all iscsi targets may be

physically on different locations, and it is online since they are connected through the Internet. M-iRAID achieves data backup by mirroring multiple iscsi targets. If the primary target fails, data is still available from the backup iscsi target. P-iRAID improves reliability by calculating and rotating a parity block for the iscsi targets. In case one iscsi target fails, data can be reconstructed from other n-1 iscsi targets. To quantitatively evaluate the performance potential of iraid in real world network environment, we have implemented the prototype of iraid and measured system performance. Extensive measurement results show that M-iRAID and P-iRAID demonstrate comparable performance while improving reliability. The rest of the paper is organized as follows. Next section presents the design and implementation of iraid including M-iRAID and P-iRAID, followed by our performance evaluation. We conclude our paper in Section 4. 2. Design of iscsi RAID (iraid) We introduce iscsi RAID, or iraid, to solve the reliability problems of iscsi storage systems. The basic idea of iraid is to organize the iscsi storage targets similar to RAID by using mirroring and rotated parity techniques. In iraid, each iscsi storage target is a basic storage unit in the array, and it serves as a storage node as shown in Figure 1. All the nodes in the array are connected to each other through a high-speed switch to form a local area network. iraid provides a direct and immediate solution to boost iscsi performance and improve reliability. Parallelism in iraid leads to performance gain while using the RAID parity technique improves the reliability. This paper focuses on two iraid configurations: mirroring iraid (M-iRAID) and rotated parity iraid (P-iRAID). System Services File System iscsi Initiator TCP/IP Network Target 1 Target 2 Target N Figure 1: iraid architecture. Data are striped among iscsi targets. 2.1 M-iRAID In the mirroring iraid (M-iRAID), all data are striped and mirrored uniformly among all the iraid nodes, which is illustrated in Figure 2. It borrows the concept from RAID level 1.

Figure 2 shows the data organization of each iraid node for a M-iRAID system consisting of n iscsi targets, where D ij indicates that data block i on iscsi target j. 2.2 P-iRAID The M-iRAID increases the reliability of iscsi through mirroring and provides remote data backup but also increases the cost since each data block is mirrored and stored in a backup iscsi target. To improve the reliability at a lower cost, we introduce parity iraid (P-iRAID) where in addition to data being striped and distributed among the iscsi targets, a parity code for each data stripe is calculated and stored in an iraid node. The parity block is rotated among the n iscsi targets as shown in Figure 3, where the shadowed blocks are parity blocks, and others are data blocks. Each bit in a parity block is the XOR operation on the corresponding bits of the rest of the data blocks in each stripe. For example, P D D D. 1 = 11 12 1, n 1 D 11 D 11 D 1n D 1n D 11 D 12 D 14 P 1 D 21 D 31 D r1 D 21 D 31 D r1 D 2n D 3n D rn D 2n D 3n D rn D 21 D 31 P r D 22 D 32 D r2 P 2 D 3(n-1) D r(n-1) D 2n D 3n D rn Figure 2: Data organization of M-iRAID. Figure 3: Data organization of P-iRAID. 3. Performance Evaluations 3.1 Experimental Setup For the purpose of performance evaluation, we have implemented iraid prototype (for both M-iRAID and P-iRAID) based on Linux software RAID and Intel iscsi code 1. Our experimental settings are shown in Figure 4. Six PCs are involved in our experiments named STAR1 through STAR6. STAR1 serves as the iscsi initiator, and STAR2-5 are four iscsi targets, which are organized as our iraid. The data block size is set to 64KB, which is the default chunk size of Linux software RAID. All these machines are interconnected through a DELL PowerConnect 5012, 10-ports managed Gigabit Ethernet switch to form an isolated LAN. Each machine is running Linux kernel 2.4.18 with a 3COM 3C996B-T server network interface card (NIC) and an Adaptec 39160 high performance SCSI adaptor. STAR6 is used to 1 URL: http://sourceforge.net/projects/intel-iscsi

monitor the network traffic over the switch. The configurations of these machines are described in Table 1 and the characteristics of the disks are summarized in Table 2. We use the popular file system benchmark tool, Iozone 2, to measure the performance. The benchmark tests file I/O performance for a wide range of operations. We will focus on performance of sequential read/write, random read/write because those are generally the primary concerns for any storage systems. The average throughput listed here is the arithmetic average of above four I/O operations. We run Iozone for different request size and data sets under various scenarios as follows: Iozone Ra S dataset size r request size P i0 i1 I 2 f /mnt/iraid/test Where dataset size and request size are configurable. We reboot all machines after each round of test. Figure 4:Experimental setup with 4 iscsi targets and 1 initiator. 2 URL: http://www.iozone.org Table 1: Machines configurations Machines Processor RAM IDE disk SCSI Controller SCSI disk STAR-1 PIII 1.4GHZ/512K Cache 1024MB N/A Adaptec 39160, Dell PERC RAID controller 4x Seagate ST318406LC STAR25 P4 2.4GHZ/512K Cache 256MB WDC WB400BB Adaptec 39160 IBM Ultrastar 73LZX STAR6 P4 2.4GHZ/512K Cache 256MB WDC WB400BB N/A N/A Table 2: Disk parameters Disk Model Interface Capacity Data buffer RPM Latency (ms) Transfer rate (MB/s) Average Seek time (ms) ST318406LC Ultra 160 SCSI 18GB 4MB 10000 2.99 63.2 5.6 Ultrastar 73LZX Ultra 160 SCSI 18GB 4MB 10000 3 29.2-57.0 4.9 WB400BB Ultra ATA 40GB 2MB 7200 4.2 33.3 9.9

3.2 Numerical Results Our first experiment is to use Iozone to measure the I/O throughput for iscsi without redundancy and M-iRAID with one backup iscsi target under Gigabit Ethernet. The data set is 1G bytes and I/O request sizes range from 4KB to 64KB. Figure 5 shows the average throughputs for single iscsi target without redundancy and two-targets M-iRAID. M-iRAID even shows better performance than single target iscsi. The reason is that by mirroring the iscsi target, the read performance is improved since the read requests may be satisfied by primary target and secondary target simultaneously. Data backup for for one primary node No No redundancy M-iRAID M-iRAID and P-iRAID No Redundancy M-iRAID P-iRAID 14 14 30 Throughput (MB/s) Throughput (MB/s) 12 12 10 10 8 8 6 6 4 4 25 20 15 10 2 2 5 0 0 4k 4k 8k 8k 16k 16k 32k 32k 64k 64k Request size size 0 4k 8k 16k 32k 64k Request Size Figure 5: Using one target as backup Figure 6: M-iRAID and P-iRAID Our next experiment is using 4 iscsi targets to construct a 4-node M-iRAID, where 2 targets (backup targets) are used to back up another 2 targets (primary targets). Similarly we construct a 4-node P-iRAID, where a parity block is calculated and rotated among the 4 iscsi targets. Figure 6 shows the performance. It is obvious that M-iRAID performance is comparable to non-redundant configuration, while P-iRAID shows lower performance because of the smallwrite problem. 4. Conclusions In this paper, we have proposed a method to perform remote online data backup to improve reliability for iscsi based storage system. We have introduced M-iRAID to improve the reliability by mirroring data among several iscsi targets, and introduced P-iRAID to improve the reliability and performance by striping data and rotating parity over several iscsi targets. We have carried out prototype implementations of M-iRAID and P-iRAID under the Linux operating system. Extensive measurement results using Iozone have shown that M-iRAID and P-iRAID can demonstrate comparable performance while improving reliability, which is critical for disaster recovery.

Acknowledgments This work is partially supported by Research Office under Faculty Research Award and Center for Manufacturing Research at Tennessee Technological University and the Mathematics, Information and Computational Sciences Office, Office of Advanced Scientific Computing Research, Office of Science, U. S. Department of Energy, under contract No. DE- AC05-00OR22725 with UT-Battelle, LLC. The authors would also like to thank the anonymous reviewers for a number of insightful and helpful suggestions. References [1] S. Aiken, D. Grunwald, A. Pleszkun, and J. Willeke, A Performance Analysis of the iscsi Protocol, 20 th IEEE Conference on Mass storage Systems and Technologies, 2003. [2] E. Gabber, et al., StarFish: highly-available block storage, Proceedings of the FREENIX track of the 2003 USENIX Annual Technical Conference, San Antonio, TX, June 2003, pp. 151-163. [3] Y. Lu and D. Du, Performance Study of iscsi-based Storage Systems, IEEE Communications, Vol. 41, No. 8, 2003. [4] R. V. Meter, G. G. Finn, S. Hotz. VISA: Netstation's Virtual Internet SCSI Adapter. In Proceedings of the 8 th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VIII), pp. 71-80, October 4-7,1998. [5] K. Meth and J. Satran, Features of the iscsi Protocol, IEEE Communications, Vol. 41,No. 8, August 2003. [6] D.A. Patterson, et al., A Case for Redundant Arrays of Inexpensive Disks (RAID), ACM International Conference on Management of Data (SIGMOD), pp. 109-116, 1988. [7] P. Sarkar, S. Uttamchandani, and K. Voruganti, Storage Over IP: When Does Hardware Support Help? USENIX Conference on File And Storage Technologies, 2003. [8] J. Satran, et al. iscsi draft standard, URL: http://www.ietf.org/internet-drafts/draft-ietfips-iscsi-20.txt, Jan. 2003. [9] P. Wang, et al., IP SAN-from iscsi to IP-Addressable Ethernet Disks, 20 th IEEE Conference on Mass storage Systems and Technologies, 2003.