1 High Performance Computing Course Notes High Performance Storage

2 Storage devices Primary storage: register (1 CPU cycle, a few ns) Cache ( cycles, us) Main memory Local main memory (0.2-4us) NUMA (2-10xlocal memory) Secondary storage: Magnetic disk (2-20ms) Solid state disk ( ms) Cache in storage controller ( ms) Tertiary storage Removable media: tapes, floppies, CDs (ms-minutes) Tape library (few seconds few minutes) 2

3 Hard disk vs. solid state drive a) 2.5-inch hard disk b) solid state drive 3

4 Tape library 4

5 Disks 5

6 Disk failure and metrics mean time between failures (MTBF): Mean time between failures (MTBF) is the average time between failures of a disk MTBF= (downtime-uptime)/number-of-failures Annual failure rate (AFR): number of failures per year AFR=running-hours-per-year/MTBF AFR disks =N disks *AFR disk 6

7 Solutions for disk failures Redundancy Replication (mirroring) Partial Redundancy Parity information 7

8 RAID RAID: Redundant Arrays of Inexpensive Disks Goals: increased data reliability and increased I/O performance Main concepts in RAID Mirroring stripping parity Advantages: High capability High performance: data stripe Graceful degrading One disk fails, only that disk needs to be replaced 8

9 RAID Disadvantage: failures AFR disks =N disks *AFR disk Solution Redundancy: 1) replication/mirroring: need more space 2) parity: recover from single disk failure; need more operations to maintain parity info and recover 9

10 Parity Parity calculation is performed using XOR. XOR operator is "true" if and only if one of its operands is true Property of XOR: If D p =D 1 XOR D k XOR D n, then D k = D p XOR D 1 D k-1 XOR D k+1 XOR D n Therefore, if any data is lost, we can recover the data from parity and the remaining data Advantages: only one of the "N+1" drives contains redundancy information Disadvantages: parity information has to be computed every time the data is updated 10

11 Disk arrays taxonomy RAID levels 0: stripping without redundancy 1: full copy mirroring 2: Hamming-code 3: separate disk for parity 4: data of a file are put in a single disk 5: rotated distributed parity 6: double parity They are just classifications rather than a ordered list 11

12 RAID levels RAID0 Stripped without redundancy Data can be read off in parallel Any disk failure destroys the entire array RAID1 Mirrored Array continues to operate so long as at least one drive is functioning 12

13 RAID3 Striped set with dedicated parity single parity disk is a bottleneck for writing Byte-level striping (typically under 1k) RAID4 Identical to RAID 3 but does block-level striping instead of byte-level striping The block can be of any size 13

14 RAID5 Striped set with distributed parity the array is not destroyed by a single drive failure Upon drive failure, any subsequent reads can be calculated from the distributed parity The array will have data loss in the event of a second drive failure 14

15 RAID6 Striped set with dual parity. Provides fault tolerance from two drive failures 15

18 Network Attached Storage (NAS) Follows a client/server design A NAS head acts as the interface between the NAS and network clients The NAS appears on the network as a single "node" that is the IP address of the head device Clients access a NAS over an Ethernet connection The NAS devices require no monitor, keyboard or mouse and run an embedded os NAS uses file-based application protocols such as NFS (Network File System) and CIFS (Common Internet File System) 18

19 Storage Area Networks (SANs( SANs) An architecture to attach remote computer storage devices to servers in such a way that the devices appear as locally attached to the OS The data is accessed in blocks Use FibreChannel protocol to access data 19

20 NAS vs. SAN 20

James Moscola Department of Physical Sciences York College of Pennsylvania Based on Operating System Concepts, 9th Edition by Silberschatz

RAID Overview 91.520 1 The Motivation for RAID Computing speeds double every 3 years Disk speeds can t keep up Data needs higher MTBF than any component in system IO Performance and Availability Issues!

Introduction - 8.1 I/O Chapter 8 Disk Storage and Dependability 8.2 Buses and other connectors 8.4 I/O performance measures 8.6 Input / Ouput devices keyboard, mouse, printer, game controllers, hard drive,

Chapter 6 External Memory Dr. Mohamed H. Al-Meer 6.1 Magnetic Disks Types of External Memory Magnetic Disks RAID Removable Optical CD ROM CD Recordable CD-R CD Re writable CD-RW DVD Magnetic Tape 2 Introduction

Network Attached Storage Jinfeng Yang Oct/19/2015 Outline Part A 1. What is the Network Attached Storage (NAS)? 2. What are the applications of NAS? 3. The benefits of NAS. 4. NAS s performance (Reliability

Lecture 36: Chapter 6 Today s topic RAID 1 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for

CS161: Operating Systems Matt Welsh mdw@eecs.harvard.edu Lecture 18: RAID April 19, 2007 2007 Matt Welsh Harvard University 1 RAID Redundant Arrays of Inexpensive Disks Invented in 1986-1987 by David Patterson

Lecture 21: Storage Administration Take QUIZ 15 over P&H 6.1-4, 6.8-9 before 11:59pm today Project: Cache Simulator, Due April 29, 2010 NEW OFFICE HOUR TIME: Tuesday 1-2, McKinley Last Time Exam discussion

Storing Data: Disks and Files (From Chapter 9 of textbook) Storing and Retrieving Data Database Management Systems need to: Store large volumes of data Store data reliably (so that data is not lost!) Retrieve

RAID Level Descriptions RAID 0 (Striping) Offers low cost and maximum performance, but offers no fault tolerance; a single disk failure results in TOTAL data loss. Businesses use RAID 0 mainly for tasks

Filing Systems At the outset we identified long-term storage as desirable characteristic of an OS. EG: On-line storage for an MIS. Convenience of not having to re-write programs. Sharing of data in an

RAID The basic idea of RAID (Redundant Array of Independent Disks) is to combine multiple inexpensive disk drives into an array of disk drives to obtain performance, capacity and reliability that exceeds

STORAGE STORAGE MEDIA independently from the repository model used, data must be saved on a support (data storage media). Arka Service uses the most common methods used as market standard such as: MAGNETIC

RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for redundant data storage Provides fault tolerant

Overview of I/O Performance and RAID in an RDBMS Environment By: Edward Whalen Performance Tuning Corporation Abstract This paper covers the fundamentals of I/O topics and an overview of RAID levels commonly

Overview of RAID Let's first address, "What is RAID and what does RAID stand for?" RAID, an acronym for "Redundant Array of Independent Disks, is a storage technology that links or combines multiple hard

Reliability and Fault Tolerance in Storage Dalit Naor/ Dima Sotnikov IBM Haifa Research Storage Systems 1 Advanced Topics on Storage Systems - Spring 2014, Tel-Aviv University http://www.eng.tau.ac.il/semcom

COSC 6374 Parallel I/O (I) I/O basics Fall 2012 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network card 1 Network card

Fault Tolerance & Reliability CDA 5140 Chapter 3 RAID & Sample Commercial FT Systems - basic concept in these, as with codes, is redundancy to allow system to continue operation even if some components

Algorithms and Methods for Distributed Storage Networks 4: Volume Manager and RAID Institut für Informatik Wintersemester 2007/08 RAID Redundant Array of Independent Disks Patterson, Gibson, Katz, A Case

RAID Definition and Use of the Different RAID Levels Contents The different RAID levels: Definition Cost / Efficiency Reliability Performance Further High Availability Aspects Performance Optimization

ES- Elettronica dei Sistemi Computer Architecture Lesson 7 Disk Arrays Network Attached Storage 4"» "» 8"» 525"» 35"» 25"» 8"» 3"» high bandwidth disk systems based on arrays of disks Decreasing Disk Diameters

REDES DE ARMAZENAMENTO E ALTA DISPONIBILIDADE O que é SAN? SAN Storage Área Network Rede de armazenamento. É uma rede de alta velocidade dedicada que possui servidores e recursos compartilhados de STORAGE

WHITEPAPER: Understanding Pillar Axiom Data Protection Options Introduction This document gives an overview of the Pillar Data System Axiom RAID protection schemas. It does not delve into corner cases

Sistemas Operativos: Input/Output Disks Pedro F. Souto (pfs@fe.up.pt) April 28, 2012 Topics Magnetic Disks RAID Solid State Disks Topics Magnetic Disks RAID Solid State Disks Magnetic Disk Construction

Storage Design for High Capacity and Long Term Storage Balancing Cost, Complexity, and Fault Tolerance DLF Spring Forum, Raleigh, NC May 6, 2009 Lecturer: Jacob Farmer, CTO Cambridge Computer Copyright

Chapter 6 Storage and Other I/O Topics 6.1 Introduction I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections

Lecture 23: Multiprocessors Today s topics: RAID Multiprocessor taxonomy Snooping-based cache coherence protocol 1 RAID 0 and RAID 1 RAID 0 has no additional redundancy (misnomer) it uses an array of disks

Disks and RAID Profs. Bracy and Van Renesse based on slides by Prof. Sirer 50 Years Old! 13th September 1956 The IBM RAMAC 350 Stored less than 5 MByte Reading from a Disk Must specify: cylinder # (distance

Disk Storage & Dependability Computer Organization Architectures for Embedded Computing Wednesday 19 November 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition,

RAID Storage-centric computing, cloud computing. Benefits: Improved reliability (via error correcting code, redundancy). Improved performance (via redundancy). Independent disks. RAID Level 0 Provides

Operating Systems RAID Redundant Array of Independent Disks Submitted by Ankur Niyogi 2003EE20367 YOUR DATA IS LOST@#!! Do we have backups of all our data???? - The stuff we cannot afford to lose?? How

CS 6290 I/O and Storage Milos Prvulovic Storage Systems I/O performance (bandwidth, latency) Bandwidth improving, but not as fast as CPU Latency improving very slowly Consequently, by Amdahl s Law: fraction

Evaluating Storage Technologies for Virtual Server Environments Russ Fellows June, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved Executive Summary

Summer Student Project Report Dimitris Kalimeris National and Kapodistrian University of Athens June September 2014 Abstract This report will outline two projects that were done as part of a three months

: Redundant Arrays of Inexpensive Disks this discussion is based on the paper:» A Case for Redundant Arrays of Inexpensive Disks (),» David A Patterson, Garth Gibson, and Randy H Katz,» In Proceedings

760 Veterans Circle, Warminster, PA 18974 215-956-1200 Technical Proposal Submitted by: ACT/Technico 760 Veterans Circle Warminster, PA 18974 for Conduction Cooled NAS Revision 4/3/07 CC/RAIDStor: Conduction

Solutions with Open-E Data Storage Software (DSS V6) Software Version: DSS ver. 6.00 up40 Presentation updated: September 2010 Different s opportunities using Open-E DSS The storage market is still growing

QUASICOM Private Cloud Backups with ExaGrid Deduplication Disk Arrays Martin Lui Senior Solution Consultant Quasicom Systems Limited Protect Data...... in the Cloud 1 Mobile Computing Users work with their

Hard Disk Drives and RAID Janaka Harambearachchi (Engineer/Systems Development) INTERFACES FOR HDD A computer interfaces is what allows a computer to send and retrieve information for storage devices such

an introduction to networked storage How networked storage can simplify your data management The key differences between SAN, DAS, and NAS The business benefits of networked storage Introduction Historical

Dependable Systems 9. Redundant arrays of inexpensive disks (RAID) Prof. Dr. Miroslaw Malek Wintersemester 2004/05 www.informatik.hu-berlin.de/rok/zs Redundant Arrays of Inexpensive Disks (RAID) RAID is

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture I: Storage Storage Part I of this course Uni Freiburg, WS 2014/15 Systems Infrastructure for Data Science 3 The

The read/write head of a hard drive only detects changes in the magnetic polarity of the material passing beneath it, not the direction of the polarity. Writes are performed by sending current either one

Storage Compared to the performance parameters of the other components we have been studying, storage systems are much slower devices. Typical access times to rotating disk storage devices are in the millisecond

SURVEY ON RAID Aishwarya Airen 1, Aarsh Pandit 2, Anshul Sogani 3 1,2,3 A.I.T.R, Indore. Abstract RAID stands for Redundant Array of Independent Disk that is a concept which provides an efficient way for

CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations Physical Disk Structure Disk components Platters Surfaces Tracks Arm Track Sector Surface Sectors Cylinders Arm Heads

INTERNATIONAL International Journal of Computer JOURNAL Engineering OF COMPUTER and Technology (IJCET), ENGINEERING ISSN 0976-6367(Print), ISSN 0976 & 6375(Online) TECHNOLOGY Volume 4, Issue (IJCET) 3,

Dec. 3 rd 2013 Advanced Knowledge and Understanding of Industrial Data Storage By Jesse Chuang, Senior Software Manager, Advantech With the popularity of computers and networks, most enterprises and organizations

IBM TotalStorage Network Attached Storage October 2001 RAID technology and IBM TotalStorage NAS products By Janet Anglin and Chris Durham Storage Networking Architecture, SSG Page No.1 Contents 2 RAID

### Click on the diagram to see RAID 0 in action

### Performance Analysis of RAIDs in Storage Area Network

### Implementing a Digital Video Archive Based on XenData Software

### HP Smart Array Controllers and basic RAID performance factors

### technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

### Managing RAID. RAID Options

### Storage node capacity in RAID0 is equal to the sum total capacity of all disks in the storage node.

### Key Messages of Enterprise Cluster NAS Huawei OceanStor N8500

### Storage System: Management of Explosively Increasing Data in Mission-Critical Systems

### Large Scale Storage. Orlando Richards, Information Services orlando.richards@ed.ac.uk. LCFG Users Day, University of Edinburgh 18 th January 2013

### A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES

### RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

### Virtualization, Business Continuation Plan & Disaster Recovery for EMS -By Ramanj Pamidi San Diego Gas & Electric

### Data Storage and Backup. Sanjay Goel School of Business University at Albany, SUNY

### Computer System. Chapter 1. 1.1 Introduction

### Secure Your Megapixel Recording. with Reliable Storage Solutions. Surveon Whitepaper

### Distributed RAID Architectures for Cluster I/O Computing. Kai Hwang

### Implementing Offline Digital Video Storage using XenData Software

### System Architecture. CS143: Disks and Files. Magnetic disk vs SSD. Structure of a Platter CPU. Disk Controller...

### DATA CENTRE TECHNOLOGIES & SERVICES

### SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

### Perforce with Network Appliance Storage

### Database Management Systems

### Brian LaGoe, Systems Administrator Benjamin Jellema, Systems Administrator Eastern Michigan University

### PARALLEL I/O FOR HIGH PERFORMANCE COMPUTING

### Non-Redundant (RAID Level 0)

### M.Sc. IT Semester III VIRTUALIZATION QUESTION BANK 2014 2015 Unit 1 1. What is virtualization? Explain the five stage virtualization process. 2.

### EMC DATA DOMAIN OPERATING SYSTEM

### ISTANBUL AYDIN UNIVERSITY

### Virtualization. Nelson L. S. da Fonseca IEEE ComSoc Summer Scool Trento, July 9 th, 2015

### EMC DATA DOMAIN OPERATING SYSTEM

### POWER ALL GLOBAL FILE SYSTEM (PGFS)

### Case for storage. Outline. Magnetic disks. CS2410: Computer Architecture. Storage systems. Sangyeun Cho

### Storage Architectures for Big Data in the Cloud

### The Panasas Parallel Storage Cluster. Acknowledgement: Some of the material presented is under copyright by Panasas Inc.

### Distribution One Server Requirements

### DISTRIBUTED MULTIMEDIA SYSTEMS

### Online Remote Data Backup for iscsi-based Storage Systems

### Outline. CS 245: Database System Principles. Notes 02: Hardware. Hardware DBMS ... ... Data Storage

### IncidentMonitor Server Specification Datasheet

### Power-All Networks Clustered Storage Area Network: A scalable, fault-tolerant, high-performance storage system.

### Cisco Small Business NAS Storage

### Physical Data Organization

### Chapter 10: Mass-Storage Systems

### William Stallings Computer Organization and Architecture 7 th Edition. Chapter 6 External Memory

