RAID Storage, Network File Systems, and DropBox

Similar documents
CS161: Operating Systems

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

Lecture 36: Chapter 6

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

How To Improve Performance On A Single Chip Computer

Network File System (NFS) Pradipta De

PIONEER RESEARCH & DEVELOPMENT GROUP

Distributed File Systems

1 Storage Devices Summary

Reliability and Fault Tolerance in Storage

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12

Web DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

Network Attached Storage. Jinfeng Yang Oct/19/2015

Distributed File Systems. Chapter 10

Review. Lecture 21: Reliable, High Performance Storage. Overview. Basic Disk & File System properties CSC 468 / CSC /23/2006

Summer Student Project Report

Definition of RAID Levels

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

CSE 120 Principles of Operating Systems

Online Remote Data Backup for iscsi-based Storage Systems

Sistemas Operativos: Input/Output Disks

An Introduction to RAID. Giovanni Stracquadanio

CS 153 Design of Operating Systems Spring 2015

RAID. Contents. Definition and Use of the Different RAID Levels. The different RAID levels: Definition Cost / Efficiency Reliability Performance

RAID Performance Analysis

Today s Papers. RAID Basics (Two optional papers) Array Reliability. EECS 262a Advanced Topics in Computer Systems Lecture 4

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

Distribution One Server Requirements

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

The Google File System

Firebird and RAID. Choosing the right RAID configuration for Firebird. Paul Reeves IBPhoenix. mail:

Chapter 10: Mass-Storage Systems

The Panasas Parallel Storage Cluster. Acknowledgement: Some of the material presented is under copyright by Panasas Inc.

3PAR Fast RAID: High Performance Without Compromise

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

RAID Basics Training Guide

File Systems Management and Examples

Data Storage - II: Efficient Usage & Errors

Chapter 12: Mass-Storage Systems

AIX NFS Client Performance Improvements for Databases on NAS

We mean.network File System

Database Management Systems

RAID Technology. RAID Overview

Cisco Small Business NAS Storage

Why disk arrays? CPUs improving faster than disks

Price/performance Modern Memory Hierarchy

Disk Storage & Dependability

Striped Set, Advantages and Disadvantages of Using RAID

AFS Usage and Backups using TiBS at Fermilab. Presented by Kevin Hill

File System Design and Implementation

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

Storage Architectures for Big Data in the Cloud

Benefits of Intel Matrix Storage Technology

CSE-E5430 Scalable Cloud Computing P Lecture 5

How To Write A Disk Array

Network File System (NFS)

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

HP Smart Array Controllers and basic RAID performance factors

RAID Overview

Last class: Distributed File Systems. Today: NFS, Coda

Considerations when Choosing a Backup System for AFS

Dependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05

White Paper. Educational. Measuring Storage Performance

CS 6290 I/O and Storage. Milos Prvulovic

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck

Chapter 11: File System Implementation. Operating System Concepts with Java 8 th Edition

Storing Data: Disks and Files

PARALLELS CLOUD STORAGE

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

Chapter 11 Distributed File Systems. Distributed File Systems

RAID Technology Overview

IBM System x GPFS Storage Server

HDFS Under the Hood. Sanjay Radia. Grid Computing, Hadoop Yahoo Inc.

EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage

G Porcupine. Robert Grimm New York University

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

Hard Disk Drives and RAID

RAID0.5: Active Data Replication for Low Cost Disk Array Data Protection

Data Backup and Archiving with Enterprise Storage Systems

Distributed File System Performance. Milind Saraph / Rich Sudlow Office of Information Technologies University of Notre Dame

Transcription:

RAID Storage, Network File Systems, and DropBox George Porter CSE 124 February 24, 2015 * Thanks to Dave Patterson and Hong Jiang

Announcements Project 2 due by end of today Office hour today 2-3pm in B275 Project 3 out

Overview Networked file storage is really important Used in companies/business/education Used in cloud computing environments Used by people our daily lives Challenges: How to access storage over the network? How to keep it reliable? We ll start with the single-node case first

IBM 305 RAMAC 4 MB 50x24 disks 1200 rpm 100 ms access 35k$/y rent Included computer & accounting software (tubes not transistors) The first HDD (1956)

10 years later 1.6 meters 5

Transportation of HDD

1 inch disk drive! 2000 IBM MicroDrive: 1.7 x 1.4 x 0.2 1 GB, 3600 RPM, 5 MB/s, 15 ms seek Digital camera, PalmPC? 2006 MicroDrive 8 GB, 50 MB/s!

The internal look of HDD (now)

Data access of HDD Access Time = Seek Time + Rotational Delay + Transfer Time

Redundant Array of Inexpensive Disks (RAID): 1987-1993 UC Berkeley Randy Katz and David Patterson: Use many PC disks to build better storage? RAID I built on 1st SPARC, 28 disks RAID II custom HW, 144 disks Today, RAID ~$25B industry RAID students join industry and academia, started own companies (VMware, Panassas)

The RAID paper Ø D. A. Patterson, G. Gibson, and R. H. Katz, "A case for redundant arrays of inexpensive disks (RAID)," in SIGMOD'88 Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data, 1988, vol. 17, no. 3, pp. 109-116. Ø One of the important publications in computer science. http://en.wikipedia.org/wiki/ List_of_important_publications_in_computer_science Ø EMC, HP, IBM, NetApp have produced so many RAIDrelated storage products.

Better Storage? Capacity? Performance? Availability?

RAID introduction A RAID is a Redundant Array of Inexpensive Disks. In industry, I is for Independent The alternative is SLED, single large expensive disk Disks are small and cheap, so it s easy to put lots of disks (10s to 100s) in one box for increased storage, performance, and availability. The RAID box with a RAID controller looks just like a SLED to the computer. Data plus some redundant information is Striped across the disks in some way. How that Striping is done is key to performance and reliability----different RAID levels 0-5, 6

RAID0 Level 0 is non-redundant disk array Files are Striped across disks, no redundant info High read throughput Best write throughput (no redundant info to write) Any disk failure results in data loss Reliability worse than SLED Stripe 0 Stripe 1 Stripe 2 Stripe 3 Stripe 4 Stripe 5 Stripe 6 Stripe 7 Stripe 8 Stripe 9 Stripe 10 Stripe 11 data disks

Array Reliability Reliability of N disks = Reliability of 1 Disk N 50,000 Hours 70 disks = 700 hours Disk system MTTF: Drops from 6 years to 1 month! Arrays (without redundancy) too unreliable to be useful! Hot spares support reconstruction in parallel with access: very high media availability can be achieved

RAID1 Mirrored Disks, data is written to two places On failure, just use surviving disk On read, choose fastest to read Write performance is same as single drive, read performance is 2x better Expensive Stripe 0 Stripe 1 Stripe 2 Stripe 3 Stripe 0 Stripe 1 Stripe 2 Stripe 3 Stripe 4 Stripe 5 Stripe 6 Stripe 7 Stripe 4 Stripe 5 Stripe 6 Stripe 7 Stripe 8 Stripe 9 Stripe 10 Stripe 11 Stripe 8 Stripe 9 Stripe 10 Stripe 11 data disks mirror copies

RAID4 Block-level parity with Stripes A read accesses the appropriate data disk A write accesses all data disks plus the parity disk Why? Heavy load on the parity disk Stripe 0 Stripe 1 Stripe 2 Stripe 3 P0-3 Stripe 4 Stripe 5 Stripe 6 Stripe 7 P4-7 Stripe 8 Stripe 9 Stripe 10 Stripe 11 P8-11 data disks Parity disk

RAID5 Block Interleaved Distributed Parity Like parity scheme, but distribute the parity info over all disks (as well as data over all disks) Better read performance, large write performance What happens when a single disk fails? Stripe 0 Stripe 1 Stripe 2 Stripe 3 P0-3 Stripe 4 Stripe 5 Stripe 6 P4-7 Stripe 7 Stripe 8 Stripe 9 P8-11 Stripe 10 Stripe 11 data and parity disks

Problems of Disk Arrays: Small Writes RAID-5: Small Write Algorithm 1 Logical Write = 2 Physical Reads + 2 Physical Writes D0' D0 D1 D2 D3 P new data old data (1. Read) old parity (2. Read) + XOR + XOR (3. Write) (4. Write) D0' D1 D2 D3 P'

RAID6 Level 5 with an extra parity Can tolerate two failures What are the odds of having two concurrent failures? May outperform Level-5 on reads, slower on writes Stripe 0 Stripe 1 Stripe 2 Stripe 3 P0-3 Q0-3 Stripe 4 Stripe 5 Stripe 6 P4-7 Q4-7 Stripe 7 Stripe 8 Stripe 9 P8-11 Q8-11 Stripe 10 Stripe 11 data and parity disks

Comparison of RAIDs RAID Levels Capacity Storage Efficienc y Availabilit y Ran. Read Ran. Write Seq. Read Seq. Write 0 S * N 100% * **** **** **** **** 1 S * N/2 50% **** *** *** ** ** 4 S * (N-1) (N-1) / N *** **** ** **** ** 5 S * (N-1) (N-1) / N *** **** ** **** *** 6 S * (N-2) (N-2) / N **** **** * **** ** Note: S indicates the capacity of a single disk, N indicates the number of the disks in a RAID set.

Distributed File Systems

Distributed File Systems Goal: transparent access to remote files Access remote files as if they were stored on local hard drive Why would you want this? What are some of the hard issues? Examples: NFS: Sun s Network File System AFS: Andrew File System Coda: CMU research project for mobile clients (now available in Linux) xfs: Berkeley research project stressing serverless design

Distributed File Systems: Motivation Centralized administration E.g., upgrades, backups, additional storage Same file system independent of physical machine Important distributed system mantra: location independence Incremental scalability Do not give everyone 20 GB disk if average user needs 1 GB Add disks to central server rather than desktops

Distributed File System Issues Semantic transparency and performance transparency: Naming: Do not change file names in moving from machine to machine Caching: approximate local performance Availability: remote server crash (fate sharing) Security: protect sensitive information Scale: how large can system grow? In terms of storage and user base

Simplified Access Model Example Application buf=x read /project/file Client kernel Vnode NFS RPC NFS RPC Vnode Local FS Server kernel read /local/a/file Local disk

Performance How to make distributed file access approximate the performance of local file access?

Performance Network latency and limited bandwidth make it difficult to match local performance But network bandwidth is surpassing disk bandwidth Storage area networks, iscsi How to make distributed file access approximate the performance of local file access? Caching: take advantage of locality Both spatial and temporal What issues are introduced by caching?

Distributed File System Structure / local project home proj1 proj2 usr1 Perform mount operation to attach remote file system into local namespace E.g., /project/proj1 actually a file on remote machine (maps to server.cs.ucsd.edu:/local/a/project/proj1)

Most files are small (< 10k) UNIX File Usage Reads outnumber writes (~6:1) Sequential access is common Files remain open for short period of time 75% <.5s, 90% < 10s Most files accessed by exactly one user Most shared files written by exactly one user Temporal locality: recently accessed files likely to be accessed again in near future Most bytes/files are short lived

Building a Distributed File System Debate in late 1980 s, early 1990 s: Stateless vs. stateful file server NFS: stateless server Only store contents of files + soft state (for performance) Crash recovery simple operation All RPCs idempotent (no state) At least once RPC semantics sufficient Server unaware of users accessing files Clients have to check with server periodically for the uncommon case Where directory/file has been modified

Server Caching Cache read results, writes, directory operations Write-through vs. write-back cache? Pros/cons?

NFS Server Caching Cache read results, writes, directory operations Write-through cache vs. write-back cache? Write through: Each update written to disk immediately When write operation returns, client is guaranteed stable update Pros: Stateless (easy to implement), no data lost on crash Cons: Slow: client must wait for disk write

NFS Client Caching Clients cache read, writes, and directory ops What if multiple people updating the same file at the same time? Consistency problems NFS approach: Server maintains last modification time/per file Client remembers time it initially retrieved data On file access, client checks timestamp against server (every 3-30 seconds) Lots of unnecessary timestamp checking How long to set the timeout? What is the tradeoff?

NFS Replication As originally specified, NFS did not support data replication More recent versions of NFS support replication via a mechanism called Automounter Allows remote mount points to be specified using a set of servers However, manually propagate modifications to replicas Intended primarily for READ-ONLY files Hong Ge

NFS Security NFS uses underlying Unix file protection on servers for access checks In early NFS, mutual trust assumed among all participating machines User identity determined by client machine and accepted without further server validation More recent versions of NFS use DES-based mutual authentication to provide a higher level of security File data in RPC packets is not encryptedè NFS is still vulnerable Hong Ge