Storing Data: Disks and Files



Similar documents
Storing Data: Disks and Files. Disks and Files. Why Not Store Everything in Main Memory? Chapter 7

Storage in Database Systems. CMPSCI 445 Fall 2010

Database Management Systems

Lecture 36: Chapter 6

How To Write A Disk Array

Data Storage - II: Efficient Usage & Errors

Chapter 9: Peripheral Devices: Magnetic Disks

RAID Overview

Price/performance Modern Memory Hierarchy

RAID. Storage-centric computing, cloud computing. Benefits:

HARD DRIVE CHARACTERISTICS REFRESHER

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

Definition of RAID Levels

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Physical Data Organization

Chapter 2 Data Storage

Sistemas Operativos: Input/Output Disks

Storage and File Structure

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367

RAID. Tiffany Yu-Han Chen. # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead

CS 153 Design of Operating Systems Spring 2015

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

1 Storage Devices Summary

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Disk Storage & Dependability

Chapter 6 External Memory. Dr. Mohamed H. Al-Meer

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Chapter 13. Disk Storage, Basic File Structures, and Hashing

Why disk arrays? CPUs improving faster than disks

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

What is RAID? data reliability with performance

Outline. Principles of Database Management Systems. Memory Hierarchy: Capacities and access times. CPU vs. Disk Speed

6. Storage and File Structures

Filing Systems. Filing Systems

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

Chapter 10: Mass-Storage Systems

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

CS 6290 I/O and Storage. Milos Prvulovic

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

Reliability and Fault Tolerance in Storage

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

Striped Set, Advantages and Disadvantages of Using RAID

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

CS161: Operating Systems

William Stallings Computer Organization and Architecture 7 th Edition. Chapter 6 External Memory

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

An Introduction to RAID. Giovanni Stracquadanio

How To Improve Performance On A Single Chip Computer

File System Design and Implementation

Introduction to I/O and Disk Management

William Stallings Computer Organization and Architecture 8 th Edition. External Memory

University of Dublin Trinity College. Storage Hardware.

HP Smart Array Controllers and basic RAID performance factors

Storage. The text highlighted in green in these slides contain external hyperlinks. 1 / 14

Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck

CS420: Operating Systems

Fault Tolerance & Reliability CDA Chapter 3 RAID & Sample Commercial FT Systems

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect

Classification of Physical Storage Media. Chapter 11: Storage and File Structure. Physical Storage Media (Cont.) Physical Storage Media

Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing

Lecture 1: Data Storage & Index

System Architecture. CS143: Disks and Files. Magnetic disk vs SSD. Structure of a Platter CPU. Disk Controller...

Data Storage Solutions

Hard Disk Drives and RAID

Oracle Database 10g: Performance Tuning 12-1

Timing of a Disk I/O Transfer

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

Chapter 10: Storage and File Structure

Chapter 11 I/O Management and Disk Scheduling

Data Storage - I: Memory Hierarchies & Disks

RAID Basics Training Guide

W4118: RAID. Instructor: Junfeng Yang

PIONEER RESEARCH & DEVELOPMENT GROUP

Database 2 Lecture I. Alessandro Artale

Outline. CS 245: Database System Principles. Notes 02: Hardware. Hardware DBMS Data Storage

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

Intel RAID Controllers

RAID Performance Analysis

NAND Flash Media Management Through RAIN

Introduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral

Chapter 12: Mass-Storage Systems

Lecture 18: Reliable Storage

How To Understand And Understand The Power Of Aird 6 On Clariion

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

How To Create A Multi Disk Raid

Review. Lecture 21: Reliable, High Performance Storage. Overview. Basic Disk & File System properties CSC 468 / CSC /23/2006

CS 61C: Great Ideas in Computer Architecture. Dependability: Parity, RAID, ECC

CSE 120 Principles of Operating Systems

Transcription:

Storing Data: Disks and Files (From Chapter 9 of textbook) Storing and Retrieving Data Database Management Systems need to: Store large volumes of data Store data reliably (so that data is not lost!) Retrieve data efficiently Alternatives for storage Main memory Disks Tape Why Not Store Everything in Main Memory? Costs too much. Why Not Store Everything in Tapes? No random access. Main memory is volatile. Slow! 1

Disks Secondary storage device of choice Solution 1: Techniques for making disks faster Intelligent data layout on disk Main problem? Redundant Array of Inexpensive Disks (RAID) Solution 2: Buffer Management Outline Keep currently used data in main memory Typical (simplified) storage hierarchy: Disk technology and how to make disk read/writes faster Buffer management Storing database files on disk 2

Components of a Disk Disk head Spindle Tracks Accessing a Disk Page Time to access (read/write) a disk block: Seek time: 1 to 20msec v Only one head reads/writes at any one time. Sector Rotational delay: 0 to 10msec Transfer rate: ~ 1msec per 4KB page v Block size is a multiple of sector size (which is fixed). Arm movement Platters Key to lower I/O cost: reduce seek/rotation delays! Arm assembly Arranging Pages on Disk `Next block concept: Blocks in a file should be arranged sequentially on disk (by `next ), to minimize seek and rotational delay. In-Class Exercise Consider a disk with: average seek time of 15 milliseconds average rotational delay of 6 milliseconds transfer time of 0.5 milliseconds/page Page size = 1024 bytes Table: 200,000 rows of 100 bytes each, no row spans 2 pages Find: Number of pages needed to store the table Time to read all rows sequentially Time to read all rows in some random order 3

In-Class Exercise Solution RAID (Redundant Array of Independent Disks) Disk Array: Arrangement of several disks that gives abstraction of a single, large disk. Goals: Increase performance and reliability. Two main techniques: Data striping Redundancy Parity Add 1 redundant block for every n blocks of data XOR of the n blocks Example: D1, D2, D3, D4 are data blocks Compute DP as D1 XOR D2 XOR D3 XOR D4 Store D1, D2, D3, D4, DP on different disks Can recover any one of them from the other four by XORing them RAID Levels Level 0: No redundancy Striping without parity Level 1: Mirrored (two identical copies) Each disk has a mirror image (check disk) Parallel access: reduces positioning time, but transfer only from one disk. Maximum transfer rate = transfer rate of one disk Write involves two disks. 4

RAID Levels (Contd.) Level 0+1: Striping and Mirroring Parallel reads. Write involves two disks. Maximum transfer rate = aggregate bandwidth Combines performance of RAID 0 with redundancy of RAID 1. Example: 8 disks Divide into two sets of 4 disks Each set is a RAID 0 array One set mirrors the other RAID Levels (Contd.) Level 3: Bit-Interleaved Parity Striping Unit: One bit. One check disk. Each read and write request involves all disks; disk array can process one request at a time. RAID Levels (Contd.) Level 4: Block-Interleaved Parity Striping Unit: One disk block. One check disk. Parallel reads possible for small requests, large requests can utilize full bandwidth Writes involve modified block and check disk RAID Levels (Contd.) Level 5: Block-Interleaved Distributed Parity Similar to RAID Level 4, but parity blocks are distributed over all disks Eliminates check disk bottleneck, one more disk for higher read parallelism 5

In-Class Exercise In-Class Exercise Solution How does the striping granularity (size of a stripe) affect performance, e.g., RAID 3 vs. RAID 4? Which RAID to Choose? RAID 0: great performance at low cost, limited reliability RAID 0+1 (better than 1): small storage subsytems (cost of mirroring limited), or when write performance matters RAID 3 (better than 2): large transfer requests of contiguous blocks, bad for small requests of single blocks RAID 5 (better than 4): good general-purpose solution Which RAID to Choose? Corrected. RAID 0: great performance at low cost, limited reliability RAID 0+1 (better than 1): small storage subsytems (cost of mirroring limited), or when write performance matters RAID 5 (better than 3, 4): good generalpurpose solution 6

Disk Space Management Lowest layer of DBMS software manages space on disk. Higher levels call upon this layer to: allocate/de-allocate a page read/write a page Request for a sequence of pages must be satisfied by allocating the pages sequentially on disk! Higher levels don t need to know how this is done, or how free space is managed. Structure of a DBMS Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB These layers must consider concurrency control and recovery 7