Outline. CS 245: Database System Principles. Notes 02: Hardware. Hardware DBMS ... ... Data Storage



Similar documents
Outline. Principles of Database Management Systems. Memory Hierarchy: Capacities and access times. CPU vs. Disk Speed

System Architecture. CS143: Disks and Files. Magnetic disk vs SSD. Structure of a Platter CPU. Disk Controller...

Price/performance Modern Memory Hierarchy

Solid State Drive Architecture

File System Management

Sistemas Operativos: Input/Output Disks

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

Introduction to I/O and Disk Management

William Stallings Computer Organization and Architecture 7 th Edition. Chapter 6 External Memory

Disk Storage & Dependability

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

Data Storage - I: Memory Hierarchies & Disks

HARD DRIVE CHARACTERISTICS REFRESHER

CS 6290 I/O and Storage. Milos Prvulovic

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

Data Storage - II: Efficient Usage & Errors

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

Storing Data: Disks and Files

William Stallings Computer Organization and Architecture 8 th Edition. External Memory

Writing Assignment #2 due Today (5:00pm) - Post on your CSC101 webpage - Ask if you have questions! Lab #2 Today. Quiz #1 Tomorrow (Lectures 1-7)

Mass Storage Structure

Flash for Databases. September 22, 2015 Peter Zaitsev Percona

Operating System Concepts. Operating System 資 訊 工 程 學 系 袁 賢 銘 老 師

Chapter 10: Mass-Storage Systems

RAID Performance Analysis

Chapter 2: Computer-System Structures. Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture

Best practices for Implementing Lotus Domino in a Storage Area Network (SAN) Environment

1 Storage Devices Summary

Chapter 8. Secondary Storage. McGraw-Hill/Irwin. Copyright 2008 by The McGraw-Hill Companies, Inc. All rights reserved.

Physical Data Organization

CSCA0201 FUNDAMENTALS OF COMPUTING. Chapter 5 Storage Devices

Lecture 16: Storage Devices

University of Dublin Trinity College. Storage Hardware.

Chapter 12: Secondary-Storage Structure

Tech Application Chapter 3 STUDY GUIDE

Storing Data: Disks and Files. Disks and Files. Why Not Store Everything in Main Memory? Chapter 7

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect

With respect to the way of data access we can classify memories as:

Types Of Storage Device

Using Synology SSD Technology to Enhance System Performance Synology Inc.

CSCA0102 IT & Business Applications. Foundation in Business Information Technology School of Engineering & Computing Sciences FTMS College Global

Managing Storage Space in a Flash and Disk Hybrid Storage System

N /150/151/160 RAID Controller. N MegaRAID CacheCade. Feature Overview

Storage in Database Systems. CMPSCI 445 Fall 2010

The IntelliMagic White Paper: Green Storage: Reduce Power not Performance. December 2010

Database Management Systems

Evaluation Report: Accelerating SQL Server Database Performance with the Lenovo Storage S3200 SAN Array

Accelerating Server Storage Performance on Lenovo ThinkServer

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Data Distribution Algorithms for Reliable. Reliable Parallel Storage on Flash Memories

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

Main Memory & Backing Store. Main memory backing storage devices

Chapter 12: Mass-Storage Systems

Nasir Memon Polytechnic Institute of NYU

A Close Look at PCI Express SSDs. Shirish Jamthe Director of System Engineering Virident Systems, Inc. August 2011

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE

Q & A From Hitachi Data Systems WebTech Presentation:

Difference between Enterprise SATA HDDs and Desktop HDDs. Difference between Enterprise Class HDD & Desktop HDD

McGraw-Hill Technology Education McGraw-Hill Technology Education

Intel RAID Controllers

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

Indexing on Solid State Drives based on Flash Memory

Chapter 7 Types of Storage. Discovering Computers Your Interactive Guide to the Digital World

Measuring Interface Latencies for SAS, Fibre Channel and iscsi

DELL TM PowerEdge TM T Mailbox Resiliency Exchange 2010 Storage Solution

Chapter 9: Peripheral Devices: Magnetic Disks

Input/output (I/O) I/O devices. Performance aspects. CS/COE1541: Intro. to Computer Architecture. Input/output subsystem.

Quiz for Chapter 6 Storage and Other I/O Topics 3.10

Speeding Up Cloud/Server Applications Using Flash Memory

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

EXTERNAL STORAGE WHAT IS IT? WHY USE IT? WHAT TYPES ARE AVAILBLE? MY CLOUD OR THEIR CLOUD? PERSONAL CLOUD SETUP COMMENTS ABOUT APPLE PRODUCTS

Record Storage and Primary File Organization

A+ Guide to Managing and Maintaining Your PC, 7e. Chapter 1 Introducing Hardware

Hardware Configuration Guide

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Chapter 7. Disk subsystem

6. Storage and File Structures

Caching Mechanisms for Mobile and IOT Devices

How To Improve Performance On A Single Chip Computer

Database Hardware Selection Guidelines

Optimizing SQL Server Storage Performance with the PowerEdge R720

Storage Class Memory and the data center of the future

An Overview of Flash Storage for Databases

NAND Flash Architecture and Specification Trends

LSI MegaRAID FastPath Performance Evaluation in a Web Server Environment

To understand how data is processed, by a computer, we can draw a simple analogy between computers and humans.

Deep Dive: Maximizing EC2 & EBS Performance

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

Performance Report Modular RAID for PRIMERGY

& Data Processing 2. Exercise 2: File Systems. Dipl.-Ing. Bogdan Marin. Universität Duisburg-Essen

Transcription:

CS 245: Database System Principles Notes 02: Hardware Hector Garcia-Molina Outline Hardware: Disks Access Times Solid State Drives Optimizations Other Topics: Storage costs Using secondary storage Disk failures CS 245 Notes 2 1 CS 245 Notes 2 2 Hardware P Typical Computer DBMS... M C... Data Storage Secondary Storage CS 245 Notes 2 3 CS 245 Notes 2 4 Secondary storage Many flavors: - Disk: Floppy (hard, soft) Removable Packs Winchester SSD disks Optical, CD-ROM Arrays - Tape Reel, cartridge Robots CS 245 Notes 2 5 Focus on: Typical Disk Terms: Platter, Head, Actuator Cylinder, Track Sector (physical), Block (logical), Gap CS 245 Notes 2 6 1

Top View Disk Access Time I want block X? block x in memory CS 245 Notes 2 7 CS 245 Notes 2 8 Time = Seek Time + Rotational Delay + Transfer Time + Other Seek Time Time 3 or 5x x 1 N Cylinders Traveled CS 245 Notes 2 9 CS 245 Notes 2 10 Average Random Seek Time Average Random Seek Time S = SEEKTIME (i j) i=1 j=1 j i S = SEEKTIME (i j) i=1 j=1 j i N N N N N(N-1) N(N-1) CS 245 Notes 2 11 CS 245 Notes 2 12 2

Typical Seek Time Ranges from 4ms for high end drives 15ms for mobile devices Typical SSD: ranges from 0.08ms 0.16ms Rotational Delay Head Here Source: Wikipedia, "Hard disk drive performance characteristics" Block I Want CS 245 Notes 2 13 CS 245 Notes 2 14 Average Rotational Delay Typical HDD figures R = 1/2 revolution HDD Spindle [rpm] Average rotational latency [ms] 4,200 7.14 5,400 5.56 7,200 4.17 10,000 3.00 15,000 2.00 R=0 for SSDs Source: Wikipedia, "Hard disk drive performance characteristics" CS 245 Notes 2 15 Transfer Rate: t value of t ranges from up to 1000 Mbit/sec 432 Mbit/sec 12x Blu-Ray disk 1.23 Mbits/sec 1x CD for SSDs, limited by interface e.g., SATA 3000 Mbit/s transfer time: block size t CS 245 Notes 2 16 Other Delays CPU time to issue I/O Contention for controller Contention for bus, memory Other Delays CPU time to issue I/O Contention for controller Contention for bus, memory Typical Value: 0 CS 245 Notes 2 17 CS 245 Notes 2 18 3

So far: Random Block Access What about: Reading Next block? If we do things right (e.g., Double Buffer, Stagger Blocks ) Time to get = Block Size + Negligible block t - skip gap - switch track - once in a while, next cylinder CS 245 Notes 2 19 CS 245 Notes 2 20 Rule of Thumb Random I/O: Expensive Sequential I/O: Much less Cost for Writing similar to Reading. unless we want to verify! need to add (full) rotation + Block size t CS 245 Notes 2 21 CS 245 Notes 2 22 To Modify a Block? To Modify a Block? To Modify Block: (a) Read Block (b) Modify in Memory (c) Write Block [(d) Verify?] CS 245 Notes 2 23 CS 245 Notes 2 24 4

SSDs storage is block oriented (not random access) lots of errors e.g., write of one block may cause an error of nearby block e.g., a block can only be written a limited number of times logic masks most issues e.g., using log structure sequential writes improve throughput (less bookkeeping) latency for seq. writes = random writes performance seq. reads = random reads Source: Reza Sadri, STEC ("the SSD Company") interface on-device logic SSD HDD or other CS 245 Notes 2 25 SSD vs Hard Disk Comparison (from Wikipedia) Factors: start up time, random access time, read latency time, data transfer rate, read performance, fragmentation, noise, temperature control, environmental factors, installation and mounting, magnetic fields, weight and size, reliability, secure writing, cost, capacity, R/W symmetry, power consumption. CS 245 Notes 2 26 Random Access Time SSD: Typically under 0.1 ms. As data can be retrieved directly from various locations of the flash memory, access time is usually not a big performance bottleneck. Hand Drive: Ranges from 2.9 (high end server drive) to 12 ms (laptop HDD) due to the need to move the heads and wait for the data to rotate under the read/write head CS 245 Notes 2 27 Data Transfer Rate SSD: In consumer products the maximum transfer rate typically ranges from about 100 MB/s to 600 MB/s, depending on the disk. Enterprise market offers devices with multi-gigabyte per second throughput. Hard Disk: Once the head is positioned, an enterprise HDD can transfer data at about 140 MB/s. In practice transfer speeds are lower due to seeking. Data transfer rate depends also upon rotational speed, which can range from 4,200 to 15,000 rpm and also upon the track (reading from the outer tracks is faster due higher). CS 245 Notes 2 28 Reliability SSD: Reliability varies across manufacturers and models with return rates reaching 40% for specific drives. As of 2011 leading SSDs have lower return rates than mechanical drives. Many SSDs critically fail on power outages; a December 2013 survey found that only some of them are able to survive multiple power outages. Hard Disk: According to a study performed by CMU for both consumer and enterprise-grade HDDs, their average failure rate is 6 years, and life expectancy is 9 11 years. Leading SSDs have overtaken hard disks for reliability, however the risk of a sudden, catastrophic data loss can be lower for mechanical disks. CS 245 Notes 2 29 Cost and Capacity SSD: NAND flash SSDs have reached US$0.59 per GB. In 2013, SSDs were available in sizes up to 2 TB, but less costly 128 to 512 GB drives were more common. Hard Drive: HDDs cost about US$0.05 per GB for 3.5- inch and $0.10 per GB for 2.5-inch drives. In 2013, HDDs of up to 6 TB were available. CS 245 Notes 2 30 5

Kibibytes 1 kibibyte = 2 10 bytes = 1024 bytes. from Wikipedia Outline Hardware: Disks Access Times Solid State Drives Optimizations Other Topics Storage Costs Using Secondary Storage Disk Failures here CS 245 Notes 2 31 CS 245 Notes 2 32 Optimizations (in controller or O.S.) Disk Scheduling Algorithms e.g., elevator algorithm Track (or larger) Buffer Pre-fetch Arrays Mirrored Disks On Disk Cache Double Buffering Problem: Have a File» Sequence of Blocks B1, B2 Have a Program» Process B1» Process B2» Process B3... CS 245 Notes 2 33 CS 245 Notes 2 34 Single Buffer Solution (1) Read B1 Buffer (2) Process Data in Buffer (3) Read B2 Buffer (4) Process Data in Buffer... Say P = time to process/block R = time to read in 1 block n = # blocks Single buffer time = n(p+r) CS 245 Notes 2 35 CS 245 Notes 2 36 6

Double Buffering Memory: process Double Buffering process Memory: A B Disk: A B C D E F G Disk: A B C D E F G done CS 245 Notes 2 37 CS 245 Notes 2 38 Double Buffering Memory: AC process B Double Buffering process Memory: AC process B Disk: A B C D E F G done Disk: A B C D E F G done done CS 245 Notes 2 39 CS 245 Notes 2 40 Say P R P = Processing time/block R = IO time/block n = # blocks Say P R P = Processing time/block R = IO time/block n = # blocks What is processing time? What is processing time? Double buffering time = R + np Single buffering time = n(r+p) CS 245 Notes 2 41 CS 245 Notes 2 42 7

Disk Arrays RAIDs (various flavors) Block Striping Mirrored... P On Disk Cache M C... cache logically one disk cache CS 245 Notes 2 43 CS 245 Notes 2 44 Five Minute Rule THE 5 MINUTE RULE FOR TRADING MEMORY FOR DISC ACCESSES Jim Gray & Franco Putzolu May 1985 The Five Minute Rule, Ten Years Later Goetz Graefe & Jim Gray December 1997 CS 245 Notes 2 45 Five Minute Rule Say a page is accessed every X seconds CD = cost if we keep that page on disk $D = cost of disk unit I = numbers IOs that unit can perform In X seconds, unit can do XI IOs So CD = $D / XI CS 245 Notes 2 46 Five Minute Rule Say a page is accessed every X seconds CM = cost if we keep that page on RAM $M = cost of 1 MB of RAM P = numbers of pages in 1 MB RAM So CM = $M / P Five Minute Rule Say a page is accessed every X seconds If CD is smaller than CM, keep page on disk else keep in memory Break even point when CD = CM, or $D P X = I $M CS 245 Notes 2 47 CS 245 Notes 2 48 8

Using 97 Numbers P = 128 pages/mb (8KB pages) I = 64 accesses/sec/disk $D = 2000 dollars/disk (9GB + controller) $M = 15 dollars/mb of DRAM Disk Failures (Sec 2.5) Partial Total Intermittent Permanent X = 266 seconds (about 5 minutes) (did not change much from 85 to 97) CS 245 Notes 2 49 CS 245 Notes 2 50 Coping with Disk Failures Detection e.g. Checksum Correction Redundancy At what level do we cope? Single Disk e.g., Error Correcting Codes Disk Array Logical Physical CS 245 Notes 2 51 CS 245 Notes 2 52 Operating System e.g., Stable Storage Database System e.g., Logical Block Copy A Copy B Current DB Log Last week s DB CS 245 Notes 2 53 CS 245 Notes 2 54 9

Summary Secondary storage, mainly disks I/O times I/Os should be avoided, especially random ones.. Outline Hardware: Disks Access Times Example: Megatron 747 Optimizations Other Topics Storage Costs Using Secondary Storage Disk Failures here CS 245 Notes 2 55 CS 245 Notes 2 56 10