Outline. Principles of Database Management Systems. Memory Hierarchy: Capacities and access times. CPU vs. Disk Speed ... ...



Similar documents
Outline. CS 245: Database System Principles. Notes 02: Hardware. Hardware DBMS Data Storage

Chapter 2 Data Storage

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Storing Data: Disks and Files

Data Storage - I: Memory Hierarchies & Disks

Database 2 Lecture I. Alessandro Artale

System Architecture. CS143: Disks and Files. Magnetic disk vs SSD. Structure of a Platter CPU. Disk Controller...

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems

Database Management Systems

Price/performance Modern Memory Hierarchy

Sistemas Operativos: Input/Output Disks

Introduction to I/O and Disk Management

Data Storage - II: Efficient Usage & Errors

Storage in Database Systems. CMPSCI 445 Fall 2010

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Storing Data: Disks and Files. Disks and Files. Why Not Store Everything in Main Memory? Chapter 7

CS 6290 I/O and Storage. Milos Prvulovic

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

William Stallings Computer Organization and Architecture 7 th Edition. Chapter 6 External Memory

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

External Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner

1 Storage Devices Summary

The Classical Architecture. Storage 1 / 36

Chapter 2: Computer-System Structures. Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture

Q & A From Hitachi Data Systems WebTech Presentation:

Mass Storage Structure

Database Fundamentals

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

Chapter 10: Mass-Storage Systems

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

Disk Storage & Dependability

Lecture 16: Storage Devices

Chapter 9: Peripheral Devices: Magnetic Disks

Communicating with devices

Physical Data Organization

CSCA0102 IT & Business Applications. Foundation in Business Information Technology School of Engineering & Computing Sciences FTMS College Global

HARD DRIVE CHARACTERISTICS REFRESHER

Chapter 13. Disk Storage, Basic File Structures, and Hashing

lesson 1 An Overview of the Computer System

Chapter 8 Memory Units

RAID Performance Analysis

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

Performance Report Modular RAID for PRIMERGY

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Here are my slides from lecture, along with my notes about each slide.

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Summer Student Project Report

What is RAID? data reliability with performance

University of Dublin Trinity College. Storage Hardware.

Principles of Database Management Systems. Overview. Principles of Data Layout. Topic for today. "Executive Summary": here.

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

William Stallings Computer Organization and Architecture 8 th Edition. External Memory

Indexing on Solid State Drives based on Flash Memory

Chapter 11 I/O Management and Disk Scheduling

Outline. Failure Types

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

DATABASE. Pervasive PSQL Performance. Key Performance Features of Pervasive PSQL. Pervasive PSQL White Paper

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

File System Management

Configuring Apache Derby for Performance and Durability Olav Sandstå

RAID Overview

Main Memory & Backing Store. Main memory backing storage devices

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

Outline. Database Management and Tuning. Overview. Hardware Tuning. Johann Gamper. Unit 12

Random-Access Memory (RAM) The Memory Hierarchy. SRAM vs DRAM Summary. Conventional DRAM Organization. Page 1

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Flash for Databases. September 22, 2015 Peter Zaitsev Percona

Optimizing LTO Backup Performance

SQL Server Version. Supported for SC2012 RTM*** Not supported for SC2012 SP1*** SQL Server 2008 SP1, SP2, SP3

CS 525 Advanced Database Organization - Spring 2013 Mon + Wed 3:15-4:30 PM, Room: Wishnick Hall 113

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Computer Systems Structure Main Memory Organization

Benchmarking Hadoop & HBase on Violin

Remote Copy Technology of ETERNUS6000 and ETERNUS3000 Disk Arrays

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

6. Storage and File Structures

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

Fusionstor NAS Enterprise Server and Microsoft Windows Storage Server 2003 competitive performance comparison

Oracle Database Deployments with EMC CLARiiON AX4 Storage Systems

SIMULATION AND MODELLING OF RAID 0 SYSTEM PERFORMANCE

Solid State Drive Architecture

Computer-System Architecture

PowerVault MD1200/MD1220 Storage Solution Guide for Applications

Data Storage and Backup. Sanjay Goel School of Business University at Albany, SUNY

CSCA0201 FUNDAMENTALS OF COMPUTING. Chapter 5 Storage Devices

Record Storage and Primary File Organization

Platter. Track. Index Mark. Disk Storage. PHY 406F - Microprocessor Interfacing Techniques

1 File Management. 1.1 Naming. COMP 242 Class Notes Section 6: File Management

Transcription:

Outline Principles of Database Management Systems Pekka Kilpeläinen (after Stanford CS245 slide originals by Hector Garcia-Molina, Jeff Ullman and Jennifer Widom) Hardware: Disks Access Times Example - Megatron 747 External sorting Other Topics: Optimizing disk usage DBMS 2001 Notes 2: Hardware 1 DBMS 2001 Notes 2: Hardware 2 P M Typical Computer C Bus Memory Hierarchy: Capacities and access times Terabytes Tertiary Storage 10-100 s P: processor M: main memory C: disk controller Secondary Storage Gigabytes Disk 10 ms = 10-2 s Megabytes Main memory (+ cache) 100 ns = 10-7 s DBMS 2001 Notes 2: Hardware 3 DBMS 2001 Notes 2: Hardware 4 CPU vs. Disk Speed CPU: 100 500 1000 MIPS Main memory access 10-6 10-9 sec. DISK read/write approx. 10 30 ms CPU executes roughly 10 6 instructions in the time of a disk access disk access times do not shrink in proportion to CPU speed-up disk I/O major DBMS concern DBMS 2001 Notes 2: Hardware 5 Typical Disk 1 n rotating platters (of two surfaces); Surfaces accessed by heads (moving together), divided into concentric tracks. Tracks under the heads at the same time form a cylinder. Tracks are divided into sectors separated by gaps. DBMS 2001 Notes 2: Hardware 6 1

Top View: tracks on a surface track gap sector Typical Numbers Diameter: 1 3.5 15 " Cylinders: 100 2000 (floppy:40) Surfaces: 1 (CD, old floppy) (=tracks/cyl) 2 (floppy, DVD) 30 Sector Size:0.5 4KB 50KB Capacity: 1.44 MB (floppy) several GBs DBMS 2001 Notes 2: Hardware 7 DBMS 2001 Notes 2: Hardware 8 Physical vs. Logical I/O Disk Access Time Sector: indivisible unit of disk I/O Blocks logical (DBMS/OS) units of disk storage transferred btw disk and main memory buffers typically 4 56 KB, consisting of one or more sectors I want block X? block X in memory DBMS 2001 Notes 2: Hardware 9 DBMS 2001 Notes 2: Hardware 10 Time = Seek Time + Rotational Delay + Transfer Time + Other Seek TimeS Time 3x 20x Real seek time x Simplified approximation 1 N Cylinders Traveled DBMS 2001 Notes 2: Hardware 11 DBMS 2001 Notes 2: Hardware 12 2

Average Random Seek TimeE(S) N N Rotational Delay R E(S) = 1/N 1/N SEEKTIME (i j) i=1 j=1 = (approx.) 1/3 SEEKTIME (1 N) Head Here Typical E(S): 3 15 ms Block I Want DBMS 2001 Notes 2: Hardware 13 DBMS 2001 Notes 2: Hardware 14 Average Rotational Delay E(R) E(R) = 1/2 x full rotation time Typical rotation speed 3600 RPM full rotation in 1/60 s E(R) = 1/120 s = 8.33 ms Transfer Rate: t typical t: 1 3 MB/second transfer time: (block size)/t Other Delays: Typical Value 0 CPU time to issue I/O Contention for controller Contention for bus, memory DBMS 2001 Notes 2: Hardware 15 DBMS 2001 Notes 2: Hardware 16 So far: Random Block Access What about: Reading Next block? If we do things right (e.g., Arrange by Cylinders, Double Buffer ) Time to get = Block Size + Negligible block t - skip gap - switch track - once in a while, next cylinder DBMS 2001 Notes 2: Hardware 17 DBMS 2001 Notes 2: Hardware 18 3

Rule of Thumb Ex: Random I/O: Expensive Sequential I/O: Much less 1 KB Block»Random I/O: 20 ms.»sequential I/O: 1 ms. Cost for Writing similar to Reading. unless we want to verify! need to add (full) rotation + Block size t DBMS 2001 Notes 2: Hardware 19 DBMS 2001 Notes 2: Hardware 20 To Modify a Block? To Modify Block: (a) Read Block (b) Modify in Memory (c) Write Block [(d) Verify?] An Example 4 platters, 8 surfaces Megatron 747 Disk (Example 2.1 in book) 2 13 = 8192 cylinders (or tracks/surface) 2 8 = 256 sectors/track 90% of track length for data, 10% for gaps 2 9 = 512 bytes/sector capacity 2 3 x 2 13 x 2 8 x 2 9 = 2 33 = 8 GB DBMS 2001 Notes 2: Hardware 21 DBMS 2001 Notes 2: Hardware 22 Timing for Megatron 747 (Ex 2.3) Time to read a 4 KB block? MIN: only transfer time: Read 8 sectors (a' 0.5 KB) and pass 7 gaps How much does the disk need to rotate? For data 8/256 x 90% = 72/2560 rotation, for gaps 7/256 x 10% = 7/2560 rotation 3840 RPM 1/64 s for one rotation time (79/2560)/64 s, or about 0.5 ms (i.e. transfer speed approx. 8 MB/s) Maximum block access time Megatron 474 head movements: 1 ms to start&stop + 1 ms/500 cylinders MAX block access time: SEEKTIME(1 8192) = 1 + 8191/500 ms = 17.4 ms full rotation: 1/64 s = 15.6 ms transfer: 0.5 ms 33.5 ms DBMS 2001 Notes 2: Hardware 23 DBMS 2001 Notes 2: Hardware 24 4

Megatron 474: Average block access time AVGblock access time: AVG SEEKTIME: Assuming random access, avg head movement is 1/3 of all tracks 1 ms + 1/3 x 8191/500 ms = 6.5 ms half rotation: 1/2 x 1/64 s = 7.8 ms transfer: 0.5 ms 14.8 ms Outline Hardware: Disks Access Times Example: Megatron 747 External sorting Other Topics Optimizing Disk Usage here DBMS 2001 Notes 2: Hardware 25 DBMS 2001 Notes 2: Hardware 26 Two-Phase Multiway Merge-Sort Sorting of files or relations is a recurrent task What if the data doesn't fit in main memory? Ex: Sort relation R of 10x10 6 100-byte tuples, using Megatron 747 disk and 50 MB main memory (M) for buffers (Ex. 2.5, 2.7-2.9, 2.14 in Book). Block size 4 KB 40 tuples/block R takes 10 7 /40 = 250,000 blocks M can hold 50 x 2 20 /2 12 = 12,800 blocks MB 4 KB DBMS 2001 Notes 2: Hardware 27 Merging Sorted Sublists is Easy Sorted file Memory Sorted sublists Always move the smallest head elemenent to the ouput buffer DBMS 2001 Notes 2: Hardware 28 One way to sort: Merge Sort (i) For each main-memory-sized chunk of R: - Read chunk - Sort in memory (say, using quicksort) - Write to disk L1 R sorted L2 chunks (sublists) Memory Ln (ii) Read all chunks + merge + write out Sorted file Memory Sorted sublists DBMS 2001 Notes 2: Hardware 29 DBMS 2001 Notes 2: Hardware 30 5

Example: Sorting R (cont.) blocks of R blocks fitting in R R fills memory 250,000/12,800=20 times (last time memory not full) Phase (i): 250,000 blocks first read, then written 500,000 block accesses. If in random order, time 500,000 x 15 ms = 125 minutes Phase (ii): Similarly 125 min; Total 250 min How Large Files Can We Sort? Let B = block size, M = size of main memory available for sorting, R = size of each record M/B buffers fit in memory; one for output Phase (ii) can merge at most (M/B - 1) sublists, each created by sorting at most M/R records M/R x (M/B - 1), roughly M 2 /(BxR) records DBMS 2001 Notes 2: Hardware 31 DBMS 2001 Notes 2: Hardware 32 M 2 /(BxR) in Practise? In our example, M = 50,000,000, B = 4096, R = 100 M 2 /(BxR) = 2500x10 12 /(4096 x 100) = 6.1 x 10 9 records, taking 0.61 terabytes of storage Pretty much! Additional merge-phases can be added to cover even larger data sets. Optimizations Arranging sequentially accessed blocks by cylinders Pre-fetching buffers Mirrored Disks DBMS 2001 Notes 2: Hardware 33 DBMS 2001 Notes 2: Hardware 34 Arranging Blocks by Cylinders Consider our merge-sort example. The blocks of R can be stored on consecutive cylinders, and the 20 sorted sublists can be written likewise Megatron 747 cylinder size 1 MB Time to fill main memory (by 50 cylinders): AVG seek 6.5 ms + 49 one-cylinder seeks a' 1 ms + transfer time for 12,800 blocks a' 0.5 ms transfer time to fill memory once 6.4 s; Seek times ignorable (if no other processes!) Arranging Blocks by Cylinders (cont.) Phase (i) of multiway merge-sort dominated by the transfer time of reading & writing a memory-full of blocks 20 times: 20 x 2 x 6.4 s = 256 s = 4.3 min (vs. previous 125 minutes!) DBMS 2001 Notes 2: Hardware 35 DBMS 2001 Notes 2: Hardware 36 6

What about Phase (ii)? In the merge phase, a new block is read when one gets exhausted blocks read in a somewhat random order But we wrote each sublist sequentially along cylinders! Wecan utilize this by using larger buffers, holding entire tracks or even cylinders To increase throughput, apply double buffering Double Buffering Problem: Have a File» Sequence of Blocks B1, B2, B3, Have a Program» Process B1» Process B2» Process B3 DBMS 2001 Notes 2: Hardware 37 DBMS 2001 Notes 2: Hardware 38 Single Buffer Solution (1) Read B1 Buffer (2) Process Data in Buffer (3) Read B2 Buffer (4) Process Data in Buffer Say P = time to process/block R = time to read in 1 block n = # blocks Single buffer time = n(p+r) DBMS 2001 Notes 2: Hardware 39 DBMS 2001 Notes 2: Hardware 40 Double Buffering process Memory: AC A process B Say R P What is processing time? P = Processing time/block R = IO time/block n = # blocks Disk: A B C D E F G done done Double buffering time = nr Single buffering time = n(r+p) DBMS 2001 Notes 2: Hardware 41 DBMS 2001 Notes 2: Hardware 42 7

Cylinder-Sized Double-Buffering Cylinder-Sized Double-Buffering (cont.) Consider implementing the merge-phase using two cylinder-sized (1 MB) buffers for each of our 20 sorted sublists Only one seek/cylinder (AVG 6.5 ms) + one full rotation for each of the 8 tracks of a cylinder (each 15.6 ms) 131.3 ms to read one buffer repeated 1000 times to read entire relation about 2.15 min to read all sublists DBMS 2001 Notes 2: Hardware 43 Writing the sorted relation can be arranged similarly Time to output the buffers for the result also 2.15 min Total time for multiway merge-sort 4.3 min (Phase i) + 4.3 min (Phase ii) = 8.6 minutes (vs. over 4 hours without these tricks!) DBMS 2001 Notes 2: Hardware 44 Mirroring Disks Two or more disks holding identical copies of data Optimizes: Can read blocks from whichever disk is faster. (Writing on each disk happens in parallel.) Enhances reliability: A single disk crash does not cause loss of data. (RAID level 1) Using redundant disks RAID (Redundant Arrays of Independent Disks) different versions, levels 1-6 Basic idea: One of the disks is redundant holding checksums for other disks contents of crashed disks can be recovered from the checksums DBMS 2001 Notes 2: Hardware 45 DBMS 2001 Notes 2: Hardware 46 Error Recovery in Databases Summary Current DB Log Last week s DB Secondary storage, mainly disks I/O times dominate DBMS processing I/Os should be avoided, especially random ones More about this later DBMS 2001 Notes 2: Hardware 47 DBMS 2001 Notes 2: Hardware 48 8