Chapter 2 Data Storage

Similar documents

Outline. Principles of Database Management Systems. Memory Hierarchy: Capacities and access times. CPU vs. Disk Speed

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Storing Data: Disks and Files

Data Storage - II: Efficient Usage & Errors

HARD DRIVE CHARACTERISTICS REFRESHER

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Price/performance Modern Memory Hierarchy

Chapter 11 I/O Management and Disk Scheduling

Data Storage - I: Memory Hierarchies & Disks

Cloud and Big Data Summer School, Stockholm, Aug Jeffrey D. Ullman

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

Storing Data: Disks and Files. Disks and Files. Why Not Store Everything in Main Memory? Chapter 7

William Stallings Computer Organization and Architecture 7 th Edition. Chapter 6 External Memory

Sistemas Operativos: Input/Output Disks

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

Chapter 13. Disk Storage, Basic File Structures, and Hashing

Communicating with devices

Storage in Database Systems. CMPSCI 445 Fall 2010

How To Write A Disk Array

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

Operating System Concepts. Operating System 資訊工程學系袁賢銘老師

Mass Storage Structure

Physical Data Organization

Chapter 10: Mass-Storage Systems

Chapter 9: Peripheral Devices: Magnetic Disks

Disk Storage & Dependability

Chapter 2: Computer-System Structures. Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture

Definition of RAID Levels

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems

Timing of a Disk I/O Transfer

Storage and File Structure

1 Storage Devices Summary

Database Management Systems

CS 6290 I/O and Storage. Milos Prvulovic

Read this before starting!

Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing

Databases and Information Systems 1 Part 3: Storage Structures and Indices

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Chapter 11 I/O Management and Disk Scheduling

Database 2 Lecture I. Alessandro Artale

Chapter 12: Secondary-Storage Structure

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

6. Storage and File Structures

Fault Tolerance & Reliability CDA Chapter 3 RAID & Sample Commercial FT Systems

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

Computer Peripherals

Big Picture. IC220 Set #11: Storage and I/O I/O. Outline. Important but neglected

Chapter 1 File Organization 1.0 OBJECTIVES 1.1 INTRODUCTION 1.2 STORAGE DEVICES CHARACTERISTICS

Outline. mass storage hash functions. logical key values nested tables. storing information between executions using DBM files

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

RAID. Storage-centric computing, cloud computing. Benefits:

System Architecture. CS143: Disks and Files. Magnetic disk vs SSD. Structure of a Platter CPU. Disk Controller...

Chapter 12: Mass-Storage Systems

Lecture 36: Chapter 6

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Robert Wagner

Computer Architecture Prof. Mainak Chaudhuri Department of Computer Science and Engineering Indian Institute of Technology, Kanpur

An Introduction to RAID. Giovanni Stracquadanio

Computer Systems Structure Main Memory Organization

COS 318: Operating Systems. Storage Devices. Kai Li and Andy Bavier Computer Science Department Princeton University

Introduction to I/O and Disk Management

Quiz for Chapter 6 Storage and Other I/O Topics 3.10

Devices and Device Controllers

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

Storage Technologies for Video Surveillance

William Stallings Computer Organization and Architecture 8 th Edition. External Memory

2) What is the structure of an organization? Explain how IT support at different organizational levels.

RAID Performance Analysis

Unit Storage Structures 1. Storage Structures. Unit 4.3

1 File Management. 1.1 Naming. COMP 242 Class Notes Section 6: File Management

HP Smart Array Controllers and basic RAID performance factors

SiS964 RAID. User s Manual. Edition. Trademarks V1.0 P/N: U49-M2-0E

CS 153 Design of Operating Systems Spring 2015

RAID Made Easy By Jon L. Jacobi, PCWorld

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

Enterprise-class versus Desktopclass

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems

MapReduce and the New Software Stack

Storage. The text highlighted in green in these slides contain external hyperlinks. 1 / 14

Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

How To Set Up A Raid On A Hard Disk Drive On A Sasa S964 (Sasa) (Sasa) (Ios) (Tos) And Sas964 S9 64 (Sata) (

SiS964/SiS180 SATA w/ RAID User s Manual. Quick User s Guide. Version 0.3

Memory Hierarchy. Arquitectura de Computadoras. Centro de Investigación n y de Estudios Avanzados del IPN. adiaz@cinvestav.mx. MemoryHierarchy- 1

TELE 301 Lecture 7: Linux/Unix file

Why disk arrays? CPUs improving faster than disks

Query Processing C H A P T E R12. Practice Exercises

Chapter 6 External Memory. Dr. Mohamed H. Al-Meer

File System Management

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE

Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck

Transcription:

Chapter 2

22 CHAPTER 2. DATA STORAGE

2.1. THE MEMORY HIERARCHY 23

26 CHAPTER 2. DATA STORAGE main memory, yet is essentially random-access, with relatively small differences

Figure 2.4: A typical disk

Figure 2.6: Schematic of a simple computer system Rotation Speed of the Disk Assembly. 5400 RPM, i.e., one rotation every 11 milliseconds, is common, although higher and lower speeds are found. Number of Platters per Unit. A typical disk drive has about five platters and therefore ten surfaces. However, the common diskette ("floppy" disk) and "zip" disk have a single platter with two surfaces, and disk drives with

2.2. DISKS 35 b) The sectors containing the block move under the disk head as the entire disk assembly rotates. The time taken between the moment at which the command to read a block is issued and the time that the contents of the block appear in main memory is called the latency of the disk. It can be broken into the following components: 1. The time taken by the processor and disk controller to process the request, usually a fraction of a millisecond, which we shall neglect. We shall also neglect time due to contention for the disk controller (some other process might be reading or writing the disk at the same time) and other delays due to contention, such as for the bus. 2. The time to position the head assembly at the proper cylinder. This time, called seek time, can be 0 if the heads happen already to be at the proper cylinder. If not, then the heads require some minimum time to start moving and stop again, plus additional time that is roughly proportional to the distance traveled. Typical minimum times, the time to start, move by one track, and stop, are a few milliseconds, while maximum times to

36 CHAPTER 2. DATA STORAGE

2.2. DISKS 37

38 CHAPTER 2. DATA STORAGE

2.2. DISKS 39 proper sector(s) to rotate under the head, but, instead of reading the data

2.3. USING SECONDARY STORAGE EFFECTIVELY 41 of computation is often called the "RAM model" or random-access model of computation. However, when implementing a DBMS, one must assume that the data does not fit into main memory. One must therefore take into account the use of secondary, and perhaps even tertiary storage in designing efficient

42 CHAPTER 2. DATA STORAGE Example 2.4 : Suppose our database has a relation R and a query asks for the tuple of R that has a certain key value k. As we shall see, it is quite desirable

2.3. USING SECONDARY STORAGE EFFECTIVELY 43 fastest. Moieover, we would use a strategy where we sort only the key fields with attached pointers to the full iccoids. Only when the keys and their pointers were in sorted order, would we use the pointeis to bring every record to its proper position. Unfortunately, these ideas do not work very well when secondary memory

Figure 2.11: Main-memory organization for multiway merging queues" 6 that takes time proportional to the logarithm of the number of sublists to find the smallest element. 2. Move the smallest element to the first available position of the output block.

2.3. USING SECONDARY STORAGE EFFECTIVELY 47 How Big Should Blocks Be?

2.4. IMPROVING THE ACCESS TIME OF SECONDARY STORAGE 49 * c) The size of blocks is doubled, to 8192 bytes (again, as throughout this exercise, all other parameteis are unchanged). d) The size of available main memory is doubled to 100 megabytes.

50 CHAPTER 2. DATA STORAGE assumption may be appropriate for a system that is executing a large number

2.4. IMPROVING THE ACCESS TIME OF SECONDARY STORAGE 51 2.4.1 Organizing Data by Cylinders Since seek time represents about half the average time to access a block, there

2.4. IMPROVING THE ACCESS TIME OF SECONDARY STORAGE 53 random, data-dependent way. If the core algorithm of phase 2 the selection

2.4. IMPROVING THE ACCESS TIME OF SECONDARY STORAGE 55

56 CHAPTER 2. DATA STORAGE Cylinder First time of Request available = 1000 " 0 3000 0 7000 0 2000 20 8000 30, 5000 40

58 CHAPTER 2. DATA STORAGE Waiting for the Last of Two Blocks Suppose there are two blocks at random positions around a cylinder. Let Xi and X2 be the positions, in fractions of the full circle, so these are

2.4. IMPROVING THE ACCESS TIME OF SECONDARY STORAGE 61 Cylinder First time of Request available 1000 6 6000 1 500 10 5000 20

62 CHAPTER 2. DATA STORAGE

64 CHAPTER 2. DATA STORAGE

2.5. DISK FAILURES 65

2.6. RECOVERY FROM DISK CRASHES 67 2.5.5 Exercises for Section 2.5 Exercise 2.5.1: Compute the parity bit for the following bit sequences: * a) 00111011. b) 00000000. c) 10101101.

72 CHAPTER 2. DATA STORAGE on disk 2, to get 01100110. That tells us we must change positions 2, 3, 6, and

2.6. RECOVERY FROM DISK CRASHES 75 We shall see shortly that the particular choice of bits in this matrix gives us a simple rule by which we can recover from two simultaneous disk crashes. Reading We may read data from any data disk normally. The redundant disks can be ignored. Writing The idea is similar to the writing strategy outlined in Section 2.6.4, but now several redundant disks may be involved. To write a block of some data disk, we compute the modulo-2 sum of the new and old versions of that block. These bits are then added, in a modulo-2 sum, to the corresponding blocks of all those redundant disks that have 1 in a row in which the written disk also has 1.

2.6. RECOVERY FROM DISK CRASHES 77 Additional Observations About RAID Level 6

78 CHAPTER 2. DATA STORAGE Disk Contents 1)11110000 2) 00001111 3) 00111000 4) 01000001 5)???????? 6) 10111110 7) 10001001

2.6. RECOVERY FROM DISK CRASHES 79 Error-Correcting Codes and RAID Level 6 There is a broad theory that guides our selection of a suitable matrix, like that of Fig. 2.17, to determine the content of redundant disks. A code of length n is a set of bit-vectors (called code words) of length n. The Hamming distance between two code words is the number of positions in which they differ, and the minimum distance of a code is the binallest Hamming distance of any two different rode words. If C is any code of length n, we can require that the corresponding bits on n disks have one of the sequences that are members of the code. As a very simple example, if we are using a disk and its minor, then n = 2,

80 CHAPTER 2. DATA STORAGE a) Express this situation by giving a parity check matrix analogous to Fig. 2.17.!! b) It is possible to recover from some but not all situations where two disks fail at the same time. Determine for which pairs it is possible to recover and for which pairs it is not. *! Exercise 2.6.12: Suppose we have eight data disks numbered 1 through 8, and three redundant disks: 9, 10, and 11. Disk 9 is a parity check on disks

2.7. SUMMARY OF CHAPTER 2 81 4- Disk

82 CHAPTER 2. DATA STORAGE +