Lecture 23: Multiprocessors



Similar documents
Lecture 36: Chapter 6

How To Create A Multi Disk Raid

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

Symmetric Multiprocessing

Introduction to Cloud Computing

Lecture 2 Parallel Programming Platforms

Dependable Systems. 9. Redundant arrays of. Prof. Dr. Miroslaw Malek. Wintersemester 2004/05

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

CMSC 611: Advanced Computer Architecture

Price/performance Modern Memory Hierarchy

Parallel Programming

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

Chapter 2 Parallel Architecture, Software And Performance

High Performance Computing. Course Notes High Performance Storage

Vorlesung Rechnerarchitektur 2 Seite 178 DASH

High Performance Computing. Course Notes HPC Fundamentals

Annotation to the assignments and the solution sheet. Note the following points

Non-Redundant (RAID Level 0)

RAID Overview

Computer Architecture TDTS10

UNIT 2 CLASSIFICATION OF PARALLEL COMPUTERS

Fault Tolerance & Reliability CDA Chapter 3 RAID & Sample Commercial FT Systems

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)

Storage. The text highlighted in green in these slides contain external hyperlinks. 1 / 14

RAID. Storage-centric computing, cloud computing. Benefits:

What is RAID? data reliability with performance

Chapter 12: Multiprocessor Architectures. Lesson 09: Cache Coherence Problem and Cache synchronization solutions Part 1

Definition of RAID Levels

Operating Systems. RAID Redundant Array of Independent Disks. Submitted by Ankur Niyogi 2003EE20367

Database Management Systems

Filing Systems. Filing Systems

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

Parallel Architectures and Interconnection

Scalability and Classifications

Principles and characteristics of distributed systems and environments

Parallel Programming Survey

Hard Disk Drives and RAID

Chapter 2 Parallel Computer Architecture

IncidentMonitor Server Specification Datasheet

How to choose the right RAID for your Dedicated Server

Why disk arrays? CPUs improving faster than disks

CS420: Operating Systems

HP Smart Array Controllers and basic RAID performance factors

ES-1 Elettronica dei Sistemi 1 Computer Architecture

VII ENCUENTRO IBÉRICO DE ELECTROMAGNETISMO COMPUTACIONAL, MONFRAGÜE, CÁCERES, MAYO

Storing Data: Disks and Files

CS550. Distributed Operating Systems (Advanced Operating Systems) Instructor: Xian-He Sun

CS161: Operating Systems

Module 6. RAID and Expansion Devices

IBM ^ xseries ServeRAID Technology

Middleware and Distributed Systems. Introduction. Dr. Martin v. Löwis

RAID. Tiffany Yu-Han Chen. # The performance of different RAID levels # read/write/reliability (fault-tolerant)/overhead

RAID Level Descriptions. RAID 0 (Striping)

How To Virtualize A Storage Area Network (San) With Virtualization

Striped Set, Advantages and Disadvantages of Using RAID

RAID Overview: Identifying What RAID Levels Best Meet Customer Needs. Diamond Series RAID Storage Array

High Performance Computing, an Introduction to

CS 153 Design of Operating Systems Spring 2015

Distributed Systems. REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1

CHAPTER 4 RAID. Section Goals. Upon completion of this section you should be able to:

Guide to SATA Hard Disks Installation and RAID Configuration

Chapter 9: Peripheral Devices: Magnetic Disks

Outline. Distributed DBMS

Computer Architecture Prof. Mainak Chaudhuri Department of Computer Science and Engineering Indian Institute of Technology, Kanpur

RAID Performance Analysis

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

Data Storage - II: Efficient Usage & Errors

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

RAID. Contents. Definition and Use of the Different RAID Levels. The different RAID levels: Definition Cost / Efficiency Reliability Performance

Client/Server Computing Distributed Processing, Client/Server, and Clusters

Why disk arrays? CPUs speeds increase faster than disks. - Time won t really help workloads where disk in bottleneck

1 Storage Devices Summary

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

How To Write A Disk Array

California Software Labs

Guide to SATA Hard Disks Installation and RAID Configuration

File System Design and Implementation

Enterprise Applications

Multiprocessor Cache Coherence

Principles of Operating Systems CS 446/646

Binary search tree with SIMD bandwidth optimization using SSE

StorTrends RAID Considerations

How To Understand The Concept Of A Distributed System

SoC Design Lecture 12: MPSoC Multi-Processor System-on-Chip. Shaahin Hessabi Department of Computer Engineering Sharif University of Technology

Designing a Cloud Storage System

COMPUTER ORGANIZATION ARCHITECTURES FOR EMBEDDED COMPUTING

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

760 Veterans Circle, Warminster, PA Technical Proposal. Submitted by: ACT/Technico 760 Veterans Circle Warminster, PA

Virtual machine interface. Operating system. Physical machine interface

Multilevel Load Balancing in NUMA Computers

RAID Technology. RAID Overview

RAID: Redundant Arrays of Independent Disks

CS 61C: Great Ideas in Computer Architecture. Dependability: Parity, RAID, ECC

Chapter 18: Database System Architectures. Centralized Systems

Communicating with devices

Tools Page 1 of 13 ON PROGRAM TRANSLATION. A priori, we have two translation mechanisms available:

Deploying a File Server Lesson 2

ADVANCED COMPUTER ARCHITECTURE: Parallelism, Scalability, Programmability

Transcription:

Lecture 23: Multiprocessors Today s topics: RAID Multiprocessor taxonomy Snooping-based cache coherence protocol 1

RAID 0 and RAID 1 RAID 0 has no additional redundancy (misnomer) it uses an array of disks and stripes (interleaves) data across the arrays to improve parallelism and throughput RAID 1 mirrors or shadows every disk every write happens to two disks Reads to the mirror may happen only when the primary disk fails or, you may try to read both together and the quicker response is accepted Expensive solution: high reliability at twice the cost 2

RAID 3 Data is bit-interleaved across several disks and a separate disk maintains parity information for a set of bits For example: with 8 disks, bit 0 is in disk-0, bit 1 is in disk-1,, bit 7 is in disk-7; disk-8 maintains parity for all 8 bits For any read, 8 disks must be accessed (as we usually read more than a byte at a time) and for any write, 9 disks must be accessed as parity has to be re-calculated High throughput for a single request, low cost for redundancy (overhead: 12.5%), low task-level parallelism 3

RAID 4 and RAID 5 Data is block interleaved this allows us to get all our data from a single disk on a read in case of a disk error, read all 9 disks Block interleaving reduces thruput for a single request (as only a single disk drive is servicing the request), but improves task-level parallelism as other disk drives are free to service other requests On a write, we access the disk that stores the data and the parity disk parity information can be updated simply by checking if the new data differs from the old data 4

RAID 5 If we have a single disk for parity, multiple writes can not happen in parallel (as all writes must update parity info) RAID 5 distributes the parity block to allow simultaneous writes 5

RAID Summary RAID 1-5 can tolerate a single fault mirroring (RAID 1) has a 100% overhead, while parity (RAID 3, 4, 5) has modest overhead Can tolerate multiple faults by having multiple check functions each additional check can cost an additional disk (RAID 6) RAID 6 and RAID 2 (memory-style ECC) are not commercially employed 6

Multiprocessor Taxonomy SISD: single instruction and single data stream: uniprocessor MISD: no commercial multiprocessor: imagine data going through a pipeline of execution engines SIMD: vector architectures: lower flexibility MIMD: most multiprocessors today: easy to construct with off-the-shelf computers, most flexibility 7

Memory Organization - I Centralized shared-memory multiprocessor or Symmetric shared-memory multiprocessor (SMP) Multiple processors connected to a single centralized memory since all processors see the same memory organization uniform memory access (UMA) Shared-memory because all processors can access the entire memory address space Can centralized memory emerge as a bandwidth bottleneck? not if you have large caches and employ fewer than a dozen processors 8

SMPs or Centralized Shared-Memory Caches Caches Caches Caches Main Memory I/O System 9

Memory Organization - II For higher scalability, memory is distributed among processors distributed memory multiprocessors If one processor can directly address the memory local to another processor, the address space is shared distributed shared-memory (DSM) multiprocessor If memories are strictly local, we need messages to communicate data cluster of computers or multicomputers Non-uniform memory architecture (NUMA) since local memory has lower latency than remote memory 10

Distributed Memory Multiprocessors & Caches & Caches & Caches & Caches Memory I/O Memory I/O Memory I/O Memory I/O Interconnection network 11

SMPs Centralized main memory and many caches many copies of the same data A system is cache coherent if a read returns the most recently written value for that word Time Event Value of X in Cache-A Cache-B Memory 0 - - 1 1 CPU-A reads X 1-1 2 CPU-B reads X 1 1 1 3 CPU-A stores 0 in X 0 1 0 12

Cache Coherence A memory system is coherent if: P writes to X; no other processor writes to X; P reads X and receives the value previously written by P P1 writes to X; no other processor writes to X; sufficient time elapses; P2 reads X and receives value written by P1 Two writes to the same location by two processors are seen in the same order by all processors write serialization The memory consistency model defines time elapsed before the effect of a processor is seen by others 13

Cache Coherence Protocols Directory-based: A single location (directory) keeps track of the sharing status of a block of memory Snooping: Every cache block is accompanied by the sharing status of that block all cache controllers monitor the shared bus so they can update the sharing status of the block, if necessary Write-invalidate: a processor gains exclusive access of a block before writing by invalidating all other copies Write-update: when a processor writes, it updates other shared copies of that block 14

Design Issues Three states for a block: invalid, shared, modified A write is placed on the bus and sharers invalidate themselves Caches Caches Caches Caches Main Memory I/O System 15

Title Bullet 16