Chapter 12: Multiprocessor Architectures. Lesson 09: Cache Coherence Problem and Cache synchronization solutions Part 1



Similar documents
Multiprocessor Cache Coherence

Vorlesung Rechnerarchitektur 2 Seite 178 DASH

Chapter 12: Multiprocessor Architectures. Lesson 04: Interconnect Networks

Lecture 23: Multiprocessors

Supporting Cache Coherence in Heterogeneous Multiprocessor Systems

Interconnection Networks

Multi-core architectures. Jernej Barbic , Spring 2007 May 3, 2007

Distributed Systems. REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1

A Locally Cache-Coherent Multiprocessor Architecture

Chapter 12: Multiprocessor Architectures. Lesson 01: Performance characteristics of Multiprocessor Architectures and Speedup

Embedded Parallel Computing

Measuring and Characterizing Cache Coherence Traffic

CS550. Distributed Operating Systems (Advanced Operating Systems) Instructor: Xian-He Sun

An Interconnection Network for a Cache Coherent System on FPGAs. Vincent Mirian

Client/Server Computing Distributed Processing, Client/Server, and Clusters

Chapter 11: Input/Output Organisation. Lesson 06: Programmed IO

Lecture 2 Parallel Programming Platforms

Distributed Operating Systems

Shared Memory Consistency Models: A Tutorial

Chapter 2. Multiprocessors Interconnection Networks

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

Virtual Shared Memory (VSM)

COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)

OPERATING SYSTEMS Internais and Design Principles

Chapter 02: Computer Organization. Lesson 04: Functional units and components in a computer organization Part 3 Bus Structures

AN OVERVIEW OF DISTRIBUTED DATABASE MANAGEMENT

Lesson Objectives. To provide a grand tour of the major operating systems components To provide coverage of basic computer system organization

Outline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip

Introduction to Parallel Computing. George Karypis Parallel Programming Platforms

B) Using Processor-Cache Affinity Information in Shared Memory Multiprocessor Scheduling

What is a bus? A Bus is: Advantages of Buses. Disadvantage of Buses. Master versus Slave. The General Organization of a Bus

Chapter 10: Virtual Memory. Lesson 08: Demand Paging and Page Swapping

Synchronization. Todd C. Mowry CS 740 November 24, Topics. Locks Barriers

ADVANCED COMPUTER ARCHITECTURE: Parallelism, Scalability, Programmability

MCSE Objectives. Exam : TS:Exchange Server 2007, Configuring

Computer Science 4302 Operating Systems. Student Learning Outcomes

DFSgc. Distributed File System for Multipurpose Grid Applications and Cloud Computing

Annotation to the assignments and the solution sheet. Note the following points

Embedded Software development Process and Tools:

Virtual machine interface. Operating system. Physical machine interface

How To Make A Distributed System Transparent

Chapter 13. PIC Family Microcontroller

Router Architectures

Chapter 2: Computer-System Structures. Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture

Operating system Dr. Shroouq J.

An Introduction to the Intel QuickPath Interconnect January 2009

High Performance Computer Architecture

5054A: Designing a High Availability Messaging Solution Using Microsoft Exchange Server 2007

LSN 2 Computer Processors

Common Operating-System Components

Design and Implementation of the Heterogeneous Multikernel Operating System

Introduction to AMBA 4 ACE and big.little Processing Technology

Distributed Systems. Examples. Advantages and disadvantages. CIS 505: Software Systems. Introduction to Distributed Systems

Understand Troubleshooting Methodology

Study Plan Masters of Science in Computer Engineering and Networks (Thesis Track)

BBM467 Data Intensive ApplicaAons

The Orca Chip... Heart of IBM s RISC System/6000 Value Servers

PikeOS: Multi-Core RTOS for IMA. Dr. Sergey Tverdyshev SYSGO AG , Moscow

Microkernels, virtualization, exokernels. Tutorial 1 CSC469

Multicore Processor and GPU. Jia Rao Assistant Professor in CS

Distributed Systems. Virtualization. Paul Krzyzanowski

21152 PCI-to-PCI Bridge

High Performance Computing. Course Notes HPC Fundamentals

Multifacet s General Execution-driven Multiprocessor Simulator (GEMS) Toolset

Distributed Systems LEEC (2005/06 2º Sem.)

5053A: Designing a Messaging Infrastructure Using Microsoft Exchange Server 2007

Distributed System: Definition

Developing Microsoft Azure Solutions 20532B; 5 Days, Instructor-led

2.1 What are distributed systems? What are systems? Different kind of systems How to distribute systems? 2.2 Communication concepts

An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 5 - DBMS Architecture

Multi-Threading Performance on Commodity Multi-Core Processors

Homework # 2. Solutions. 4.1 What are the differences among sequential access, direct access, and random access?

BLM 413E - Parallel Programming Lecture 3

Comprehensive Hardware and Software Support for Operating Systems to Exploit MP Memory Hierarchies

CHAPTER 4 MARIE: An Introduction to a Simple Computer

Real-Time Operating Systems for MPSoCs

DISK: A Distributed Java Virtual Machine For DSP Architectures

Weighted Total Mark. Weighted Exam Mark

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer

The SGI Origin: A ccnuma Highly Scalable Server

Lecture 1 Operating System Overview

OPERATING SYSTEM - VIRTUAL MEMORY

Precise and Accurate Processor Simulation

A Lab Course on Computer Architecture

Memory Coherence in Shared Virtual Memory Systems

Technical Report. Communication centric, multi-core, fine-grained processor architecture. Gregory A. Chadwick. Number 832.

Transcription:

Chapter 12: Multiprocessor Architectures Lesson 09: Cache Coherence Problem and Cache synchronization solutions Part 1

Objective To understand cache coherence problem To learn the methods used to solve it To understand the synchronization mechanisms for maintaining cache coherence Cache-coherence problem protocol MESI 2

Disadvantage of Bus shared systems 3

Disadvantage of Bus shared systems Only some of the data in the memory directly accessible by each processor, since a processor can only read and write its local memory system Accessing data in another processor s memory requires communication through the network 4

Possibility that two or more copies of a given datum Two or more copies could exist in different processors memories The copies may result because of data sharing, because of process migration from one processor to another, or because of an I/O operation 5

Cache coherence problem 6

Cache coherence problem When two or more copies of a given datum exist in different processors memories, it may lead to different processors having different values for the same variable Major source of complexity in shared-memory systems 7

Cache coherence problem Problem of inconsistency between a cached copy and the shared memory or between cached copies themselves due to the existence of multiple cached copies of data 8

Cache Coherence Protocols snooping bus protocols 9

Maintaining cache coherence using cache snooping Cache snooping easy in a bus-based multiprocessor Each processor in the system can observe the state of the memory bus, called cache snooping Cache snooping allows each processor to see any requests that other processors make to the main memory 10

Cache coherence protocol of a shared memory multiprocessor Defines how the data may be shared and replicated across processors 11

Memory-coupling (consistency) model Defines when the programs running on the processors will see operations executed on other processors 12

Cache coherence protocol Defines the specific set of actions that are executed to keep each processor s view of the memory system consistent Operates on cache lines of data at a time, communicating an entire line between processors when necessary, rather than just sending a single word 13

Two categories of Cache-coherence protocols Snoopy bus protocols for shared buses Directory based protocols for the multistage networks 14

Two categories of snooping bus protocols Invalidation-based protocols Update-based protocols 15

Invalidation based cache protocol 16

Invalidation based protocols MESI Protocol 17

MESI Protocol Commonly used invalidation-based cachecoherence protocol Each line in a processor s cache assigned one of four states to track which caches have copies of the line 18

Four possible tates of a cache line in MESI Protocol Cache line state M, E or S or I (Modified, Exclusive, Shared, or Invalid) 19

Invalidation based cache protocol 20

State I The invalid state means that the processor does not have a copy of the line Any access to the line will require that the shared-memory system send a request message to the memory that contains the line to get a copy of the line 21

State S The shared state means that the processor has a copy of the line, and that one or more other processors also have copies The processor may read from the line, but any attempt to write the line requires that the other copies of the line be invalidated 22

State E If a line is in the exclusive state, the processor is the only one that has a copy of the line, but it has not written the line since it acquired the copy 23

Advantage of State E The definition of the exclusive unmodified state of data entirely avoids the transition to invalidations on write operations to unmodified non-shared blocks 24

State M The modified state means that the processor is the only one with a copy of the line, and it has written the line since it acquired the copy In both the exclusive and modified states, the processor may read and write the line freely 25

Different snoopy bus protocols 26

The states in different snoopy bus protocols 27

The meaning of terms dirty and invalid dirty 28

Summary 29

We Learnt Cache coherence problem Maintenance of identical copies when a copy is written and two or more copies already exist Snoopy bus protocols Invalidation based snoopy bus protocols 30

End of Lesson 04 on Cache Coherence Problem and Cache synchronization solutions Part 1 31