# Lecture: Pipelining Extensions. Topics: control hazards, multi-cycle instructions, pipelining equations

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Lecture: Pipelining Extensions Topics: control hazards, multi-cycle instructions, pipelining equations 1

2 Problem 6 Show the instruction occupying each stage in each cycle (with bypassing) if I1 is R1+R2 R3 and I2 is R3+R4 R5 and I3 is R3+R8 R9. Identify the input latch for each input operand. CYC-1 CYC-2 CYC-3 CYC-4 CYC-5 CYC-6 CYC-7 CYC-8 IF I1 IF I2 IF I3 IF I4 IF I5 IF IF IF D/R ALU D/R I1 ALU D/R I2 ALU I1 D/R I3 ALU I2 D/R I4 L3 L3 L4 L3 L5 L3 ALU I3 D/R ALU D/R ALU D/R ALU DM DM DM DM I1 DM I2 DM I3 DM DM RW RW RW RW RW I1 RW I2 RW I3 RW

3 Problem 7 Consider this 8-stage pipeline (RR and RW take a full cycle) IF DE RR AL AL DM DM RW For the following pairs of instructions, how many stalls will the 2 nd instruction experience (with and without bypassing)? ADD R1+R2 R3 ADD R3+R4 R5 without: 5 with: 1 LD [R1] R2 ADD R2+R3 R4 without: 5 with: 3 LD [R1] R2 SD [R2] R3 without: 5 with: 3 LD [R1] R2 SD [R3] R2 without: 5 with: 1 3

4 Hazards Structural Hazards Data Hazards Control Hazards 4

5 Control Hazards Simple techniques to handle control hazard stalls: for every branch, introduce a stall cycle (note: every 6 th instruction is a branch on average!) assume the branch is not taken and start fetching the next instruction if the branch is taken, need hardware to cancel the effect of the wrong-path instructions predict the next PC and fetch that instr if the prediction is wrong, cancel the effect of the wrong-path instructions fetch the next instruction (branch delay slot) and execute it anyway if the instruction turns out to be on the correct path, useful work was done if the instruction turns out to be on the wrong path, hopefully program state is not lost 5

6 Branch Delay Slots 6

7 Problem 1 Consider a branch that is taken 80% of the time. On average, how many stalls are introduced for this branch for each approach below: Stall fetch until branch outcome is known Assume not-taken and squash if the branch is taken Assume a branch delay slot o You can t find anything to put in the delay slot o An instr before the branch is put in the delay slot o An instr from the taken side is put in the delay slot o An instr from the not-taken side is put in the slot 7

8 Problem 1 Consider a branch that is taken 80% of the time. On average, how many stalls are introduced for this branch for each approach below: Stall fetch until branch outcome is known 1 Assume not-taken and squash if the branch is taken 0.8 Assume a branch delay slot o You can t find anything to put in the delay slot 1 o An instr before the branch is put in the delay slot 0 o An instr from the taken side is put in the slot 0.2 o An instr from the not-taken side is put in the slot 0.8 8

9 Multicycle Instructions 9

10 Effects of Multicycle Instructions Potentially multiple writes to the register file in a cycle Frequent RAW hazards WAW hazards (WAR hazards not possible) Imprecise exceptions because of o-o-o instr completion Note: Can also increase the width of the processor: handle multiple instructions at the same time: for example, fetch two instructions, read registers for both, execute both, etc. 10

11 Precise Exceptions On an exception: must save PC of instruction where program must resume all instructions after that PC that might be in the pipeline must be converted to NOPs (other instructions continue to execute and may raise exceptions of their own) temporary program state not in memory (in other words, registers) has to be stored in memory potential problems if a later instruction has already modified memory or registers A processor that fulfils all the above conditions is said to provide precise exceptions (useful for debugging and of course, correctness) 11

12 Dealing with these Effects Multiple writes to the register file: increase the number of ports, stall one of the writers during ID, stall one of the writers during WB (the stall will propagate) WAW hazards: detect the hazard during ID and stall the later instruction Imprecise exceptions: buffer the results if they complete early or save more pipeline state so that you can return to exactly the same state that you left at 12

13 Slowdowns from Stalls Perfect pipelining with no hazards an instruction completes every cycle (total cycles ~ num instructions) speedup = increase in clock speed = num pipeline stages With hazards and stalls, some cycles (= stall time) go by during which no instruction completes, and then the stalled instruction completes Total cycles = number of instructions + stall cycles Slowdown because of stalls = 1/ (1 + stall cycles per instr) 13

14 Pipelining Limits Gap between indep instrs: T + Tovh Gap between dep instrs: T + Tovh A B C A B C Gap between indep instrs: T/3 + Tovh Gap between dep instrs: T + 3Tovh A B C D E F A B C D E F Gap between indep instrs: T/6 + Tovh Gap between dep instrs: T + 6Tovh Assume that there is a dependence where the final result of the first instruction is required before starting the second instruction 14

15 Problem 2 Assume an unpipelined processor where it takes 5ns to go through the circuits and 0.1ns for the latch overhead. What is the throughput for 20-stage and 40-stage pipelines? Assume that the P.O.P and P.O.C in the unpipelined processor are separated by 2ns. Assume that half the instructions do not introduce a data hazard and half the instructions depend on their preceding instruction. 15

16 Problem 2 Assume an unpipelined processor where it takes 5ns to go through the circuits and 0.1ns for the latch overhead. What is the throughput for 1-stage, 20-stage and 50-stage pipelines? Assume that the P.O.P and P.O.C in the unpipelined processor are separated by 2ns. Assume that half the instructions do not introduce a data hazard and half the instructions depend on their preceding instruction. 1-stage: 1 instr every 5.1ns 20-stage: first instr takes 0.35ns, the second takes 2.8ns 50-stage: first instr takes 0.2ns, the second takes 4ns 16

17 Title Bullet 17

### Lecture: Pipelining Basics

Lecture: Pipelining Basics Topics: Basic pipelining implementation Video 1: What is pipelining? Video 2: Clocks and latches Video 3: An example 5-stage pipeline Video 4: Loads/Stores and RISC/CISC 1 Building

### Page # Outline. Overview of Pipelining. What is pipelining? Classic 5 Stage MIPS Pipeline

Outline Overview of Venkatesh Akella EEC 270 Winter 2005 Overview Hazards and how to eliminate them? Performance Evaluation with Hazards Precise Interrupts Multicycle Operations MIPS R4000-8-stage pipelined

### Hiroaki Kobayashi 10/08/2002

Hiroaki Kobayashi 10/08/2002 1 Multiple instructions are overlapped in execution Process of instruction execution is divided into two or more steps, called pipe stages or pipe segments, and Different stage

### What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages

What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism

### Lecture 3 Pipeline Hazards

CS 203A Advanced Computer Architecture Lecture 3 Pipeline Hazards 1 Pipeline Hazards Hazards are caused by conflicts between instructions. Will lead to incorrect behavior if not fixed. Three types: Structural:

### Outline. 1 Reiteration 2 MIPS R Instruction Level Parallelism. 4 Branch Prediction. 5 Dependencies. 6 Instruction Scheduling.

Outline 1 Reiteration Lecture 4: EITF20 Computer Architecture Anders Ardö EIT Electrical and Information Technology, Lund University November 6, 2013 2 MIPS R4000 3 Instruction Level Parallelism 4 Branch

### CS4617 Computer Architecture

1/39 CS4617 Computer Architecture Lecture 20: Pipelining Reference: Appendix C, Hennessy & Patterson Reference: Hamacher et al. Dr J Vaughan November 2013 2/39 5-stage pipeline Clock cycle Instr No 1 2

### CSCI 6461 Computer System Architecture Spring 2015 The George Washington University

CSCI 6461 Computer System Architecture Spring 2015 The George Washington University Homework 2 (To be submitted using Blackboard by 2/25/15 at 11.59pm) Note: Make reasonable assumptions if necessary and

### Appendix C: Pipelining: Basic and Intermediate Concepts

Appendix C: Pipelining: Basic and Intermediate Concepts Key ideas and simple pipeline (Section C.1) Hazards (Sections C.2 and C.3) Structural hazards Data hazards Control hazards Exceptions (Section C.4)

### Datapath vs Control. Appendix A: Pipelining Pipelining is an implementation technique whereby multiple instructions are overlapped in execution

Appendix A: Pipelining Pipelining is an implementation technique whereby multiple instructions are overlapped in execution Takes advantage of parallelism that exists among the actions needed to execute

### Pipelining: Its Natural!

Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 40 minutes Folder takes 20 minutes Sequential

### Solutions for the Sample of Midterm Test

s for the Sample of Midterm Test 1 Section: Simple pipeline for integer operations For all following questions we assume that: a) Pipeline contains 5 stages: IF, ID, EX, M and W; b) Each stage requires

### CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1

CMCS 611-101 Advanced Computer Architecture Lecture 7 Pipeline Hazards September 28, 2009 www.csee.umbc.edu/~younis/cmsc611/cmsc611.htm Mohamed Younis CMCS 611, Advanced Computer Architecture 1 Previous

### Computer Science 146. Computer Architecture

Computer Architecture Spring 2004 Harvard University Instructor: Prof. dbrooks@eecs.harvard.edu Lecture 5: Exceptions, Multicycle Ops, Dynamic Scheduling Lecture Outline Homework 1 Questions? Finish up

### Basic Pipelining. B.Ramamurthy CS506

Basic Pipelining CS506 1 Introduction In a typical system speedup is achieved through parallelism at all levels: Multi-user, multitasking, multi-processing, multi-programming, multi-threading, compiler

### COSC4201. Prof. Mokhtar Aboelaze York University

COSC4201 Pipelining (Review) Prof. Mokhtar Aboelaze York University Based on Slides by Prof. L. Bhuyan (UCR) Prof. M. Shaaban (RIT) 1 Instructions: Fetch Every DLX instruction could be executed in 5 cycles,

### Pipelining. Pipelining in CPUs

Pipelining Chapter 2 1 Pipelining in CPUs Pipelining is an CPU implementation technique whereby multiple instructions are overlapped in execution. Think of a factory assembly line each step in the pipeline

### Computer Architecture. Pipeline: Hazards. Prepared By: Arun Kumar

Computer Architecture Pipeline: Hazards Prepared By: Arun Kumar Pipelining Outline Introduction Defining Pipelining Pipelining Instructions Hazards Structural hazards \ Data Hazards Control Hazards Performance

### Pipelining: Its Natural!

Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes, Dryer takes 40 minutes, Folder takes 20 minutes T a s k O

### Computer Architecture-I

Computer Architecture-I Assignment 2 Solution Assumptions: a) We assume that accesses to data memory and code memory can happen in parallel due to dedicated instruction and data bus. b) The STORE instruction

### Lecture Topics. Pipelining in Processors. Pipelining and ISA ECE 486/586. Computer Architecture. Lecture # 10. Reference:

Lecture Topics ECE 486/586 Computer Architecture Lecture # 10 Spring 2015 Pipelining Hazards and Stalls Reference: Effect of Stalls on Pipeline Performance Structural hazards Data Hazards Appendix C: Sections

### Material from Chapter 3 of H&P (for DLX). Material from Chapter 6 of P&H (for MIPS). Outline: (In this set.)

06 1 MIPS Implementation 06 1 Material from Chapter 3 of H&P (for DLX). Material from Chapter 6 of P&H (for MIPS). line: (In this set.) Unpipelined DLX Implementation. (Diagram only.) Pipelined MIPS Implementations:

### Processor: Superscalars

Processor: Superscalars Pipeline Organization Z. Jerry Shi Assistant Professor of Computer Science and Engineering University of Connecticut * Slides adapted from Blumrich&Gschwind/ELE475 03, Peh/ELE475

### Appendix C Solutions. 2 Solutions to Case Studies and Exercises. C.1 a.

2 Solutions to Case Studies and Exercises C.1 a. Appendix C Solutions R1 LD DADDI R1 DADDI SD R2 LD DADDI R2 SD DADDI R2 DSUB DADDI R4 BNEZ DSUB b. Forwarding is performed only via the register file. Branch

WAR: Write After Read write-after-read (WAR) = artificial (name) dependence add R1, R2, R3 sub R2, R4, R1 or R1, R6, R3 problem: add could use wrong value for R2 can t happen in vanilla pipeline (reads

### Outline. G High Performance Computer Architecture. (Review) Pipeline Hazards (A): Structural Hazards. (Review) Pipeline Hazards

Outline G.43-1 High Performance Computer Architecture Lecture 4 Pipeline Hazards Branch Prediction Pipeline Hazards Structural Data Control Reducing the impact of control hazards Delay slots Branch prediction

### What is Pipelining? Pipelining Part I. Ideal Pipeline Performance. Ideal Pipeline Performance. Like an Automobile Assembly Line for Instructions

What is Pipelining? Pipelining Part I CS448 Like an Automobile Assembly Line for Instructions Each step does a little job of processing the instruction Ideally each step operates in parallel Simple Model

### Pipelining Appendix A

Pipelining Appendix A CS A448 1 What is Pipelining? Like an Automobile Assembly Line for Instructions Each step does a little job of processing the instruction Ideally each step operates in parallel Simple

### Instruction Level Parallelism Part I - Introduction

Course on: Advanced Computer Architectures Instruction Level Parallelism Part I - Introduction Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Outline of Part I Introduction

### Instruction Level Parallelism Part I

Course on: Advanced Computer Architectures Instruction Level Parallelism Part I Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Outline of Part I Introduction to ILP Dependences

### Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:

Multiple-Issue Processors Pipelining can achieve CPI close to 1 Mechanisms for handling hazards Static or dynamic scheduling Static or dynamic branch handling Increase in transistor counts (Moore s Law):

### INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER

Course on: Advanced Computer Architectures INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER Prof. Cristina Silvano Politecnico di Milano cristina.silvano@polimi.it Prof. Silvano, Politecnico di Milano

### Computer Architecture Pipelines. Diagrams are from Computer Architecture: A Quantitative Approach, 2nd, Hennessy and Patterson.

Computer Architecture Pipelines Diagrams are from Computer Architecture: A Quantitative Approach, 2nd, Hennessy and Patterson. Instruction Execution Instruction Execution 1. Instruction Fetch (IF) - Get

### PROBLEMS #20,R0,R1 #\$3A,R2,R4

506 CHAPTER 8 PIPELINING (Corrisponde al cap. 11 - Introduzione al pipelining) PROBLEMS 8.1 Consider the following sequence of instructions Mul And #20,R0,R1 #3,R2,R3 #\$3A,R2,R4 R0,R2,R5 In all instructions,

### Pipeline Hazards. Structure hazard Data hazard. ComputerArchitecture_PipelineHazard1

Pipeline Hazards Structure hazard Data hazard Pipeline hazard: the major hurdle A hazard is a condition that prevents an instruction in the pipe from executing its next scheduled pipe stage Taxonomy of

### Overview. Chapter 2. CPI Equation. Instruction Level Parallelism. Basic Pipeline Scheduling. Sample Pipeline

Overview Chapter 2 Instruction-Level Parallelism and Its Exploitation Instruction level parallelism Dynamic Scheduling Techniques Scoreboarding Tomasulo s Algorithm Reducing Branch Cost with Dynamic Hardware

### How to improve (decrease) CPI

How to improve (decrease) CPI Recall: CPI = Ideal CPI + CPI contributed by stalls Ideal CPI =1 for single issue machine even with multiple execution units Ideal CPI will be less than 1 if we have several

### Chapter 2. Instruction-Level Parallelism and Its Exploitation. Overview. Instruction level parallelism Dynamic Scheduling Techniques

Chapter 2 Instruction-Level Parallelism and Its Exploitation ti 1 Overview Instruction level parallelism Dynamic Scheduling Techniques Scoreboarding Tomasulo s Algorithm Reducing Branch Cost with Dynamic

Advanced Computer Architecture Pipelining Dr. Shadrokh Samavi 1 Dr. Shadrokh Samavi 2 Some slides are from the instructors resources which accompany the 5 th and previous editions. Some slides are from

### Five instruction execution steps. Single-cycle vs. multi-cycle. Goal of pipelining is. CS/COE1541: Introduction to Computer Architecture.

Five instruction execution steps CS/COE1541: Introduction to Computer Architecture Pipelining Sangyeun Cho Instruction fetch Instruction decode and register read Execution, memory address calculation,

### response (or execution) time -- the time between the start and the finish of a task throughput -- total amount of work done in a given time

Chapter 4: Assessing and Understanding Performance 1. Define response (execution) time. response (or execution) time -- the time between the start and the finish of a task 2. Define throughput. throughput

### Spring 2015 :: CSE 502 Computer Architecture. Processor Pipeline. Instructor: Nima Honarmand

Processor Pipeline Instructor: Nima Honarmand Processor Datapath The Generic Instruction Cycle Steps in processing an instruction: Instruction Fetch (IF_STEP) Instruction Decode (ID_STEP) Operand Fetch

### A B C D. Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold. Time

Pipelining Readings: 4.5-4.8 Example: Doing the laundry A B C D Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes Folder takes

### Lecture 4: Pipeline Complications: Data and Control Hazards. Administrative

Lecture 4: Pipeline Complications: Data and Control Hazards Professor Alvin R. Lebeck Computer Science 220 Fall 2001 Administrative Homework #1 Due Tuesday, September 11 Start Reading Chapter 4 Projects

### CS 152 Computer Architecture and Engineering. Lecture 10 - Complex Pipelines, Out-of-Order Issue, Register Renaming

CS 152 Computer Architecture and Engineering Lecture 10 - Complex Pipelines, Out-of-Order Issue, Register Renaming Krste Asanovic Electrical Engineering and Computer Sciences University of California at

### COMP4211 (Seminar) Intro to Instruction-Level Parallelism

Overview COMP4211 (Seminar) Intro to Instruction-Level Parallelism 05S1 Week 02 Oliver Diessel Review pipelining from COMP3211 Look at integrating FP hardware into the pipeline Example: MIPS R4000 Goal:

### 1. Fetch 2. Decode 3. EXecute (or address calculation) 4. Memory access 5. Result Write back

Pipelining Five stages in the datapath of MIPS: 1. Fetch 2. Decode 3. EXecute (or address calculation) 4. Memory access 5. Result Write back Instructions take 3-5 cycles to complete execution. How to overlap

### CS 152 Computer Architecture and Engineering

CS 152 Computer Architecture and Engineering Lecture 21 Advanced Processors II 2004-11-16 Dave Patterson (www.cs.berkeley.edu/~patterson) John Lazzaro (www.cs.berkeley.edu/~lazzaro) Thanks to Krste Asanovic...

### HW 5 Solutions. Manoj Mardithaya

HW 5 Solutions Manoj Mardithaya Question 1: Processor Performance The critical path latencies for the 7 major blocks in a simple processor are given below. IMem Add Mux ALU Regs DMem Control a 400ps 100ps

### Spring 2016 :: CSE 502 Computer Architecture. Processor Pipeline. Nima Honarmand

Processor Pipeline Nima Honarmand Generic Instruction Cycle Steps in processing an instruction: Instruction Fetch (IF_STEP) Instruction Decode (ID_STEP) Operand Fetch (OF_STEP) Might be from registers

### The MIPS R4000 Pipeline (circa 1991)

The MIPS R4000 Pipeline (circa 1991) Microprocessor without Interlocked Pipeline Stages (MIPS) Challenge S Challenge M Crimson Indigo Indigo 2 Indy R4300 (embedded) Nintendo-64 Superpipelining: pipelining

### Basic Pipelining. Basic Pipelining

Basic Pipelining * In a typical system, speedup is achieved through parallelism at all levels: Multi -user, multi-tasking, multi-processing, multi-programming, multi-threading, compiler optimizations and

### UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering. EEC180B Lab 7: MISP Processor Design Spring 1995

UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering EEC180B Lab 7: MISP Processor Design Spring 1995 Objective: In this lab, you will complete the design of the MISP processor,

### Pipelining Review and Its Limitations

Pipelining Review and Its Limitations Yuri Baida yuri.baida@gmail.com yuriy.v.baida@intel.com October 16, 2010 Moscow Institute of Physics and Technology Agenda Review Instruction set architecture Basic

### Floating Point/Multicycle Pipelining in MIPS

Floating Point/Multicycle Pipelining in MIPS Completion of MIPS EX stage floating point arithmetic operations in one or two cycles is impractical since it requires: A much longer CPU clock cycle, and/or

ECE 563 Advanced Computer Architecture Fall 2010 Lecture 4: Instruction-level Parallelism 563 L04.1 Fall 2010 In the News AMD Quad-Core Released 9/10/2007 First single die quad-core x86 processor Major

Advanced Pipelining Techniques 1. Dynamic Scheduling 2. Loop Unrolling 3. Software Pipelining 4. Dynamic Branch Prediction Units 5. Register Renaming 6. Superscalar Processors 7. VLIW (Very Large Instruction

### Pipelined Processor Design

Pipelined Processor Design Instructor: Huzefa Rangwala, PhD CS 465 Fall 2014 Review on Single Cycle Datapath Subset of the core MIPS ISA Arithmetic/Logic instructions: AND, OR, ADD, SUB, SLT Data flow

### CIT 595 Spring Real-World Pipelines: Car Washes. Sequential Laundry. Laundry Example

Real-World Pipelines: Car Washes Sequential Parallel Pipelining CI 595 Spring 2010 Additional material 2004/2005 Lewis/Martin Modified by Diana Palsetia Pipelined Idea Divide process into independent stages

### ECE 4750 Computer Architecture Topic 3: Pipelining Structural & Data Hazards

ECE 4750 Computer Architecture Topic 3: Pipelining Structural & Data Hazards Christopher Batten School of Electrical and Computer Engineering Cornell University http://www.csl.cornell.edu/courses/ece4750

### Instruction-Level Parallelism

Instruction-Level Parallelism Chapter 3 Dynamic Scheduling 1 ILP basic concepts Basic idea of pipelining overlap instructions in execution to improve performance ILP exploits parallelism between instructions

### CS352H: Computer Systems Architecture

CS352H: Computer Systems Architecture Topic 9: MIPS Pipeline - Hazards October 1, 2009 University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell Data Hazards in ALU Instructions

### Solutions. Solution 4.1. 4.1.1 The values of the signals are as follows:

4 Solutions Solution 4.1 4.1.1 The values of the signals are as follows: RegWrite MemRead ALUMux MemWrite ALUOp RegMux Branch a. 1 0 0 (Reg) 0 Add 1 (ALU) 0 b. 1 1 1 (Imm) 0 Add 1 (Mem) 0 ALUMux is the

### Computer Architecture Lecture 4: Pipelining (Appendix C) Chih Wei Liu 劉志尉 National Chiao Tung University

Computer Architecture Lecture 4: Pipelining (Appendix C) Chih Wei Liu 劉志尉 National Chiao Tung University cwliu@twins.ee.nctu.edu.tw Why Pipeline? T I 2 I 1 I 2 I 1 Throughput = 1/T I 4 I 3 I 2 I 1 I 4

### Instruction Level Parallelism 1 (Compiler Techniques)

Instruction Level Parallelism 1 (Compiler Techniques) Outline ILP Compiler techniques to increase ILP Loop Unrolling Static Branch Prediction Dynamic Branch Prediction Overcoming Data Hazards with Dynamic

### Pipelining: Performance and Limits

Pipelining: Performance and Limits Arturo Díaz D PérezP Centro de Investigación n y de Estudios Avanzados del IPN adiaz@cinvestav.mx PipeliningPerformance- 1 Review: Summary of Pipelining Basics Pipelines

### EXPLOITING ILP. B649 Parallel Architectures and Pipelining

EXPLOITING ILP B649 Parallel Architectures and Pipelining What is ILP? Pipeline CPI = Ideal pipeline CPI + Structural stalls + Data hazard stalls + Control stalls Use hardware techniques to minimize stalls

### CPU Performance Equation

CPU Performance Equation C T I T ime for task = C T I =Average # Cycles per instruction =Time per cycle =Instructions per task Pipelining e.g. 3-5 pipeline steps (ARM, SA, R3000) Attempt to get C down

### CSE Computer Architecture I Fall 2010 Final Exam December 13, 2010

CSE 30321 Computer Architecture I Fall 2010 Final Exam December 13, 2010 Test Guidelines: 1. Place your name on EACH page of the test in the space provided. 2. Answer every question in the space provided.

### Computer Science 146. Computer Architecture

Computer Architecture Spring 2004 Harvard University Instructor: Prof. dbrooks@eecs.harvard.edu Lecture 4: Basic Implementation and Pipelining Lecture Outline Finish ISAs (Compiler Impact) Basic Implementation

### PIPELINING CHAPTER OBJECTIVES

CHAPTER 8 PIPELINING CHAPTER OBJECTIVES In this chapter you will learn about: Pipelining as a means for executing machine instructions concurrently Various hazards that cause performance degradation in

### Lecture 7 ARM Processor Organization. Internal Organization of ARM

Lecture 7 AM Processor Organization First AM processor developed on 3 micron technology in 83-85 his course is mainly based on the AM6/7 architecture developed between 90-95. Digital quipment Corporation

### Design of Pipelined MIPS Processor. Sept. 24 & 26, 1997

Design of Pipelined MIPS Processor Sept. 24 & 26, 1997 Topics Instruction processing Principles of pipelining Inserting pipe registers Data Hazards Control Hazards Exceptions MIPS architecture subset R-type

### CS429: Computer Organization and Architecture

CS429: Computer Organization and Architecture Dr. Bill Young Department of Computer Sciences University of Texas at Austin Last updated: October 31, 2016 at 12:07 CS429 Slideset 14: 1 Overview What s wrong

### Computer Architecture

Computer Architecture Having studied numbers, combinational and sequential logic, and assembly language programming, we begin the study of the overall computer system. The term computer architecture is

### Q. Consider a dynamic instruction execution (an execution trace, in other words) that consists of repeats of code in this pattern:

Pipelining HW Q. Can a MIPS SW instruction executing in a simple 5-stage pipelined implementation have a data dependency hazard of any type resulting in a nop bubble? If so, show an example; if not, prove

### Lecture 8: Pipeline Complications Control Hazards, Branches and Interrupts Professor Randy H. Katz Computer Science 252 Spring 1996

Lecture 8: Pipeline Complications Control Hazards, Branches and Interrupts Professor Randy H. Katz Computer Science 252 Spring 1996 RHK.SP96 1 T a s k O r d e r A B C D Review From Last Time Pipelining

### Data Dependences. A data dependence occurs whenever one instruction needs a value produced by another.

Data Hazards 1 Hazards: Key Points Hazards cause imperfect pipelining They prevent us from achieving CPI = 1 They are generally causes by counter flow data pennces in the pipeline Three kinds Structural

### A Brief Review of Processor Architecture. Why are Modern Processors so Complicated? Basic Structure

A Brief Review of Processor Architecture Why are Modern Processors so Complicated? Basic Structure CPU PC IR Regs ALU Memory Fetch PC -> Mem addr [addr] > IR PC ++ Decode Select regs Execute Perform op

### Pipelined CPUs. Amdahl's Law

Comp, Spring 5 3/ Lecture page Pipelined CPUs Where are the registers? Study Chapter 6 of Text L6 Pipelined CPU I mdahl's Law t affected t improved = + rspeedup t unaffected Example: "Suppose a program

### Lecture 5: ISA Implementation. Basic Pipelining

Lecture 5: IS Implementation Kunle Olukotun Gates 302 kunle@ogun.stanford.edu http://www-leland.stanford.edu/class/ee282h/ 1 Basic Pipelining Pipelining is the organizational implementation technique that

### PROBLEMS. which was discussed in Section 1.6.3.

22 CHAPTER 1 BASIC STRUCTURE OF COMPUTERS (Corrisponde al cap. 1 - Introduzione al calcolatore) PROBLEMS 1.1 List the steps needed to execute the machine instruction LOCA,R0 in terms of transfers between

### 361 Computer Architecture Lecture 12: Designing a Pipeline Processor. Overview of a Multiple Cycle Implementation

36 Computer rchitecture Lecture 2: Designing a Pipeline Processor pipeline. Overview of a Multiple Cycle mplementation The root of the single cycle processor s problems: The cycle time has to be long enough

### ECE 6100 Advanced Computer Architecture Sample Final Exam Solns

Problem 1 (3 parts, 20points) Pipelining Speedup Suppose a program running on a RISC machine performs 16,000,000 instructions during its execution. The total time it takes to execute an instruction is

### Ľudmila Jánošíková. Department of Transportation Networks Faculty of Management Science and Informatics University of Žilina

Assembly Language Ľudmila Jánošíková Department of Transportation Networks Faculty of Management Science and Informatics University of Žilina Ludmila.Janosikova@fri.uniza.sk 041/5134 220 Recommended texts

More on Pipelining and Pipelines in Real Machines CS 333 Fall 2006 Main Ideas Data Hazards RAW WAR WAW More pipeline stall reduction techniques Branch prediction» static» dynamic bimodal branch prediction

### Using Graphics and Animation to Visualize Instruction Pipelining and its Hazards

Using Graphics and Animation to Visualize Instruction Pipelining and its Hazards Per Stenström, Håkan Nilsson, and Jonas Skeppstedt Department of Computer Engineering, Lund University P.O. Box 118, S-221

### Computer Organization and Components

Computer Organization and Components IS5, fall 25 Lecture : Pipelined Processors ssociate Professor, KTH Royal Institute of Technology ssistant Research ngineer, University of California, Berkeley Slides

### Computer Systems Architecture I. CSE 560M Lecture 7 Prof. Patrick Crowley

Computer Systems Architecture I CSE 560M Lecture 7 Prof. Patrick Crowley Announcement Plan for Today HW1 due Wednesday Commentary due next Monday Questions Pipelining III discussion The Story with ILP

### Advanced Topics in Computer Architecture

Advanced Topics in Computer Architecture Lecture 3 Instruction-Level Parallelism and Its Exploitation Marenglen Biba Department of Computer Science University of New York Tirana Outline Instruction-Level

### Pipelining An Instruction Assembly Line

Processor Designs So Far Pipelining An Assembly Line EEC7 Computer Architecture FQ 5 We have Single Cycle Design with low CPI but high CC We have ulticycle Design with low CC but high CPI We want best

### COMP303 Computer Architecture Lecture 15 Calculating and Improving Cache Performance

COMP303 Computer Architecture Lecture 15 Calculating and Improving Cache Performance Cache Performance Measures Hit rate : fraction found in the cache So high that we usually talk about Miss rate = 1 -

### 2013 Advanced Computer Architecrture Mid-term Exam

2013 Advanced Computer Architecrture Mid-term Exam 1. Amdahl s law When making changes to optimize part of a processor, it is often the case that speeding up one type of instruction comes at the cost of

### 3 Pipelining. It is quite a three-pipe problem. The Adventures of Sherlock Holmes. Sir Arthur Conan Doyle

3 Pipelining It is quite a three-pipe problem. Sir Arthur Conan Doyle The Adventures of Sherlock Holmes 3.1 What Is Pipelining? 125 3.2 The Basic Pipeline for DLX 132 3.3 The Major Hurdle of Pipelining

### Branch Prediction Techniques

Course on: Advanced Computer Architectures Branch Prediction Techniques Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it Outline Dealing with Branches in the Processor Pipeline

Computer Architecture A Quantitative Approach, Fifth Edition Chapter 3 Instruction-Level Parallelism and Its Exploitation 1 Introduction Pipelining become universal technique in 1985 Overlaps execution

### Instruction Level Parallelism

Instruction Level Parallelism Pipelining achieves Instruction Level Parallelism (ILP) Multiple instructions in parallel But, problems with pipeline hazards CPI = Ideal CPI + stalls/instruction Stalls =

### Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

Lecture Handout Computer Architecture Lecture No. 2 Reading Material Vincent P. Heuring&Harry F. Jordan Chapter 2,Chapter3 Computer Systems Design and Architecture 2.1, 2.2, 3.2 Summary 1) A taxonomy of