Processor Basic steps to process an instruction

Similar documents
Lecture: Pipelining Extensions. Topics: control hazards, multi-cycle instructions, pipelining equations

EE282 Computer Architecture and Organization Midterm Exam February 13, (Total Time = 120 minutes, Total Points = 100)

PROBLEMS #20,R0,R1 #$3A,R2,R4

WAR: Write After Read

Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:

UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering. EEC180B Lab 7: MISP Processor Design Spring 1995

VLIW Processors. VLIW Processors

Pipeline Hazards. Structure hazard Data hazard. ComputerArchitecture_PipelineHazard1

Review: MIPS Addressing Modes/Instruction Formats

Computer organization

Computer Organization and Components


Q. Consider a dynamic instruction execution (an execution trace, in other words) that consists of repeats of code in this pattern:

CS352H: Computer Systems Architecture

Execution Cycle. Pipelining. IF and ID Stages. Simple MIPS Instruction Formats

Central Processing Unit (CPU)

Using Graphics and Animation to Visualize Instruction Pipelining and its Hazards

PROBLEMS. which was discussed in Section

INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER

Solutions. Solution The values of the signals are as follows:

Pipeline Hazards. Arvind Computer Science and Artificial Intelligence Laboratory M.I.T. Based on the material prepared by Arvind and Krste Asanovic

Addressing The problem. When & Where do we encounter Data? The concept of addressing data' in computations. The implications for our machine design(s)

Chapter 9 Computer Design Basics!

Software Pipelining. for (i=1, i<100, i++) { x := A[i]; x := x+1; A[i] := x

Data Dependences. A data dependence occurs whenever one instruction needs a value produced by another.

CPU Performance Equation

Instruction Set Architecture. or How to talk to computers if you aren t in Star Trek

Instruction Set Architecture

CPU Organization and Assembly Language

LSN 2 Computer Processors

l C-Programming l A real computer language l Data Representation l Everything goes down to bits and bytes l Machine representation Language

PART B QUESTIONS AND ANSWERS UNIT I

In the Beginning The first ISA appears on the IBM System 360 In the good old days

Reduced Instruction Set Computer (RISC)

Design of Pipelined MIPS Processor. Sept. 24 & 26, 1997

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language

Computer Architecture TDTS10

Preface. Any questions from last time? A bit more motivation, information about me. A bit more about this class. Later: Will review 1st 22 slides

Course on Advanced Computer Architectures

StrongARM** SA-110 Microprocessor Instruction Timing

Administration. Instruction scheduling. Modern processors. Examples. Simplified architecture model. CS 412 Introduction to Compilers

Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University

COMPUTER ORGANIZATION ARCHITECTURES FOR EMBEDDED COMPUTING

Chapter 01: Introduction. Lesson 02 Evolution of Computers Part 2 First generation Computers

Chapter 2 Topics. 2.1 Classification of Computers & Instructions 2.2 Classes of Instruction Sets 2.3 Informal Description of Simple RISC Computer, SRC

CPU Organisation and Operation

Instruction Set Architecture (ISA) Design. Classification Categories

COMP 303 MIPS Processor Design Project 4: MIPS Processor Due Date: 11 December :59

BASIC COMPUTER ORGANIZATION AND DESIGN

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

(Refer Slide Time: 00:01:16 min)

Let s put together a Manual Processor

Week 1 out-of-class notes, discussions and sample problems

A s we saw in Chapter 4, a CPU contains three main sections: the register section,

EECS 427 RISC PROCESSOR

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM

Instruction Set Architecture (ISA)

Introduction to Cloud Computing

Module: Software Instruction Scheduling Part I

CS521 CSE IITG 11/23/2012

PROBLEMS (Cap. 4 - Istruzioni macchina)

CHAPTER 7: The CPU and Memory

Pipelining Review and Its Limitations

The Microarchitecture of Superscalar Processors

Chapter 4 Lecture 5 The Microarchitecture Level Integer JAVA Virtual Machine

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

MICROPROCESSOR AND MICROCOMPUTER BASICS


COMPUTERS ORGANIZATION 2ND YEAR COMPUTE SCIENCE MANAGEMENT ENGINEERING JOSÉ GARCÍA RODRÍGUEZ JOSÉ ANTONIO SERRA PÉREZ

A SystemC Transaction Level Model for the MIPS R3000 Processor

Systems I: Computer Organization and Architecture

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

Microprocessor & Assembly Language

An Introduction to the ARM 7 Architecture

Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings:

what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?

TIMING DIAGRAM O 8085

Operating Systems. Virtual Memory

Microprocessor/Microcontroller. Introduction

Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

8 9 +

Chapter 07: Instruction Level Parallelism VLIW, Vector, Array and Multithreaded Processors. Lesson 05: Array Processors

CS:APP Chapter 4 Computer Architecture. Wrap-Up. William J. Taffe Plymouth State University. using the slides of

Instruction Set Design

OAMulator. Online One Address Machine emulator and OAMPL compiler.

The ARM Architecture. With a focus on v7a and Cortex-A8

Z80 Instruction Set. Z80 Assembly Language

A Lab Course on Computer Architecture

Levels of Programming Languages. Gerald Penn CSC 324

GPU Architecture. An OpenCL Programmer s Introduction. Lee Howes November 3, 2010

Introducción. Diseño de sistemas digitales.1

CSE 141L Computer Architecture Lab Fall Lecture 2

Giving credit where credit is due

Microprocessor and Microcontroller Architecture

Z80 Microprocessors Z80 CPU. User Manual UM Copyright 2014 Zilog, Inc. All rights reserved.

Administrative Issues

Notes on Assembly Language

CSE 141 Introduction to Computer Architecture Summer Session I, Lecture 1 Introduction. Pramod V. Argade June 27, 2005

The WIMP51: A Simple Processor and Visualization Tool to Introduce Undergraduates to Computer Organization

Overview. CISC Developments. RISC Designs. CISC Designs. VAX: Addressing Modes. Digital VAX

Transcription:

Processor Basic steps to process an instruction Execute Write Back Instruction Fetch ory ccess Instruction Decode Operand Fetch EE524/CptS561 Jose G. Delgado-Frias 1 path +4 N Inst.. IR B L U. imm Instruction Fetch Inst. Dec. Op. Fetch Execute ory ccess Write Back IR [ N + 4 [IR 6..10 B [IR 11..15 Imm ((IR 16 ) 16 ## IR 11..15 EE524/CptS561 Jose G. Delgado-Frias 2 1

path (arith/logic inst) +4 N Inst.. IR B L U. imm Instruction Fetch Inst. Dec. Op. Fetch Execute ory ccess Write Back IR [ N + 4 [IR 6..10 B [IR 11..15 Imm ((IR 16 ) 16 ## IR 11..15 EE524/CptS561 Jose G. Delgado-Frias 3 path (load Inst) +4 N Inst.. IR B L U. imm Instruction Fetch Inst. Dec. Op. Fetch Execute ory ccess Write Back IR [ N + 4 [IR 6..10 B [IR 11..15 Imm ((IR 16 ) 16 ## IR 11..15 EE524/CptS561 Jose G. Delgado-Frias 4 2

path (store Inst) +4 N Inst.. IR B L U. imm Instruction Fetch Inst. Dec. Op. Fetch Execute ory ccess Write Back IR [ N + 4 [IR 6..10 B [IR 11..15 Imm ((IR 16 ) 16 ## IR 11..15 EE524/CptS561 Jose G. Delgado-Frias 5 path (branch Inst) +4 N Inst.. IR B L U. imm Instruction Fetch Inst. Dec. Op. Fetch Execute ory ccess Write Back IR [ N + 4 [IR 6..10 B [IR 11..15 Imm ((IR 16 ) 16 ## IR 11..15 EE524/CptS561 Jose G. Delgado-Frias 6 3

path w/ pipeline +4 Inst.. L U. EE524/CptS561 Jose G. Delgado-Frias 7 Instruction Fetch () stage 0 /ID 4 dd +4 m u x +4 +4 Instruction memory [ [ IR DD R7,R2,R5 clock EE524/CptS561 Jose G. Delgado-Frias 8 4

Instruction Decode (ID) stage 1 /ID ID/ IR IR 6..10 IR 11..15 rs1 rs2 isters [R2 [R5 IR 16..31 R7 Sign extend EE524/CptS561 Jose G. Delgado-Frias 9 Execution () stage 2 ID/ / [R2 [R5 SUM R7 EE524/CptS561 Jose G. Delgado-Frias 10 5

ory () stage 3 / / SUM ory R7 SUM EE524/CptS561 Jose G. Delgado-Frias 11 Write Back () stage 4 / SUM R7 SUM EE524/CptS561 Jose G. Delgado-Frias 12 6

Instruction Decode (ID) stage 4 /ID ID/ IR IR 6..10 IR 11..15 R7 SUM rs1 rs2 isters IR 16..31 Sign extend EE524/CptS561 Jose G. Delgado-Frias 13 Instruction Fetch () stage 100 /ID 4 dd +4 m u x +4 +4 Instruction memory [ [ IR BNZ R5,-32 clock EE524/CptS561 Jose G. Delgado-Frias 14 7

Instruction Decode (ID) stage 101 /ID +4 ID/ IR IR 6..10 IR 11..15 rs1 rs2 isters [R5 [RX IR 16..31 RY -32-32 Sign extend EE524/CptS561 Jose G. Delgado-Frias 15 Execution () stage 102 ID/ T / +4 [R5 [RX -28-32 RY EE524/CptS561 Jose G. Delgado-Frias 16 8

ory () stage 103 T / / -28 ory RY EE524/CptS561 Jose G. Delgado-Frias 17 Instruction Fetch () stage -28 /ID T 103 4 dd +16 m u x +4-28 Instruction memory [ IR EE524/CptS561 Jose G. Delgado-Frias 18 9

path w/ pipeline +4 Inst.. L U. EE524/CptS561 Jose G. Delgado-Frias 19 Pipeline 1 INSTRUCTIONS 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 CLOCK CYCLE EE524/CptS561 Jose G. Delgado-Frias 20 10

Pipeline Clock cycle 1 2 3 4 5 6 7 8 EE524/CptS561 Jose G. Delgado-Frias 21 Pipeline Hazards Structural Hazards two or more instructions use same hardware at the same time. Hazards dependencies Result from inst. j is needed by inst. k Control Hazards Branch changes flow, what happen with the following instruction(s) EE524/CptS561 Jose G. Delgado-Frias 22 11

Resources EE524/CptS561 Jose G. Delgado-Frias 23 Hazards R1 R2+R3 R5 R1+R3 R8 R1-R6 EE524/CptS561 Jose G. Delgado-Frias 24 12

Forwarding R1 R2+R3 R5 R1+R3 R8 R1-R6 EE524/CptS561 Jose G. Delgado-Frias 25 path w/ pipeline +4 Inst.. L U. Forwarding unit EE524/CptS561 Jose G. Delgado-Frias 26 13

Example DD XOR SUB R1,R2,R3 R7,R8,R1 R4,R3,R1 XOR SUB DD R7,R8,R1 R4,R3,R1 R1,R2,R3 SUB DD R4,R3,R1 R1,R2,R3 DD R1,R2,R3 +4 Inst.. L U. Forwarding unit EE524/CptS561 Jose G. Delgado-Frias 27 Example XOR R7,R8,R1 SUB R4,R3,R1 DD R1.. +4 Inst.. L U. Forwarding unit EE524/CptS561 Jose G. Delgado-Frias 28 14

Hazard Classification RW (Read fter Write) w/ forward only load presents a problem WW WR RR j: R1 k: RY R1 j: R1 k: R1 j: R1 k: R1 j: R1 k: R1 EE524/CptS561 Jose G. Delgado-Frias 29 Forwarding (load) R1 LD[ R5 R1+R3 R8 R1-R6 EE524/CptS561 Jose G. Delgado-Frias 30 15

hazard (load) R1 LW R1,0(R1) ID SUB R4,R1,R5 ID stall ND R6,R1,R7 OR R8,R1,R9 stall stall ID ID EE524/CptS561 Jose G. Delgado-Frias 31 Branch LBEL_: BR R1, LBEL_ DD R2,R3,R7 ND R5,R7,R11 : : LD R4,R2,005 EE524/CptS561 Jose G. Delgado-Frias 32 16

Branch BR R1, LBEL_ DD R2,R3,R7 ND R5,R7,R11 LD R4,R2,005 EE524/CptS561 Jose G. Delgado-Frias 33 path w/ pipeline +4 Inst.. L U. Forwarding unit EE524/CptS561 Jose G. Delgado-Frias 34 17

What to do w/ branch Reduce the number of cycles to decide on a branch. Delayed branch (Software Solutions) NO-OP move instructions from before from target from fall through EE524/CptS561 Jose G. Delgado-Frias 35 Branch BR R1, LBEL_ DD R2,R3,R7 LD R4,R2,005 EE524/CptS561 Jose G. Delgado-Frias 36 18

NO-OP Branch NO-OP EE524/CptS561 Jose G. Delgado-Frias 37 From Before Branch EE524/CptS561 Jose G. Delgado-Frias 38 19

From Target Branch EE524/CptS561 Jose G. Delgado-Frias 39 From Fall Through Branch EE524/CptS561 Jose G. Delgado-Frias 40 20

Example loop: LW R1,0(R2) R1 [R2+0 DDI SW DDI SUB R1,R1,#1 0(R2),R1 R2,R2,#4 R4,R3,R2 R1 R1+1 [R2+0 R1 R2 R2+4 R4 R3-R2 BNEZ R4,loop R4 = 0 GOTO loop Initial value: R3 = R2+396 EE524/CptS561 Jose G. Delgado-Frias 41 Dynamic Branch Prediction Hardware approach Based on past history 2-bit counter EE524/CptS561 Jose G. Delgado-Frias 42 21

2-bit prediction scheme Taken Predict taken Not Taken Taken Predict taken Taken Not Taken Predict not taken Not Taken Taken Predict not taken Not Taken EE524/CptS561 Jose G. Delgado-Frias 43 Branch Target Buffer (BTB) Current Predicted Branch Prediction Taken / Untaken EE524/CptS561 Jose G. Delgado-Frias 44 22