Outline. EEL-4713 Computer Architecture Designing a Single Cycle Datapath

Similar documents
Review: MIPS Addressing Modes/Instruction Formats

Computer organization

Reduced Instruction Set Computer (RISC)

Instruction Set Architecture. or How to talk to computers if you aren t in Star Trek

Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University

Computer Organization and Components

Solutions. Solution The values of the signals are as follows:

Instruction Set Architecture

Pipeline Hazards. Arvind Computer Science and Artificial Intelligence Laboratory M.I.T. Based on the material prepared by Arvind and Krste Asanovic

Introducción. Diseño de sistemas digitales.1

Design of Pipelined MIPS Processor. Sept. 24 & 26, 1997

Pipeline Hazards. Structure hazard Data hazard. ComputerArchitecture_PipelineHazard1

Instruction Set Design

Execution Cycle. Pipelining. IF and ID Stages. Simple MIPS Instruction Formats

COMP 303 MIPS Processor Design Project 4: MIPS Processor Due Date: 11 December :59

Design of Digital Circuits (SS16)

l C-Programming l A real computer language l Data Representation l Everything goes down to bits and bytes l Machine representation Language

Microprocessor & Assembly Language

(Refer Slide Time: 00:01:16 min)

Addressing The problem. When & Where do we encounter Data? The concept of addressing data' in computations. The implications for our machine design(s)

Chapter 9 Computer Design Basics!

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

In the Beginning The first ISA appears on the IBM System 360 In the good old days

MICROPROCESSOR AND MICROCOMPUTER BASICS

Chapter 2 Topics. 2.1 Classification of Computers & Instructions 2.2 Classes of Instruction Sets 2.3 Informal Description of Simple RISC Computer, SRC

CS352H: Computer Systems Architecture

The 104 Duke_ACC Machine

CSE 141 Introduction to Computer Architecture Summer Session I, Lecture 1 Introduction. Pramod V. Argade June 27, 2005

UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering. EEC180B Lab 7: MISP Processor Design Spring 1995

Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings:

A s we saw in Chapter 4, a CPU contains three main sections: the register section,

A SystemC Transaction Level Model for the MIPS R3000 Processor

CS 61C: Great Ideas in Computer Architecture Finite State Machines. Machine Interpreta4on

EC 362 Problem Set #2

Preface. Any questions from last time? A bit more motivation, information about me. A bit more about this class. Later: Will review 1st 22 slides

Sequential Logic. (Materials taken from: Principles of Computer Hardware by Alan Clements )

EE282 Computer Architecture and Organization Midterm Exam February 13, (Total Time = 120 minutes, Total Points = 100)

CS101 Lecture 26: Low Level Programming. John Magee 30 July 2013 Some material copyright Jones and Bartlett. Overview/Questions

CS:APP Chapter 4 Computer Architecture. Wrap-Up. William J. Taffe Plymouth State University. using the slides of

CPU Organization and Assembly Language

LSN 2 Computer Processors

İSTANBUL AYDIN UNIVERSITY

MACHINE ARCHITECTURE & LANGUAGE

Let s put together a Manual Processor

Chapter 2 Logic Gates and Introduction to Computer Architecture

EECS 427 RISC PROCESSOR

16-bit ALU, Register File and Memory Write Interface

PART B QUESTIONS AND ANSWERS UNIT I

CPU Organisation and Operation

A New Paradigm for Synchronous State Machine Design in Verilog

MICROPROCESSOR. Exclusive for IACE Students iacehyd.blogspot.in Ph: /422 Page 1

Q. Consider a dynamic instruction execution (an execution trace, in other words) that consists of repeats of code in this pattern:

WAR: Write After Read

Systems I: Computer Organization and Architecture

Central Processing Unit (CPU)

Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:

what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?

To design digital counter circuits using JK-Flip-Flop. To implement counter using 74LS193 IC.

MIPS Assembler and Simulator

WEEK 8.1 Registers and Counters. ECE124 Digital Circuits and Systems Page 1

CHAPTER 4 MARIE: An Introduction to a Simple Computer

Assembly Language Programming

Computer Organization and Architecture

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language

Administration. Instruction scheduling. Modern processors. Examples. Simplified architecture model. CS 412 Introduction to Compilers

Software Pipelining. for (i=1, i<100, i++) { x := A[i]; x := x+1; A[i] := x

Translating C code to MIPS

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

Digital Logic Design. Basics Combinational Circuits Sequential Circuits. Pu-Jen Cheng

ETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies

Chapter 5 Instructor's Manual

Giving credit where credit is due

361 Computer Architecture Lecture 14: Cache Memory

PROBLEMS #20,R0,R1 #$3A,R2,R4

ARM. Architecture and Assembly. Modest Goal: Turn on an LED

Winter 2002 MID-SESSION TEST Friday, March 1 6:30 to 8:00pm

Counters and Decoders

Typy danych. Data types: Literals:

TIMING DIAGRAM O 8085

Instruction Set Architecture (ISA)

Memory Elements. Combinational logic cannot remember

Reusable Application-Dependent Machine Descriptions

Intel 8086 architecture

Digital Systems Based on Principles and Applications of Electrical Engineering/Rizzoni (McGraw Hill

Techniques for Accurate Performance Evaluation in Architecture Exploration

EE361: Digital Computer Organization Course Syllabus

Summary of the MARIE Assembly Language

Chapter 01: Introduction. Lesson 02 Evolution of Computers Part 2 First generation Computers

CHAPTER 7: The CPU and Memory

Today. Binary addition Representing negative numbers. Andrew H. Fagg: Embedded Real- Time Systems: Binary Arithmetic

Lecture-3 MEMORY: Development of Memory:

1 Classical Universal Computer 3

ASYNCHRONOUS COUNTERS

Software based Finite State Machine (FSM) with general purpose processors

Computer Organization. and Instruction Execution. August 22

Central Processing Unit

CSE 141L Computer Architecture Lab Fall Lecture 2

Systems I: Computer Organization and Architecture

Notes on Assembly Language

Generating MIF files

Transcription:

Outline EEL-473 Computer Architecture Designing a Single Cycle path Introduction The steps of designing a processor path and timing for register-register operations path for logical operations with immediates path for load and store operations path for branch and jump operations Big Picture The five classic components of a computer Processor Control path Input Output Today s topic: design of a single cycle processor The Big Picture: The Performance Perspective Performance of a machine is determined by: CPI count Clock cycle time Clock cycles per instruction Inst Count - CPI will discuss later Processor design determines: Clock cycle time Clock cycles per instruction Cycle Time Single cycle processor: - Advantage: One clock cycle per instruction - Disadvantage: long cycle time

How to Design a Processor: step-by-step Analyze instruction set => datapath requirements The meaning of each instruction is given by the register transfers The datapath must include storage element for ISA registers - And possibly more The datapath must support each register transfer 2 Select set of datapath components and establish clocking methodology 3 Assemble datapath meeting the requirements 4 Analyze implementation of each instruction to determine setting of control points that effects the register transfer Assemble the control logic MIPS ISA: instruction formats All MIPS instructions are bits long There are 3 instruction formats: 6 R-type op rs rt rd shamt funct bits bits 6 bits I-type bits J-type op target address 6 bits bits The different fields are: op: operation of the instruction rs, rt, rd: the source(s) and destination register specifiers shamt: shift amount funct: selects the variant of the operation in the op field address / immediate: address offset or immediate value target address: target address of the jump instruction *Step a: The MIPS lite subset for today Logical Register Transfers ADD and SUB addu rd, rs, rt subu rd, rs, rt OR Immediate: ori rt, rs, imm LOAD and STORE Word lw rt, rs, imm sw rt, rs, imm BRANCH: beq rs, rt, imm op rs rt rd shamt funct bits bits 6 bits bits bits bits 6 RTL gives the meaning of the instructions All start by fetching the instruction op rs rt rd shamt funct = MEM[ ] op rs rt Imm = MEM[ ] inst Register Transfers ADDU R[rd] < R[rs] + R[rt]; < + 4 SUBU R[rd] < R[rs] R[rt]; < + 4 ORi R[rt] < R[rs] + zero_ext(imm); < + 4 LOAD R[rt] < MEM[ R[rs] + sign_ext(imm)]; < + 4 STORE MEM[ R[rs] + sign_ext(imm) ] < R[rt]; < + 4 BEQ if ( R[rs] == R[rt] ) then < + sign_ext(imm)] else < + 4

Step : Requirements of the Set instruction & data ( x ) read RS read RT Write RT or RD Step 2: Components of the path Combinational Elements Storage Elements Clocking methodology Extender Add and Sub register or extended immediate Add 4 or extended immediate to Combinational Logic Elements (Basic Building Blocks) MUX A B Select A B OP A B MUX CarryIn Sum Carry Y Result Storage Element: Register (Basic Building Block) Register Similar to the D Flip Flop except - N-bit input and output - Write Enable input Write Enable: - negated () (not asserted): Out will not change - asserted (): Out will become In Write Enable In N Out N

Storage Element: Register File Register File consists of registers: Two -bit output busses: Write Enable and One -bit input bus: Register is selected by: Ra (number) selects the register to put on (data) Rb (number) selects the register to put on (data) Rw (number) selects the register to be written via (data) when Write Enable is Clock input (CLK) The CLK input is a factor ONLY during write operation During read operation, behaves as a combinational logic block (ie, reads are not clocked): -bit - RA or RB valid => or valid after access time Storage Element: Idealized (idealized) One input bus: In One output bus: Out word is selected by: selects the word to put on Out Write Enable = : address selects the memory word to be written via the In bus Clock input (CLK) The CLK input is a factor ONLY during write operation During read operation, behaves as a combinational logic block (ie, reads are not clocked): Write Enable In Out - valid => Out valid after access time Clocking Methodology Step 3 Setup Hold Don t Care Setup Hold Register Transfer Requirements > path Assembly Fetch Read Operands and Execute Operation All storage elements are clocked by the same clock edge Cycle Time = CLK-to-Q + Longest Delay Path + Setup + Clock Skew

3a: Overview of the Fetch Unit The common RTL operations Fetch the : mem[] Update the program counter: - Sequential Code: <- + 4 - Branch and Jump: <- something else RTL: The ADD 6 op rs rt rd shamt funct bits bits 6 bits add rd, rs, rt op rs rt rd shamt funct <- mem[] Next Logic R[rd] <- R[rs] + R[rt] The actual operation <- + 4 Calculate the next instruction s address Word RTL: The Subtract 6 op rs rt rd shamt funct bits bits 6 bits sub rd, rs, rt op rs rt rd shamt funct <- mem[] R[rd] <- R[rs] - R[rt] The actual operation <- + 4 Calculate the next instruction s address 3b: Add & Subtract R[rd] <- R[rs] op R[rt] Example: addu rd, rs, rt Ra, Rb, and Rw come from instruction s rs, rt, and rd fields ctr and RegWr: control logic after decoding the instruction 6 op rs rt rd shamt funct bits bits 6 bits ctrl RegWr -bit ctr Result

,,, Op, Func ctr Register-Register Timing -to-q New Value Old Value RegWr Access Time New Value -bit Delay through Control Logic New Value RegWr Old Value New Value, B Old Value Old Value Old Value Old Value Register File Access Time New Value ctr Delay New Value Result Register Write Occurs Here RTL: The OR Immediate ori rt, rs, imm op rs rt Imm <- mem[] R[rt] <- R[rs] OR ZeroExt(imm) bits The OR operation <- + 4 Calculate the next instruction s address *3c: Logical Operations with Immediate R[rt] <- R[rs] op ZeroExt[imm] ] RegDst RegWr imm -bit ZeroExt rd? bits Src ctr Result RTL: The Load lw rt, rs, imm op rs rt Imm <- mem[] Addr <- R[rs] + SignExt(imm) R[rt] <- Mem[Addr] Calculate the memory address Load the data into the register <- + 4 Calculate the next instruction s address bits bits bits immediate bits immediate bits

3d: Load Operations R[rt] <- Mem[R[rs] + SignExt[imm]] Example: lw rt, rs, imm 3e: Store Operations Mem[ R[rs] + SignExt[imm] <- R[rt] ] Example: sw rt, rs, imm rd bits bits RegDst RegWr imm -bit Extender Src ctr In MemWr WrEn Adr W_Src RegDst RegWr imm -bit Extender ctr In MemWr WrEn Adr W_Src ExtOp ExtOp Src 3f: The Branch beq rs, rt, imm op rs rt Imm <- mem[] Equal <- R[rs] == R[rt] Calculate the branch condition if (COND eq ) Calculate the next instruction s address - <- + 4 + ( SignExt(imm) x 4 ) else - <- + 4 bits path for Branch Operations beq rs, rt, imm path generates condition (equal) bits Branch Next Logic Word Inst RegWr -bit Equal Equal?

Putting it All Together: A Single Cycle path <:> RegDst RegWr Branch Next Logic imm -bit <:2> Extender <:2> Equal <:> <:> Imm ctr = <:> MemWr WrEn Adr In MemtoReg An Abstract View of the Critical Path Register file and ideal memory: Next The CLK input is a factor ONLY during write operation During read operation, behave as combinational logic: - valid => Output valid after access time Ideal -bit Imm A B Critical Path (Load Operation) = s -to-q + s Access Time + Register File s Access Time + to Perform a -bit Add + Access Time + Setup Time for Register File Write + Clock Skew In Ideal ExtOp Src Binary arithmetic for the next address In theory, the is a -bit byte address into the instruction memory: Sequential operation: <:> = <:> + 4 Branch operation: <:> = <:> + 4 + SignExt[Imm] * 4 Next Logic: Expensive and Fast Solution Using a 3-bit : Sequential operation: <:2> = <:2> + Branch operation: <:2> = <:2> + + SignExt[Imm] In either case: = <:2> concat The magic number 4 always comes up because: The -bit is a byte address And all our instructions are 4 bytes ( bits) long In other words: The 2 LSBs of the -bit are always zeros There is no reason to have hardware to keep the 2 LSBs In practice, we can simplify the hardware by using a 3-bit <:2>: Sequential operation: <:2> = <:2> + Branch operation: <:2> = <:2> + + SignExt[Imm] In either case: = <:2> concat 3 imm <:> SignExt 3 3 3 3 3 Branch Zero Addr<:2> Addr<:> <:>

Next Logic: Cheap and Slow Solution Why is this slow? Cannot start the address add until Zero (output of ) is valid Does it matter that this is slow in the overall scheme of things? Probably not here Critical path is the load operation imm <:> 3 SignExt 3 3 3 Carry In 3 Addr<:2> Addr<:> <:> RTL: The Jump op target address 6 bits bits j target mem[] <:2> <- <:28> concat target<2:> Calculate the next instruction s address Branch Zero Fetch Unit j target <:2> <- <:28> concat target<2:> Putting it All Together: A Single Cycle path We have everything except control signals (underline) 3 imm <:> <:28> Target <2:> SignExt 3 3 4 3 3 3 3 Branch Zero Jump Addr<:2> Addr<:> <:> RegDst RegWr imm -bit Extender Branch ExtOp Jump Src Fetch Unit ctr In Zero <:> <:2> WrEn <:2> MemWr Adr <:> <:> Imm MemtoReg

An Abstract View of the Implementation Next Ideal Logical vs Physical Structure -bit A B Control Control Signals path Conditions In Ideal Out Summary steps to design a processor Analyze instruction set => datapath requirements 2 Select set of datapath components & establish clock methodology 3 Assemble datapath meeting the requirements 4 Analyze implementation of each instruction to determine setting of control points that effects the register transfer Assemble the control logic MIPS makes it easier s same size Source registers always in same place Immediates same size, location Operations always on registers/immediates Single cycle datapath => CPI=, CCT => long Next time: implementing control (Steps 4 and )