Review: Addressing Modes Addressing mode Example Meaning Register Add R4,R3 R4 R4+R3 Immediate Add R4,#3 R4 R4+3 Displacement Add R4,1(R1) R4 R4+Mem[1+R1] Register indirect Add R4,(R1) R4 R4+Mem[R1] Indexed / Base Add R3,(R1+R2) R3 R3+Mem[R1+R2] Direct or absolute Add R1,(11) R1 R1+Mem[11] indirect Add R1,@(R3) R1 R1+Mem[Mem[R3]] Auto-increment Add R1,(R2)+ R1 R1+Mem[R2]; R2 R2+d Auto-decrement Add R1, (R2) R2 R2 d; R1 R1+Mem[R2] Scaled Add R1,1(R2)[R3] R1 R1+Mem[1+R2+R3*d] ECE 2 L3 InstructionSet4 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Review: MIPS Addressing Modes/Instruction Formats All instructions are -bit wide Register (direct) op rs rt rd register Immediate op rs rt immed Base+index op rs rt immed register + PC-relative op rs rt immed PC + Register Indirect? ECE 2 L3 InstructionSet6 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
MIPS: Software conventions for registers R $zero constant R1 $at reserved for assembler R2 $v value registers & R3 $v1 function results R4 $a arguments R5 $a1 R6 $a2 R7 $a3 R8 $t temporary: caller saves (callee can clobber) R15 $t7 R $s callee saves R23 $s7 R24 $t8 temporary (cont d) R25 $t9 R26 $k reserved for OS kernel R27 $k1 R28 $gp pointer to global area R29 $sp Stack pointer R3 $fp Frame pointer R $ra return Address ECE 2 L3 InstructionSet19 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers MIPS data transfer instructions Instruction Comment sw 5($r4), $r3 Store word sh 52($r2), $r3 Store half sb 41($r3), $r2 Store byte lw $r1, 3($r2) Load word lh $r1, 4($r3) Load halfword lhu $r1, 4($r3) Load halfword unsigned lb $r1, 4($r3) Load byte lbu $r1, 4($r3) Load byte unsigned lui $r1, 4 Load Upper Immediate ( bits shifted left by ) LUI $r5 $r5 ECE 2 L3 InstructionSet2 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
Loading large numbers Pseudo-instruction li $t, big: load -bit constant lui $t, upper ori $t, $t, lower # $t[:] upper # $t ($t Or [extlower]) Or upper 15 lower $to: upper lower -bit constant ECE 2 L3 InstructionSet21 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Processor Design Basics: Outline Design a processor: step-by-step Requirements of the Instruction Set Components and clocking Assembling an adequate path Controlling the path ECE 2 L3 InstructionSet Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
The Big Picture: The Performance Perspective Performance of a machine is determined by: Instruction count Clock cycle time Clock cycles per instruction Processor design (datapath and control) will determine: Clock cycle time CPI Clock cycles per instruction Today: Inst Count Single cycle processor: - Advantage: One clock cycle per instruction - Disadvantage: long cycle time Cycle Time ECE 2 L3 InstructionSet Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers How to Design a Processor: step-by-step 1 Analyze instruction set => datapath requirements the meaning of each instruction is given by the register transfers datapath must include storage element for ISA registers - possibly more datapath must support each register transfer 2 Select set of datapath components and establish clocking methodology 3 Assemble datapath meeting the requirements 4 Analyze implementation of each instruction to determine setting of control points that effects the register transfer 5 Assemble the control logic ECE 2 L3 InstructionSet33 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
The MIPS Instruction Formats All MIPS instructions are bits long The three instruction formats: 26 21 11 6 R-type op 6 bits rs 5 bits rt 5 bits rd 5 bits shamt 5 bits funct 6 bits I-type 26 21 op rs rt immediate 6 bits 5 bits 5 bits bits J-type 26 op target address 6 bits 26 bits The different fields are: op: operation of the instruction rs, rt, rd: the source and destination register specifiers shamt: shift amount funct: selects the variant of the operation in the op field address / immediate: address offset or immediate value target address: target address of the jump instruction ECE 2 L3 InstructionSet34 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Step 1a: The MIPS-light Subset ADD and SUB addu rd, rs, rt subu rd, rs, rt OR Immediate: ori rt, rs, imm LOAD and STORE Word lw rt, rs, imm sw rt, rs, imm BRANCH: beq rs, rt, imm 26 21 11 6 op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits 26 21 op rs rt immediate 6 bits 5 bits 5 bits bits 26 21 op rs rt immediate 6 bits 5 bits 5 bits bits 26 21 op rs rt immediate 6 bits 5 bits 5 bits bits ECE 2 L3 InstructionSet35 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
Logical Register Transfers RTL gives the meaning of the instructions All start by fetching the instruction op rs rt rd shamt funct = MEM[ PC ] op rs rt Imm = MEM[ PC ] inst Register Transfers ADDU R[rd] < R[rs] + R[rt]; PC < PC + 4 SUBU R[rd] < R[rs] R[rt]; PC < PC + 4 ORi R[rt] < R[rs] + zero_ext(imm); PC < PC + 4 LOAD R[rt] < MEM[ R[rs] + sign_ext(imm)]; PC < PC + 4 STORE MEM[ R[rs] + sign_ext(imm) ] < R[rt]; PC < PC + 4 BEQ if ( R[rs] == R[rt] ) then PC < PC + sign_ext(imm)] else PC < PC + 4 ECE 2 L3 InstructionSet36 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Step 1: Requirements of the Instruction Set instruction & data Registers ( x ) read RS read RT Write RT or RD PC Extender Add and Sub register or extended immediate Add 4 or extended immediate to PC ECE 2 L3 InstructionSet37 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
Step 2: Components of the path Combinational Elements Storage Elements Clocking methodology ECE 2 L3 InstructionSet38 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Combinational Logic Elements (Basic Building Blocks) Adder A B CarryIn Adder Sum Carry MUX Selec t A Y B MUX A B O P Result ECE 2 L3 InstructionSet39 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
Storage Element: Register (Basic Building Block) Register Write Enable Similar to the D Flip Flop except In Out - N-bit input and output N N - Write Enable input Write Enable: - negated (): Out will not change - asserted (1): Out will become In ECE 2 L3 InstructionSet4 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Storage Element: Register File Register File consists of registers: Two -bit output buses: and busb One -bit input bus: Register is selected by: RA (number) selects the register to put on (data) RB (number) selects the register to put on busb (data) RW (number) selects the register to be written via (data) when Write Enable is 1 Clock input (CLK) Write Enable RWRARB 5 5 5 -bit Registers The CLK input is a factor ONLY during write operation During read operation, behaves as a combinational logic block: - RA or RB valid => or busb valid after access time busb ECE 2 L3 InstructionSet41 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
Storage Element: Idealized (idealized) One input bus: In One output bus: Out word is selected by: Address selects the word to put on Out Write Enable = 1: address selects the memory word to be written via the In bus Clock input (CLK) Write Enable Address In Out The CLK input is a factor ONLY during write operation During read operation, behaves as a combinational logic block: - Address valid => Out valid after access time ECE 2 L3 InstructionSet42 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Clocking Methodology Setup Hold Don t Care Setup Hold All storage elements are clocked by the same clock edge Cycle Time = CLK-to-Q + Longest Delay Path + Setup + Clock Skew (CLK-to-Q + Shortest Delay Path - Clock Skew) > Hold Time ECE 2 L3 InstructionSet43 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
Step 3 Register Transfer Requirements > path Assembly Instruction Fetch Read Operands and Execute Operation ECE 2 L3 InstructionSet44 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers 3a: Overview of the Instruction Fetch Unit The common RTL operations Fetch the Instruction: mem[pc] Update the program counter: - Sequential Code: PC <- PC + 4 - Branch and Jump: PC <- something else PC Next Address Logic Address Instruction Instruction Word ECE 2 L3 InstructionSet45 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
3b: Add & Subtract R[rd] <- R[rs] op R[rt] Example: addu rd, rs, rt Ra, Rb, and Rw come from instruction s rs, rt, and rd fields ctr and RegWr: control logic after decoding the instruction 26 21 11 6 op rs rt rd shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits RegWr Rd Rs 5 5 5 Rt -bit Registers busb ctr Result ECE 2 L3 InstructionSet46 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Register-Register Timing PC Rs, Rt, Rd, Op, Func ctr RegWr, B Old Value -to-q New Value Old Value Old Value Old Value Old Value Old Value Instruction Access Time New Value Delay through Control Logic New Value New Value Register File Access Time New Value Delay New Value RegWr Rd Rs Rt 5 5 5 -bit Registers busb ctr Result Register Write Occurs Here ECE 2 L3 InstructionSet47 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
3c: Logical Operations with Immediate R[rt] <- R[rs] op ZeroExt[imm] ] bits Rd Rt RegDst Rs RegWr ctr 5 5 5 imm -bit Registers 26 op rs rt immediate busb ZeroExt 21 6 bits 5 bits 5 bits rd? bits 15 Src Result 11 immediate bits ECE 2 L3 InstructionSet48 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers 3d: Load Operations R[rt] <- Mem[R[rs] + SignExt[imm]] Example: lw rt, rs, imm 26 21 11 op rs rt immediate 6 bits 5 bits 5 bits rd bits Rd Rt RegDst Rs RegWr 5 5 5 imm -bit Registers busb Extender Src ctr In MemWr WrEn Adr W_Src ExtOp ECE 2 L3 InstructionSet49 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
3e: Store Operations Mem[ R[rs] + SignExt[imm] <- R[rt] ] Example: sw rt, rs, imm 26 21 op rs rt immediate 6 bits 5 bits 5 bits bits RegDst Rd Rt RegWr 5 5 imm Rs Rt 5 -bit Registers busb Extender ctr In MemWr WrEn Adr W_Src ExtOp Src ECE 2 L3 InstructionSet5 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers 3f: The Branch Instruction 26 21 op rs rt immediate 6 bits 5 bits 5 bits bits beq rs, rt, imm mem[pc] Fetch the instruction from memory Equal <- R[rs] == R[rt] Calculate the branch condition if (COND eq ) Calculate the next instruction s address - PC <- PC + 4 + ( SignExt(imm) x 4 ) else - PC <- PC + 4 ECE 2 L3 InstructionSet51 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
path for Branch Operations beq rs, rt, imm path generates condition (equal) 26 21 op rs rt immediate 6 bits 5 bits 5 bits bits Inst Address Cond 4 imm PC Ext npc_sel Adder Adder PC Rs Rt RegWr 5 5 5 -bit Registers busb Equal? ECE 2 L3 InstructionSet52 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Putting it All Together: A Single Cycle path Inst Adr <:15> <11:15> <:2> <21:25> Rs Rt Rd Imm Instruction<:> imm 4 PC Ext Adder Adder npc_sel PC RegDst Rd Rt Equal 1 RegWr Rs Rt 5 5 5 -bit Registers busb imm Extender 1 ctr = MemWr WrEn Adr In MemtoReg 1 ExtOp Src ECE 2 L3 InstructionSet53 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
An Abstract View of the Critical Path Register file and ideal memory: Next Address The CLK input is a factor ONLY during write operation During read operation, behave as combinational logic: - Address valid => Output valid after access time Ideal Instruction Instruction Address PC Rd 5 Instruction Rs 5 Rt 5 -bit Registers Imm A B Critical Path (Load Operation) = PC s -to-q + Instruction s Access Time + Register File s Access Time + to Perform a -bit Add + Access Time + Setup Time for Register File Write + Clock Skew Address In Ideal ECE 2 L3 InstructionSet54 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Step 4: Given path: RTL -> Control Instruction<:> Inst Adr Op <21:25> <21:25> Fun Rt <:15> <11:15> <:2> Rs Rd Imm Control npc_selregwr RegDst ExtOp Src ctr MemWr MemtoReg Equal DATA PATH ECE 2 L3 InstructionSet55 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
Meaning of the Control Signals (see slide 28) Rs, Rt, Rd and Imed hardwired into datapath npc_sel: => PC < PC + 4; 1 => PC < PC + 4 + SignExt(Im) npc_sel Inst Adr 4 imm PC Ext Adder Adder PC ECE 2 L3 InstructionSet56 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Meaning of the Control Signals (see slide 28) ExtOp: zero, sign src: immed => regb; 1 => ctr: add, sub, or RegDst Rd Rt Equal 1 RegWr Rs Rt 5 5 5 -bit Registers busb imm Extender MemWr: write memory MemtoReg: 1 => Mem RegDst: => rt ; 1 => rd RegWr: write dest register ctr MemWr MemtoReg 1 = WrEn Adr In 1 ExtOp Src ECE 2 L3 InstructionSet57 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
Control Signals inst Register Transfer ADD R[rd] < R[rs] + R[rt]; PC < PC + 4 src = RegB, ctr = add, RegDst = rd, RegWr, npc_sel = +4 SUB R[rd] < R[rs] R[rt]; PC < PC + 4 src =, Extop =, ctr =, RegDst =, RegWr(?), MemtoReg(?), MemWr(?), npc_sel = ORi R[rt] < R[rs] + zero_ext(imm); PC < PC + 4 src =, Extop =, ctr =, RegDst =, RegWr(?), MemtoReg(?), MemWr(?), npc_sel = LOAD R[rt] < MEM[ R[rs] + sign_ext(imm)]; PC < PC + 4 src =, Extop =, ctr =, RegDst =, RegWr(?), MemtoReg(?), MemWr(?), npc_sel = STORE MEM[ R[rs] + sign_ext(imm)] < R[rs]; PC < PC + 4 src =, Extop =, ctr =, RegDst =, RegWr(?), MemtoReg(?), MemWr(?), npc_sel = BEQ if ( R[rs] == R[rt] ) then PC < PC + sign_ext(imm)] else PC < PC + 4 src =, Extop =, ctr =, RegDst =, RegWr(?), MemtoReg(?), MemWr(?), npc_sel = ECE 2 L3 InstructionSet58 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Step 5: Logic for each control signal npc_sel <= if (OP == BEQ) then EQUAL else src <= if (OP == ) then regb else immed ctr <= if (OP == ) then funct elseif (OP == ORi) then OR elseif (OP == BEQ) then sub else add ExtOp <= MemWr <= MemtoReg<= RegWr: <= RegDst: <= ECE 2 L3 InstructionSet6 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
Example: Load Instruction Inst Adr <:15> <11:15> <:2> <21:25> Rs Rt Rd Imm Instruction<:> imm 4 PC Ext Adder Adder npc_sel +4 PC RegDst ct MemWr MemtoReg rt Rd Rt Equal radd 1 Rs Rt RegWr 5 5 5 = -bit Registers busb WrEn Adr 1 1 In imm sign ext ECE 2 L3 InstructionSet62 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Extender ExtOp Src An Abstract View of the Implementation Next Address Ideal Instruction Instruction Address PC Rd 5 Instruction Rs 5 Rt 5 -bit Registers Logical vs Physical Structure A B Control Control Signals path Conditions Address In Ideal Out ECE 2 L3 InstructionSet63 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers
A Real MIPS path (CNS T) ECE 2 L3 InstructionSet64 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers Summary 5 steps to design a processor 1 Analyze instruction set => datapath requirements 2 Select set of datapath components & establish clock methodology 3 Assemble datapath meeting the requirements 4 Analyze implementation of each instruction to determine setting of control points that effects the register transfer 5 Assemble the control logic MIPS makes it easier Instructions same size Source registers always in same place Immediates same size, location Operations always on registers/immediates Single cycle datapath => CPI=1, CCT => long Next time: implementing control ECE 2 L3 InstructionSet65 Adapted from Patterson 97 UCB Copyright 1998 Morgan Kaufmann Publishers