Instruction Set Architecture Micro-architecture Datapath & Control CIT 595 Spring 2010 ISA =Programmer-visible components & operations Memory organization Address space -- how may locations can be addressed? Addressibility -- how many bits per location? egister set How many? What size? Instruction set Opcodes Data types Addressing modes All information needed to write/generate machine language program CIT 595 2 Fundamental unit of work Constituents Instruction Opcode: operation to be performed (e.g. ADD, LD) Operands: data/locations to be used for operation Source: location that contains the data/instruction Destination: location that will store the result of computation Immediate: data values not contained at a particular location CIT 595 3 LC-3 Overview: Memory and egisters Memory Address space: 2 16 locations (16-bit addresses) Addressibility: 16 bits egisters Temporary storage (Memory access takes longer) Eight general-purpose registers: 0-7 (each 16 bits wide) Other registers: Not directly addressable, but used by and affected by instructions E.g. PC (program counter), condition codes (NZP) Word Size Number of bits normally processed by ALU in one instruction Also width of registers LC-3 word size is 16 bits CIT 595 4 1
Opcodes LC-3 ISA: Overview 16 opcodes ([15:12] of instruction = 2 4 = 16 possible values) Types of instructions: Operate instructions: E.g. ADD Data movement instructions: E.g. LD, LEA, ST Control instructions: E.g. B, JMP, TAP JS, TI Operate and Data movement instructions (except Store) set/clear condition codes, based on result N = negative (<0), Z = zero (=0), P = positive (> 0) Addressing Modes How is the location of an operand (data to acted upon) specified? Non-memory addresses: register, immediate (literal) Memory addresses: base+offset, PC-relative, indirect Example: ADD Instruction Format LC-3 ADD : Add the contents of 2 to the contents of 6, and store the result in 6. Data Types 16-bit 2 s complement integer CIT 595 5 CIT 595 6 Example: LD Instruction Format Microarchitecture (Machine Internals) Describes a large number of details that are hidden in the programming model Constituent parts of the processor and How these interconnect and interoperate to implement the architectural specification Computer = processing unit + memory system + I/O Processing unit = control + datapath Add the value 6 to the contents of 3 to form a memory address. Load the contents of memory at that address and place the resulting data in 2 CIT 595 7 Control = finite state machine Inputs = machine instruction, datapath conditions Outputs = register transfer control signals, ALU operation codes Instruction interpretation = instruction fetch, decode, execute, write Datapath = functional units + registers All the logic used to process information Functional units = ALU, multipliers, dividers, etc. egisters = program counter (PC), instruction register (I), storage CIT 595 registers 8 2
Circuitry that Control Unit controls the flow of information through the processor, and Coordinates activities of the other units within it. Is a FSM States enumerate all possible configurations the machine can be in Using the opcode information & some other inputs (e.g. Condition Code, Interrupt Signal) determines next state and output Decides for each stage in instruction processing cycle Which registers/memory location are enabled? Which operation should ALU perform? Choose ALU output or Memory Output? Instruction Processing Cycle FETCH instruction from mem. DECODE instruction EVALUATE ADDESS FETCH OPEANDS EXECUTE operation STOE result CIT 595 9 CIT 595 10 E.g. LC3 FSM diagram Variations in Processing Cycle Example in LC3 Evaluate Address and Execute are combined as they both use ALU (adder) Operand Fetch is separated into egister Fetch and Memory Access Store consists of only register writes Memory Write is part of Memory Access Thus we have a total of 6 stages CIT 595 11 CIT 595 12 3
Simple LC3 Datapath Memory * * Harvard architecture (physically separate Storage) As opposed to Von Nuemann Model Dominant in ISC style architecture e.g. AM processor Instruction Memory is default set to read Data Memory can be read or written based on WE WE = 1 means write CIT 595 13 CIT 595 14 egisters/egister File ALU 2 eads Ports Default read (no need for enable) d1 S1, Base d2 S2 1 Write Port Need Enable (WE) W D Source: Prof. Milo Martin at Upenn ADD eg-eg Immediate Bits [4:0] Default control unit signal set to 0 for ADD LD, ST Used to evaluate the address of the load and store Bits [5:0] Need to expand to accommodate other instructions E.g. NOT instruction CIT 595 15 CIT 595 16 4
MA(Memory Address) MUX (for Data Memory) PC and PC MUX PC (16-bit register) update is based on the PC MUX PC + 1(default) Address based on B, JMP, TAP For B to work in this implementation need CC registers CIT 595 17 CIT 595 18 Dval MUX Control Signals Select appropriate D (destination register) value 0: 3-bit d1 (Source 1 S1/Base) 1: 3-bit d2 (Source 2 S2) 2: 3-bit Wr (Destination egister D) 3: 1-bit WE for egister (to control register update) 4: 2-bit MUX to select 2 nd Operand to ALU 5: 2-bit MA (Memory Address) MUX to select Address to the Data Memory 6: 1-bit PC enable (to update PC) 7: 1-bit WE for Data Memory (to control memory update) 8: 1-bit ALUMEM MU (to select output of Memory or ALU 9: 1-bit Dval MUX (to select value written to Destination register) 10: 1-bit PC MUX (select value of PC) CIT 595 19 CIT 595 20 5
ADD Instruction LD Instruction Instr Opcode CONTOL SIGNALS I[15:12] I[5] ADD 0001 0 I[8:6] I[2:0] I[11:9] 1 00 00 1 0 1 0 1 S1 S2 D Instr Opcode CONTOL SIGNALS I[15:12] I[5] LD 0110 x I[8:6] xxx I[11:9] 1 10 00 1 0 0 0 1 S1 S2 D JMP Instruction TAP TAP 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 1 1 1 1 0 0 0 0 trapvect8 Used to get data in and out of the computer Calls operating system service routine Identified by 8-bit trap vector Execution resumes after OS code executes Opcode CONTOL SIGNALS I[15:12] I[5] JMP 1100 x I[8:6] xxx X 0 10 00 1 0 1 x 0 S1 S2 D CIT 595 24 6
TAP Instruction Sequencing an instruction For appropriate sequencing of instruction processing cycle i.e. F->D->EA->OP->EX->S Updating PC and condition code registers eading/updating g egisters and Memory Use clock to sequence each phase of an instruction by raising the right signals as the right time 7 = PC + 1 Opcode CONTOL SIGNALS I[15:12] I[5] TAP 1111 x xxx xxx 111(7) 1 xx 01 1 0 0 1 0 It takes fixed number of clock ticks/cycles (repetition of rising or falling edge) to execute each instruction How is this done? CIT 595 26 Sequencing an instruction (contd..) Hardwired Control Unit We connect the clock to a synchronous counter and the counter to the decoder The decoder d output t enabled is based on counter outputs (i.e. which cycle you are in) Combinational circuit The control signals are combination of the decoder output, opcode and some other inputs Clock n-bit counter CIT 595 27 n n x 2 n Decoder 2 n Control signals are combination of Opcode bits Other signals such as interrupts, or condition codes (NZP) Timing info (T1 to Tn) these signals are essential for timing for proper sequencing through instruction cycle CIT 595 28 7
Clocking Methodology How long should the clock cycle be such that we complete a one phase of the instruction cycle? When is data valid or stable? So that it can be read or written Do not want to end with mix of old and new data In a processor only memory elements can store values This means any collection of combinational logic must have its Inputs coming from a set of memory elements and Outputs written into a set of memory elements CIT 595 29 Clocking Methodology (contd..) The length of the clock cycle is determined as follows: The time necessary for the signals to reach memory element 2 defines the length of the clock cycle i.e. minimum clock cycle time must be at least as great as the maximum propagation delay of the circuit CIT 595 30 Example of Clock Cycle Length Programmable Control Unit CIT 595 31 Source: http://fourier.eng.hmc.edu/e85/lectures/processor/node11.html CIT 595 32 8
Programmable Control Unit (contd..) E.g. LC3 Implemented using Program Control Each machine instruction is in turn implemented by a series of instructions called microinstructions Micro instructions encodes Control signals for carrying out a particular stage in the instruction cycle The address of the most likely next micro instruction The microinstructions form a microprogram, which is stored in programmable memory Sometimes called Control Store E.g. Flash Memory is non-volatile and reprogrammable The behavior of LC-3 during a given clock cycle is completely described by the 49 bit microinstruction 39 bits for control signals 10 bits for possible next state t of the machine Each phase of instruction cycle may require more than one microinstruction E.g. Fetch stage takes 3 microinstructions 6-bit address is used lookup the memory There are 52 possible microinstructions (states) that can describe LC3 s behavior Memory size 2 6 x 49 I[15:12] CIT 595 33 CIT595 34 E.g. LC3 Implemented using Program Control The microsequencer produces the 6 bit address Corresponds to the next behavior of the processor Combinational circuit based on 10 bits of Microinstruction 8 bit additional info based on other events I[15:12] Appendix C of Yale & Patt Fig C.2 To 8 (See Figure C.7) 1 D< S1+OP2* set CC 5 D< S1&OP2* set CC D< NOT(S) set CC MD< M[MA] 7< PC PC< MD 9 MA< ZEXT[I[7:0]] D< PC+off9 set CC 28 30 14 15 TI ADD AND NOT 6 18 MA < PC PC< PC+1 [INT] 1 0 33 MD< M I< MD 35 BEN< I[11] & N + I[10] & Z + I[9] & P [I[15:12]] TAP LEA LD LD LDI STI ST ST MA< B+off6 10 MA< PC+off9 24 MD< M[MA ] 11 MA< PC+off9 MD< M[MA] To 49 (See Figure C.7) 29 32 1101 B JMP JS MA< B+off6 7 To 13 0 [BEN] 0 1 22 PC< PC+off9 12 PC< Base 4 7< PC [I[11]] 1 0 21 PC< PC+off11 20 PC< Base 2 MA< PC+off9 26 MA< MD 31 MA< MD MA< PC+off9 3 CIT595 Microprogram Control 35 CIT595 25 23 MD< M[MA] MD< S 16 27 D< MD M[MA]< MD set CC NOTES B+off6 : Base + SEXT[offset6] PC+off9 : PC + SEXT{offset9] PC+off11 : PC + SEXT[offset11] *OP2 may be S2 or SEXT[imm5] 36 9
Hardwired vs. Programmable Control Complexity There is an extra level of instruction interpretation in microprogrammed control, which makes it slower than hardwired control Instruction Flexibility Instruction and Control Logic are tied together in hardwired control, which makes it difficult to modify New instructions can be easily added by only making changes to the microprogram in programmed control implementation Instr Opcode CONTOL SIGNALS S1 S2 D I[15:12] I[5] CIT 595 37 CIT 595 38 10