Computer Architecture and Organization Instruction Set Architecture Lecturer: Prof. Yifeng Zhu Fall, 2015 Portions of these slides are derived from: Dave Patterson UCB 1
The View from Ten Thousand Feet What is Computer Architecture Computer Architecture = Instruction Set Architecture + Machine Organization 2
Instruction Set Architecture Specification of a microprocessor design Interface between human and machine s functionality HLL Exposed to software Transparent to software Front-end compiler Backend compiler (e.g. code generator, scheduler, IR optimizer) Binary execution, microarchitecture, dynamic optimizer Packaging, cooling IR ISA HW 3
Instruction Set Architecture: Critical Interface software instruction set hardware Properties of a good abstraction Lasts through many generations (portability) Used in many different ways (generality) Provides convenient functionality to higher levels Permits an efficient implementation at lower levels 4
Instruction Set Architecture... the attributes of a [computing] system as seen by the programmer, i.e. the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls the logic design, and the physical implementation. Amdahl, Blaauw, and Brooks, 1964 -- Organization of Programmable Storage -- Data Types & Data Structures: Encodings & Representations -- Instruction Formats -- Instruction (or Operation Code) Set SOFTWARE -- Modes of Addressing and Accessing Data Items and Instructions -- Exceptional Conditions 5
Course Focus Understanding the whole picture. Understanding the design techniques, machine structures, technology factors, evaluation methods that will determine the form of computers in 21st Century Applications Technology Parallelism Computer Architecture: Organization Hardware/Software Boundary Programming Languages Interface Design (ISA) Compilers Operating Systems Measurement & Evaluation History 6
Instruction Set Architecture Defines what state exists Defines what operations exist on that state Typically implies a sequential ordering Classic serial fetch-execute (von Neuman) 7
ISA State What states are there? User-level» Registers General purpose Program counter (PC) Memory Addressing/Indexing Other» Virtual memory support (TLB, page descriptors, )» Kernel (special registers, memory)» I/O 8
Manipulating the State Must have instructions that Access state (read and write) Implement control flow (jump, branch, etc.) Perform ALU operations (add, multiply, etc.) Largest difference among instructions is in how you access your state Operand location» stack, memory, register Addressing modes» computing the location (addresses) of the state 9
Operand Locations in Four ISA Classes GPR 10
Code Sequence C = A + B for Four Instruction Sets Stack Accumulator Register (register-memory) Push A Load A Load R1, A Push B Add B Add R1, B Add Store C Store C, R1 Pop C Register (loadstore) Load R1,A Load R2, B Add R3, R1, R2 Store C, R3 memory memory acc = acc + mem[c] R1 = R1 + mem[c] R3 = R1 + R2 11
RISC vs CISC: Example Complex Instruction Set Computer (CISC) Example: Auto-increment: add R1, (R2)+» R2 register holds the memory pointer» R1 R1 + Mem[R2]» R2 R2 + d Reduced Instruction Set Computer (RISC) Equivalent RISC instructions of the auto-increment Load R3, (R2) ; load memory contents Add R1, R1, R3 ; add the content of R3 Add R2, R2, d ; increment memory address 12
RISC vs. CISC: Another Example CISC instruction: MUL <addr1>, <addr2> RISC instructions: LOAD A, <addr1> LOAD B, <addr2> MUL A, A, B STORE A, <addr1> RISC is dependent on optimizing compilers 13 13
Example CISC ISA: Intel X86 12 addressing modes: Register Immediate Direct Base Base + Displacement Index + Displacement Scaled Index + Displacement Based Index Based Scaled Index Based Index + Displacement Based Scaled Index + Displacement Relative Operand sizes: Can be 8, 16, 32, 48, 64, or 80 bits long. Also supports string operations. Instruction Encoding: The smallest instruction is one byte. The longest instruction is 17 bytes long. 14 14
Example RISC ISA: SPARC 5 addressing modes: Register indirect with immediate displacement. Register indirect indexed by another register. Register direct. Immediate. PC relative. Operand sizes: Four operand sizes: 1, 2, 4 or 8 bytes. Instruction Encoding: Instruction set has 3 basic instruction formats with 3 minor variations. All are 32 bits in length. 15 15
CISC s Roots Back in the 70 s memory & software = $$$ Hardware not so much $ Move burden of code from software & memory to hardware Closing the semantic gap When do we prefer CISC? Storage and Memory» High cost of memory.» Need for compact code. Simple compilers Support for high-level languages 16
CISC Effects Moved complexity from s/w to h/w Ease of compiler design Easier to debug Lengthened design times Increased design errors 17 17
Exempli Gratia = E.G.!= I.E. Let s pretend that H is the name for a high-level language. This language has a function Cube() which will cube an integer H compiler translates code into assembly language for the A-1 computer, which only has two instructions 18
A-1 Computer Instructions Move [destination register, integer or source register] This instruction takes a value, either an integer or the contents of another register, and places it in the destination register. Move [D, 5] would place the number 5 in register D. Move [D, E] would take whatever number is stored in E and place it in D Mult [destination register, integer or source register] This instruction takes the contents of the destination register and multiplies it by either an integer or the contents of the source register, and places the result in the destination register. Mult [D, 70] would multiply the contents of D by 70 and place the results in D. Mult [D, E] would multiply the contents of D by the contents of E, and place the result in D 19
Pre-CISC example Statements in H Statements in Assembly for A-1 computer 1.A = 20; 2.B = Cube(A); 1.Move [A, 20] 2.Mult [A, A] 3.Mult [A, A] 4.Move [B, A] Here it takes four statements in the A-1 assembly to do the work of two statements in H since the A-1 computer has no instruction for taking the Cube of a number 20
Three possible problems 1. If the program H uses Cube() many times, then assembly code will be relatively larger, which is bad for the A-1 computer that has very little memory 2. With computer speeds being so slow, compiler takes a long time to translate all of the Cube() statements to multiple Mult[] instructions 3. Programming in assembly language would be time consuming, tedious, and difficult to debug 21
How does CISC solve this problem? Include a Cube instruction in the instruction set of the next generation of computers, A-2 Cube[destination register, source register] This instruction takes the contents of the source register and cubes it. It then places the result in the destination register. Cube [D, E] takes whatever value is in E, cubes it, and places the result in D 22
Post-CISC example Statements in H 1.A = 20; 2.B = Cube(A); Statements in Assembly for A-2 computer 1.Move [A, 20] 2.Cube[B, A] One-to-one correspondence between H and assembly code Semantic gap is closed Complexity has moved from the software level to the hardware level 23
Results Compiler does less work to translate Less memory needed Easier to debug 24
Drawbacks When using an instructions set with so many instructions, the decode function of the computer system must be able to recognize a wide variety of functions. As a result, the decode logic, while time critical for purposes of performance, grows to be quite complex Not every one of the complex instructions are used for each software program, and thus much of the decode logic functionality is seldom used during operation Another problem arises from the fact that the complex instructions are often of different lengths, i.e., each instruction could consist of any number of operands and takes any number of cycles to execute 25
Here comes the 80 s and a. 26
Birth of RISC & CISC??? RISC = Reduced Instruction Set Computer Previous to RISC, CISC was not called CISC, it was just the really good way to do things computer or RGWTDTC (just kidding) The term complex instruction set computer was forced upon anything else that was not RISC 27
Bloody hell, it s obvious that RISC and pointy hats are the future, mate!!! Pipelining BRILLIANT!! 28
Birth of RISC Roots can be traced to two research projects Berkeley RISC processor (~1980, D. Patterson) Stanford MIPS processor (~1981, J. Hennessy) Stanford & Berkeley projects driven by interest in building a simple chip that could be made in a university environment Commercialization benefited from these two independent projects Berkeley Project -> began Sun Microsystems Stanford Project -> began MIPS (used by SGI) 29 29
Reduced Instruction Set Computing (RISC) RISC newer concept than CISC (but still old) MIPS, PowerPC, SPARC, all RISC designs Small instruction set, CISC type operation becomes a chain of RISC operations Upside: Easier to design CPU Upside: Smaller instruction set => higher clock speed Downside: assembly language typically longer (compiler design is though) 30 30
RISC Effect Move complexity from h/w to s/w Provided a single-chip solution Better use of chip area Better Speed Feasibility of pipelining Single cycle execution stages Uniform Instruction Format Patterson: Make the common case fast 31 31
CISC vs. RISC (1970s 80s) CISC RISC Year introduced # instructions IBM 370/168 VAX 11/780 Xerox Dorado IBM 801 Berkeley RISC1 Stanford MIPS 1973 1978 1978 1980 1981 1983 208 303 270 120 39 55 Microcode 54KB 61KB 17KB 0 0 0 Instruction size Execution model 2 to 6B 2 to 57B 1 to 3B 4B 4B 4B Reg-reg Reg-mem Mem-mem Reg-reg Reg-mem Memmem Stack Reg-reg Reg-reg Reg-reg Why CISC is popular in 1970s? Source: Andy Tanenbaum s Structured Computer Organization 32
RISC vs CISC CISC: Complex Instruction Set Computer Each instruction can execute several low-level operations such as a load from memory, an arithmetic operation, and a memory store, all in a single instruction. CISC Philosophy use microcode build rich instruction sets build high-level instruction sets RISC: Reduced Instruction Set Computer simplified instructions which "do less" may still provide for higher performance if this simplicity can be utilized to make instructions execute very quickly. RISC philosophy fixed instruction lengths load-store instruction sets limited addressing modes limited operations CISC tries to reduce the number of instructions for a program, and RISC tries to reduce the cycles per instruction. 33
For CISC Memory are expensive and slow back then Reduce Semantic Gap by Cramming more functions into one instruction Using microcode ROM (μrom) for complex operations Justification for RISC Complex apps are mostly composed of simple assignments RAM speed catching up Compiler (human) getting smarter and sophisticated Frequency shorter pipe stages, more pipelineable From CISC to RISC CISC Variable length instructions Abundant instructions and addressing modes Longer decoding Contain mem-to-mem operations Use on-core microcode Closer semantic gap (shift complexity to microcode) IBM 360, DEC VAX, Intel IA32, Mot 68030 Defined by Colwell et al. 85 to weed out misleading RISC Fixed-length instructions, singlecycle operation Fewer instructions and addressing modes Easier decoding Load/store architecture No microinstructions, directly executed by HW logic Needs smart compilers, more compiler effort IBM 801, MIPS, RISC I, IBM RS6000, Sun Sparc 34
RISC argument Key arguments For a given technology, RISC implementation will be faster Current VLSI technology enables single-chip RISC When technology enables single-chip CISC, RISC will be pipelined When technology enables pipelined CISC, RISC will have caches When CISC have caches, RISC will have multiple cores CISC argument CISC flaws not fundamental (fixed with more transistors) Moore s Law will narrow the RISC/CISC gap (true) Software costs will dominate (very true) 35
Are you ready to rumble? RISC CISC VS 36
Warning, Geeky Conspiracy Theory Ahead 37
CISC + Pipelining = i486 CISC chips started using pipelining with the Intel i486 processor in 1989. Now what, RISC?!? Several years later Apple starts using the G3 (third generation PowerPC processors) in 1997 This was a RISC chip which actually had more instructions than Intel s Pentium II CISC processor! Hold up, Isn t RISC suppose to have a reduced number of instructions? Isn t that why RISC is so much better than CISC? 38
RISC = GOOD!!!! CISC = BAD!!! RISC > CISC 39
10 Years RISC (PowerPC) over CISC (x86)? What announcement did Apple make in 2005? 40
2005 No more PowerPC for Apple, Now it s all about Intel! Brilliant!! Just for fun! 41
Who won? Modern x86 are RISC-CISC hybrids CISCy skin (x86 ISA) RISCy heart» Each x86 instruction is translated into micro-op ( op) or MacroOP or RISC-op on-the-fly» Internal microarchitecture resembles RISC design philosophy» Processor dynamically reschedule based on ops MIPS, Sun SPARC, DEC Alpha are typical implementations of the RISC ideal Modern metric for determining RISCkyness of design: does the ISA have LOAD STORE instructions to memory? Modern CPUs utilize features of both! 42
CISC Summary: RISC vs CISC Effectively realizes one particular High Level Language Computer System in HW Recur HW development costs when change needed RISC Allows effective realization of any High Level Language Computer System in SW Recur SW development costs when change needed Modern CISC and RISC implementations converges. Many of today s RISC chips support as many instructions as yesterday's CISC chips. And today's CISC chips use many techniques formerly associated with RISC chips. 43
References http://cse.stanford.edu/class/sophomore-college/projects-00/risc/ http://www.visionengineer.com/comp/why_cisc.shtml http://www.visionengineer.com/comp/why_risc.shtml http://www.embedded.com/story/oeg20030205s0025 http://encyclopedia.laborlawtalk.com/powerpc http://www.sunderland.ac.uk/~ts0jti/comparch/ciscrisc.htm http://www.heyrick.co.uk/assembler/riscvcisc.html http://www.aallison.com/history.htm Lecture slides of Javier Arboleda, http://www.cs.sjsu.edu/~lee/cs147/cisc...again.ppt 44