CSE 141 Introduction to Computer Architecture Summer Session I, 2005. Lecture 1 Introduction. Pramod V. Argade June 27, 2005



Similar documents
Instruction Set Architecture. or How to talk to computers if you aren t in Star Trek

Introducción. Diseño de sistemas digitales.1

Instruction Set Architecture

Reduced Instruction Set Computer (RISC)

Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University

Instruction Set Architecture (ISA)

Instruction Set Design

EE361: Digital Computer Organization Course Syllabus

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

CPU Organization and Assembly Language

Intel 8086 architecture

Overview. CISC Developments. RISC Designs. CISC Designs. VAX: Addressing Modes. Digital VAX

Computer Architecture Lecture 3: ISA Tradeoffs. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013

Review: MIPS Addressing Modes/Instruction Formats

Computer Architectures

Processor Architectures

İSTANBUL AYDIN UNIVERSITY

LSN 2 Computer Processors

Chapter 2 Logic Gates and Introduction to Computer Architecture

More MIPS: Recursion. Computer Science 104 Lecture 9

MICROPROCESSOR AND MICROCOMPUTER BASICS

what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

Instruction Set Architecture (ISA) Design. Classification Categories

Computer organization

Chapter 2 Topics. 2.1 Classification of Computers & Instructions 2.2 Classes of Instruction Sets 2.3 Informal Description of Simple RISC Computer, SRC

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM

COMP 303 MIPS Processor Design Project 4: MIPS Processor Due Date: 11 December :59

Design Cycle for Microprocessors

Computer System: User s View. Computer System Components: High Level View. Input. Output. Computer. Computer System: Motherboard Level

CISC, RISC, and DSP Microprocessors

Introduction to MIPS Assembly Programming

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

Basic Computer Organization

Property of ISA vs. Uarch?


In the Beginning The first ISA appears on the IBM System 360 In the good old days

Computer Organization and Architecture

CHAPTER 4 MARIE: An Introduction to a Simple Computer

Management Challenge. Managing Hardware Assets. Central Processing Unit. What is a Computer System?

MICROPROCESSOR. Exclusive for IACE Students iacehyd.blogspot.in Ph: /422 Page 1

Central Processing Unit (CPU)

Design of Digital Circuits (SS16)

Computer Organization

IA-64 Application Developer s Architecture Guide

612 CHAPTER 11 PROCESSOR FAMILIES (Corrisponde al cap Famiglie di processori) PROBLEMS

An Overview of Stack Architecture and the PSC 1000 Microprocessor

Pentium vs. Power PC Computer Architecture and PCI Bus Interface

An Introduction to the ARM 7 Architecture

Q. Consider a dynamic instruction execution (an execution trace, in other words) that consists of repeats of code in this pattern:

Computer Organization and Components

ARM Architecture. ARM history. Why ARM? ARM Ltd developed by Acorn computers. Computer Organization and Assembly Languages Yung-Yu Chuang

Administrative Issues

MIPS Assembly Code Layout

EC 362 Problem Set #2

CS352H: Computer Systems Architecture

Computer Organization and Components

Here is a diagram of a simple computer system: (this diagram will be the one needed for exams) CPU. cache

Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings:

Computer Architecture TDTS10

Chapter 5 Instructor's Manual

a storage location directly on the CPU, used for temporary storage of small amounts of data during processing.

ARM Microprocessor and ARM-Based Microcontrollers

Exceptions in MIPS. know the exception mechanism in MIPS be able to write a simple exception handler for a MIPS machine

Operating System Overview. Otto J. Anshus

Syscall 5. Erik Jonsson School of Engineering and Computer Science. The University of Texas at Dallas

Architectures and Platforms

Assembly Language Programming

Chapter 7D The Java Virtual Machine

Administration. Instruction scheduling. Modern processors. Examples. Simplified architecture model. CS 412 Introduction to Compilers

Computer Systems Design and Architecture by V. Heuring and H. Jordan

Computer Systems Structure Main Memory Organization

VLIW Processors. VLIW Processors

Week 1 out-of-class notes, discussions and sample problems

Pipelining Review and Its Limitations

CHAPTER 7: The CPU and Memory

Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z)

Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Computer Architecture Basics

INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER

RISC AND CISC. Computer Architecture. Farhat Masood BE Electrical (NUST) COLLEGE OF ELECTRICAL AND MECHANICAL ENGINEERING

What is a System on a Chip?

Introduction to Microprocessors

CSC 2405: Computer Systems II

A3 Computer Architecture

ELE 356 Computer Engineering II. Section 1 Foundations Class 6 Architecture

Translating C code to MIPS

Performance evaluation

Learning Outcomes. Simple CPU Operation and Buses. Composition of a CPU. A simple CPU design

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

M A S S A C H U S E T T S I N S T I T U T E O F T E C H N O L O G Y DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

Microprocessor and Microcontroller Architecture

Pipeline Hazards. Structure hazard Data hazard. ComputerArchitecture_PipelineHazard1

Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:

Software Pipelining. for (i=1, i<100, i++) { x := A[i]; x := x+1; A[i] := x

CPU Organisation and Operation

How To Write Portable Programs In C

CPU Performance Equation

EE282 Computer Architecture and Organization Midterm Exam February 13, (Total Time = 120 minutes, Total Points = 100)

1 Classical Universal Computer 3

Transcription:

CSE 141 Introduction to Computer Architecture Summer Session I, 2005 Lecture 1 Introduction Pramod V. Argade June 27, 2005

CSE141: Introduction to Computer Architecture Instructor: Pramod V. Argade (p2argade@cs.ucsd.edu) Office Hour: Mon. 4:30-5:30 PM (AP&M 4218) TAs: Chengmo Yang: c5yang@cs.ucsd.edu Wenjing Rao: wrao@cs.ucsd.edu Lecture: Mon/Wed. 6:00-8:50 PM, HSS 1330 Textbook: Web-page: Computer Organization & Design The Hardware Software Interface, 3 nd Edition. Authors: Patterson and Hennessy http://www.cse.ucsd.edu/classes/su05/cse141 1-2

What is a Computer? Components: Input (mouse, keyboard) Output (display, printer) Memory (disk drives, DRAM, SRAM, CD) Network Our primary focus: the processor (datapath and control) Implemented using millions of transistors Impossible to understand by looking at each transistor Rapidly changing field: Vacuum tube Transistor IC VLSI SoC Interesting history: http://www.islandnet.com/~kpolsson/comphist/ 1-3

Architecture: Dictionary Definition The art and science of designing and erecting buildings. A style and method of design and construction. Orderly arrangement of parts; structure.... 1-4

What is Computer Architecture? Hardware Designer Thinks about circuits, components, timing, functionality, ease of debugging construction engineer Computer Architect Thinks about high-level components, how they fit together, how they work together to deliver performance. building architect 1-5

The Challenge of Computer Architecture The industry changes faster than any other. design assuming processor technology trends The ground rules change every year. new problems new opportunities different tradeoffs It s all about making programs run faster than the current generation of processors 1-6

Forces on Computer Architecture Hardware Technology Programming Languages Applications Computer Architecture Compiler Technology Operating Systems Compatibility 1-7

Evolution of Intel Processors* Processor Year of Introduction # Transistors 4004 1971 2,250 8008 1972 2,500 8080 1974 5,000 8086 1978 29,000 286 1982 120,000 Intel I386 1985 275,000 Intel I486 1989 1,180,000 Intel Pentium 1993 3,100,000 Intel Pentium II 1997 7,500,000 Intel Pentium III 1999 24,000,000 Intel Pentium 4 2000 42,000,000 Intel Itanium 2002 220,000,000 Intel Itanium 2 2003 410,000,000 *http://www.intel.com/research/silicon/mooreslaw.htm 1-8

Pentium 2 Photomicrograph* *http://microscope.fsu.edu/chipshots/pentium/pent2large.html 1-9

Moore s Law* Gordon Moore made the observation in 1965, four years after planar integrated circuit was discovered. The law states that transistor density doubles every ~18 months *http://www.intel.com/research/silicon/mooreslaw.htm 1-10

Processor Performance 1200 DEC Alpha 21264/600 1000 Performance 800 600 DEC Alpha 5/500 400 200 0 Sun -4/ 260 MIPS M 2000 MIPS M/ 120 IBM RS/ 6000 HP 9000/ 750 DEC AXP/ 500 IBM POWER 100 87 88 89 90 91 92 93 94 95 96 97 Year DEC Alpha 5/300 DEC Alpha 4/266 1-11

Computer Architecture s Changing Definition 1950s to 1960s Computer Architecture Computer Arithmetic 1970s to mid 1980s Computer Architecture Instruction Set Design, especially ISA appropriate for compilers 1990s Computer Architecture Design of CPU, memory system, I/O system, Multiprocessors Instruction level Parallelism 2000s Computer Architecture System-on-a-chip (SoC) 1-12

Processors are everywhere! PC, Workstation Cell Phone DVD Player Digital Camera Car Watch Pacemaker Pill Camera... 1-13

Commercial Processor Architectures Embedded Processors ARM, MIPS, PowerPC, ARC, PC/Workstation Processors Intel x86, Sun SPARC, IBM/Motorola PowerPC, Super Computer IBM PowerPC, Intel Itanium, NEC, 1-14

Why do you care about architecture? If you become a SW designer Understand architecture to better utilize it What are performance bottlenecks & who will fix them? If you become a HW designer You may implement a processor architecture in your career If you become a System designer You may make HW and SW partition decision You may have to make purchasing decisions Buy from vendor A or B? Why? Everything amounts to cost/benefit tradeoff! What are the techniques for making such tradeoffs? 1-15

Which is faster? for (i=0; i<n; i=i+1) for (j=0; j<n; j=j+1) { r = 0; for (k=0; k<n; k=k+1) r = r + y[i][k] * z[k][j]; x[i][j] = r; } for (jj=0; jj<n; jj=jj+b) for (kk=0; kk<n; kk=kk+b) for (i=0; i<n; i=i+1) { for (j=jj; j<min(jj+b-1,n); j=j+1) r = 0; for (k=kk; k<min(kk+b-1,n); k=k+1) r = r + y[i][k] * z[k][j]; x[i][j] = x[i][j] + r; } 1-16

Which is faster? load R1, addr1 store R1, addr2 add R0, R2 -> R3 subtract R4, R3 -> R5 add R0, R6 ->R7 store R7, addr3 load R1, addr1 add R0, R2 -> R3 add R0, R6 -> R7 store R1, addr2 subtract R4, R3 -> R5 store R7, addr3 1-17

Levels of Abstraction Delving into the depths reveals more information An abstraction omits unneeded detail, helps us cope with complexity High Level Language Program Compiler temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; Assembly Language Program Assembler Machine Language Program Machine Interpretation SW $15, 0($2) LW $16, 4($2) SW $16, 0($2) SW $15, 4($2) 1000110001100010000000000000000 1000110011110010000000000000100 1010110011110010000000000000000 1010110001100010000000000000100 Control Signal Spec ALUOP[0:3] <= InstReg[9:11] & MASK 1-18

The five classic components of computers Computer Control Input Memory Datapath Output 1-19

Computer Architecture Topics Input/Output and Storage Memory Hierarchy Disks, WORM, Tape DRAM L2 Cache Coherence, Bandwidth, Latency RAID Emerging Technologies Interleaving Bus protocols VLSI L1 Cache Instruction Set Architecture Pipelining, Hazard Resolution, Superscalar, Reordering, Prediction, Speculation, Vector, DSP Addressing, Protection, Exception Handling Pipelining and Instruction Level Parallelism 1-20

MIPS R10000 CPU

What you can expect to get out of this class Understand fundamental concepts in computer architecture Understand impact on application performance Evaluate architectural descriptions of processors Gain experience designing a working CPU completely from scratch Learn techniques used to evaluate advanced architecture design 1-22

Course Outline Instruction Set Architecture Computer System Performance and Performance Metrics Computer Arithmetic and Number System CPU Architecture Pipelining Memory Hierarchy and Caches Virtual Memory 1-23

Summer Session I, 2005 CSE141 Course Schedule Lecture # Date Time Room Topic Quiz topic 1 Mon. 6/27 6-8:50 PM HSS 1330 Introduction, Ch. 1 ISA, Ch. 2 2 Wed. 6/29 6-8:50 PM HSS 1330 Arithmetic, Ch. 3 Homework Due - - ISA Ch. 2 - Mon. 7/4 No Class July 4th Holiday - - 3 Wed. 7/6 6-8:50 PM HSS 1330 Performance, Ch. 4 Arithmetic Single-cycle CPU Ch. 5 Ch. 3 #2 4 Mon. 7/11 6-8:50 PM HSS 1330 Single-cycle CPU Ch. 5 Cont. Performance Multi-cycle CPU Ch. 5 Ch. 4 #3 5 Tue. 7/12 7:30-8:50 PM HSS 1330 Multi-cycle CPU Ch. 5 Cont. (July 4th make up class) - - 6 Wed. 7/13 6-8:50 PM HSS 1330 Single and Multicycle CPU Examples and Single-cycle CPU Review for Midterm Ch. 5-7 Mon. 7/18 6-8:50 PM HSS 1330 Mid-term Exam Exceptions - #4 8 Tue. 7/19 7:30-8:50 PM HSS 1330 Pipelining Ch. 6 (July 4th make up class) - - 9 Wed. 7/20 6-8:50 PM HSS 1330 Hazards, Ch. 6 - - 10 Mon. 7/25 6-8:50 PM HSS 1330 Memory Hierarchy & Caches Ch. 7 11 Wed. 7/27 6-8:50 PM HSS 1330 Virtual Memory, Ch. 7 Course Review Hazards Ch. 6 Cache Ch. 7 12 Sat. 7/30 TBD TBD Final Exam - - #1 #5 #6 1-24

Course Work and Grading Course Work Weekly homework assignments No late assignments accepted No re-grading of homework or quizzes Grading 10% homework Lowest homework score will be dropped 25% weekly quizzes (10 minutes, every Monday) Lowest quiz score will be dropped 25% midterm 40% final (which will cover the whole quarter) Tests & Quizzes Closed books and no notes 1-25

Quiz, Tests: What to expect? Material covered during lectures Material covered by homework assignments 1-26

CSE 141 Computer Architecture Spring 2005 Lecture 2 Instruction Set Architecute Pramod V. Argade June 27, 2005

Instruction Set Architecture (ISA) General Considerations 1-28

Stored Program Concept Control Datapath Memory for Program & Data Input Output Instructions are bits Programs are stored in memory to be read or written just like data Fetch & Execute Cycle Instructions are fetched and put into a special register Bits in the register "control" the subsequent actions Fetch the next instruction and continue 1-29

Memory Organization Viewed as a large, single-dimension array, with an address. A memory address is an index into the array "Byte addressing" means that the index points to a byte of memory. 0 1 2 3 4 5 6... 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data 1-30

Memory Organization Bytes are nice, but most data items use larger "words" For MIPS, a word is 32 bits or 4 bytes. 0 4 8 12... 32 bits of data 32 bits of data 32 bits of data 32 bits of data Words are aligned on 4-byte boundary i.e., what are the least 2 significant bits of a word address? 32 bits address 2 32 bytes with byte addresses from 0 to 2 32-1 2 30 words with byte addresses 0, 4, 8,... 2 32-4 1-31

Endian-ness: How to address bytes within words? Big Endian: address of most significant byte = word address IBM 360/370, Motorola 68k, MIPS, Sparc, HP PA Little Endian: address of least significant byte = word address Intel 80x86, DEC Vax, DEC Alpha (Windows NT) 3 2 1 0 little endian byte # msb lsb 0 1 2 3 bigendianbyte # 0 1 2 3 4 5 6... 0x00 0x11 0x22 0x33 0x44 0x55 0x66 Big-endian 0 4 8 12... 0x00112233 0x44556677 0x8899aabb 0xccddeeff Word Data 0 0x33 1 0x22 2 0x11 3 0x00 4 0x77 5 0x66 6 0x55... Little-endian 1-32

Addressing: Alignment Aligned 0 1 2 3 Not Aligned Alignment: require that objects fall on address that is multiple of their size. MIPS requires address alignment Word addresses must be multiple of 4 Half word addresses must be multiple of 2 1-33

The Instruction Execution Cycle Instruction Fetch Obtain instruction from program storage Instruction Decode Determine required actions and instruction size Operand Fetch Execute Locate and obtain operand data Compute result value or status Result Store Deposit results in storage for later use Next Instruction Determine successor instruction

The Instruction Set Architecture Is the interface between all the software that runs on the machine and the hardware that executes it. Application Compiler Micro-code Operating System I/O system Instruction Set Architecture Digital Design Circuit Design Provides a level of abstraction in both directions Modern instruction set architectures: 80x86/Pentium, MIPS, SPARC, PowerPC, ARM, Tensilica,... 1-35

Instruction Set Architecture (ISA) Instructions: Words of a machine s language Instruction Set: Machine s vocabulary ISA: A very important abstraction Interface between hardware and low-level software Standardizes instructions, machine language bit patterns, etc. Advantage: different implementations of the same architecture Disadvantage: sometimes prevents using new innovations True or False? Binary compatibility is extraordinarily important. Part of the architecture that is visible to the programmer opcodes (available instructions) number and types of registers instruction formats storage access, addressing modes exceptional conditions 1-36

Key ISA Decisions Instruction length Fixed length Variable length Registers How many? Operand access Register Memory Instruction format Meaning of group of bits within machine instruction Operands How many per instruction, size (byte, word,..) Operations ADD, SUB, MUL,... 1-37

Accessing the Operands Operands are generally in one of two places: Registers: fast on-chip storage (how many, how wide?) Memory (how many locations?) Registers are Easy to specify Close to the processor Provide fast access Can read two operands and write one result per clock cycle The idea that we want to access registers whenever possible led to load-store architectures. Normal arithmetic instructions only access registers Only access memory with explicit loads and stores 1-38

Basic ISA Classes Comparing the Number of Instructions Code sequence for C = A + B for four classes of instruction sets: Stack Accumulator Register Register (register-memory) (load-store) Push A Load A Load R1,A Load R1,A Push B Add B Add R1,B Load R2,B Add Store C Store C, R1 Add R3,R1,R2 Pop C Store C,R3 1-39

MIPS Instruction Set Architecture 1-40

MIPS Instruction Set Architecture Typical RISC Instruction Set designed in 1980 s MIPS is found in products from: Silicon Graphics NEC Cisco Broadcom Nintendo Sony Ti Toshiba We will study various implementations of MIPS instruction set Later part of the course! 1-41

MIPS ISA: Key Points MIPS is a general-purpose register, load-store, fixedinstruction-length architecture. MIPS is optimized for fast pipelined performance, not for low instruction count Four principles of ISA Simplicity favors regularity: regular instruction set Smaller is faster: small number of formats, registers Good design demands good compromises: e.g. fixed length instructions Make the common case fast: encode constant within the instruction 1-42

Overview of MIPS ISA Fixed 32-bit instructions 3-operand, load-store architecture 32 general-purpose registers Registers are 32-bits wide (word) R0 always equals zero. Addressing modes Register, immediate, base+displacement, PC-relative and pseudodirect addressing modes 3 instruction formats will be covered in the class 1-43

Notes on MIPS Assembly Comments start with # Destination first (except for store instructions) ADD $t0, $s1, $s2 # $t0 = $s1 + $s2 Register access $n: register n (e.g. $1 is register 1) Register naming convention Name Register number Usage $zero 0 the constant value 0 $v0-$v1 2-3 values for results and expression evaluation $a0-$a3 4-7 arguments $t0-$t7 8-15 temporaries $s0-$s7 16-23 saved $t8-$t9 24-25 more temporaries $gp 28 global pointer $sp 29 stack pointer $fp 30 frame pointer $ra 31 return address 1-44

MIPS Instructions (Covered in this course) Arithmetic add, sub, addi Logical sll, srl, and, andi, or, ori, nor Data transfer lw, sw, lb, sb, lui Conditional branch beq, bne, slt, slti Unconditional jump j, jr Jump and link jal 1-45

MIPS Arithmetic Instructions All instructions have 3 operands Operand order is fixed (destination first) Example: C code: A = B + C MIPS code: add $s0, $s1, $s2 (associated with variables by compiler) Operands must be registers, only 32 registers provided Instruction format: R 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits op rs rt rd shamt funct Design Principle: simplicity favors regularity. Why? Design Principle: smaller is faster. Why? 1-46

How to specify constants? Small constants are used quite frequently (50% of operands) e.g., A = A + 5; B = B + 1; C = C - 18; Solutions Put 'typical constants' in memory and load them. create hard-wired registers (like $zero) for constants like one. Specify constant in the instruction: this commonly used MIPS Instructions: addi $s3, $t0, 4 slti $s0, $t1, 10 Instruction Format I 6 bits 5 bits 5 bits op rs rt 16 bit address 1-47

How about larger constants? We'd like to be able to load a 32 bit constant into a register Must use two instructions, new "load upper immediate" instruction lui $t0, 1010101010101010 1010101010101010 0000000000000000 Then must get the lower order bits right, i.e., ori $t0, $t0, 1010101010101010 filled with zeros 1010101010101010 0000000000000000 ori 0000000000000000 1010101010101010 1010101010101010 1010101010101010 1-48

Memory Access Instructions Load and store instructions Example: C code: A[8] = h + A[8]; MIPS code: lw $t0, 32($s3) add $t0, $s2, $t0 sw $t0, 32($s3) Store word has destination last Remember arithmetic operands are registers, not memory! Instruction format: I 6 bits 5 bits 5 bits op rs rt 16 bit address 1-49

Control Transfer Instructions Decision making instructions Alter the control flow, i.e., change the "next" instruction to be executed MIPS conditional branch instructions: bne $t0, $t1, Label beq $t0, $t1, Label Example: if (i==j) h = i + j; bne $s0, $s1, Label add $s3, $s0, $s1 Label:... Instruction format: I 6 bits 5 bits 5 bits op rs rt 16 bit address 1-50

MIPS Conditional Branches MIPS branches use PC-relative addressing BEQ, BNE BEQ $1, $2, addr => if( r1 == r2 ) goto addr MIPS has no Branch-If-Lless-Than It is in timing critical path Set-Less-Than, SLT SLT $1, $2, $3 => if( $2 < $3) $1 = 1; else 1 = 0 BEQ, BNE, SLT combined with $0 can implement all branch conditions: always, never,!=, ==, <=, >=, <, >, >(unsigned), <=(unsigned),... 1-51

MIPS Jump Instructions Need to transfer control Jump to an absolute address Jump to an address in a register Jump-And-Link to do procedure call and return Jump example: j 0x20000 => PC = 0x20000 Jump and Link example jal 0x40000 => $31 = PC+4, PC = 0x40000 Jump register example: jr $31 => PC = $31 (This is return instruction!) 1-52

MIPS unconditional branch instructions: j label Instruction format: J 6 bits op Unconditional Jumps 26 bit address Jump uses pseudo-direct addressing mode Program Counter 4 26 Instruction 6 26 4 26 Jump Destination Address 00 1-53

MIPS Instruction Format & Machine code Instruction format R I J 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits op rs rt rd shamt funct op rs rt 16 bit address op 26 bit address The opcode tells the machine which format Machine instruction add r1, r2, r3 has opcode=0, funct=32, rs=2, rt=3, rd=1, sa=0 Machine code: 000000 00010 00011 00001 00000 100000 0000 0000 0100 0011 0000 1000 0010 0000 0x00430820 Expected to assemble and disassemble machine code 1-54

Summary of MIPS Instructions MIPS operands Name Example Comments $s0-$s7, $t0-$t9, $zero, Fast locations for data. In MIPS, data must be in registers to perform 32 registers $a0-$a3, $v0-$v1, $gp, arithmetic. MIPS register $zero always equals 0. Register $at is $fp, $sp, $ra, $at reserved for the assembler to handle large constants. Memory[0], Accessed only by data transfer instructions. MIPS uses byte addresses, so 2 30 memory Memory[4],..., sequential words differ by 4. Memory holds data structures, such as arrays, words Memory[4294967292] and spilled registers, such as those saved on procedure calls. MIPS assembly language Category Instruction Example Meaning Comments add add $s1, $s2, $s3 $s1 = $s2 + $s3 Three operands; data in registers Arithmetic subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 Three operands; data in registers add immediate addi $s1, $s2, 100 $s1 = $s2 + 100 Used to add constants load word lw $s1, 100($s2) $s1 = Memory[$s2 + 100] Word from memory to register store word sw $s1, 100($s2) Memory[$s2 + 100] = $s1 Word from register to memory Data transfer load byte lb $s1, 100($s2) $s1 = Memory[$s2 + 100] Byte from memory to register store byte sb $s1, 100($s2) Memory[$s2 + 100] = $s1 Byte from register to memory load upper immediate lui $s1, 100 $s1 = 100 * 2 16 Loads constant in upper 16 bits branch on equal beq $s1, $s2, 25 if ($s1 == $s2) go to PC + 4 + 100 branch on not equal bne $s1, $s2, 25 if ($s1!= $s2) go to Conditional PC + 4 + 100 branch set on less than slt $s1, $s2, $s3 if ($s2 < $s3) $s1 = 1; else $s1 = 0 Equal test; PC-relative branch Not equal test; PC-relative Compare less than; for beq, bne set less than immediate slti $s1, $s2, 100 if ( $s2 < 100) $s1 = 1; else $s1 = 0 Compare less than constant jump j 2500 go to 10000 Jump to target address Uncondi- jump register jr $ra go to $ra For switch, procedure return tional jump jump and link jal 2500 $ra = PC + 4; go to 10000 For procedure call 1-55

MIPS Addressing Modes 1. Immediate addressing op rs rt 2. Register addressing op rs rt e.g. addi $t0, $t1, 4 Immediate e.g. sub $t0, $t1, $t2 rd... funct Note: Parts 3 and 4 in Figure 2.24 (page 101) Edition 3 are incorrect! Registers Register 3. Base addressing e.g. lw $t0, 4( $t2) op rs rt Address Memory Register + Byte Halfword Word 4. PC-relative addressing op rs rt e.g. beq $t1, $t2, 32 Address Memory PC + Word 5. Pseudodirect addressing op Address e.g. j 0x1000 Memory PC Word 1-56

Assembly Language vs. Machine Language Assembly provides convenient symbolic representation Much easier than writing down numbers e.g., destination first Machine language is the underlying reality Used by the processor at run time Assembly can provide 'pseudo-instructions' e.g., move $t0, $t1 exists only in Assembly would be implemented using add $t0,$t1,$zero When considering performance you should count real instructions 1-57

Other Issues Things we are not going to cover support for procedures linkers, loaders, memory layout stacks, frames, recursion manipulating strings and pointers interrupts and exceptions system calls and conventions We've focused on architectural issues Basics of MIPS assembly language and machine code We ll build a processor to execute these instructions. 1-58

Alternative Architectures Design alternative: Provide more powerful operations Goal is to reduce number of instructions executed Danger is a slower cycle time and/or a higher CPI Sometimes referred to as RISC vs. CISC Virtually all new instruction sets since 1982 have been RISC VAX: minimize code size, make assembly language easy instructions from 1 to 54 bytes long! 1-59

Intel IA-32 Architecture 1978: The Intel 8086 is announced (16 bit architecture) 1980: The 8087 floating point coprocessor is added 1982: The 80286 increases address space to 24 bits, +instructions 1985: The 80386 extends to 32 bits, new addressing modes 1989-1995: The 80486, Pentium, Pentium Pro add a few instructions (mostly designed for higher performance) 1997: MMX is added 1999 Pentium II (same architecture) 2000 Pentium 4 (144 new multimedia instructions 2001 Itanium (new ISA which executes x86 code) 1-60

A dominant architecture: 80x86 See Section 2-16 for a more detailed description Complexity: Instructions from 1 to 17 bytes long One operand must act as both a source and destination One operand can come from memory Complex addressing modes e.g., base or scaled index with 8 or 32 bit displacement Saving grace: The most frequently used instructions are not too difficult to build Compilers avoid the portions of the architecture that are slow 1-61

Summary Instruction complexity is only one variable Lower instruction count vs. higher CPI / lower clock rate Four principles of ISA Simplicity favors regularity: regular instruction set Smaller is faster: small number of formats, small number of registers Good design demands good compromises: fixed length instructions Make the common case fast: encode constant within the instruction Instruction set architecture A very important abstraction indeed! In subsequent lectures we will show how to implement a subset of this ISA 1-62