Forecast. Instructions. Basics. Basics. Instructions are the words of a computer. Basics. ISA is its vocabulary. Registers and ALU ops

Similar documents
Instruction Set Architecture. or How to talk to computers if you aren t in Star Trek

Reduced Instruction Set Computer (RISC)

Instruction Set Architecture

CSE 141 Introduction to Computer Architecture Summer Session I, Lecture 1 Introduction. Pramod V. Argade June 27, 2005

Instruction Set Architecture (ISA)

Review: MIPS Addressing Modes/Instruction Formats

Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University

Chapter 2 Topics. 2.1 Classification of Computers & Instructions 2.2 Classes of Instruction Sets 2.3 Informal Description of Simple RISC Computer, SRC

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

Intel 8086 architecture

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

Instruction Set Architecture (ISA) Design. Classification Categories

Instruction Set Design

CPU Organization and Assembly Language

LSN 2 Computer Processors

COMP 303 MIPS Processor Design Project 4: MIPS Processor Due Date: 11 December :59

Computer Organization and Architecture

Introducción. Diseño de sistemas digitales.1

Introduction to MIPS Assembly Programming

More MIPS: Recursion. Computer Science 104 Lecture 9

Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings:

Chapter 5 Instructor's Manual

Computer Architectures

EECS 427 RISC PROCESSOR

Property of ISA vs. Uarch?

MIPS Assembler and Simulator

Computer organization

Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z)

Assembly Language Programming

Typy danych. Data types: Literals:

Overview. CISC Developments. RISC Designs. CISC Designs. VAX: Addressing Modes. Digital VAX

Computer Architecture Lecture 3: ISA Tradeoffs. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013

a storage location directly on the CPU, used for temporary storage of small amounts of data during processing.

In the Beginning The first ISA appears on the IBM System 360 In the good old days

An Introduction to Assembly Programming with the ARM 32-bit Processor Family

MIPS Assembly Code Layout

Winter 2002 MID-SESSION TEST Friday, March 1 6:30 to 8:00pm

Design of Pipelined MIPS Processor. Sept. 24 & 26, 1997

Lecture Outline. Stack machines The MIPS assembly language. Code Generation (I)

1 Classical Universal Computer 3

Addressing The problem. When & Where do we encounter Data? The concept of addressing data' in computations. The implications for our machine design(s)

MICROPROCESSOR AND MICROCOMPUTER BASICS

Pipeline Hazards. Structure hazard Data hazard. ComputerArchitecture_PipelineHazard1

EC 362 Problem Set #2

Translating C code to MIPS

5 MIPS Assembly Language

Syscall 5. Erik Jonsson School of Engineering and Computer Science. The University of Texas at Dallas

MACHINE ARCHITECTURE & LANGUAGE

(Refer Slide Time: 00:01:16 min)

CS352H: Computer Systems Architecture

EE282 Computer Architecture and Organization Midterm Exam February 13, (Total Time = 120 minutes, Total Points = 100)

THUMB Instruction Set

Pipeline Hazards. Arvind Computer Science and Artificial Intelligence Laboratory M.I.T. Based on the material prepared by Arvind and Krste Asanovic

CHAPTER 7: The CPU and Memory

A SystemC Transaction Level Model for the MIPS R3000 Processor

The AVR Microcontroller and C Compiler Co-Design Dr. Gaute Myklebust ATMEL Corporation ATMEL Development Center, Trondheim, Norway

An Introduction to the ARM 7 Architecture

How It All Works. Other M68000 Updates. Basic Control Signals. Basic Control Signals

Q. Consider a dynamic instruction execution (an execution trace, in other words) that consists of repeats of code in this pattern:

Administrative Issues

Design of Digital Circuits (SS16)

Chapter 2 Logic Gates and Introduction to Computer Architecture

CHAPTER 4 MARIE: An Introduction to a Simple Computer


Solutions. Solution The values of the signals are as follows:

Administration. Instruction scheduling. Modern processors. Examples. Simplified architecture model. CS 412 Introduction to Compilers

İSTANBUL AYDIN UNIVERSITY

Chapter 7D The Java Virtual Machine

Notes on Assembly Language

2) Write in detail the issues in the design of code generator.

CSE 141L Computer Architecture Lab Fall Lecture 2

WAR: Write After Read

Central Processing Unit (CPU)

1 The Java Virtual Machine

Instruction Set Reference

Exceptions in MIPS. know the exception mechanism in MIPS be able to write a simple exception handler for a MIPS machine

LC-3 Assembly Language

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM

EE361: Digital Computer Organization Course Syllabus

Processor Architectures

Outline. Lecture 3. Basics. Logical vs. physical memory physical memory. x86 byte ordering

Lecture 3 Addressing Modes, Instruction Samples, Machine Code, Instruction Execution Cycle

VLIW Processors. VLIW Processors

ASSEMBLY PROGRAMMING ON A VIRTUAL COMPUTER

Compilers I - Chapter 4: Generating Better Code

M A S S A C H U S E T T S I N S T I T U T E O F T E C H N O L O G Y DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

CPU Organisation and Operation

What to do when I have a load/store instruction?

Computer Organization and Components

Pentium vs. Power PC Computer Architecture and PCI Bus Interface

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

CS:APP Chapter 4 Computer Architecture. Wrap-Up. William J. Taffe Plymouth State University. using the slides of

Guide to RISC Processors

Figure 1: Graphical example of a mergesort 1.

CISC, RISC, and DSP Microprocessors

INSTRUCTION LEVEL PARALLELISM PART VII: REORDER BUFFER

Microprocessor and Microcontroller Architecture

CS101 Lecture 26: Low Level Programming. John Magee 30 July 2013 Some material copyright Jones and Bartlett. Overview/Questions

Where we are CS 4120 Introduction to Compilers Abstract Assembly Instruction selection mov e1 , e2 jmp e cmp e1 , e2 [jne je jgt ] l push e1 call e

Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Transcription:

Instructions Forecast Instructions are the words of a computer ISA is its vocabulary With a few other things, this defines the interface of computers But implementations vary Basics Registers and ALU ops Memory and load/store Branches and jumps And more... We will study MIPS ISA but the next ISA is not too hard after the first Read Chapter 3 and skim through A.10 for MIPS ISA We won t write programs - EE362 EE365 Lecture Notes: Chapter 3 1 EE365 Lecture Notes: Chapter 3 2 Basics Basics C statement f = (g + h) - (i + j) MIPS instructions add t0, g, h add t1, i, j sub f, t0, t1 Opcode: Specifies the kind of operation (mnemonic) Operands: input and output data (source/destination) Operands t0, t1 are temps One operation, two inputs, one output Multiple instructions for one C statement Opcodes/mnemonic, operands, source/destination EE365 Lecture Notes: Chapter 3 3 EE365 Lecture Notes: Chapter 3 4

Why not have bigger instructions? Registers and ALU ops why not have f = (g + h) - (i + j) as one instruction? Church s thesis: A very primitive computer can compute anything that a fancy computer can compute - need only logical functions, read and write memory and data-dependent decisions Therefore, ISA selected for practical reasons performance and cost, not computability Regularity tends to improve both E.g., H/W to handle arbitrary number of operands complex and slow and NOT NECESSARY Ok, I lied! Operands must be registers, not variables add $8, $17, $18 add $9, $19, $20 sub $16, $8, $9 MIPS has 32 registers $0-$31 (figure next slide) Registers $8, $9 are temps, $16 - f, $17 - g, $18 - h, $19 - i, $20 - j MIPS also allows one constant called immediate later we will see immediate is 16 bits [-32768, 32767] EE365 Lecture Notes: Chapter 3 5 EE365 Lecture Notes: Chapter 3 6 Registers and ALU ALU ops Registers Processor $0 $31 ALU Some ALU ops: add, addi, addu, addiu (immediate, unsigned) sub... mul, div - weird! and, andi or, ori sll, srl,... Why registers? fit in instructions, smaller is faster Are registers enough? EE365 Lecture Notes: Chapter 3 7 EE365 Lecture Notes: Chapter 3 8

Memory and load/store Memory and load/store But need more than 32 words of storage Processor Memory An array of locations M[addr], indexed by addr (figure next slide) Data movement (on words or integers) load word for reg <-- memory lw $17, 1002 # get input g store word for reg --> memory sw $16, 1001 # save output f; Note: destination last! Registers $0 $31 ALU 0 1 2 3 1001 1002 f g maxmem EE365 Lecture Notes: Chapter 3 9 EE365 Lecture Notes: Chapter 3 10 Memory and load/store Memory and load/store I lied again! we need address bytes for character strings words take 4 bytes therefore, word addresses must be multiples of 4 figure next slide Registers Processor $0 $31 ALU 0 1 2 3 4004 4008 Memory f g maxmem EE365 Lecture Notes: Chapter 3 11 EE365 Lecture Notes: Chapter 3 12

Memory and load/store Important for arrays A[i] = A[i] + h (figure next slide) # $8 - temp, $18 - h, $21 - (i x 4) Astart is 8000 lw $8, Astart($21) # or 8000($21) add $8, $18, $8 sw $8, Astart($21) MIPS has other load/store for bytes and halfwords Registers Processor $0 $31 Memory and load/stor4 ALU 0 4004 4008 Memory f g 8000 A[0] 8004 A[1] 8008 A[2] maxmem EE365 Lecture Notes: Chapter 3 13 EE365 Lecture Notes: Chapter 3 14 Aside on Endian Midterms Big endian: MSB at address xxxxxx00 e.g., IBM, SPARC Little endian: MSB at address xxxxxx11 e.g., Intel x86 (Windows NT requires it) Midterm 1: 2/26/98, Thursday 7:00 p.m. - 8:00 p.m., EE 170 Midterm 2: 4/08/98, Wednesday 8:30 p.m. - 9:30 p.m., CHME 302 Mode selectable e.g., PowerPC, MIPS EE365 Lecture Notes: Chapter 3 15 EE365 Lecture Notes: Chapter 3 16

Branches and Jumps While ( i!= j) { j= j + i; i= i + 1; } # $8 is i, $9 is j, $10 is k Loop: beq $8, $9, Exit # not!= add $9, $9, $8 addi $8, $8, 1 j Loop Branches and Jumps Better yet: beq $8, $9, Exit # not!= Loop: add $9, $9, $8 addi $8, $8, 1 bne $8, $9, Loop Exit: Let compilers worry about such optimizations Exit: EE365 Lecture Notes: Chapter 3 17 EE365 Lecture Notes: Chapter 3 18 Branches and Jumps Branches and Jumps What does bne do really? read $8; read $9, compare set PC = PC + 4 or PC = Target To do compares other than = or!= e.g., blt $8, $9, Target # assembler pseudo-instruction expanded to slt $1, $8, $9 # $1 == ($8 < $9) == ($8 - $9 < 0) bne $1, $0, Target # $0 is always 0 Other MIPS branches/jumps beq $8, $9, imm # if ($8 == $9) PC = PC + imm<<2 else PC += 4 bne... slt, sle, sgt, sge with immediate, unsigned j addr # PC = addr jr $12 # PC = $12 jal addr # $31 = PC+4; PC = addr; used for procedure calls EE365 Lecture Notes: Chapter 3 19 EE365 Lecture Notes: Chapter 3 20

Layers of Software Notation - program: input data -> output data executable: input data -> output data loader: executable file -> executable in memory linker: object files -> executable file assembler: assembly file -> object file compiler: HLL file -> assembly file editor: editor commands -> HLL file Only possible because programs can be manipulated as data MIPS Machine Language All 32-bit instructions A.L. add $1, $2, $3 33222222222211111111110000000000 10987654321098765432109876543210 M.L. 00000000010000110000100000010000 000000 00010 00011 00001 00000 010000 alu-rr 2 3 1 zero add/signed EE365 Lecture Notes: Chapter 3 21 EE365 Lecture Notes: Chapter 3 22 Instruction Format Instruction Format R-format Other R-format: addu, sub, subi, etc opcode rs rt rd shamt funct 6 5 5 5 5 6 Digression: How do you store the number 4,392,976? Same as add $1, $2, $3 Stored program: instructions are represented as numbers programs can be read/written in memory like numbers A.L. lw $1, 100($2) M.L. 100011 00010 00001 0000000001100100 lw 2 1 100 (in binary) I-format opcode rs rt address/immediate 6 5 5 16 EE365 Lecture Notes: Chapter 3 23 EE365 Lecture Notes: Chapter 3 24

Instruction Format Instruction Format I-format also used for ALU ops with immediates addi $1, $2, 100 001000 00010 00001 0000000001100100 What about constants larger than 16 bits = [-32768, 32767] 1100 0000 0000 0000 1111? lui $4, 12 # $4 == 0000 0000 1100 0000 0000 0000 0000 ori $4, $4, 15 # $4 == 0000 0000 1100 0000 0000 0000 1111 beq $1, $2, 7 000100 00001 00010 0000 0000 0000 0111 PC = PC + (0000 0111 << 2) # word offset Finally, J-format opcode addr 6 26 addr is weird in MIPS: 4 MSB of PC // addr // 00 All loads and stores use I-format EE365 Lecture Notes: Chapter 3 25 EE365 Lecture Notes: Chapter 3 26 Summary: Instruction Formats R-format: opcode rs rt rd shamt funct 6 5 5 5 5 6 I-format: opcode rs rt address/immediate 6 5 5 16 J-format: opcode addr 6 26 Instr decode - Theory: Inst bits -> identify instrs -> control signals Procedure Calls See section 3.6 for more details save registers Proc: save more registers set up parameters do function call procedure set up results get results restore more registers restore registers return jal is only special instruction: the rest is software convention Practice: Instruction bits -> control signals EE365 Lecture Notes: Chapter 3 27 EE365 Lecture Notes: Chapter 3 28

Procedure Calls An important data structure is the stack Stack grows from larger to smaller addresses $29 is the stack pointer, it points just beyond valid data Push $2: Pop $2: addi $29, $29, -4 lw $2, 4($29) sw $2, 4($29) addi $29, $29, 4 the order cannot be changed. why? interrupts Procedure Example temp = v[k]; /* code for swap */ v[k] = v[k+1]; v[k+1] = temp; return (temp); swap: addi $29, $29, -8 # save registers sw $15, 4($29) # sw $2, 0($29) sw $16, 8($29) EE365 Lecture Notes: Chapter 3 29 EE365 Lecture Notes: Chapter 3 30 Procedure Example Procedure Example muli $2, $5, 4 # procedure body; $2 == k*4 add $2, $4, $2 # $2 == v + k*4 == address of v[k] lw $15, 0($2) # $15 == v[k] lw $16, 4($2) # $16 == v[k+1] sw $16, 0($2) # v[k] == $16 sw $15, 4($2) # v[k+1] == $15 mov $2, $15 # return value lw $15, 4($29) # lw $2, 0($29); restore registers lw $16, 8($29) addi $29, $29, 8 jr $31 # return return_value = swap(arr_a, i); /* call swap */ mov $4, $18 # first parameter mov $5, $17 # second parameter jal swap mov $20, $2 # copy return value EE365 Lecture Notes: Chapter 3 31 EE365 Lecture Notes: Chapter 3 32

Addressing modes There are many ways of getting data Register addressing Addressing Modes Base addressing (aka displacement) lw $1, 100($2) # $2 == 400, M[500] == 42 add $1, $2, $3 op rs rt address op rs rt rd... funct register 100 register 400 Memory Effective address 42 EE365 Lecture Notes: Chapter 3 33 EE365 Lecture Notes: Chapter 3 34 Addressing Modes Addressing Modes Immediate addressing addi $1, $2, 100 op rs rt immediate PC relative addressing beq $1, $2, 100 # if ($1 == $2) PC = PC + 100 op rs rt address PC Memory Effective address EE365 Lecture Notes: Chapter 3 35 EE365 Lecture Notes: Chapter 3 36

Addressing Modes Addressing Modes Not found in MIPS Indexed: add two registers - base + index Autoupdate lwupdate $1, 4[$2] # $1 == M[4+$2]; $2 == $2 + 4 Indirect: M[M[addr]] - two memory references op rs rt address Autoincrement/decrement: add data size Autoupdate - found in IBM PowerPC, HP PA-RISC like displacement but update register Delay register Effective Memory address EE365 Lecture Notes: Chapter 3 37 EE365 Lecture Notes: Chapter 3 38 Addressing Modes How to Choose ISA for (i = 0, I < N, i +=1) sum += A[i]; $7 - sum, $8 - address of a[i], $9 - N, $2 - tmp, $3 - i*4 inner: new inner: lw $2, 0($8) lwupdate $2, 4($8) addi $8, $8, 4 add $7, $7, $2 add $7, $7, $2 Minimize what? Instrs/prog x cycles/instr x sec/cycle In 1985-1995 technology, simple modes like MIPS good. As technology changes, computer design options change For memory is small, dense instructions important For high speed, pipelining important any problems with new inner? before the loop: sub $8, $8, 4 EE365 Lecture Notes: Chapter 3 39 EE365 Lecture Notes: Chapter 3 40

Intel x-86 (IA-32) Year CPU Comments 1978 8086 16-bit with 8-bit bus from 8080; selected for IBM PC - golden handcuffs! 1980 8087 Floating Point Unit 1982 80286 24-bit addresses, memory-map, protection 1985 80386 32-bit registers and addresses, paging 1989 80486 1992 Pentium 1995 Pentium Pro few changes; 1997 MMX (later) Intel 80386 ISA 8 32-bit registers two register machine: src1/dst src2 reg - reg, reg - immed, reg - mem, mem - reg, mem - imm seven addressing modes 16-bit and 32-bit ops on memory and registers 8-bit prefix to override default data size condition codes + part of normal operation, faster than MIPS extends operation time, often not test result EE365 Lecture Notes: Chapter 3 41 EE365 Lecture Notes: Chapter 3 42 Intel 80386 ISA Current Approach Decoding nightmare instructions 1 to 17 bytes prefixes, postfixes crazy formats - e.g., register specifiers move around but key instructions not terrible yet got to make all work correctly Current technique in Pentium Pro Instruction decode logic translates into RISCy Ops Execution unit runs RISCy ops + Backward compatibility Complex decoding + Execution unit as fast as RISC We work with MIPS to keep it simple and clean Learn x86 on the job! EE365 Lecture Notes: Chapter 3 43 EE365 Lecture Notes: Chapter 3 44

Complex Instructions More powerful instructions not necessarily faster execution E.g. - string copy option 1: move with repeat prefix for memory to memory move special-purpose option 2: use loads into register and then stores generic instructions option 2 faster on the same machine! Concluding Remarks Simple and regular same length instructions, fields in same place Small and fast Small number of registers Compromises inevitable Pipelining (buffet concept) should not be hindered Common case fast Read Chapter 3 EE365 Lecture Notes: Chapter 3 45 EE365 Lecture Notes: Chapter 3 46