MIPS ISA-II: Procedure Calls & Program Assembly

Similar documents
Instruction Set Architecture. or How to talk to computers if you aren t in Star Trek

Instruction Set Architecture

Intel 8086 architecture

Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings:

LSN 2 Computer Processors

Lecture Outline. Stack machines The MIPS assembly language. Code Generation (I)

More MIPS: Recursion. Computer Science 104 Lecture 9

MIPS Assembly Code Layout

Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University

Reduced Instruction Set Computer (RISC)

Instruction Set Architecture (ISA)

CSE 141 Introduction to Computer Architecture Summer Session I, Lecture 1 Introduction. Pramod V. Argade June 27, 2005

Computer Systems Architecture

Overview. CISC Developments. RISC Designs. CISC Designs. VAX: Addressing Modes. Digital VAX

a storage location directly on the CPU, used for temporary storage of small amounts of data during processing.

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

Introduction to MIPS Assembly Programming

Instruction Set Design

Assembly Language Programming

İSTANBUL AYDIN UNIVERSITY

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM


Typy danych. Data types: Literals:

Translating C code to MIPS

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

CPU Organization and Assembly Language

Computer Architecture Lecture 3: ISA Tradeoffs. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013

Exceptions in MIPS. know the exception mechanism in MIPS be able to write a simple exception handler for a MIPS machine

Review: MIPS Addressing Modes/Instruction Formats

EE361: Digital Computer Organization Course Syllabus

An Introduction to Assembly Programming with the ARM 32-bit Processor Family

Figure 1: Graphical example of a mergesort 1.

Chapter 2 Logic Gates and Introduction to Computer Architecture

Computer Architectures

An Overview of Stack Architecture and the PSC 1000 Microprocessor

M A S S A C H U S E T T S I N S T I T U T E O F T E C H N O L O G Y DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

Design Cycle for Microprocessors

ARM Microprocessor and ARM-Based Microcontrollers

Winter 2002 MID-SESSION TEST Friday, March 1 6:30 to 8:00pm

612 CHAPTER 11 PROCESSOR FAMILIES (Corrisponde al cap Famiglie di processori) PROBLEMS

CHAPTER 4 MARIE: An Introduction to a Simple Computer

Computer System: User s View. Computer System Components: High Level View. Input. Output. Computer. Computer System: Motherboard Level

CISC, RISC, and DSP Microprocessors

Computer Organization and Architecture

VLIW Processors. VLIW Processors

Central Processing Unit (CPU)

Introducción. Diseño de sistemas digitales.1

Processor Architectures

Instruction Set Architecture (ISA) Design. Classification Categories

Week 1 out-of-class notes, discussions and sample problems

Property of ISA vs. Uarch?

X86-64 Architecture Guide

Intel 64 and IA-32 Architectures Software Developer s Manual

COS 318: Operating Systems

CHAPTER 7: The CPU and Memory

Introduction to RISC Processor. ni logic Pvt. Ltd., Pune

CS352H: Computer Systems Architecture

Operating System Overview. Otto J. Anshus

what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?

Here is a diagram of a simple computer system: (this diagram will be the one needed for exams) CPU. cache

Memory Management Outline. Background Swapping Contiguous Memory Allocation Paging Segmentation Segmented Paging

2) Write in detail the issues in the design of code generator.

Computer Organization and Components

Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z)

IA-64 Application Developer s Architecture Guide

MICROPROCESSOR. Exclusive for IACE Students iacehyd.blogspot.in Ph: /422 Page 1

MICROPROCESSOR AND MICROCOMPUTER BASICS

EC 362 Problem Set #2

Chapter 2 Topics. 2.1 Classification of Computers & Instructions 2.2 Classes of Instruction Sets 2.3 Informal Description of Simple RISC Computer, SRC

Computer Organization and Components

In the Beginning The first ISA appears on the IBM System 360 In the good old days

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

ARM Architecture. ARM history. Why ARM? ARM Ltd developed by Acorn computers. Computer Organization and Assembly Languages Yung-Yu Chuang

Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Let s put together a Manual Processor

COMP 303 MIPS Processor Design Project 4: MIPS Processor Due Date: 11 December :59

umps software development

Laboratorio di Sistemi Operativi Anno Accademico

1 The Java Virtual Machine

Keil C51 Cross Compiler

Administrative Issues

Pentium vs. Power PC Computer Architecture and PCI Bus Interface

MIPS Assembler and Simulator

An Introduction to the ARM 7 Architecture

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 20: Stack Frames 7 March 08

Q. Consider a dynamic instruction execution (an execution trace, in other words) that consists of repeats of code in this pattern:

Chapter 7D The Java Virtual Machine

Chapter 5 Instructor's Manual

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Pipelining Review and Its Limitations

Embedded Software development Process and Tools: Lesson-4 Linking and Locating Software

LC-3 Assembly Language

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

Chapter 5, The Instruction Set Architecture Level

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

MADRAS: Multi-Architecture Binary Rewriting Tool Technical report. Cédric Valensi University of Versailles Saint-Quentin en Yvelines

1 Classical Universal Computer 3

The AVR Microcontroller and C Compiler Co-Design Dr. Gaute Myklebust ATMEL Corporation ATMEL Development Center, Trondheim, Norway

Motorola 8- and 16-bit Embedded Application Binary Interface (M8/16EABI)

Introduction to Embedded Systems. Software Update Problem

Transcription:

MIPS ISA-II: Procedure Calls & Program Assembly Module Outline Review ISA and understand instruction encodings Arithmetic and Logical Instructions Review memory organization Memory (data movement) instructions Control flow instructions Procedure/Function calls Program assembly, linking, & encoding (2) 1

Reading Reading 2.8, 2.12 Appendix A: A1 - A.6 Practice Problems: 10, 14,23 Goals v Understand the binary encoding of complete program executables o o o o o How can procedures be independently compiled and linked (e.g., libraries)? What makes up an executable? How do libraries become part of the executable? What is the role of the ISA in encoding programs? What constitutes the hardware/software interface (3) Procedure Calls Basic functionality v Transfer of parameters & control to procedure v Transfer of results & control back to the calling program v Support for nested procedures What is so hard about this? v Consider independently compiled code modules o Where are the inputs? o Where should I place the outputs? o Recall: What do you need to know when you write procedures in C? (4) 2

Specifics Where do we pass data v Preferably registers à make the common case fast v Memory as an overflow area Nested procedures v The stack, $fp, $sp and $ra v Saving and restoring machine state Set of rules that developers/compilers abide by v Which registers can am I permitted to use with no consequence? v Caller and callee save conventions for MIPS (5) Basic Parameter Passing.data arg1:.word 22, 20, 16, 4 arg2:.word 33,34,45,8 Register usage What about nested calls? What about excess arguments? loop:.text addi $t0, $0, 4 move $t3, $0 move $t1, $0 move $t2, $0 beq $t0, $0, exit addi $t0, $t0, -1 lw $a0, arg1($t1) lw $a1, arg2($t2) jal func add $t3, $t3, $v0 addi $t1, $t1, 4 addi $t2, $t2, 4 j loop func: sub $v0, $a0, $a1 jr $ra exit: --- PC $31 PC $31 + 4 (6) 3

Leaf Procedure Example C code: int leaf_example (int g, h, i, j) { int f; f = (g + h) - (i + j); return f; } v Arguments g,, j are passed in $a0,, $a3 v f in $s0 (we need to save $s0 on stack we will see why later) v Results are returned in $v0, $v1 argument registers $a0 $a1 $a2 $a3 procedure $v0 result $v1 registers (7) Procedure Call Instructions Procedure call: jump and link jal ProcedureLabel v Address of following instruction put in $ra v Jumps to target address Procedure return: jump register jr $ra v Copies $ra to program counter v Can also be used for computed jumps o e.g., for case/switch statements Example: (8) 4

Leaf Procedure Example MIPS code: leaf_example: addi $sp, $sp, -4 sw $s0, 0($sp) add $t0, $a0, $a1 add $t1, $a2, $a3 sub $s0, $t0, $t1 add $v0, $s0, $zero lw $s0, 0($sp) addi $sp, $sp, 4 jr $ra Save $s0 on stack Procedure body Result Restore $s0 Return (9) Procedure Call Mechanics High Address $fp $sp Old Stack Frame $fp System Wide Memory Map $sp stack arg registers dynamic data New Stack Frame return address Saved registers $gp PC static data text reserved $sp local variables compiler compiler Low Address ISA HW addressing (10) 5

Example of the Stack Frame $fp $sp arg 1 arg 2.. callee saved registers caller saved registers local variables.. $fp $ra $s0-$s9 $a0-$a3 $t0-$t9 Call Sequence 1. place excess arguments 2. save caller save registers ($a0-$a3, $t0-$t9) 3. jal 4. allocate stack frame 5. save callee save registers ($s0-$s9, $fp, $ra) 6 set frame pointer Return 1. place function argument in $v0 2. restore callee save registers 3. restore $fp 4. pop frame 5. jr $31 (11) Policy of Use Conventions Name Register number Usage $zero 0 the constant value 0 $v0-$v1 2-3 values for results and expression evaluation $a0-$a3 4-7 arguments $t0-$t7 8-15 temporaries $s0-$s7 16-23 saved $t8-$t9 24-25 more temporaries $gp 28 global pointer $sp 29 stack pointer $fp 30 frame pointer $ra 31 return address (12) 6

Summary: Register Usage $a0 $a3: arguments (reg s 4 7) $v0, $v1: result values (reg s 2 and 3) $t0 $t9: temporaries v Can be overwritten by callee $s0 $s7: saved v Must be saved/restored by callee $gp: global pointer for static data (reg 28) $sp: stack pointer (reg 29) $fp: frame pointer (reg 30) $ra: return address (reg 31) (13) Non-Leaf Procedures Procedures that call other procedures For nested call, caller needs to save on the stack: v Its return address v Any arguments and temporaries needed after the call Restore from the stack after the call (14) 7

Non-Leaf Procedure Example C code: int fact (int n) { if (n < 1) return f; else return n * fact(n - 1); } v Argument n in $a0 v Result in $v0 (15) Template for a Procedure 1. Allocate stack frame (decrement stack pointer) 2. Save any registers (callee save registers) 3. Procedure body (remember some arguments may be on the stack!) 4. Restore registers (callee save registers) 5. Pop stack frame (increment stack pointer) 6. Return (jr $ra) (16) 8

Non-Leaf Procedure Example } int fact (int n) { callee save if (n < 1) return f; else return n * fact(n - 1); restore (17) Callee save Termination Check MIPS code: fact: Non-Leaf Procedure Example addi $sp, $sp, -8 # adjust stack for 2 items sw $ra, 4($sp) # save return address sw $a0, 0($sp) # save argument slti $t0, $a0, 1 # test for n < 1 beq $t0, $zero, L1 addi $v0, $zero, 1 # if so, result is 1 Leaf Node addi $sp, $sp, 8 # pop 2 items from stack jr $ra # and return L1: addi $a0, $a0, -1 # else decrement n Recursive call jal fact # recursive call lw $a0, 0($sp) # restore original n lw $ra, 4($sp) # and return address Intermediate addi $sp, $sp, 8 # pop 2 items from stack Node mul $v0, $a0, $v0 # multiply to get result jr $ra # and return (18) 9

Module Outline Review ISA and understand instruction encodings Arithmetic and Logical Instructions Review memory organization Memory (data movement) instructions Control flow instructions Procedure/Function calls Program assembly, linking, & encoding (19) The Complete Picture C program Reading: 2.12, A2, A3, A4, A5 compiler Assembly assembler Object module Object library linker executable loader memory (20) 10

The Assembler Create a binary encoding of all native instructions v Translation of all pseudo-instructions v Computation of all branch offsets and jump addresses v Symbol table for unresolved (library) references Create an object file with all pertinent information Header (information) Text segment Data segment Relocation information Example: Symbol table (21) One pass vs. two pass assembly Effect of fixed vs. variable length instructions Time, space and one pass assembly Local labels, global labels, external labels and the symbol table v What does mean when a symbol is unresolved? Absolute addresses and re-location Assembly Process (22) 11

Example.data L1:.word 0x44,22,33,55 # array.text.globl main main: la $t0, L1 li $t1, 4 add $t2, $t2, $zero loop: lw $t3, 0($t0) add $t2, $t2, $t3 addi $t0, $t0, 4 addi $t1, $t1, -1 bne $t1, $zero, loop What changes when you relocate code? 00400000] 3c081001 lui $8, 4097 [L1] [00400004] 34090004 ori $9, $0, 4 [00400008] 01405020 add $10, $10, $0 [0040000c] 8d0b0000 lw $11, 0($8) [00400010] 014b5020 add $10, $10, $11 [00400014] 21080004 addi $8, $8, 4 [00400018] 2129ffff addi $9, $9, -1 [0040001c] 1520fffc bne $9, $0, -16 [loop-0x0040001c] [00400020] 000a082a slt $1, $0, $10 [00400024] 14200003 bne $1, $0, 12 [then-0x00400024] [00400028] 000a8021 addu $16, $0, $10 [0040002c] 0810000d j 0x00400034 [exit] [00400030] 000a8821 addu $17, $0, $10 [00400034] 3402000a ori $2, $0, 10 [00400038] 0000000c syscall bgt $t2, $0, then move $s0, $t2 j exit then: move $s1, $t2 exit: li $v0, 10 syscall Assembly Program Native Instructions Assembled Binary (23) Linker & Loader Linker v Links independently compiled modules v Determines real addresses v Updates the executables with real addresses Loader v As the name implies v Specifics are operating system dependent (24) 12

Program A Assembly A Program B Assembly B cross reference labels Linking header text static data reloc symbol table debug Why do we need independent compilation? What are the issues with respect to independent compilation? references across files (can be to data or code!) absolute addresses and relocation Study: Example on pg. 127 (25) # separate file Example:.text 0x20040004 addi $4, $0, 4 0x20050005 addi $5, $0, 5 000011 jal func_add done 0x0340200a 0x0000000c # separate file.text.globl func_add func_add: add $2, $4, $5 0x00851020 jr $31 0x03e00008 0x00400000 0x20040004 0x00400004 0x20050005 0x00400008? 0x0040000c 0x3402000a 0x00400010 0x0000000c 0x00400014 0x008551020 0x00400018 0x03e00008 Ans: 0x0c100005 (26) 13

Loading a Program Load from image file on disk into memory 1. Read header to determine segment sizes 2. Create virtual address space (later) 3. Copy text and initialized data into memory o Or set page table entries so they can be faulted in 4. Set up arguments on stack 5. Initialize registers (including $sp, $fp, $gp) 6. Jump to startup routine o Copies arguments to $a0, and calls main o When main returns, do exit syscall (27) Dynamic Linking Static Linking v All labels are resolved at link time v Link all procedures that may be called by the program v Size of executables? Dynamic Linking: Only link/load library procedure when it is called v Requires procedure code to be relocatable v Avoids image bloat caused by static linking of all (transitively) referenced libraries v Automatically picks up new library versions (28) 14

Lazy Linkage Indirection table Stub: Loads routine ID, Jump to linker/loader Linker/loader code Dynamically mapped code (29) The Computing Model Revisited Register File (Programmer Visible State) 0x00 0x01 0x02 0x03 Memory Interface stack 0x1F Processor Internal Buses Dynamic Data Data segment (static) Text Segment Program Counter Instruction register Kernel registers Programmer Invisible State Reserved 0xFFFFFFFF Arithmetic Logic Unit (ALU) Memory Map Program Execution and the von Neumann model (30) 15

Instruction Set Architectures (ISA) Instruction set architectures are characterized by several features 1. Operations v Types, precision, size 2. Organization of internal storage v Stack machine v Accumulator v General Purpose Registers (GPR) 3. Memory addressing v Operand location and addressing (31) Instruction Set Architectures 4. Memory abstractions v Segments, virtual address spaces (more later) v Memory mapped I/O (later) 5. Control flow v Condition codes v Types of control transfers conditional vs. unconditiional ISA design is the result of many tradeoffs v Decisions determine hardware implementation v Impact on time, space, and energy Check out ISAs for PowerPC, ARM, x86, SPARC, etc. (32) 16

ARM & MIPS Similarities ARM: the most popular embedded core Similar basic set of instructions to MIPS ARM MIPS Date announced 1985 1985 Instruction size 32 bits 32 bits Address space 32-bit flat 32-bit flat Data alignment Aligned Aligned Data addressing modes 9 3 Registers 15 32-bit 31 32-bit Input/output Memory mapped Memory mapped (33) Compare and Branch in ARM Uses condition codes for result of an arithmetic/logical instruction v Negative, zero, carry, overflow v Compare instructions to set condition codes without keeping the result Each instruction can be conditional v Top 4 bits of instruction word: condition value v Can avoid branches over single instructions Z V C N CPU/Core $0 $1 $31 ALU (34) 17

Instruction Encoding Differences? (35) The Intel x86 ISA Evolution with backward compatibility v 8080 (1974): 8-bit microprocessor o Accumulator, plus 3 index-register pairs v 8086 (1978): 16-bit extension to 8080 o Complex instruction set (CISC) v 8087 (1980): floating-point coprocessor o Adds FP instructions and register stack v 80286 (1982): 24-bit addresses, MMU o Segmented memory mapping and protection v 80386 (1985): 32-bit extension (now IA-32) o Additional addressing modes and operations o Paged memory mapping as well as segments (36) 18

The Intel x86 ISA Further evolution v i486 (1989): pipelined, on-chip caches and FPU v Pentium (1993): superscalar, 64-bit datapath o Later versions added MMX (Multi-Media extension) instructions o The infamous FDIV bug v Pentium Pro (1995), Pentium II (1997) o New microarchitecture (see Colwell, The Pentium Chronicles) v Pentium III (1999) o Added SSE (Streaming SIMD Extensions) and associated registers v Pentium 4 (2001) o New microarchitecture o Added SSE2 instructions (37) The Intel x86 ISA And further v AMD64 (2003): extended architecture to 64 bits v EM64T Extended Memory 64 Technology (2004) o AMD64 adopted by Intel (with refinements) o Added SSE3 instructions v Intel Core (2006) o Added SSE4 instructions, virtual machine support v AMD64 (announced 2007): SSE5 instructions v Intel Advanced Vector Extension (AVX announced 2008) If Intel didn t extend with compatibility, its competitors would! v Technical elegance market success Commonly thought of as a Complex Instruction Set Architecture (CISC) (38) 19

Basic x86 Registers (39) Basic x86 Addressing Modes Two operands per instruction n Source/dest operand Second source operand Register Register Register Immediate Register Memory Memory Register Memory Immediate Memory addressing modes n Address in register n Address = R base + displacement n Address = R base + 2 scale R index (scale = 0, 1, 2, or 3) n Address = R base + 2 scale R index + displacement (40) 20

x86 Instruction Encoding Variable length encoding v Postfix bytes specify addressing mode v Prefix bytes modify operation o Operand length, repetition, locking, (41) Implementing IA-32 Complex instruction set makes implementation difficult v Hardware translates instructions to simpler microoperations o Simple instructions: 1 1 o Complex instructions: 1 many v Microengine similar to RISC v Market share makes this economically viable Comparable performance to RISC v Compilers avoid complex instructions Better code density (42) 21

Fallacies Powerful instruction higher performance v Fewer instructions required v But complex instructions are hard to implement o May slow down all instructions, including simple ones v Compilers are good at making fast code from simple instructions Use assembly code for high performance v But modern compilers are better at dealing with modern processors v More lines of code more errors and less productivity (43) Fallacies Backward compatibility instruction set does not change v But they do accrete more instructions x86 instruction set (44) 22

Instruction complexity is only one variable v lower instruction count vs. higher CPI / lower clock rate Design Principles: v simplicity favors regularity v smaller is faster v good design demands compromise v make the common case fast Instruction set architecture v a very important abstraction indeed! Summary (45) Study Guide Compute number of bytes to encode a SPIM program What does it mean for a code segment to be relocatable? Identify addresses that need to be modified when a program is relocated. v Given the new start address modify the necessary addresses Given the assembly of an independently compiled procedure, ensure that it follows the MIPS calling conventions, modifying it if necessary (46) 23

Study Guide (cont.) Given a SPIM program with nested procedures, ensure that you know what registers are stored in the stack as a consequence of a call Encode/disassemble jal and jr instructions Computation of jal encodings for independently compiled modules How can I make procedure calls faster? v Hint: What about a call is it that takes time? How are independently compiled modules linked into a single executable? (assuming one calls a procedure located in another) (47) Glossary Argument registers Caller save registers Callee save registers Disassembly Frame pointer Independent compilation Labels: local, global, external Linker/loader Linking: static vs. dynamic vs. lazy Native instructions Nested procedures Object file One/two pass assembly Procedure invocation Pseudo instructions Relocatable code Stack frame Stack pointer Symbol table Unresolved symbol (48) 24