Lecture 27 C and Assembly



Similar documents
X86-64 Architecture Guide

Instruction Set Architecture

Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z)

Assembly Language: Function Calls" Jennifer Rexford!

64-Bit NASM Notes. Invoking 64-Bit NASM

CS61: Systems Programing and Machine Organization

CS:APP Chapter 4 Computer Architecture Instruction Set Architecture. CS:APP2e

Simple C Programs. Goals for this Lecture. Help you learn about:

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 20: Stack Frames 7 March 08

Machine-Level Programming II: Arithmetic & Control

Return-oriented programming without returns

Machine Programming II: Instruc8ons

Language Processing Systems

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

How to make the computer understand? Lecture 15: Putting it all together. Example (Output assembly code) Example (input program) Anatomy of a Computer

TODAY, FEW PROGRAMMERS USE ASSEMBLY LANGUAGE. Higher-level languages such

An Introduction to Assembly Programming with the ARM 32-bit Processor Family

x64 Cheat Sheet Fall 2015

Compilers I - Chapter 4: Generating Better Code

Programming from the Ground Up

Notes on Assembly Language

Chapter 7D The Java Virtual Machine

Hacking Techniques & Intrusion Detection. Ali Al-Shemery arabnix [at] gmail

l C-Programming l A real computer language l Data Representation l Everything goes down to bits and bytes l Machine representation Language

Programming from the Ground Up

Programming from the Ground Up. Jonathan Bartlett

1 Classical Universal Computer 3

High-speed image processing algorithms using MMX hardware

A Tiny Guide to Programming in 32-bit x86 Assembly Language

Chapter 4 Processor Architecture

Andreas Herrmann. AMD Operating System Research Center

İSTANBUL AYDIN UNIVERSITY

CHAPTER 7: The CPU and Memory

Automated Repair of Binary and Assembly Programs for Cooperating Embedded Devices

MICROPROCESSOR AND MICROCOMPUTER BASICS

MACHINE ARCHITECTURE & LANGUAGE

a storage location directly on the CPU, used for temporary storage of small amounts of data during processing.

For a 64-bit system. I - Presentation Of The Shellcode

Introduction. Application Security. Reasons For Reverse Engineering. This lecture. Java Byte Code

An Overview of Stack Architecture and the PSC 1000 Microprocessor

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

CSC 2405: Computer Systems II

Where we are CS 4120 Introduction to Compilers Abstract Assembly Instruction selection mov e1 , e2 jmp e cmp e1 , e2 [jne je jgt ] l push e1 call e

2) Write in detail the issues in the design of code generator.

LC-3 Assembly Language

A3 Computer Architecture

Intel Assembler. Project administration. Non-standard project. Project administration: Repository

PART B QUESTIONS AND ANSWERS UNIT I

Informatica e Sistemi in Tempo Reale

Instruction Set Design

Intel 8086 architecture

6.S096 Lecture 1 Introduction to C

Off-by-One exploitation tutorial

The programming language C. sws1 1

Programing the Microprocessor in C Microprocessor System Design and Interfacing ECE 362

Comp 255Q - 1M: Computer Organization Lab #3 - Machine Language Programs for the PDP-8

picojava TM : A Hardware Implementation of the Java Virtual Machine

UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering. EEC180B Lab 7: MISP Processor Design Spring 1995

PROBLEMS (Cap. 4 - Istruzioni macchina)

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

Microprocessor & Assembly Language

Intro to Intel Galileo - IoT Apps GERARDO CARMONA

LSN 2 Computer Processors

Stack Allocation. Run-Time Data Structures. Static Structures

Compiler Construction

CS101 Lecture 26: Low Level Programming. John Magee 30 July 2013 Some material copyright Jones and Bartlett. Overview/Questions

Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings:

ASSEMBLY PROGRAMMING ON A VIRTUAL COMPUTER

Buffer Overflows. Security 2011

Chapter 4 Lecture 5 The Microarchitecture Level Integer JAVA Virtual Machine

Introduction to Embedded Systems. Software Update Problem

Keil C51 Cross Compiler

An Introduction to the ARM 7 Architecture

How To Port A Program To Dynamic C (C) (C-Based) (Program) (For A Non Portable Program) (Un Portable) (Permanent) (Non Portable) C-Based (Programs) (Powerpoint)

Use of Simulator in Teaching Introductory Computer Engineering*

Instruction Set Architecture. or How to talk to computers if you aren t in Star Trek

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

1 Abstract Data Types Information Hiding

W4118 Operating Systems. Junfeng Yang

Computer Systems Architecture

Introduction. What is an Operating System?

Computer Organization and Components

Instruction Set Architecture

Exceptions in MIPS. know the exception mechanism in MIPS be able to write a simple exception handler for a MIPS machine

CS:APP Chapter 4 Computer Architecture. Wrap-Up. William J. Taffe Plymouth State University. using the slides of

C Compiler Targeting the Java Virtual Machine

Systems Design & Programming Data Movement Instructions. Intel Assembly

Administrative Issues

Under The Hood: The System Call

Computer Organization and Architecture

The Hardware/Software Interface CSE 351 Spring 2016

1. General function and functionality of the malware

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Notes on x86-64 programming

Lecture 21: Buffer Overflow Attack. Lecture Notes on Computer and Network Security. by Avi Kak

Computer Architectures

Transcription:

Ananda Gunawardena Lecture 27 C and Assembly This is a quick introduction to working with x86 assembly. Some of the instructions and register names must be check for latest commands and register names. Programming in assembly language requires one to understand the instruction set architecture of the processor. Writing a program in machine language or assembly language is like programming a microprocessor kit. It requires the understanding of low level details of how a machine may execute a set of instructions, fetch-execute cycle among other things. Today most programmers don t deal directly with assembly language, unless the task requires direct interfacing with hardware. For example, a programmer may consider using an assembly language to write a device driver or optimize part of a game program. To understand the assembly code, Let us consider the simple code below. #this is in a file first.s movl $20, %eax movl $10, %ebx The first line of the program is a comment. The.globl assembler directive makes the symbol main visible to the linker. This line makes the program linked up with the C start up library. If we try to remove this line, then we get the following message % gcc first.s /usr/lib/crt1.o: In function `_start : : undefined reference to `main collect2: ld urned 1 exit status The commands like movl $20, %eax means that the bit pattern of 20 is moved to register %eax. Using the GNU C compiler S option, we can generate the assembly code for a source code. For example, consider the program simple.c int main(){ int x=10,y=15; urn 0; } When the program is compiled with S option gcc S simple.c

it produces the assembly code file simple.s as shown below. pushq %rbp movq %rsp, %rbp movl $10, -8(%rbp) movl $15, -4(%rbp) The intent here is to give some level of understanding of how assembly code works. There are traces of the initialization of x and y in the code as well as many uses of esp (stack pointer) and ebp (base pointer) references. The l at the end of each instruction indicates that we are using opcode that works with 32-bit operands. The registers are indicated by % in front and -4(%ebp) for example, indicates a reference to ebp-4 location. For example, movl $10, -4(%ebp) indicates moving the value 10 to ebp-4. Let us look at another example of a C program converted into assembly code. #include <stdio.h> int main(){ printf("hello world\n"); urn 0; } // Assembly code.lc0:.string "hello world" pushq %rbp movq %rsp, %rbp movl $.LC0, %edi call printf In 15-213 you will be required to program in x86 assembly. In this short introduction we only expect to understand how to interp simple assembly programs, how to write simple assembly programs, assemble programs into object code and how to link them up with a C program.

Looping with Assembly Unlike high level languages, assembly language does not have any direct loop constructs. Instead a programmer makes use of command like je (jump equal) to test for loop exit condition. Here is a program to find the factorial 4 and we can assemble and run this program # factorial of 4 # in file factorial.s.lc0:.string "%d \n".global main movl $4, %eax movl $1, %ebx L1: cmpl $0, %eax je L2 imul %eax, %ebx decl %eax jmp L1 L2: movl %ebx, 4(%esp) movl $.LC0, (%esp) call printf The program correctly prints out the factorial of 4. Caution: This program may not contain all assembly directives needed to compile and run. If you type this into a file and run, it may segfault. Functions A function is a piece of code that is designed to perform a subtask in the program. Functions can have local variables, receive arguments, and pass a result back to the calling program. Consider the following subroutine foo that urn the value 4 to main. # in file subroutine.c and subroutine.s.globl foo.type foo, @function foo: pushl %ebp movl %esp, %ebp movl $4, %eax popl %ebp.lc0:.string "The value is %d \n" pushl %ebp

movl %esp, %ebp subl %eax, %esp call foo movl %eax, -4(%ebp) movl -4(%ebp), %eax movl %eax, 4(%esp) movl $.LC0, (%esp) call printf A function is called with the instruction call foo. This instruction transfers the control of the program to foo function. The two special registers ebp (base pointer) and esp (stack pointer) handles call and urn mechanisms of function calls. The values are urned to the calling program via register eax. Stack A stack data structure allows two operations, push and pop. Both operations are handled from the top of the stack. Imagine a stack of books, where you can add a book to the top (push) and remove a book from the top (pop). Stacks are useful data structures for saving the status of a program before branching out. For example, when a subroutine is called from main, the status of the environment is pushed into the stack. Upon urn from the subroutine, the status variables are pop from the stack to restore the calling program status. The key operations of a stack are the push and pop. For example, popl %eax places the top element of the stack on the register eax and change the stack pointer esp. The stack pointer (esp) points to the top of the stack. To decrement the stack pointer esp we can simply add 4 as follows. Stacks may grow upward or downward. The instruction addl $4, %esp causes the stack pointer to point to the next 4 bytes of memory. Note that if the stack grows downward, then a push operation subtracts 4 from esp and pop operation adds 4 to esp. Local Variables C programs generally define local variables scope as the module where they are defined. The registers are used to manipulate the variables, but local variables are generally stored in the stack. Some variables, perhaps declared as register variables may hold space in a register, but most variables do not get register space but instead allocated space in the stack. Consider the following program that defines a local variable of value 10 and push that into the stack. Upon exit from foo all local variables are removed from the stack. In

the above code, call foo causes the address of the instruction after call foo is to be saved in the stack. # in file local.s.global main movl $3, %eax call foo # save the address of the next instruction on stack subl $4, %eax foo: pushl %ebp # save the address of base pointer movl %esp, %ebp # move the base pointer to where stack pointer movl $10, %ebx # push the local variable value to a register pushl %ebx # push the value of register to stack movl %ebp, %esp # restore the esp upon urn popl %ebp # restore the base pointer # pop the stack to see where to urn Mixing C and Assembly C programs and assembly programs can be mixed to form a complete program. For example, consider the following C program // file in main.c #include <stdio.h> int main(){ int i = foo(5); printf( The value is %d \n, i); urn 0; } Now consider the function foo written in assembly # file in foo.s.global foo foo: movl 4(%esp), %eax # (esp+4) contains the value 5 imull %eax, %eax # multiply the register eax by itself # urn values are given back thru eax Both programs can be compiled, linked and run using gcc main.c foo.s./a.out Global Variables A global variable in a C program is declared outside of any function (including main). For example, the following program defines a global variable x as follows. // in file global.c

int x = 10; int main( ) { int y = x; } The assembly output produced by C compiler includes a global identification of the variable x..globl x.data.align 4.type x, @object.size x, 4 x:.long 10 We have provided only a part of the output. The full assembly code can be generated by compiling the file global.c > gcc S global.c Conclusion This brief introductory lecture was intended to give you some very basic facts about assembly language, the stack, linking assembly and C programs. Here are few more references. References: [1] From C to Assembly Linux Gazette, Issue 94, September 2003 - Hiran Ramakutty [2] Guide to Assembly Language Programming in Linux (Paperback) by Sivarama P. Dandamudi, Springer-Verlag, 2005 [3] Computer Systems: A Programmer's Perspective by Randal E. Bryant and David R. O'Hallaron Prentice Hall, 2003. [ 15-213 textbook]