64-Bit NASM Notes. Invoking 64-Bit NASM



Similar documents
Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z)

Hacking Techniques & Intrusion Detection. Ali Al-Shemery arabnix [at] gmail

5. Calling conventions for different C++ compilers and operating systems

x64 Cheat Sheet Fall 2015

Test Driven Development in Assembler a little story about growing software from nothing

X86-64 Architecture Guide

Andreas Herrmann. AMD Operating System Research Center

Abysssec Research. 1) Advisory information. 2) Vulnerable version

Lecture 27 C and Assembly

Assembly Language: Function Calls" Jennifer Rexford!

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 20: Stack Frames 7 March 08

Systems Design & Programming Data Movement Instructions. Intel Assembly

The Plan Today... System Calls and API's Basics of OS design Virtual Machines

Buffer Overflows. Security 2011

Return-oriented programming without returns

Instruction Set Architecture

For a 64-bit system. I - Presentation Of The Shellcode

Off-by-One exploitation tutorial

Stack Overflows. Mitchell Adair

CS61: Systems Programing and Machine Organization

CS:APP Chapter 4 Computer Architecture Instruction Set Architecture. CS:APP2e

OS Virtualization Frank Hofmann

A Tiny Guide to Programming in 32-bit x86 Assembly Language

An Introduction to Assembly Programming with the ARM 32-bit Processor Family

Intel 8086 architecture

CHAPTER 6 TASK MANAGEMENT

High-speed image processing algorithms using MMX hardware

Format string exploitation on windows Using Immunity Debugger / Python. By Abysssec Inc

Assembly Language Tutorial

THE VELOX STACK Patrick Marlier (UniNE)

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer

Notes on x86-64 programming

Intel Vanderpool Technology for IA-32 Processors (VT-x) Preliminary Specification

Overview of IA-32 assembly programming. Lars Ailo Bongo University of Tromsø

Understanding a Simple Operating System

Exploiting nginx chunked overflow bug, the undisclosed attack vector

Introduction. Figure 1 Schema of DarunGrim2

Software Fingerprinting for Automated Malicious Code Analysis

TODAY, FEW PROGRAMMERS USE ASSEMBLY LANGUAGE. Higher-level languages such

Using the RDTSC Instruction for Performance Monitoring

Machine-Level Programming I: Basics

W4118 Operating Systems. Junfeng Yang

Instruction Set Architecture

Lecture 26: Obfuscation

An introduction to the Return Oriented Programming. Why and How

Introduction. Application Security. Reasons For Reverse Engineering. This lecture. Java Byte Code

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

esrever gnireenigne tfosorcim seiranib

Using MMX Instructions to Convert RGB To YUV Color Conversion

Syscall Proxying - Simulating remote execution Maximiliano Caceres <maximiliano.caceres@corest.com> Copyright 2002 CORE SECURITY TECHNOLOGIES

Computer Architectures

Software Vulnerabilities

1. General function and functionality of the malware

Computer Architectures

Machine-Level Programming II: Arithmetic & Control

Attacking x86 Windows Binaries by Jump Oriented Programming

System V Application Binary Interface AMD64 Architecture Processor Supplement Draft Version

8. MACROS, Modules, and Mouse

PC Assembly Language. Paul A. Carter

Computer Organization and Architecture

System/Networking performance analytics with perf. Hannes Frederic Sowa

612 CHAPTER 11 PROCESSOR FAMILIES (Corrisponde al cap Famiglie di processori) PROBLEMS

Dynamic Behavior Analysis Using Binary Instrumentation

The programming language C. sws1 1

Computer Organization and Assembly Language

AES Flow Interception : Key Snooping Method on Virtual Machine. - Exception Handling Attack for AES-NI -

Sources: On the Web: Slides will be available on:

and Symbiotic Optimization

Automated Repair of Binary and Assembly Programs for Cooperating Embedded Devices

Binary Code Extraction and Interface Identification for Security Applications

Informatica e Sistemi in Tempo Reale

How Compilers Work. by Walter Bright. Digital Mars

Chapter 7D The Java Virtual Machine

Faculty of Engineering Student Number:

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

CSC 2405: Computer Systems II

Administration. Instruction scheduling. Modern processors. Examples. Simplified architecture model. CS 412 Introduction to Compilers

Unpacked BCD Arithmetic. BCD (ASCII) Arithmetic. Where and Why is BCD used? From the SQL Server Manual. Packed BCD, ASCII, Unpacked BCD

Bypassing Windows Hardware-enforced Data Execution Prevention

Under The Hood: The System Call

CSC230 Getting Starting in C. Tyler Bletsch

x86-64 Machine-Level Programming

Computer Organization and Components

Compiler Construction

Practical taint analysis for protecting buggy binaries

Automatic Logging of Operating System Effects to Guide Application-Level Architecture Simulation

The Beast is Resting in Your Memory On Return-Oriented Programming Attacks and Mitigation Techniques To appear at USENIX Security & BlackHat USA, 2014

Language Processing Systems

How To Use A Computer With A Screen On It (For A Powerbook)

Introduction to Reverse Engineering

TitanMist: Your First Step to Reversing Nirvana TitanMist. mist.reversinglabs.com

The IA-32 processor architecture

Programming from the Ground Up. Jonathan Bartlett

Harnessing Intelligence from Malware Repositories

Bachelors of Computer Application Programming Principle & Algorithm (BCA-S102T)

Pentium vs. Power PC Computer Architecture and PCI Bus Interface

IA-32 Intel Architecture Software Developer s Manual

Transcription:

64-Bit NASM Notes The transition from 32- to 64-bit architectures is no joke, as anyone who has wrestled with 32/64 bit incompatibilities will attest We note here some key differences between 32- and 64-bit Intel assembly language programming, both in general and with NASM specifically It s a good idea to know, for various operating systems, how to detect the underlying version/architecture (on Unix-like platforms, uname -a does the trick) Invoking 64-Bit NASM NASM became 64-bit capable as of version 2.0: invoke nasm -v to check the version you re running When assembling, make sure to specify a 64-bit format elf64 for most 64-bit Linux architectures macho64 for 64-bit Mac OS X win64 for 64-bit Windows Given the right object files, no command changes should be necessary when linking via gcc

Registers The primary new capability in 64-bit architectures is the ability to operate on a quadword s worth of data in a single instruction Addressable memory, both virtual and physical, becomes larger by virtue of 64-bit pointers/addresses Structurally, most registers are larger (64 bits wide, duh), and there are more of them (16 general-purpose registers vs. 8 in 32-bit Intel CPUs) Registers eax, ebx, ecx, edx, ebp, esp, esi, and edi are now 64-bit: rax, rbx, rcx, rdx, rbp, rsp, rsi, rdi The new general-purpose registers are r8, r9, r10, r11, r12, r13, r14, and r15 these are also available in 32-bit flavors r8d r15d Most of the time, operands that are smaller than 64 bits zero-extend to 64 bits The default operand size is 32 bits except when pushing/popping the stack: that s 64- or 16-bit only When in doubt, consult Chapter 3 of the Intel 64 and IA-32 Architectures Software Developer s Manual, Volume 1

Calling Conventions Calling conventions are platform-specific, each with official documentation typically called the application binary interface (ABI) For Linux, Windows, and Mac OS X respectively, the specifications can be found at: http://www.x86-64.org/documentation/abi.pdf http://msdn2.microsoft.com/en-gb/library/ms794533.aspx http://developer.apple.com/mac/library/documentation/ DeveloperTools/Conceptual/LowLevelABI Some calling convention highlights on 64-bit Linux: Integer/pointer parameters are placed, in order, in rdi, rsi, rdx, rcx, r8, and r9 Floating-point arguments go to the xmm registers Variable-argument subroutines require a value in rax for the number of vector registers used Registers rbp, rbx, and r12 through r15 are callerowned the called function must preserve them (either don t touch them, or save-and-restore via the stack or other mechanism) Integer/pointer urn values are placed in rax or possibly rdx; floating point goes in xmm0 or xmm1

64-Bit Examples The following listings include direct conversions of some of the 32-bit examples in Prof. Toal s x86assembly and nasmexamples pages to 64-bit Linux Note how, aside from calling conventions and selected conversion to 64-bit registers, not much has actually changed i.e., the main concepts of good assembly language programming remain the same Conversion to other 64-bit Intel operating systems is left as an interesting and beneficial exercise :) Of course, we start with helloworld global _start _start: ; write(1, message, 13) mov eax, 4 ; system call 4 is write mov ebx, 1 ; file handle 1 is stdout mov ecx, message ; address of string to output mov edx, 13 ; number of bytes int 80h No changes here, since we use interrupts instead of subroutines! ; exit(0) mov eax, 1 ; system call 1 is exit mov ebx, 0 ; we want urn code 0 int 80h message: db "Hello, World", 10 The version that uses printf is another story: compare this to the 32-bit version main: mov rdi, message ; rdi gets the first argument (a pointer) xor rax, rax ; printf has a variable number of arguments, ; so rax needs to be set to the number of ; vector registers used...zero in this case message: db 'Hello, World', 10, 0

powers.asm needs a similar makeover since it also uses printf note the use of the stack for register preservation section.data format: db '%d', 10, 0 main: L1: mov esi, 1 ; current value mov edi, 31 ; counter push rsi ; save registers push rdi mov rdi, format ; address of format string ; second argument, the current number, is already in rsi xor eax, eax ; zero vector registers (eax is OK) pop rdi ; restore registers pop rsi add esi, esi ; double value dec edi ; keep counting jne L1 main: print: push rbx ; we have to save this since we use it ; 32-bit operands will zero-extend to 64 bits mov ecx, 40 ; ecx will countdown from 40 to 0 xor eax, eax ; eax will hold the current number xor ebx, ebx ; ebx will hold the next number inc ebx ; ebx is originally 1 ; We need to, but we are using eax, ebx, and ecx. ; printf may destroy eax and ecx so we will save these before ; the call and restore them afterwards. push rax ; 32-bit stack operands are not encodable push rcx ; in 64-bit mode, so we use the "r" names mov rdi, format ; arg 1 is a pointer mov rsi, rax ; arg 2 is the current number xor eax, eax ; no vector registers in use 64-bit fib.asm must preserve the caller-owned rbx register pop pop rcx rax mov edx, eax ; save the current number mov eax, ebx ; next number is now current add ebx, edx ; get the new next number dec ecx ; count down jnz print ; if not done counting, do some more pop rbx ; restore ebx before urning format: db '%10d', 10, 0

maxofthree in 32- and 64-bit incarnations global maxofthree maxofthree: mov eax, [esp + 4] mov ecx, [esp + 8] mov edx, [esp + 12] cmp eax, ecx cmovl eax, ecx cmp eax, edx cmovl eax, edx global maxofthree maxofthree: cmp edi, esi ; compare args 1 and 2 cmovl edi, esi ; set edi to the larger cmp edi, edx ; compare against arg 3 cmovl edi, edx ; set edi to the larger mov eax, edi ; urn value in rax (note how we can use the operands right away; urn value remains expected in eax) #include <stdio.h> both work with the same C source (why?). int maxofthree(int, int, int); int main() { printf("%d\n", maxofthree(1, -4, -7)); printf("%d\n", maxofthree(2, -6, 1)); printf("%d\n", maxofthree(2, 3, 1)); printf("%d\n", maxofthree(-2, 4, 3)); printf("%d\n", maxofthree(2, -6, 5)); printf("%d\n", maxofthree(2, 4, 6)); urn 0; } 64-bit does not change how main is still just a function but accordingly, command line arguments need to be processed using the new ABI, as seen in 64-bit echo.asm main: top: mov rcx, rdi ; argc mov rdx, rsi ; argv push rcx ; save registers that printf wastes push rdx mov rdi, format ; the format string mov rsi, [rdx] ; the argument string to display xor rax, rax ; zero vector registers pop rdx ; restore registers printf used pop rcx add rdx, 8 ; point to next argument dec rcx ; count down jnz top ; if not done counting keep going format: db '%s', 10, 0