Return-oriented programming without returns



Similar documents
Assembly Language: Function Calls" Jennifer Rexford!

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 20: Stack Frames 7 March 08

The Beast is Resting in Your Memory On Return-Oriented Programming Attacks and Mitigation Techniques To appear at USENIX Security & BlackHat USA, 2014

Stitching the Gadgets On the Ineffectiveness of Coarse-Grained Control-Flow Integrity Protection

Attacking x86 Windows Binaries by Jump Oriented Programming

Abysssec Research. 1) Advisory information. 2) Vulnerable version

Return-oriented Programming: Exploitation without Code Injection

Off-by-One exploitation tutorial

Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z)

Buffer Overflows. Security 2011

Instruction Set Architecture

CS61: Systems Programing and Machine Organization

Software Vulnerabilities

CS:APP Chapter 4 Computer Architecture Instruction Set Architecture. CS:APP2e

Hacking Techniques & Intrusion Detection. Ali Al-Shemery arabnix [at] gmail

X86-64 Architecture Guide

64-Bit NASM Notes. Invoking 64-Bit NASM

Machine-Level Programming II: Arithmetic & Control

Stack Overflows. Mitchell Adair

I Control Your Code Attack Vectors Through the Eyes of Software-based Fault Isolation. Mathias Payer, ETH Zurich

Lecture 27 C and Assembly

Software Fingerprinting for Automated Malicious Code Analysis

Heap-based Buffer Overflow Vulnerability in Adobe Flash Player

l C-Programming l A real computer language l Data Representation l Everything goes down to bits and bytes l Machine representation Language

derop: Removing Return-Oriented Programming from Malware

esrever gnireenigne tfosorcim seiranib

Hardware-Assisted Fine-Grained Control-Flow Integrity: Towards Efficient Protection of Embedded Systems Against Software Exploitation

G-Free: Defeating Return-Oriented Programming through Gadget-less Binaries

Bypassing Windows Hardware-enforced Data Execution Prevention

Software security. Buffer overflow attacks SQL injections. Lecture 11 EIT060 Computer Security

Practical taint analysis for protecting buggy binaries

telnetd exploit FreeBSD Telnetd Remote Exploit Für Compass Security AG Öffentliche Version 1.0 Januar 2012

Computer Organization and Architecture

Introduction. Application Security. Reasons For Reverse Engineering. This lecture. Java Byte Code

Syscall Proxying - Simulating remote execution Maximiliano Caceres <maximiliano.caceres@corest.com> Copyright 2002 CORE SECURITY TECHNOLOGIES

Systems Design & Programming Data Movement Instructions. Intel Assembly

An introduction to the Return Oriented Programming. Why and How

Intel 8086 architecture

Where s the FEEB? The Effectiveness of Instruction Set Randomization

MSc Computer Science Dissertation

Gadge Me If You Can Secure and Efficient Ad-hoc Instruction-Level Randomization for x86 and ARM

A Tiny Guide to Programming in 32-bit x86 Assembly Language

x64 Cheat Sheet Fall 2015

Hotpatching and the Rise of Third-Party Patches

There s a kernel security researcher named Dan Rosenberg whose done a lot of linux kernel vulnerability research

Evaluating a ROP Defense Mechanism. Vasilis Pappas, Michalis Polychronakis, and Angelos D. Keromytis Columbia University

Administration. Instruction scheduling. Modern processors. Examples. Simplified architecture model. CS 412 Introduction to Compilers

Application-Specific Attacks: Leveraging the ActionScript Virtual Machine

Betriebssysteme KU Security

CS 161 Computer Security

Machine Programming II: Instruc8ons

TitanMist: Your First Step to Reversing Nirvana TitanMist. mist.reversinglabs.com

Andreas Herrmann. AMD Operating System Research Center

CHAPTER 6 TASK MANAGEMENT

Fighting malware on your own

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

Chapter 7D The Java Virtual Machine

Violating Database - Enforced Security Mechanisms

CS:APP Chapter 4 Computer Architecture Instruction Set Architecture

For a 64-bit system. I - Presentation Of The Shellcode

Automating Mimicry Attacks Using Static Binary Analysis

Chapter 4 Processor Architecture

Bypassing Memory Protections: The Future of Exploitation

Out Of Control: Overcoming Control-Flow Integrity

Introduction. Figure 1 Schema of DarunGrim2

Bypassing Browser Memory Protections in Windows Vista

WLSI Windows Local Shellcode Injection. Cesar Cerrudo Argeniss (

Buffer Overflows. Code Security: Buffer Overflows. Buffer Overflows are everywhere. 13 Buffer Overflow 12 Nov 2015

Computer Organization and Assembly Language

CS 152 Computer Architecture and Engineering. Lecture 22: Virtual Machines

The Plan Today... System Calls and API's Basics of OS design Virtual Machines

An Introduction to Assembly Programming with the ARM 32-bit Processor Family

Computer Organization and Components

風 水. Heap Feng Shui in JavaScript. Alexander Sotirov.

Intel Assembler. Project administration. Non-standard project. Project administration: Repository

Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings:

TODAY, FEW PROGRAMMERS USE ASSEMBLY LANGUAGE. Higher-level languages such

Securing software by enforcing data-flow integrity

REpsych. : psycholigical warfare in reverse engineering. def con 2015 // domas

1. General function and functionality of the malware

Vigilante: End-to-End Containment of Internet Worms

Computer Architectures

How to Sandbox IIS Automatically without 0 False Positive and Negative

Format string exploitation on windows Using Immunity Debugger / Python. By Abysssec Inc

Transparent ROP Detection using CPU Performance Counters. 他 山 之 石, 可 以 攻 玉 Stones from other hills may serve to polish jade

Exploiting nginx chunked overflow bug, the undisclosed attack vector

General Introduction

Understanding a Simple Operating System

Using fuzzing to detect security vulnerabilities

Linux exploit development part 2 (rev 2) - Real app demo (part 2)

High-speed image processing algorithms using MMX hardware

Overview of IA-32 assembly programming. Lars Ailo Bongo University of Tromsø

Windows XP SP3 Registry Handling Buffer Overflow

Phoenix Technologies Ltd.

Dynamically Translating x86 to LLVM using QEMU

Transcription:

Faculty of Computer Science Institute for System Architecture, Operating Systems Group Return-oriented programming without urns S. Checkoway, L. Davi, A. Dmitrienko, A. Sadeghi, H. Shacham, M. Winandy Dresden, 2010-10-20

Fundamental problem with stacks User input gets written to the stack. x86 allows to specify only read/write rights. Idea: Create programs so that memory pages are either writable or executable, never both. W ^ X paradigm Software: OpenBSD W^X, PaX, RedHat ExecShield Hardware: Intel XD, AMD NX, ARM XN

A perfect W^X world User input ends up in writable stack pages. No execution of this data possible problem solved. But: existing code assumes executable stacks Windows contains a DLL function to disable execution prevention used e.g. for IE <= 6 Nested functions: GCC generates trampoline code on stack

Circumventing W^X We cannot anymore: execute code on the stack directly We still can: Place data on the stack Format string attacks, non-stack overflows, Idea: modify urn address to start of function known to be available e.g., a libc function such as execve() put additional parameters on stack, too urn-to-libc attack

Chaining urns Not restricted to a single function: Modify stack to urn to another function after the first: Param 1 for bar <addr bar> Param 1 for foo Param 2 for foo <addr foo> ESP And why only urn to function beginnings?

Return anywhere x86 instructions have variable lengths (1 16 bytes) x86 allows jumping (urning) to an arbitrary address Idea: scan binaries/libs and find all possible instructions Native RETs: 0xC3 RET bytes within other instructions, e.g. MOV %EAX, %EBX 0x89 0xC3 ADD $1000, %EBX 0x81 0xC3 0x00 0x10 0x00 0x00

Return anywhere Example instruction stream:.. 0x72 0xf2 0x01 0xd1 0xf6 0xc3 0x02 0x74 0x08.. 0x72 0xf2 jb <-12> 0x01 0xd1 add %edx, %ecx 0xf6 0xc3 0x02 test $0x2, %bl 0x74 0x08 je <+8> Three byte forward:.. 0x72 0xf2 0x01 0xd1 0xf6 0xc3 0x02 0x74 0x08.. 0xd1 0xf6 0xc3 shl, %esi

Many different RETs Claim: Any sufficiently large code base e.g. libc, libqt,... consists of 0xC3 bytes == RET with sufficiently many different prefixes == a few x86 instructions terminating in RET (in [Sha07]: gadget) sufficiently many : /lib/libc.so.6 on Ubuntu 10.4 ~17,000 sequences (~6,000 unique)

Return-Oriented Programming Return addresses jump to code gadgets performing a small amount of work Stack contains Data arguments Chain of addresses urning to gadgets Claim: This is enough to write arbitrary programs (and thus: shell code). Return-oriented Programming

ROP: Load constant into register EIP pop %edx 0x00C0FFEE Return Addr ESP EDX: Stack

ROP: Load constant into register EIP pop %edx 0x00C0FFEE ESP EDX: Stack

ROP: Load constant into register EIP pop %edx 0x00C0FFEE ESP Stack EDX: 0x00C0FFEE

ROP: Add 23 to EAX EIP (1) EAX: 19 EDX: 0 (4) ptr to 23 (3) (1) (2) ESP (2) pop %edi (3) pop %edx EDI: 0 23 (4) addl (%edx), %eax push %edi

ROP: Add 23 to EAX (1) EAX: 19 EDX: 0 (4) ptr to 23 EIP (2) pop %edi EDI: 0 (3) (1) (2) ESP (3) pop %edx 23 (4) addl (%edx), %eax push %edi

ROP: Add 23 to EAX (1) EAX: 19 EDX: 0 (4) ptr to 23 EIP (2) pop %edi EDI: addr of (1) (3) (1) (2) ESP (3) pop %edx 23 (4) addl (%edx), %eax push %edi

ROP: Add 23 to EAX (1) EAX: 19 EDX: 0 (4) ptr to 23 (3) ESP (2) pop %edi EDI: addr of (1) (1) (2) EIP (3) pop %edx 23 (4) addl (%edx), %eax push %edi

ROP: Add 23 to EAX (1) EAX: 19 EDX: addr of '23' (4) ptr to 23 ESP (2) pop %edi EDI: addr of (1) (3) (1) (2) EIP (3) pop %edx 23 (4) addl (%edx), %eax push %edi

ROP: Add 23 to EAX (1) EAX: 19 EDX: addr of '23' (4) ptr to 23 ESP (2) pop %edi EDI: addr of (1) (3) (1) (2) (3) pop %edx 23 EIP (4) addl (%edx), %eax push %edi

ROP: Add 23 to EAX (1) EAX: 42 EDX: addr of '23' (4) ptr to 23 ESP (2) pop %edi EDI: addr of (1) (3) (1) (2) (3) pop %edx 23 EIP (4) addl (%edx), %eax push %edi

ROP: Add 23 to EAX (1) EAX: 42 EDX: addr of '23' (1) (4) ptr to 23 ESP (2) pop %edi EDI: addr of (1) (3) (1) (2) (3) pop %edx 23 EIP (4) addl (%edx), %eax push %edi

Return-oriented programming More samples in the paper it is assumed to be Turing-complete. Problem: need to use existing gadgets, limited freedom Yet another limitation, but no show stopper. Good news: Writing ROP code can be automated, there is a C-to-ROP compiler.

ROP protection Assuming use of RETs: Detect abnormal frequency of executed RETs Ensure LIFO principle for stack pointer Compile binaries without 0xC3 bytes Shadow urn stack Other: Address-space layout randomization Runtime CFI checking

ROP without RETs Dissecting RET: 2 operations at once Memory-indirect JMP (modifies control flow) Update processor state (stack pop on x86, register load on ARM) Is it necessary to use it? No! RET-less compilers show exactly this. Just use some sequence that does exactly the same: pop %edx // modifies stack jmp *(%edx) // indirect jump

Update-load-branch Update: update control structures to point to next gadget Load: load next gadget's address Branch: Jump Problem: occurs much less frequent than RET Solution: use exactly one Update-Load-Branch sequence as a trampoline reserve a register as pointer to trampoline then: all sequences ending in indirect jmp through register can serve as gadgets

The many faces of update-load-branch Any pop X; jmp *X sequence suffices. Doubly indirect jump JMP on x86 can have register or memory operand Use memory operand: adversary data can contain a table of usable gadgets sequence catalog May even contain immediate operands, such as jmp *4(%edx) Both, jmp and ljmp are valid.

ROP gadgets without RET Debian libc contains no ULB sequence! add Mozilla's libxul and libphp or customize attack to target application Trampoline from libxul uses %ebx Trampoline address stored in %edx Gadgets must end with jmp *(%edx) Chose 34 sequences to construct 19 gadgets to show Turingcompleteness of approach. Only a subset of possible sequences Still far fewer than the 6,000 RET sequences in my libc

Load register / Store memory pop %eax sub %dh, %bl jmp *(%edx) mov 4(%eax), %ecx jmp *(%edx) mov %esi, -0xb(%eax) jmp *(%edx)

Not-so-difficult gadgets Move within memory: combine Load from memory to register Store from register to memory Arithmetic negate, phase 1: (Goal: %esi := - <val>) xor %esi, %esi // %esi := 0 jmp (%edx) // trampoline Arithmetic negate, getting tricky: subl -0x7D(%ebp, %ecx, 1), %esi // %esi := - (%ebp + 1*%ecx 0x7D) // requires // %ebp == <val> + 0x7D - <jmp target> jmp (%ecx) // next gadget

Set-less-than Goal: if (a < b) result = -1; else result = 0;

Getting the attack to run Need attacks that don't require any RET Stack overflow: Don't overflow RET address (would violate LIFO order) Instead overwrite higher-level function's local data, especially if this is later used for determining where to branch Overwrite SETJMP buffers Overwrite C++ vtables and function pointers Deemed practically impossible without use of RET

Discussion Is CFI the ultimate solution? Overhead More code more gadgets? but all jmp sequences look identical CFI vs. JIT compilation??? Allowing JNI on Android (or in any JVM) is obviously broken. Is everything lost?