Software Code Protection Through Software Obfuscation



Similar documents
Code Obfuscation Literature Survey

Software Protection through Code Obfuscation

Code Obfuscation. Mayur Kamat Nishant Kumar

Applications of obfuscation to software and hardware systems

Implementation of an Obfuscation Tool for C/C++ Source Code Protection on the XScale Architecture *

Obfuscation: know your enemy

Lecture 4 on Obfuscation by Partial Evaluation of Distorted Interpreters

Obfuscating C++ Programs via Control Flow Flattening

Introduction to Program Obfuscation

OBFUSCATING C++ PROGRAMS VIA CONTROL FLOW FLATTENING

SOURCE CODE OBFUSCATION BY MEAN OF EVOLUTIONARY ALGORITHMS

An Iteration Obfuscation Based on Instruction Fragment Diversification and Control Flow Randomization

Second year review WP2 overview SW-based Method. Trento - October 17th, 2008

Surreptitious Software

Lecture 12: Software protection techniques. Software piracy protection Protection against reverse engineering of software

A Framework for Measuring Software Obfuscation Resilience Against Automated Attacks

RE-TRUST. PoliTO Prototypes Overview. Paolo Falcarin, Stefano DiCarlo Alessandro Cabutto, Nicola Garazzino, Davide Barberis

Program Obfuscation: A Quantitative Approach

Code Obfuscation. using. Code Splitting with Self-modifying Code

Copy protection through software watermarking and obfuscation

Introduction. Application Security. Reasons For Reverse Engineering. This lecture. Java Byte Code

Attacking Obfuscated Code with IDA Pro. Chris Eagle

Building secure software-based DRM systems

Platform Independent Code Obfuscation

Obfuscation by Partial Evaluation of Distorted Interpreters

CONFUSE: LLVM-based Code Obfuscation

Software Tamper Resistance: Obstructing Static Analysis of Programs

How to Sandbox IIS Automatically without 0 False Positive and Negative

Obfuscation of Abstract Data-Types

Habanero Extreme Scale Software Research Project

Assessment of Data Obfuscation with Residue Number Coding

Software Reverse Engineering

Wiggins/Redstone: An On-line Program Specializer

Control Flow Obfuscation with Information Flow Tracking

KITES TECHNOLOGY COURSE MODULE (C, C++, DS)

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Lumousoft Visual Programming Language and its IDE

Melde- und Analysestelle Informationssicherung MELANI Torpig/Mebroot Reverse Code Engineering (RCE)

Binary Code Extraction and Interface Identification for Security Applications

Sandy. The Malicious Exploit Analysis. Static Analysis and Dynamic exploit analysis. Garage4Hackers

LASTLINE WHITEPAPER. Why Anti-Virus Solutions Based on Static Signatures Are Easy to Evade

An Introduction to Assembly Programming with the ARM 32-bit Processor Family

Static Analysis of Binary Executables

Next-Generation Protection Against Reverse Engineering

Automating Mimicry Attacks Using Static Binary Analysis

JConstHide: A Framework for Java Source Code Constant Hiding

Using Evolutionary Algorithms to obfuscate code

k-gram Based Software Birthmarks

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

Introduction. What is an Operating System?

Detecting the One Percent: Advanced Targeted Malware Detection

Basic Programming and PC Skills: Basic Programming and PC Skills:

Overview. CISC Developments. RISC Designs. CISC Designs. VAX: Addressing Modes. Digital VAX

Computing Concepts with Java Essentials

Obfuscatorreloaded. Pascal Junod HEIG-VD Julien Rinaldini HEIG-VD Marc Romanens- EIA-FR Jean-Roland Schuler EIA-FR

Parasitics: The Next Generation. Vitaly Zaytsev Abhishek Karnik Joshua Phillips

Limits of Static Analysis for Malware Detection

LANCET: A Nifty Code Editing Tool

ALLIED PAPER : DISCRETE MATHEMATICS (for B.Sc. Computer Technology & B.Sc. Multimedia and Web Technology)

Static Analysis of Virtualization- Obfuscated Binaries

Java Obfuscation Salah Malik BSc Computer Science 2001/2002

C Compiler Targeting the Java Virtual Machine

MICROPROCESSOR AND MICROCOMPUTER BASICS

VLIW Processors. VLIW Processors

Fine-grained covert debugging using hypervisors and analysis via visualization

Sequential Performance Analysis with Callgrind and KCachegrind

Persistent Binary Search Trees

The Research of Multi-point Function Opaque Predicates Obfuscation Algorithm

BCS2B02: OOP Concepts and Data Structures Using C++

Identification and Removal of

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

Compiling Object Oriented Languages. What is an Object-Oriented Programming Language? Implementation: Dynamic Binding

Transparent Monitoring of a Process Self in a Virtual Environment

ELEC 377. Operating Systems. Week 1 Class 3

Lecture 26: Obfuscation

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 20: Stack Frames 7 March 08

PROBLEMS #20,R0,R1 #$3A,R2,R4

AP Computer Science A - Syllabus Overview of AP Computer Science A Computer Facilities

GBIL: Generic Binary Instrumentation Language. Andrew Calvano. September 24 th, 2015

Assessment Plan for CS and CIS Degree Programs Computer Science Dept. Texas A&M University - Commerce

Privacy and Security in library RFID Issues, Practices and Architecture

Java 6 'th. Concepts INTERNATIONAL STUDENT VERSION. edition

Dongwoo Kim : Hyeon-jeong Lee s Husband

Secure application programming in the presence of side channel attacks. Marc Witteman & Harko Robroch Riscure 04/09/08 Session Code: RR-203

Translating C code to MIPS

ARIZONA CTE CAREER PREPARATION STANDARDS & MEASUREMENT CRITERIA SOFTWARE DEVELOPMENT,

Instruction Set Architecture

Code obfuscation techniques for metamorphic viruses

Embedded Programming in C/C++: Lesson-1: Programming Elements and Programming in C

Bachelor of Games and Virtual Worlds (Programming) Subject and Course Summaries

CSC 2405: Computer Systems II

Jakstab: A Static Analysis Platform for Binaries

GENERIC and GIMPLE: A New Tree Representation for Entire Functions

1/20/2016 INTRODUCTION

Topics. Introduction. Java History CS 146. Introduction to Programming and Algorithms Module 1. Module Objectives

CS 111 Classes I 1. Software Organization View to this point:

Feedback-Driven Binary Code Diversification

Glossary of Object Oriented Terms

Transcription:

Software Code Protection Through Software Obfuscation Presented by: Sabu Emmanuel, PhD School of Computer Engineering Nanyang Technological University, Singapore E-mail: asemmanuel@ntu.edu.sg 20, Mar, 2009 IIT, Guwahati 3/21/2009 1

What is Ahead Motivation Software obfuscation techniques Obfuscation space Design obfuscation Layout obfuscation Data flow obfuscation Control flow obfuscation Diablo & PLTO IDA Pro Demos 3/21/2009 2

Software Piracy 3/21/2009 3

Software Piracy 3/21/2009 4

Software Protection Reverse engineering Software piracy Malicious modifications Discovering vulnerabilities Company A Company B Algorithm G O(n* log(n)) Algorithm G O(n) 3/21/2009 5

Software Protection Encryption Vs. Obfuscation Execution time overhead is higher for encryption Protection level is higher for encryption Decryption need to be applied before executing the encrypted program Encryption is high cost approach Run-time decryption is more costly Encryption Protection of software; Obfuscation Reverse engineering too expensive 3/21/2009 6

Software Protection Obfuscation To introduce confusion on the understanding of data in program To confuse the cracker on the execution path of program To increase the complexity of design of automatic reverse engineering tools To prevent attackers from dynamic tracing 3/21/2009 7

Compilation & Reverse Engineering Process Flow 3/21/2009 8

Obfuscation Space Source code level Assembly language level Byte code level (java) Binary level Compile time Link time 3/21/2009 9

Reverse Engineering Reverse Engineering: is the process of recovering higherlevel structure and semantics from a machine code program. Disassembly: machine code assembly language code Decompilation: assembly language code higher level structure and semantics Disassembly Static Linear sweep Recursive traversal Dynamic 3/21/2009 10

Static Disassembly - Linear Sweep global strt_addr, end_addr; proc DisasmLinear(addr) begin while (strt_addr addr < end_addr) do I := decode instruction at address addr; addr += length(i); od end GNU Utility - Objdump proc main() begin strt_addr := address of the first executable byte; endaddr := strt_addr + text section size; DisasmLinear(ep); end 3/21/2009 11

Static Disassembly - Recursive Traversal global strt_addr, end_addr; proc DisasmRec(addr) begin end while (strt_addr addr < end_addr) do if (addr has been visited already) return; I := decode instruction at address addr; mark addr as visited; if (I is a branch or function call) for each possible target t of I do DisasmRec(t); od return; else addr += length(i); od IDA Pro proc main() begin strt_addr := program entry point; end_addr := strt_addr + text section size; DisasmRec(strt_Addr); end 3/21/2009 12

Evaluation Metric Obfuscations are evaluated with respect to: Potency - To what degree is a human reader confused Resilience - How well are automatic deobfuscation attacks resisted Cost - How much time/space overhead is added Stealth - How well does obfuscated code blend in with the original code 3/21/2009 13

Evaluation Metric To be a potent obfuscation: Increase overall program size, introduce new classes and methods Introduce new predicates, increase the nesting level of conditional and looping constructs Increase the number of method arguments and inter-class instance variable dependencies Increase the height of inheritance tree Increase long-range variable dependencies 3/21/2009 14

Evaluation Metric Resiliency of obfuscation: 3/21/2009 15

Obfuscation Classes Design obfuscation Layout obfuscation Data flow obfuscation Control flow obfuscation 3/21/2009 16

Design Obfuscation Design obfuscation: which will obfuscate the design intent of object-oriented software. Class merging Class splitting Type hiding M.Sosonkin et al. Class merging obfuscation transforms a program into another one by merging two or more classes in the program into a single class. Class splitting obfuscation splits a single class in a program into several classes. Type hiding obfuscation uses Java interfaces to obscure the types of objects manipulated by the program. 3/21/2009 17

Design Obfuscation Class Merging M.Sosonkin et al. 3/21/2009 18

Design Obfuscation Class Splitting M.Sosonkin et al. 3/21/2009 19

Layout Obfuscation Source code formatting Comments removal Scrambling identifiers These obfuscations does not affect the execution time and space overheads 3/21/2009 20

Data Flow Obfuscation Change data encoding Promote scalar to object Split variable - one variable is expressed in two or more variables one 8-bit variable obtained from two 4-bit variables Merge variable - two or more variables are expressed in one variable two 4-bit variables merged to obtain an 8-bit variable Flatten array - uses a lower-dimensional array variable for a higher-dimensional array Fold array - expresses a lower-dimensional array variable in a higher-dimensional array Change variable lifetime - converts a global variable to a local variable. Convert static data to procedure This procedure would produce the static data when called 3/21/2009 21

Change Data Encoding Int l = 1; while (l < 10) {..l++; } Int l = 5; while (l < 23) {..l+=2; } I = 2*I+3 3/21/2009 22

Promote Scalar to Object int k = 0; While (k<10) { k++;..; } Int k = new Int(0); While (k.value<10) { k.value++;..; } Promote scalar to object 3/21/2009 23

Array Restructuring 3/21/2009 24

Control Flow Obfuscation Opaque Variables: A variable V is opaque at a point p in a program, if V has a property q at p, which is known at obfuscation time. Opaque Predicates: A predicate P is opaque at p if its outcome, either True or False is known at obfuscation time. { int v, a=2; b=8; V = a + b; /* v is 10 here. */ if (b>9) False if (random(l,7) < 8) True } 3/21/2009 25

Control Flow Obfuscation Opaque Predicates (cntd.): Insert dead or irrelevant code Extending loop conditions While(i<10){ } => While (i< 10) && (3>2) && (1<2){ } All these use opaque predicates. Dynamic opaque predicates - predicates constant over a single program run but varied over different program runs. 3/21/2009 26

Control Flow Obfuscation Computation transformations Insert dead or irrelevant code Extend loop conditions - add opaque predicate in a loop to change control flow graph but to keep the semantics of the program Include a pseudo-code interpreter - choose a section of code in the unobfuscated program, then create a virtual process not existing in the host language and add a pseudo-code interpreter for such a virtual process into the obfuscated program Remove library calls Add redundant operands Parallelize code 3/21/2009 27

Control Flow Obfuscation Aggregation transforms Inlining of procedures Outlining of procedures Interleave procedures Clone procedures Ordering transformations Changes the locality of terms within expressions, statements with basic blocks, basic blocks within procedures, procedures within classes and classes within files. Signal flow approaches Self-modifying codes 3/21/2009 28

Control Flow Flatten C. Wang, J. Hill, J. Knight, and J. Davidson, Software tamper resistance: Obstructing static analysis of programs, University of Virginia, Charlottesville, VA, 2000. SK Udupa, SK Debray, and M. Madou, Deobfuscation: Reverse Engineering Obfuscated Code, in Reverse Engineering, 12th Working Conference on, 2005, pp. 45 54. 3/21/2009 29

Control Flow Flatten If-then-goto constructs + target address of gotos are determined dynamically In C the goto statement is implemented using switch statement Control flow is made data dependent, which makes control flow analysis and data flow analysis complex Chenxi Wang et al. 3/21/2009 30

Control Flow Flatten int a,b; a=1; b=2; while(a<10) { b=a+b; if(b>10) b--; a++; } use(b); (a) Chenxi Wang et al. 3/21/2009 31

Control Flow Flatten Flattened control flow Chenxi Wang et al. 3/21/2009 32

Control Flow Flatten Chenxi Wang et al. 3/21/2009 33

Control Flow Flatten Control flow flatten with Global arrays for dispatch variables and Dummy basic blocks and pointers Udupa et al. 3/21/2009 34

Control Flow Flatten A = Random(1) swvar{a} = 2 swvar{a+1} = 1. Call(a,b) swvar =1 A = Random(2) swvar{a} = 2 swvar{a+1} = 1. Call(a,b) Switch(sWvar) L6:.. L1:. swvar{a+1}; L2:. If swvar{a+8}; else swvar{a+1}; L3:. If swvar{a+2}; else swvar{a+8}; L4:. swvar{a+5}; L5:. swvar{a+1}; L9: Dummy If swvar{a+9}; else swvar{a+1}; goto Udupa et al. 3/21/2009 35

Signal-based Obfuscation I.V. Popov, S.K. Debray, and G.R. Andrews. Binary obfuscation using signals. In Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium table of contents. USENIX Association Berkeley, CA, USA, 2007. 3/21/2009 36

Signal-based Obfuscation Signal-based obfuscation Signal, or software interrupt, is a message mechanism between different processes. Three types of signal SIGINT 2 keyboard interrupt (such as "break" is pressed) SIGILL 4 illegal instruction SIGSEGV 11 illegal memory usage List all signals with command kill -l, list usage with command man 7 signal Igor V. Popov et al. 3/21/2009 37

Signal-based Obfuscation Signal-based obfuscation signal install instructions Install user s signal handler with signal() function Code-before Trap instruction Kernel trap handler User s signal handler Code-after Kernel restore function Igor V. Popov et al. 3/21/2009 38

Signal-based Obfuscation signal install instructions 1. User s signal handler installation instruction is placed so mewhere in _start before the call to libc_start_main 2. User s signal handler is installed with signal() function Code-before Code-before 1. Set flag to inform our signal handler that this trap is raised by us, rather than system 2. Save machine state 3. Reserve stack to save trap instruction address Setup code call Sub fun Addr Code-after Trap instruction Code-after Kernel trap handler Our signal handler 1. Overwrite the kernel restore function s return ad dress with our restore function s address 2. Save the address of trap instruction in stack to ha sh the target address in restore function Kernel restore function Igor V. Popov et al. Our restore function Sub fun Addr 1. Restore machine state when trap happened 2. Hash the target address with trap instruction addr ess, push the address into stack and transfer control to target Subfun address by return instruction 3/21/2009 39

Signal-based Obfuscation Signal-based obfuscation file.c GCC file.o3 PLTO file.obf Relocatable file PLTO file.prof file.counts Profiled file Binary executable counter file containing information on how many times of BBLs and edges have been executed 3/21/2009 40

Self-Modifying Code Techniques Y. Kanzaki, A. Monden, M. Nakamura, and K. ichi Matsumoto; Exploiting self modification mechanism for program protection. Matias Madou, Bertrand Anckaert, Patrick Moseley, Saumya Debray, Bjorn De Sutter, and Koen De Bosschere; Software Protection Through Dynamic Code Mutation. 3/21/2009 41

Exploiting Self-Modification Mechanism for Program Protection Yichiro Kanzaki et al. RR i Restores the dummy instruction Dummy i to target instruction. HR i Hides the target instruction (to dummy instruction). 3/21/2009 42

Software Protection Through Dynamic Code Mutation Call F F() Addresses 1) Edit script 2)Template G() Addresses 1) Edit script 2)Template Call G Opaque Variable Pseudorandom Generator XOR Edit Script Code Editing Engine Template Matias Madou et.al. Uses Edit Engine + Edit Script One pass protection Cluster protection 3/21/2009 43

References [1] C. Collberg, C. Thomborson, and D. Low, Manufacturing cheap, resilient, and stealthy opaque constructs, in Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. ACM New York, NY, USA, 1998, pp. 184 196. [2] A. Majumdar and C. Thomborson, Manufacturing opaque predicates in distributed systems for code obfuscation, in Proceedings of the 29th Australasian Computer Science Conference- Volume 48. Australian Computer Society, Inc. Darlinghurst, Australia, Australia, 2006, pp. 187 196. [3] I.V. Popov, S.K. Debray, and G.R. Andrews, Binary obfuscation using signals, in Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium table of contents. USENIX Association Berkeley, CA, USA, 2007. [4] M. Madou, B. Anckaert, P. Moseley, S. Debray, B. De Sutter, and K. De Bosschere, Software Protection Through Dynamic Code Mutation, LECTURE NOTES IN COMPUTER SCIENCE, vol. 3786, pp. 194,2006. [5] J. Ge, S. Chaudhuri, and A. Tyagi, Control flow based obfuscation, in Proceedings of the 5th ACM workshop on Digital rights management. ACM New York, NY, USA, 2005, pp. 83 92. [6] Y. Kanzaki, A. Monden, M. Nakamura, and K. Matsumoto, Exploiting self-modification mechanism for program protection, in Computer Software and Applications Conference, 2003. COMPSAC 2003. Proceedings. 27th Annual International, 2003, pp. 170 179. 3/21/2009 44

References [7] C. Wang, J. Davidson, J. Hill, and J. Knight, Protection of Software-based Survivability Mechanisms, in Proc. International Conference of Dependable Systems and Networks, July 2001. [8] Chenxi Wang, A Security Architecture for survivability Mechanisms, in Phd thesis, Department of Computer Science, University of Virginia, October 2000. [9] S. Chow, Y. Gu, H. Johnson, and V.A. Zakharov, An Approach to the Obfuscation of Control-Flow of Sequential Computer Programs, In Proc. 4th. Information Security Conference (ISC 2001), pp. 144 155, 2001. [10] C. Wang, J. Hill, J. Knight, and J. Davidson, Software tamper resistance: Obstructing static analysis of programs, University of Virginia, Charlottesville, VA, 2000. [11] B. Anckaert, M. Madou, and K. De Bosschere, A Model for Self-Modifying Code, LECTURE NOTES IN COMPUTER SCIENCE, vol. 3825, pp. 197 204, 2006. [12] M. Madou, B. Anckaert, B. De Sutter, and K. De Bosschere. Hybrid static-dynamic attacks against software protection mechanisms. Proceedings ACM workshop on Digital rights management, 2005. 3/21/2009 45

Obfuscation Prototypes and Demo Diablo link time binary rewriting system PLTO link time binary rewriting system IDA Pro Professional disassembler tool Demo 3/21/2009 46

Thank You!! 3/21/2009 47