Automated Repair of Binary and Assembly Programs for Cooperating Embedded Devices Eric Schulte 1 Jonathan DiLorenzo 2 Westley Weimer 2 Stephanie Forrest 1 1 Department of Computer Science University of New Mexico Albuquerque, NM 87131-0001 2 Department of Computer Science University of Virginia Charlottesville, VA 22904-4740 March 19, 2013 ASPLOS 2013 0
i Outline Background Technical Approach Empirical Results Distributed Program Repair Discussion Conclusion b; i if(a! a; int argc, []) main(int int mai argc, char *argv[]) ASPLOS 2013 1
Embedded Devices Computing Device Sales 500 tablet pc smartphone 400 Million Units 300 200 100 0 2008 2009 2010 2011 Years [Maier, 2011] ASPLOS 2013 2
Embedded Devices Resource Constraints Small disks Less memory Slow processors Slow, costly comm. ASPLOS 2013 2
Genprog: Automatically Repairing Software Bugs Use of algorithmic and heuristic methods to search for, generate, and evaluate program repairs. Strengths Repaired 55/105 bugs for $8 each [Le Goues et al., 2012a] Repairs multiple classes of bugs and security defects [Weimer et al., 2009] Wins human-competitive awards [Forrest et al., 2009; Le Goues et al., 2012b] Limitations Requires source code Requires build tool chain Requires program instrumentation Expensive fitness function (compilation, test execution) ASPLOS 2013 3
Software Repair Algorithm: Baseline Input Evolve Fitness Evaluation Output Accept X Discard Mutants ASPLOS 2013 4
Software Repair Algorithm: Contributions if(a==0) puts(b); while(b!=0) How do we mutate? Where do we mutate? if(a>b) puts(a); a=a-b; else b=b-a; ASPLOS 2013 5
Software Repair Algorithm: Contributions if(a==0) puts(b); while(b!=0) How do we mutate? Where do we mutate? if(a>b) puts(a); a=a-b; else b=b-a; ASPLOS 2013 5
ASM and ELF Program Representations Source 1 if (a ==0){ 2 printf ("%g\n", b); } 3 else { 4 while (b!=0){ 5 if (a>b){ a=a- b; } 6 else { b=b-a; } } } 7 printf ("%g\n", a); AST printf("%g\n",b); a=a-b; if(a==0) if(a>b) while(b!=0) else b=b-a; printf("%g\n",a); ASPLOS 2013 6
ASM and ELF Program Representations Source 1 if (a ==0){ 2 printf ("%g\n", b); } 3 else { 4 while (b!=0){ 5 if (a>b){ a=a- b; } 6 else { b=b-a; } } } 7 printf ("%g\n", a); ASM.file "gcd.c".globl main.type main, @function main:.cfi startproc pushq %rbp.cfi def cfa offset 16.cfi offset 6, -16 movq %rsp, %rbp.cfi def cfa register 6 subq $48, %rsp AST if(a==0) printf("%g\n",b); if(a>b) a=a-b; while(b!=0) printf("%g\n",a); else b=b-a; ASPLOS 2013 6
ASM and ELF Program Representations Source 1 if (a ==0){ 2 printf ("%g\n", b); } 3 else { 4 while (b!=0){ 5 if (a>b){ a=a- b; } 6 else { b=b-a; } } } 7 printf ("%g\n", a); AST if(a==0) printf("%g\n",b); if(a>b) a=a-b; while(b!=0) printf("%g\n",a); else b=b-a; ASM.file "gcd.c".globl main.type main, @function main:.cfi startproc pushq %rbp.cfi def cfa offset 16.cfi offset 6, -16 movq %rsp, %rbp.cfi def cfa register 6 subq $48, %rsp ELF ELF\? ELF header program header table section 1....text section [55] [48 89 e5] [48 83 ec 20] [48 89 7d e8] [89 75 e4] [83 7d e4 01] [7e 60]...... section n section header table ASPLOS 2013 6
ASM and ELF Program Mutations Delete Insert Swap movq 8(%rdx), %rdi xorl %eax, %eax movq %rdx, -80(%rbp) addl $1, %r14d call atoi movq -80(%rbp), %rdx movl %eax, (%r15) addq $4, %r15 movq 8(%rdx), %rdi xorl %eax, %eax addl $1, %r14d call atoi movq %rdx, -80(%rbp) movl %eax, (%r15) addq $4, %r15 movq 8(%rdx), %rdi xorl %eax, %eax movq -80(%rbp), %rdx addl $1, %r14d call atoi movq -80(%rbp), %rdx movl %eax, (%r15) addq $4, %r15 movq 8(%rdx), %rdi xorl %eax, %eax movq -80(%rbp), %rdx addl $1, %r14d call atoi movq %rdx, -80(%rbp) movq -80(%rbp), %rdx movl %eax, (%r15) addq $4, %r15 movq 8(%rdx), %rdi xorl %eax, %eax movq %rdx, -80(%rbp) addl $1, %r14d call atoi movq -80(%rbp), %rdx movl %eax, (%r15) addq $4, %r15 movq 8(%rdx), %rdi xorl %eax, %eax movq -80(%rbp), %rdx addl $1, %r14d call atoi movq %rdx, -80(%rbp) movl %eax, (%r15) addq $4, %r15 (... additional ELF bookkeeping... ) ASPLOS 2013 7
Genprog: Automatically Repairing Software Bugs if(a==0) puts(b); while(b!=0) How do we mutate? Where do we mutate? if(a>b) puts(a); a=a-b; else b=b-a; ASPLOS 2013 8
Genprog: Automatically Repairing Software Bugs if(a==0) puts(b); while(b!=0) How do we mutate? Where do we mutate? if(a>b) puts(a); a=a-b; else b=b-a; ASPLOS 2013 8
Genprog: Automatically Repairing Software Bugs if(a==0) puts(b); while(b!=0) How do we mutate? Where do we mutate? if(a>b) puts(a); a=a-b; else b=b-a; Fault Localization ASPLOS 2013 8
Light Weight Fault Localization 1. Sample program counter. 2. Translate memory addresses to program offsets. 3. Smooth sample with Gaussian convolution. CPU movq 8(%rdx), %rdi xorl %eax, %eax movl %eax, (%r15) addl $1, %r14d call atoi movq -80(%rbp), %rdx movq %rdx, -80(%rbp) addq $4, %r15 movq 8(%rdx), %rdi xorl %eax, %eax movl %eax, (%r15) Machine-code Instructions ASPLOS 2013 9
Light Weight Fault Localization 1. Sample program counter. 2. Translate memory addresses to program offsets. 3. Smooth sample with Gaussian convolution. CPU movq 8(%rdx), %rdi xorl %eax, %eax movl %eax, (%r15) addl $1, %r14d call atoi movq -80(%rbp), %rdx movq %rdx, -80(%rbp) addq $4, %r15 movq 8(%rdx), %rdi xorl %eax, %eax movl %eax, (%r15) Machine-code Instructions ASPLOS 2013 9
Light Weight Fault Localization 1. Sample program counter. 2. Translate memory addresses to program offsets. 3. Smooth sample with Gaussian convolution. CPU memory addr. to instruction movq 8(%rdx), %rdi xorl %eax, %eax movl %eax, (%r15) addl $1, %r14d call atoi movq -80(%rbp), %rdx movq %rdx, -80(%rbp) addq $4, %r15 movq 8(%rdx), %rdi xorl %eax, %eax movl %eax, (%r15) Machine-code Instructions ASPLOS 2013 9
Light Weight Fault Localization 1. Sample program counter. 2. Translate memory addresses to program offsets. 3. Smooth sample with Gaussian convolution. CPU memory addr. to instruction movq 8(%rdx), %rdi xorl %eax, %eax movl %eax, (%r15) addl $1, %r14d call atoi movq -80(%rbp), %rdx movq %rdx, -80(%rbp) addq $4, %r15 movq 8(%rdx), %rdi xorl %eax, %eax movl %eax, (%r15) Machine-code Instructions ASPLOS 2013 9
Fault Localization Comparison Execution Count 700 600 500 400 300 200 100 0 Merge Sort Full Trace Sampling 0x4005D8 0x400510 0x400768 0x4006A0 Address 0x400B50 0x400A88 0x4009C0 0x4008F8 0x400830 ASPLOS 2013 10
Genprog: Automatically Repairing Software Bugs Use of algorithmic and heuristic methods to search for, generate, and evaluate program repairs. Strengths Repaired 55/105 bugs for $8 each [Le Goues et al., 2012a] Repairs multiple classes of bugs and security defects [Weimer et al., 2009] Wins human-competitive awards [Forrest et al., 2009; Le Goues et al., 2012b] Limitations Requires source code Requires build tool chain Requires program instrumentation Expensive fitness function (compilation, test execution) ASPLOS 2013 11
Genprog: Automatically Repairing Software Bugs Use of algorithmic and heuristic methods to search for, generate, and evaluate program repairs. Strengths Repaired 55/105 bugs for $8 each [Le Goues et al., 2012a] Repairs multiple classes of bugs and security defects [Weimer et al., 2009] Wins human-competitive awards [Forrest et al., 2009; Le Goues et al., 2012b] Limitations Requires source code Requires build tool chain Requires program instrumentation Expensive fitness function (compilation, test execution) ASPLOS 2013 11
Benchmark Programs Program Program Description Bug atris graphical tetris game local stack buffer exploi ccrypt encryption utility segfault deroff document processing segfault flex lexical analyzer generator segfault indent source code processing infinite loop look svr4 dictionary lookup infinite loop look ultrix dictionary lookup infinite loop merge merge sort duplicate inputs merge-cpp merge sort (in C++) duplicate inputs s3 sendmail utility buffer overflow uniq duplicate text processing segfault units metric conversion segfault zune embedded media player infinite loop ASPLOS 2013 12
Empirical Results Effective 62% faster runtime 95% smaller disk footprint 86% less memory Total bugs repaired Rep. Num. Bugs AST 13 ASM 12 ELF 11 Average success rate 100 runs per bug Rep. Success Rate AST 78.17% ASM 70.75% ELF 65.83% ASPLOS 2013 13
Empirical Results Effective 62% faster runtime 95% smaller disk footprint 86% less memory Expected fitness evaluations Rep. Evaluations AST 583.98 ASM 188.38 ELF 207.15 Total runtime Rep. Sec. AST 229.50 ASM 278.30 ELF 74.20 ASPLOS 2013 13
Empirical Results Effective 62% faster runtime 95% smaller disk footprint 86% less memory Example: Merge Sort Repair by representation AST, 2 of 4900 Swaps ASM, 1 of 280 Deletes merge.c 44 if( left [l-mid -1] <= right [0]){ 45 result = list ; } 46 else {/* fix : swap branches */ 47 result = merge (left,l-mid, 48 right, mid ); } merge.s 210 cmpl % eax, % edx ; fix : del. 211 jg.l12 212 movq -72(% rbp ), % rax ASPLOS 2013 13
Empirical Results Disk size Effective 62% faster runtime 95% smaller disk footprint 86% less memory Rep. AST ASM ELF Requirements Source code & build toolchain Assembly code & linker Compiled executable ASPLOS 2013 13
Empirical Results Working memory Effective 62% faster runtime 95% smaller disk footprint 86% less memory Rep. MB AST 1402 ASM 756 ELF 200 ASPLOS 2013 13
Distributed Genetic Repair Algorithm 1 4 2 3 ASPLOS 2013 14
Distributed Genetic Repair Algorithm 1 4 2 3 ASPLOS 2013 14
Distributed Genetic Repair Algorithm delete(45) 1 swap(23,86) 4 2 insert(4,23) 3 delete(4) ASPLOS 2013 14
Distributed Genetic Repair Evaluation Relative performance of DGA Expected Wall Clock # Nodes Fitness Evaluations Seconds w/sms 1 1 1 1 2 0.94 0.89 1.07 3 0.84 0.67 0.81 4 0.80 0.55 0.63 ASPLOS 2013 15
Distributed Genetic Repair Evaluation Relative performance of DGA Expected Wall Clock # Nodes Fitness Evaluations Seconds w/sms 1 1 1 1 2 0.94 0.89 1.07 3 0.84 0.67 0.81 4 0.80 0.55 0.63 ASPLOS 2013 15
Distributed Genetic Repair Evaluation Relative performance of DGA Expected Wall Clock # Nodes Fitness Evaluations Seconds w/sms 1 1 1 1 2 0.94 0.89 1.07 3 0.84 0.67 0.81 4 0.80 0.55 0.63 ASPLOS 2013 15
Discussion ASM & ELF search space Search space System protection Program size 3 more assembly instructions than C statements Search space size program size = alphabet Possible program coverage Possible Programs Reachable Programs Original Program ASPLOS 2013 16
Discussion Search space System protection System protection Arbitrary assembly is arbitrarily dangerous Light weight sandboxing (ulimit and chroot) ASPLOS 2013 16
Conclusion ASM and ELF representation Remove requirement for source code and build toolchain Language Agnostic; x86 or ARM assembly or ELF Change program repair search space Reduce resources; 95% smaller disk footprint, 86% less memory, 62% faster runtime Distributed Genetic Program Repair Allows multiple devices to collaborate Fewer fitness evaluations and faster runtime ASPLOS 2013 17
Conclusion Take away Assembly mutations cause semantic changes! Software is not brittle! ASPLOS 2013 17
Conclusion Take away Assembly mutations cause semantic changes! Software is not brittle! ASPLOS 2013 17
Thank You Contact Name Email Homepage Eric Schulte eschulte@cs.unm.edu http://cs.unm.edu/~eschulte Code Program Repair Tool ptrace Tracer ELF (C) ELF (Common Lisp) http://genprog.cs.virginia.edu http://github.com/eschulte/tracer http://github.com/eschulte/rw-elf http://github.com/eschulte/elf ASPLOS 2013 18
Bibliography S. Forrest, W. Weimer, T. Nguyen, and C. Le Goues. A genetic programming approach to automated software repair. In Genetic and Evolutionary Computing Conference, 2009. C. Le Goues, M. Dewey-Vogt, S. Forrest, and W. Weimer. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In International Conference on Software Engineering, 2012a. C. Le Goues, W. Weimer, and S. Forrest. Representations and operators for improving evolutionary software repair. In Proceedings of the fourteenth international conference on Genetic and evolutionary computation conference, pages 959 966. ACM, 2012b. D. Maier. Sales of smartphones and tablets to exceed pcs. Practical Ecommerce, October 2011. http://www.practicalecommerce.com/articles/ 3069-Sales-of-Smartphones-and-Tablets-to-Exceed-PCs-. W. Weimer, T. Nguyen, C. Le Goues, and S. Forrest. Automatically finding patches using genetic programming. In International Conference on Software Engineering, pages 364 367, 2009. ASPLOS 2013 19
Backup Slides ASPLOS 2013 20
Backup: Wall Clock Times Mean wall clock time in seconds to find a successful repair DGA Naïve Parallel # Nodes Seconds Rounds Seconds 1 205.531 2 173.868 43.2 195.821 3 135.17 28.2 201.346 4 115.566 14.5 211.989 ASPLOS 2013 21
Backup: Software Mutational Robustness Semantic Space, Specification & Test Suite Specification Program Test Suite ASPLOS 2013 22
Backup: Software Mutational Robustness Semantic Space, Specification & Test Suite Specification Program 1 Test Suite ASPLOS 2013 22
Backup: Software Mutational Robustness Semantic Space, Specification & Test Suite 2 Specification Program 1 Test Suite ASPLOS 2013 22
Backup: Software Mutational Robustness Semantic Space, Specification & Test Suite 2 Specification Program 1 3 Test Suite ASPLOS 2013 22
Backup: Software Mutational Robustness Semantic Space, Specification & Test Suite 2 Specification 4 Program 1 3 Test Suite ASPLOS 2013 22
Backup: Software Mutational Robustness Semantic Space, Specification & Test Suite 2 Specification 4 5 Program 1 3 Test Suite ASPLOS 2013 22