Compiler-Assisted Binary Parsing



Similar documents
Performance Characterization of SPEC CPU2006 Integer Benchmarks on x Architecture

Analysis of Memory Sensitive SPEC CPU2006 Integer Benchmarks for Big Data Benchmarking

Achieving QoS in Server Virtualization

Computer Architecture

An OS-oriented performance monitoring tool for multicore systems

How Much Power Oversubscription is Safe and Allowed in Data Centers?

IDEA: A New Intrusion Detection Data Source

Full and Para Virtualization

DELL VS. SUN SERVERS: R910 PERFORMANCE COMPARISON SPECint_rate_base2006

CPU PERFORMANCE COMPARISON OF TWO CLOUD SOLUTIONS: VMWARE VCLOUD HYBRID SERVICE AND MICROSOFT AZURE

Fine-Grained User-Space Security Through Virtualization. Mathias Payer and Thomas R. Gross ETH Zurich

SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs

RISC-V Software Ecosystem. Andrew Waterman UC Berkeley

Streamline Computing Linux Cluster User Training. ( Nottingham University)

Oracle Database Reliability, Performance and scalability on Intel Xeon platforms Mitch Shults, Intel Corporation October 2011

Benchmarking the Amazon Elastic Compute Cloud (EC2)

HPSA Agent Characterization

Performance Monitoring of the Software Frameworks for LHC Experiments

Drupal Performance Tuning

System Requirements Table of contents

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM

icer Bioinformatics Support Fall 2011

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

Demystifying Deduplication for Backup with the Dell DR4000

Chapter 14 Analyzing Network Traffic. Ed Crowley

Gadge Me If You Can Secure and Efficient Ad-hoc Instruction-Level Randomization for x86 and ARM

AgencyPortal v5.1 Performance Test Summary Table of Contents

PERFORMANCE TOOLS DEVELOPMENTS

EEM 486: Computer Architecture. Lecture 4. Performance

GRIDCENTRIC VMS TECHNOLOGY VDI PERFORMANCE STUDY

Multi-Threading Performance on Commodity Multi-Core Processors

Lecture 36: Chapter 6

Oracle Database Scalability in VMware ESX VMware ESX 3.5

UT69R000 MicroController Software Tools Product Brief

CPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate:

Very Large Enterprise Network, Deployment, Users

Building a More Efficient Data Center from Servers to Software. Aman Kansal

MAGENTO HOSTING Progressive Server Performance Improvements

1 Review Information About this Guide

Practical taint analysis for protecting buggy binaries

Efficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing

Transparent Optimization of Grid Server Selection with Real-Time Passive Network Measurements. Marcia Zangrilli and Bruce Lowekamp

GUEST OPERATING SYSTEM BASED PERFORMANCE COMPARISON OF VMWARE AND XEN HYPERVISOR

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Improving Job Scheduling by using Machine Learning & Yet an another SLURM simulator. David Glesser, Yiannis Georgiou (BULL) Denis Trystram(LIG)

The Impact of Memory Subsystem Resource Sharing on Datacenter Applications. Lingia Tang Jason Mars Neil Vachharajani Robert Hundt Mary Lou Soffa

Virtualization for Cloud Computing

Pattern Insight Clone Detection

How To Trace

umps software development

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

IaaS Performance and Value Analysis A study of performance among 14 top public cloud infrastructure providers

Intel Cloud Builders Guide to Cloud Design and Deployment on Intel Platforms

Applying Data Analysis to Big Data Benchmarks. Jazmine Olinger

1/20/2016 INTRODUCTION

ProTrack: A Simple Provenance-tracking Filesystem

CODE FACTORING IN GCC ON DIFFERENT INTERMEDIATE LANGUAGES

Virtuoso and Database Scalability

Pydgin: Generating Fast Instruction Set Simulators from Simple Architecture Descriptions with Meta-Tracing JIT Compilers

Testing for Security

Operating System Impact on SMT Architecture

Virtualization in Linux a Key Component for Cloud Computing

Kashif Iqbal - PhD Kashif.iqbal@ichec.ie

Red Hat Developer Toolset 1.1

Outline. hardware components programming environments. installing Python executing Python code. decimal and binary notations running Sage

Polyglot: Automatic Extraction of Protocol Message Format using Dynamic Binary Analysis

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

Very Large Enterprise Network Deployment, 25,000+ Users

Replication on Virtual Machines

High level code and machine code

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

FACT: a Framework for Adaptive Contention-aware Thread migrations

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis

FlashSoft for VMware vsphere VM Density Test

Rally Installation Guide

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster

Analyzing ChromeOS s Boot Performance

Practical Memory Checking with Dr. Memory

SGI. High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems. January, Abstract. Haruna Cofer*, PhD

Process-Level Virtualization for Runtime Adaptation of Embedded Software

1 Bull, 2011 Bull Extreme Computing

MySQL and Virtualization Guide

Implementation of a Purely Hardware-assisted VMM for x86 Architecture

Computer Architecture. Secure communication and encryption.

Transcription:

Compiler-Assisted Binary Parsing Tugrul Ince tugrul@cs.umd.edu PD Week 2012 26 27 March 2012

Parsing Binary Files Binary analysis is common for o Performance modeling o Computer security o Maintenance o Binary modification Parsing: first step in most binary analyses o Not straight-forward o Time consuming 2

Objective Improve parsing speed and accuracy Store more data in binary files o Basic block locations o Edge information (source, target, type) Binary analysis tools read this extra information o Create basic block, edge, and finally CFG abstractions 3

Difficulties in Parsing Distinguishing code and data Disassembly is tricky o Identifying functions o Finding instruction boundaries Variable-length instruction set architectures Building Control Flow Graphs o Identify Basic Block boundaries o Identify edges between basic blocks 4

Compiler Assistance for Parsing Developed new compilation mechanism o Wrappers for GNU compiler suite (gcc/g++) o Transparent to the end user Support most standard flags o Pass flags to underlying system compiler o Intercept output flags (-c, -S, -o, etc.) Augments binary files with tables o Basic Block Table o Edge Table 5

Compiler Infrastructure Analyze intermediate assembly files o Generate information about basic blocks and edges o Store in a section that is not loaded at runtime 6

Basic Block -Edge Tables 7

Assembly Modification Function Model o Block of code o type @function o.size Modifications o Add Basic Block and Edge Tables o Add shadow symbol 8

Merge Duplicate Functions Weak functions are merged by linker o Functions included multiple times o Binary code might slightly differ o Only one weak function survives Tables cannot be merged o Need to uniquely match functions and tables o Use shadow symbol in function to extract file name o Use file name and function name to identify tables 9

Reconstruction Binary analysis tools operate on executables directly o No interaction with the compiler 10

Reconstruction Parsing a functions involves: o Finding the shadow symbol stored in the function File name is extracted o Locating Basic Block and Edge Tables with the function name and file name pair o Reading in the tables o Adding function start address to offsets o Creating basic block and edge abstractions No need to parse individual instructions 11

Benchmarks o SPEC CINT2006 o PETSc snes package o Firefox (v. 9.0.1) Systems Evaluation o 64-bit Linux machines o server: 24-core Intel Xeon, 48 GB total memory o laptop: AMD Turion, 2 GB total memory Methodology o Executed running time experiments 5 times o Reporting mean 12

Normalized Parsing Time SPEC CINT2006 1 Benchmark GNU Tables astar 0.21 0.05 bzip2 0.29 0.07 gcc 21.91 5.6 gobmk 4.78 1.35 h264ref 2.24 0.54 hmmer 1.56 0.42 0.9 0.8 0.7 0.6 0.5 Benchmark GNU Tables libquantum 0.19 0.05 mcf 0.06 0.02 omnetpp 3.67 1.11 perlbench 7.65 2 sjeng 0.79 0.18 Xalan 20.06 7.07 0.4 0.3 0.2 0.1 0 13

Normalized Parsing Time PETScsnesPackage 1 Benchmark GNU Tables 0.9 ex14 28.67 6.81 ex18 30.18 7.28 0.8 ex19 29.72 6.98 0.7 ex1f 29.56 7.17 ex1 30.01 6.83 0.6 ex20 30.15 7.07 0.5 ex21 29.24 6.87 ex22 29.62 7.05 0.4 ex23 29.95 7.25 0.3 Benchmark GNU Tables ex24 29.62 7.15 ex25 29.65 7.02 ex26 29.53 7.08 ex27 29.58 7.08 ex29 29.72 7.10 ex2 28.65 6.84 ex30 30.07 7.21 ex31 29.65 7.32 Benchmark GNU Tables ex34f90 31.02 7.48 ex3 29.34 7.00 ex42 28.68 6.77 ex43 28.53 6.87 ex5f90 30.32 7.09 ex5f 29.70 7.07 ex5 29.56 6.97 ex6 28.86 6.89 0.2 0.1 0 14

Normalized Parsing Time 1 0.9 0.8 0.7 0.6 0.5 0.4 Firefox Version 9.0.1 Benchmark GNU Tables firefox 0.29 0.08 libbrowsercomps 0.64 0.17 libdbusservice 0.04 0.01 libfreebl3 1.90 0.47 libmozalloc 0.02 0.01 libmozgnome 0.13 0.04 libmozsqlite3 3.91 1.08 Benchmark GNU Tables libnkgnomevfs 0.11 0.03 libnspr4 1.47 0.40 libnss3 8.06 2.23 libnssckbi 0.64 0.20 libnssdbm3 1.19 0.32 libnssutil3 0.43 0.12 Benchmark GNU Tables libplc4 0.08 0.03 libplds4 0.04 0.01 libsmime3 1.11 0.31 libsoftokn3 1.83 0.60 libssl3 1.45 0.38 libxpcom 0.02 0.01 0.3 0.2 0.1 0 15

Build Time Metrics File Size Without Debug With Debug Compilation Time SPEC CINT2006 2.21x 1.38x 1.25x PETSc 1.50x 1.09x 1.32x Firefox 1.17x 1.21x 1.13x OVERALL 1.63x 1.23x 1.23x File size increase on disk o Not reflected to memory footprint Small increase in compilation time o One time cost o Not reflected to running time performance 16

Runtime Metrics Memory Footprint Running Time SPEC CINT2006 1.00x 0.97x PETSc 1.00x 0.95x Firefox 1.00x 0.94x OVERALL 1.00x 0.95x Virtually no change in runtime metrics o Memory requirement is almost constant o Change in running time is within noise Hard to measure Firefox running time o No workload o Use V8 Benchmark 17

V8 Benchmarks for Firefox V8 Benchmark Value Firefox with gcc 2549.2 Firefox with our mechanism 2587.6 V8: JavaScript benchmark o Higher scores are better o Cannot be converted to time No significant change in performance 18

Limitations / Future Work Hand-written assembly o When branches use offsets in assembly 2n more symbols (n: number of functions) Compilation takes 23% more time o Integrate compilation mechanism into gcc File size increases o Compress tables about 78% compression ratio 19

Conclusion Developed a new compilation mechanism o Creates Basic Block and Edge Tables o Transparent to end user Improved parsing speed o On average 73% decrease in parsing time o No memory or runtime overhead 20