A Brief Introduction to Static Analysis



Similar documents
A Static Analyzer for Large Safety-Critical Software. Considered Programs and Semantics. Automatic Program Verification by Abstract Interpretation

How To Develop A Static Analysis System For Large Programs

Infer: An Automatic Program Verifier for Memory Safety of C Programs

The Course.

Static Analysis. Find the Bug! : Analysis of Software Artifacts. Jonathan Aldrich. disable interrupts. ERROR: returning with interrupts disabled

Introduction to Static Analysis for Assurance

IKOS: A Framework for Static Analysis based on Abstract Interpretation (Tool Paper)

Secure Software Programming and Vulnerability Analysis

Program Analysis: Theory and Practice

Linux Kernel. Security Report

Grigore Rosu Founder, President and CEO Professor of Computer Science, University of Illinois

Towards practical reactive security audit using extended static checkers 1

Automatic Verification by Abstract Interpretation

Reasoning about Safety Critical Java

Static Checking of C Programs for Vulnerabilities. Aaron Brown

Static Analysis of the Mars Exploration Rover Flight Software

Source Code Security Analysis Tool Functional Specification Version 1.0

Comprehensive Static Analysis Using Polyspace Products. A Solution to Today s Embedded Software Verification Challenges WHITE PAPER

Týr: a dependent type system for spatial memory safety in LLVM

Visualizing Information Flow through C Programs

Trustworthy Software Systems

Towards Checking the Usefulness of Verification Tools

The Software Model Checker BLAST: Applications to Software Engineering

Oracle Solaris Studio Code Analyzer

Verification of Device Drivers and Intelligent Controllers: a Case Study

HybriDroid: Analysis Framework for Android Hybrid Applications

Towards a Framework for Generating Tests to Satisfy Complex Code Coverage in Java Pathfinder

Software safety - DEF-STAN 00-55

Probabilistic Assertions

Static analysis: from theory to practice

Soundness by Static Analysis and False-alarm Removal by Statistical Analysis: Our Airac Experience

Software Testing & Analysis (F22ST3): Static Analysis Techniques 2. Andrew Ireland

A puzzle. Suppose that you want to run some JavaScript programs---all the cool kids are doing it.

The software model checker BLAST

Rigorous Software Development CSCI-GA

The FDA Forensics Lab, New Tools and Capabilities

TOOL EVALUATION REPORT: FORTIFY

Static analysis of numerical programs

A Case for Static Analyzers in the Cloud (Position paper)

Model Checking: An Introduction

Coverability for Parallel Programs

Monitoring Java enviroment / applications

Static Analysis of Dynamic Properties - Automatic Program Verification to Prove the Absence of Dynamic Runtime Errors

Pluggable Type Systems. Gilad Bracha

Automated Theorem Proving - summary of lecture 1

Static Analysis for Software Verification. Leon Moonen

Chair of Software Engineering. Software Verification. Assertion Inference. Carlo A. Furia

The CompCert verified C compiler

InvGen: An Efficient Invariant Generator

Regression Verification: Status Report

Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification

Database Application Developer Tools Using Static Analysis and Dynamic Profiling

Static Program Transformations for Efficient Software Model Checking

Know or Go Practical Quest for Reliable Software

A Memory Model for Static Analysis of C Programs

Testing static analyzers with randomly generated programs

Static Taint-Analysis on Binary Executables

Optimizing Generation of Object Graphs in Java PathFinder

Combining Static Analysis and Test Generation for C Program Debugging

Comparing the Effectiveness of Penetration Testing and Static Code Analysis

Astrée: Building a Static Analyzer for Real-Life Programs

A Test Suite for Basic CWE Effectiveness. Paul E. Black.

Common Errors in C/C++ Code and Static Analysis

Fully Automated Static Analysis of Fedora Packages

Securing Network Software using Static Analysis

µz An Efficient Engine for Fixed points with Constraints

Applications of formal verification for secure Cloud environments at CEA LIST

Continuous Code-Quality Assurance with SAFE

SSVChecker: Unifying Static Security Vulnerability Detection Tools in an Eclipse Plug-In

Gold Standard Method for Benchmarking C Source Code Static Analysis Tools

CSCI E 98: Managed Environments for the Execution of Programs

The Eighth International Conference INCOSE_IL Formal Methods Security Tools in the Service of Cyber Security

Software security assessment based on static analysis

The programming language C. sws1 1

MPI-Checker Static Analysis for MPI

CARAMEL: Detecting and Fixing Performance Problems That Have Non-Intrusive Fixes

Performance Measurement of Dynamically Compiled Java Executions

Tachyon: a Meta-circular Optimizing JavaScript Virtual Machine

Software Reliability Estimation Based on Static Error Detection

Platform-independent static binary code analysis using a metaassembly

KITES TECHNOLOGY COURSE MODULE (C, C++, DS)

Writing new FindBugs detectors

Advances in Programming Languages

Data races in Java and static analysis

A Comparative Study of Industrial Static Analysis Tools (Extended Version)

SAFECode Security Development Lifecycle (SDL)

QUIRE: : Lightweight Provenance for Smart Phone Operating Systems

Measuring the Attack Surfaces of SAP Business Applications

The Model Checker SPIN

Software security specification and verification

Access Control Fundamentals

Source Code Review Using Static Analysis Tools

Improving Software Quality with Static Analysis

Software Testing. Quality & Testing. Software Testing

Not agree with bug 3, precision actually was. 8,5 not set in the code. Not agree with bug 3, precision actually was

Numerology - A Case Study in Network Marketing Fractions

APPROACHES TO SOFTWARE TESTING PROGRAM VERIFICATION AND VALIDATION

The C Programming Language course syllabus associate level

On the value of hybrid security testing

PROBLEM SOLVING SEVENTH EDITION WALTER SAVITCH UNIVERSITY OF CALIFORNIA, SAN DIEGO CONTRIBUTOR KENRICK MOCK UNIVERSITY OF ALASKA, ANCHORAGE PEARSON

Transcription:

A Brief Introduction to Static Analysis Sam Blackshear March 13, 2012

Outline A theoretical problem and how to ignore it An example static analysis What is static analysis used for? Commercial successes Free static analysis tools you should use Current research in static analysis 2

An Interesting Problem We ve written a program P and we want to know... Does P satisfy property of interest φ (for example, P does not dereference a null pointer, all casts in P are safe,...) Manually verifying that P satisfies φ may be tedious or impractical if P is large or complex Would be great to write a program (or static analysis tool) that can inspect the source code of P and determine if φ holds! 3

An Inconvenient Truth We cannot write such a program; the problem is undecidable in general! Rice s Theorem (paraphrase): For any nontrivial program property φ, no general automated method can determine whether φ holds for P. Where nontrivial means we can there exists both a program that has property φ and one that does not. So should we give up on writing that program? 4

A Way Out Only if not getting an exact answer bothers us (and it shouldn t). Key insight: we can abstract the behaviors of the program into a decidable overapproximation or underapproximation, then attempt to prove the property in the abstract version of the program This way, we can have our decidability, but can also make very strong guarantees about the program 5

Analysis Choices A sound static analysis overapproximates the behaviors of the program. A sound static analyzer is guaranteed to identify all violations of our property φ, but may also report some false alarms, or violations of φ that cannot actually occur. A complete static analysis underapproximates the behaviors of the program. Any violation of our property φ reported by a complete static analyzer corresponds to an actual violation of φ, but there is no guarantee that all actual violations of φ will be reported Note that when a sound static analyzer reports no errors, our program is guaranteed not to violate φ! This is a powerful guarantee. As a result, most static analysis tools choose to be sound rather than complete. 6

Visualizing Soundness and Completeness Overapproximate (sound) analysis All program behaviors Underapproximate (complete) analysis 7

Outline A theoretical problem and how to ignore it An example static analysis What is static analysis used for? Commercial successes Free static analysis tools you should use Current research in static analysis 8

An Example Static Analysis : Sign Analysis Classic example: sign analysis! Abstract concrete domain of integer values into abstract domain of signs (-, 0, +) {, } Step 1: Define abstraction function abstract() over integer literals: -abstract(i) = - if i < 0 -abstract(i) = 0 if i == 0 -abstract(0) = + if i > 0 9

Sign Analysis (continued) Define transfer functions over abstract domain that show how to evaluate expressions in the abstract domain: + + + = + - + - = - 0 + 0 = 0 0 + + = + 0 + - = -. + + - = (analysis notation for unknown ) {+, -, 0, } + = {+, -, 0, } / 0 = (analysis notation for undefined ). 10

An Example Static Analysis (continued) Now, every integer value and expression has been abstracted into a {+, -, 0,, }. An exceedingly silly analysis, but even so can be used to: Check for division by zero (as simple as looking for occurrences of ) Optimize: store + variables as unsigned integers or 0 s as false boolean literals See if (say) a banking program erroneously allows negative account values (see if balance variable is - or ) More? 11

Outline A theoretical problem and how to ignore it An example static analysis What is static analysis used for? Commercial successes Free static analysis tools you should use Current research in static analysis 12

What Are Static Analysis Tools Used For? To name a few: Compilers (type checking, optimization) Bugfinding Formal Verification 13

Compilers - Type Checking Most common and practically useful example of static analysis Statically ensure that arithmetic operations are computable (prevent adding an integer to a boolean in C++, for example) Guarantee that functions are called with the correct number/type of arguments Real-world static analysis success story: undefined variable analysis to ensure that an undefined variable is never read - Common source of nondeterminism in C; causes nasty bugs - Analysis built in to Java compiler! Statically guarantees that an undefined variable will never be read Object o; o.foo(); Compiler: Variable o might not have been initialized! 14

Compilers - Optimization Examples: Machine-aware optimizations: convert x*2 into x + x Loop-invariant code motion: move code out of a loop int a = 7, b = 6, sum, z, i; for (i = 0; i < 25; i++) z = a + b; sum = sum + z + i; Lift z = a + b out of the loop Function inlining - save overhead of procedure call by inserting code for procedure (related: loop unrolling) Many more! 15

Bugfinding Big picture: identify illegal or undesirable language behaviors, see if program can trigger them Null pointer dereference analysis (C, C++, Java... ) Buffer overflow analysis: can the program write past the bounds of a buffer? (C, C++, Objective-C) Cast safety analysis: can a cast from one type to another fail? Taint analysis: can a program leak secret data, or use untrusted input in an insecure way? (web application privacy, SQL injection,... ) Memory leak analysis: is malloc() called without free()? (C, C++) Is a heap location that is never read again reachable from the GC roots? (Java) Race condition checking: Can threads interleave in such a way that threads t 1 and t 2 simultaneously access variable x, where at least one access is a write? 16

Formal Verification Given a rigorous complete or partial specification, prove that no possible behavior of the program violates the specification Assertion checking - user writes assert() statements that fail at runtime if assertion evaluates to false. We can use static analysis to prove that an assertion can never fail Given a specification for an algorithm and a formal semantics for the language the program is written in, can prove that the implementation (not just the algorithm!) is correct. Note: giving specification is sometimes harder than checking it! Example: Specification for sorting. How would you define? Takes input l, 0-indexed array of integers, returns l, 0-indexed array of integers, where: 17

Formal Verification Given a rigorous complete or partial specification, prove that no possible behavior of the program violates the specification Assertion checking - user writes assert() statements that fail at runtime if assertion evaluates to false. We can use static analysis to prove that an assertion can never fail Given a specification for an algorithm and a formal semantics for the language the program is written in, can prove that the implementation (not just the algorithm!) is correct. Note: giving specification is sometimes harder than checking it! Example: Specification for sorting. How would you define? Takes input l, 0-indexed array of integers, returns l, 0-indexed array of integers, where: (1) for i in [0, length(l ) - 1), l [i] l [i + 1] (obvious) 17

Formal Verification Given a rigorous complete or partial specification, prove that no possible behavior of the program violates the specification Assertion checking - user writes assert() statements that fail at runtime if assertion evaluates to false. We can use static analysis to prove that an assertion can never fail Given a specification for an algorithm and a formal semantics for the language the program is written in, can prove that the implementation (not just the algorithm!) is correct. Note: giving specification is sometimes harder than checking it! Example: Specification for sorting. How would you define? Takes input l, 0-indexed array of integers, returns l, 0-indexed array of integers, where: (1) for i in [0, length(l ) - 1), l [i] l [i + 1] (obvious) (2) l is a permutation of l (subtle!) 17

Outline A theoretical problem and how to ignore it An example static analysis What is static analysis used for? Commercial successes Free static analysis tools you should use Current research in static analysis 18

Commercial Successes Astree Coverity Microsoft Static Driver Verifier Java Pathfinder Microsoft Visual C/C++ Static Analyzer 19

Astrée Goal: Prove absence of undefined behavior and runtime errors in C (null pointer dereference, integer, overflow, divide by zero, buffer overflow,... ) Developed by INRIA (France), commercial sponsorship by Airbus (aircraft manufacturer) Astrée proved absence of errors for 132,000 lines of flight control software in only 50 minutes! Has also been used to verify absence of runtime errors in docking software used for the International Space Station Over 20 publications on techniques developed/used More information at http://www.astree.ens.fr/, http://www.absint.com/astree/index.htm 20

Coverity Goal: Catch bugs in C, C++, Java, and C# code during development Commercial spinoff of Stanford static analysis tools Analysis is neither sound nor complete; emphasis is on finding bugs that are easy to explain to the user: Explaining errors is often more difficult than finding them. A misunderstood explanation means the error is ignored or, worse, transmuted into a false positive. Used by over 1100 companies! Very interesting CACM article on challenges of commercializing a research tool A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World. http://www.coverity.com 21

Microsoft Static Driver Verifier (SLAM) Goal: Find bugs in Windows device drivers written in C Major effort in defining what constitutes a bug; requires developing rigorous model for operating system and specifying what behaviors are unacceptable Uses model checking: intelligently enumerating and exploring state space of program to ensure that no error states are reachable from the initial state Tool is sound and quite precise (5% false positive rate) Works in a few minutes for typical drivers; up to 20 minutes for complex ones msdn.microsoft.com/en-us/windows/hardware/gg487506 22

Java PathFinder Goal: find bugs in mission-critical NASA code NASA-developed model checker that runs on Java bytecode Specialty is detecting concurrency errors such as race conditions; also handles uncaught exceptions Now free and open source! Can configure JPF to check properties you are interested in by adding plugins Drawback: can t handle native Java libraries, must use models instead Sound, but not very precise http://babelfish.arc.nasa.gov/trac/jpf 23

Microsoft Visual C/C++ Static Analyzer Invoke using /analyze flag when compiling Emphasis on finding security-critical errors such as buffer overruns and printf format string vulnerabilities Neither sound nor complete; lots of heuristics, but finds lots of bugs Developer legend John Carmack (of Quake fame) is a big fan: http://altdevblogaday.com/2011/12/24/ static-code-analysis/ http://msdn.microsoft.com/en-us/library/ms182025.aspx 24

Outline A theoretical problem and how to ignore it An example static analysis What is static analysis used for? Commercial successes Free static analysis tools you should use Current research in static analysis 25

Some (Good) Free and Open Source Static Analysis Tools Clang static analyzer FindBugs WALA vellvm 26

Clang Static Analyzer Part of llvm compiler infrastructure; works only on C and Objective-C programs Over 30 checks built into default analyzer Built for debugging ios apps, so includes extensive functionality for finding memory problems Neither sound nor complete Can suppress false positives reported by tool by adding annotations to code Very snappy IDE Integration with Xcode Support for C++ coming soon! http://clang-analyzer.llvm.org/ 27

FindBugs University of Maryland research tool by David Hovemeyer and Bill Pugh (of SkipList fame) Rather than using rigorous formal methods, focus is on heuristics; in particular, identifying common bug patterns Looks for 45 bug patterns such as equal objects must have equal hashcodes and wait not in loop Bugs are given a severity rating from 1-20 No soundness or completeness guarantee, but proven to be very useful in practice Integration with numerous IDEs including Eclipse and NetBeans http://findbugs.sourceforge.net/ 28

SAFECode + Vellvm Research tool from UPenn and UIUC Automatically add memory safety to existing C source code: no buffer overflows, dangling pointers, e.t.c All you have to do is recompile your source using llvm-clang with the SAFECode flag Runtime overhead only 10-40%; small price to avoid getting hacked! Implementation of compiler transformation proven correct with proof assistant (Coq). Very strong guarantee! http://sva.cs.illinois.edu/downloads.html 29

Outline A theoretical problem and how to ignore it An example static analysis What is static analysis used for? Commercial successes Free static analysis tools you should use Current research in static analysis 30

Current Research in Static Analysis: a Taste Too many topics to enumerate here, but will mention a few: Concurrency - the state spaces of large concurrent programs are much too large to explore exhaustively. How can we make guarantees about the correctness of concurrent programs while only exploring a fraction of this space? JavaScript and other dynamic languages - how can we deal with dynamic behavior like reflection and dynamic code evaluation with eval()? Checking deeper properties - most static analyses prove properties that must be true of all programs in their target language. How can we automatically infer and check properties that correspond more closely to program correctness? 31

Further Reading Foundations of Static Analysis - Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. Patrick Cousot and Radhia Cousot. POPL 1977. Astrée - A Static Analyzer for Large Safety-Critical Software. Bruno Blanchet, Patrick Cousot, Radhia Cousot, Jérôme Feret, Laurent Mauborgne, Antoine Miné, David Monniaux, and Xavier Rival. PLDI 2003. Coverity - Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code. Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. SOSP 2001. 32

Further Reading (continued) Microsoft Static Driver Verifier/SLAM - Automatic Predicate Abstraction of C Programs. Thomas Ball, Rupak Majumdar, Todd D. Millstein, Sriram K. Rajamani, PLDI 2001. Java PathFinder - Test Input Generation with Java PathFinder. W. Visser, C. Pasareanu, S. Khurshid. ISSTA 2004. FindBugs - Finding Bugs is Easy. David Hovemeyer and William Pugh. OOPSLA 2004. SAFECode - Formalizing the LLVM Intermediate Representation for Verified Program Transformation. Jianzhou Zhao, Santosh Nagarakatte, Milo M K Martin and Steve Zdancewic. POPL 2012 33

Further Reading (still continued) Analysis of concurrent programs - Reducing Concurrent Analysis Under a Context Bound to Sequential Analysis. Akash Lal and Thomas Reps. Formal Methods in System Design (FMSD) 2009. Static analysis of JavaScript - The Essence of JavaScript. Arjun Guha, Claudiu Saftoiu, Shriram Krishnamurthi. ECOOP 2010. Specification inference - Probabilistic, Modular and Scalable Inference of Typestate Specifications, Nels E. Beckman and Aditya V. Nori. PLDI 2011. 34

Executive Summary Static analysis is infeasible in theoretical terms, but quite feasible in practice Static analysis has been used in industry for important applications such as finding bugs in aircraft software There are free and open source static analysis tools that can help you find common problems in your code You should use them! Current research in static analysis is attacking interesting problems that will continue to push the boundaries of automated reasonng 35