Static analysis: from theory to practice



Similar documents
A Static Analyzer for Large Safety-Critical Software. Considered Programs and Semantics. Automatic Program Verification by Abstract Interpretation

Static analysis of numerical programs

Numerology - A Case Study in Network Marketing Fractions

Abstract Interpretation-based Static Analysis Tools:

Automatic Verification by Abstract Interpretation

Chair of Software Engineering. Software Verification. Assertion Inference. Carlo A. Furia

Program Analysis: Theory and Practice

Introduction to Static Analysis for Assurance

Motivations 1. What is (or should be) the essential preoccupation of computer scientists?

InvGen: An Efficient Invariant Generator

Analyse statique de programmes numériques avec calculs flottants

Know or Go Practical Quest for Reliable Software

Relational Abstract Domains for the Detection of Floating-Point Run-Time Errors

A Static Analyzer for Large Safety-Critical Software

Automated Theorem Proving - summary of lecture 1

On some Potential Research Contributions to the Multi-Core Enterprise

The Course.

Rigorous Software Development CSCI-GA

Astrée: Building a Static Analyzer for Real-Life Programs

A Case for Static Analyzers in the Cloud (Position paper)

Formal Verification and Linear-time Model Checking

A Brief Introduction to Static Analysis

A Static Analyzer for Large Safety-Critical Software

Static Analysis of the Mars Exploration Rover Flight Software

A puzzle. Suppose that you want to run some JavaScript programs---all the cool kids are doing it.

Static Program Transformations for Efficient Software Model Checking

Operation Count; Numerical Linear Algebra

How To Develop A Static Analysis System For Large Programs

µz An Efficient Engine for Fixed points with Constraints

IKOS: A Framework for Static Analysis based on Abstract Interpretation (Tool Paper)

[Refer Slide Time: 05:10]

An On-Line Algorithm for Checkpoint Placement

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

Numerical Matrix Analysis

Rigorous Software Development CSCI-GA

Overview Motivating Examples Interleaving Model Semantics of Correctness Testing, Debugging, and Verification

Binary search tree with SIMD bandwidth optimization using SSE

Adversary Modelling 1

Certification Authorities Software Team (CAST) Position Paper CAST-13

MEng, BSc Applied Computer Science

16.1 MAPREDUCE. For personal use only, not for distribution. 333

Euler: A System for Numerical Optimization of Programs

Static Analysis of Dynamic Properties - Automatic Program Verification to Prove the Absence of Dynamic Runtime Errors

Achieving business benefits through automated software testing. By Dr. Mike Bartley, Founder and CEO, TVS

Coverability for Parallel Programs

Model Checking based Software Verification

MEng, BSc Computer Science with Artificial Intelligence

Infer: An Automatic Program Verifier for Memory Safety of C Programs

Verification of Device Drivers and Intelligent Controllers: a Case Study

Formal verification of contracts for synchronous software components using NuSMV

Why? A central concept in Computer Science. Algorithms are ubiquitous.

Learning Outcomes. Simple CPU Operation and Buses. Composition of a CPU. A simple CPU design

SCADE Suite in Space Applications

Software Active Online Monitoring Under. Anticipatory Semantics

Indiana State Core Curriculum Standards updated 2009 Algebra I

ECE 0142 Computer Organization. Lecture 3 Floating Point Representations

CS510 Software Engineering

The pitfalls of verifying floating-point computations

Software Testing & Analysis (F22ST3): Static Analysis Techniques 2. Andrew Ireland

Real Time Programming: Concepts

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

Context-Bounded Model Checking of LTL Properties for ANSI-C Software. Jeremy Morse, Lucas Cordeiro, Bernd Fischer, Denis Nicole

Visualizing Information Flow through C Programs

Lecture 2: Universality

Static Analysis by Abstract Interpretation of Numerical Programs and Systems

Solution of Linear Systems

An Abstract Monte-Carlo Method for the Analysis of Probabilistic Programs

Regression Verification: Status Report

Algorithms are the threads that tie together most of the subfields of computer science.

Factoring & Primality

A Framework for the Semantics of Behavioral Contracts

Rotation Rate of a Trajectory of an Algebraic Vector Field Around an Algebraic Curve

Operations Research and Financial Engineering. Courses

POMPDs Make Better Hackers: Accounting for Uncertainty in Penetration Testing. By: Chris Abbott

How To Get A Computer Science Degree At Appalachian State

Gold Standard Benchmark for Static Source Code Analyzers

FROM SAFETY TO SECURITY SOFTWARE ASSESSMENTS AND GUARANTEES FLORENT KIRCHNER (LIST)

Scheduling Real-time Tasks: Algorithms and Complexity

T Reactive Systems: Introduction and Finite State Automata

GETTING STARTED WITH LABVIEW POINT-BY-POINT VIS

Fixed-Point Logics and Computation

Precision & Performance: Floating Point and IEEE 754 Compliance for NVIDIA GPUs

Global Constraints in Software Testing Applications

o-minimality and Uniformity in n 1 Graphs

CSCI E 98: Managed Environments for the Execution of Programs

Static Analysis. Find the Bug! : Analysis of Software Artifacts. Jonathan Aldrich. disable interrupts. ERROR: returning with interrupts disabled

Real-Time Software. Basic Scheduling and Response-Time Analysis. René Rydhof Hansen. 21. september 2010

MPLAB TM C30 Managed PSV Pointers. Beta support included with MPLAB C30 V3.00

Introduction to Embedded Systems. Software Update Problem

Development of dynamically evolving and self-adaptive software. 1. Background

CHAPTER 5 Round-off errors

Dynamic load balancing of parallel cellular automata

Chapter 1. NP Completeness I Introduction. By Sariel Har-Peled, December 30, Version: 1.05

Chapter 7 Memory Management

Monday January 19th 2015 Title: "Transmathematics - a survey of recent results on division by zero" Facilitator: TheNumberNullity / James Anderson, UK

Software Verification and Testing. Lecture Notes: Temporal Logics

This Unit: Floating Point Arithmetic. CIS 371 Computer Organization and Design. Readings. Floating Point (FP) Numbers

Compiling Object Oriented Languages. What is an Object-Oriented Programming Language? Implementation: Dynamic Binding

Transcription:

Static analysis: from theory to practice David Monniaux CNRS / VERIMAG A joint laboratory of CNRS, Université Joseph Fourier (Grenoble) and Grenoble-INP. 19 juin 2009 David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 1 / 38

HDR presented before Rapporteurs Pr Roberto Giacobazzi (U. Verona) Pr Eric Goubault (École polytechnique / CEA) Pr Andreas Podelski (U. Freiburg) Other members Pr Gilles Dowek (École polytechnique) Pr Roland Groz (Grenoble-INP) Pr Helmut Seidl (TU-München) David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 2 / 38

Research area in a nutshell Program analysis Program (source code or object code) automatic analysis Facts about the program Connex areas Semantics of real-life programming languages (floating-point, asynchronous parallelism) Use of proof assistants Decision of logic theories, quantifier elimination David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 3 / 38

Flashback: PhD thesis 2001 PhD thesis on the notion of abstract interpretation of probabilistic programs (semantics as Markov chains, Markov decision processes). Mostly theoretical. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 4 / 38

Plan 1 Astrée project Industrial context Scientific work 2 Formal proofs 3 Exotic analysis methods 4 Prospects David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 5 / 38

The Team Semantics and abstract interpretation team at LIENS Bruno Blanchet Patrick Cousot (project leader) Radhia Cousot Jérôme Feret Laurent Mauborgne Antoine Miné yours truly Xavier Rival David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 6 / 38

Plan 1 Astrée project Industrial context Scientific work 2 Formal proofs 3 Exotic analysis methods 4 Prospects David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 7 / 38

1996: Ariane 501 Before After This event raised great awareness of the consequences of bugs in critical systems. Earlier accidents (e.g. Therac-25 radiotherapy machine, killed patients) had less publicity. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 8 / 38

Fly-by-wire controls: Airbus A380 Exemplified by sidesticks: commands from pilots get sent to computers, which control the active surfaces. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 9 / 38

Fly-by-wire controls: Boeing 777-200ER The control yoke is fake : it is not mechanically tied to the active surfaces, but to a computer. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 10 / 38

Airbus: a long-standing investment in quality software Airbus develops control-command software using high-level tools such as SAO and SCADE (Lustre). Airbus uses verification techniques (e.g. the CAVEAT Hoare logic proof assistant). Airbus uses abstract interpretation AbsInt s worst-case execution time analysis tool. CEA s Fluctuat etc. tools. Astrée David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 11 / 38

Plan 1 Astrée project Industrial context Scientific work 2 Formal proofs 3 Exotic analysis methods 4 Prospects David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 12 / 38

The challenge Prove the absence of runtime errors in critical code (from C source): overflow (see e.g. Ariane 501) divide by zero array access out of bound bad pointer other undefined behaviours Challenges: very large code high precision requested (few false alarms) lots of floating-point David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 13 / 38

Difficulties High precision In static analysis for optimization, 95% completeness (95% of code without any alarms shown) is marvellous. In static analysis for certification, this is very bad. On a 100,000 line program, this means 5,000 warnings far too many for engineers! C language and floating-point Somewhat fuzzy and complex semantics. Thus not so-well studied by scientists, who prefer nicer cases. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 14 / 38

Static analysis by abstract interpretation Reachability analysis X 0 = initial states X n = states reachable in n steps X n+1 = φ(x n ) where X = X 0 f (X n ) X = reachable states = lim n X n = n X n. Abstract interpretation Replace X n (difficult to represent exactly) by overapproximation γ(x n) where X n machine representable. Example : X n pair (a, b) defining the interval γ(x n) = [a, b]. If needed, accelerate convergence using widening operators. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 15 / 38

Concrete results http://www.astree.ens.fr Absence of runtime errors, no/few false alarms in: 2003 Primary fly-by-wire control A340 2005 Fly-by-wire control A380 2008 Automatic docking software of the ATV (automatic transfer vehicle for International space station) (and other industrial programs) David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 16 / 38

How these results were achieved Development of specific abstract domains. Maximizing synergies between abstract domains. Coarse analysis as long as its coarseness is not a problem. Careful use of data structures (large programs = large memory = any weakness may cause inacceptably large analysis time or memory usage)...and research and innovations on little-studied areas of static analysis. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 17 / 38

Semantics of C and floating-point Difficulties: Memory model. Common assumptions outside of C specification. Interactions between C and IEEE-754 floating-point (a mess see TOPLAS 08 article). These issues need careful practical studies. Standards are fuzzy and anyway compilers do not respect them. Some assumptions made by some proof tools are actually unsound. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 18 / 38

Floating-point and numerics Misleading prejudice Floating-point is just like reals. (optimistic) Floating-point can yield just about anything, it s so unpredictable. (pessimistic) Floating-point programs run identically on all IEEE-754 systems. (optimistic) Floating-point is too complicated for analysis. (pessimistic) My opinion Floating-point is partially specified, and what is specified is enough to derive bounds that can prove many properties in programs written by normal people. Programs relying on finer properties not provable using these bounds should only be written by experts (e.g. William Kahan). David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 19 / 38

Floating-point and numerics analysis Floating-point Floating-point = exact real computation + boundable error [Miné]: floating-point semantics real nondeterministic semantics real (ideal) analysis float implementation of analysis Filtering Numerical filtering: in control-command theory books, bounds on floating-point roundoff errors ignored ( useless or too pessimistic ). Results by Feret and Monniaux: any linear filter implemented in floating-point can be abstracted as O = R.I + ɛ, ɛ E. I (R ideal impulse response expressed as its Z-transform using rational functions). [CAV 05] David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 20 / 38

Low-level device drivers Modern hardware controllers are programmable and transfer data using bus-master DMA. Positive point: offload work from the CPU to the controller, acts quite autonomously, does not need interrupts and reprogramming all the time. Negative point: more like a shared-memory multi-cpu system than a single-cpu system. Contribution: attempt at modelization of USB OHCI controller as an asynchronous process composed with the driver ran on CPU [EMSOFT 07]. All other driver analyses ignore devices, some even ignore other threads. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 21 / 38

Parallelization of analysis Exploit the structure of the program to be analyzed in order to derive independently analyzable parts. Synchronous code with multiple clocks statically scheduled: phase = 0; while (true) { executephase(phase); phase = phase+1; if (phase==10) phase=0; } David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 22 / 38

Parallelization of analysis (2) Abstract as: while (true) { phase = [0, 9]; executephase(phase); } Run the analysis of all phases in parallel (or grouped in chunks, per number of CPUs). David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 23 / 38

What I got out of it For analysis designers It is possible to design static analyzers with very low levels of false alarms by targeting specific domains (e.g. control-command codes written in a certain way). This requires significant work. Orders of magnitude more than for toy languages and examples. We academics should take into account the difference between clean models and the actual systems they are meant to represent. For industrials Static analysis is not a replacement for testing. Static analysis should be integrated into the development process early on, not as an afterthought. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 24 / 38

Plan 1 Astrée project Industrial context Scientific work 2 Formal proofs 3 Exotic analysis methods 4 Prospects David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 25 / 38

A case for formally proved tools We prove properties of programs... But how about bugs in the compiler? (see work by e.g. X. Leroy, X. Rival) How about bugs in the analyzer? Formalizations of elements of static analyzers in Coq: The balanced tree data structure used in Astrée, with correctness proofs. Minimalistic notion of widening [under subm. HOSC]. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 26 / 38

Plan 1 Astrée project Industrial context Scientific work 2 Formal proofs 3 Exotic analysis methods 4 Prospects David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 27 / 38

Known weaknesses of many abstract interpretation schemes Lack of modularity (but see e.g. Logozzo s class-invariant analysis) must re-run analysis if some part changes cannot analyze program fragments with environment abstracted away Widening operators no guarantee of precision non-monotonic results design somewhat of a black art What can we do? David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 28 / 38

Direct computation of least invariants I is an inductive invariant iff it contains the initial state and it is stable by the denotational semantics of the program: Suppose x 0 I (1) x x I (x) P (x, x ) I (x ) (2) the state x is a tuple x 1,..., x n of reals I is a conjunction of P i (x 1,..., x n ) C i, 1 i m where the P i are fixed polynomials and the C i are free variables (parameters). Then Eqn. 1 is a formula with free variables C 1,..., C m whose solutions are the shape parameters defining inductive invariants. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 29 / 38

Set of solutions Example Choose x p in [a, b] Infinite loop: Choose x in [a, b] If x > x p + δ then x := x p + δ If x < x p δ then x := x p δ x p := x Find invariants of the form x p [c, d]. The condition becomes: x p a x p b c x p d x p c x p d x, x, x (x > x p + δ? x = x p + δ : x = x) (x < x p δ? x = x p δ : x = x ) c x d (3) where p? xa : b is short for (p a) ( p b). David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 30 / 38

Quantifier elimination yields least invariants The above formula can be simplified as c a d b. If we ask for minimal d and maximal c (least invariant) the formula simplifies to c = a d = b. But how can we do this automatically? By quantifier elimination [SAS 07, POPL 09]. From a formula with quantifiers, output an equivalent formula without quantifiers. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 31 / 38

Work on quantifier elimination Past state of the art Work The nonlinear problem over the reals is very hard (CAD, best known algorithm, is very complex to implement and slow). Known algorithms for the linear problem (LRA) were also slow. Developed a quantifier elimination for LRA based on SMT-solving (SAT modulo the LRA theory) and eager constraint generation [LPAR 08]. Implemented it in a tool. Developed a lazy version and implemented it, currently benchmarking it. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 32 / 38

SMT-solving: use of numerical techniques Most proof assistants / analysis tools / decision procedures use exact arithmetic, which can be costly. Can we use numerical methods? Polynomial inequalities Finding Positivstellensatz witnesses reduced to semidefinite programming (SDP). Solve SDP using numerical methods. Does not work well due to fatal geometric degeneracies (hard to get rid of). Linear inequalities Floating-point simplex for exact computations: interesting efficiency in some cases [CAV 09]. Interior point method for the witness problem: remains to be tested (degeneracy problem easier than for Positivstellensatz). David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 33 / 38

Plan 1 Astrée project Industrial context Scientific work 2 Formal proofs 3 Exotic analysis methods 4 Prospects David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 34 / 38

Short and medium term Current Work with graduate student Julien Le Guen, Nicolas Halbwachs and STMicroelectronics on using static analysis for optimization. Work on quantifier elimination (lazy strategy). Work on numerical methods for finding invariants or checking formulas. Medium term Possibly work on extracting formal proofs. Application to the modular analysis of LUSTRE? Another round at analysis of C? David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 35 / 38

Industrial lessons learned There are many interesting and hard research problems arising from industrial needs. Some industrial difficulties are actually self-inflicted organisational issues, and are not in need of a technical solution. There is a vast difference between what researchers like to research (clean semantics, mathematical structures, etc.) and real-life languages and systems. It is more reasonable to start with modest goals (domain-specific static analysis) than to target all kinds of programs. Verification (= proof of absence of bugs) is not at all the same as bug-finding. David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 36 / 38

Research lessons learned There are many interesting techniques coming from operational research, signal processing, automatic control theory... Yet the need for sound proofs often makes them unsuitable for direct application in formal methods. Sometimes, these techniques can be used with some adaptation (bounding ε s, checking phase in exact precision, etc.). David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 37 / 38

Questions David Monniaux (VERIMAG) Static analysis: from theory to practice 19 juin 2009 38 / 38