Static Analysis. Find the Bug! 15-654: Analysis of Software Artifacts. Jonathan Aldrich. disable interrupts. ERROR: returning with interrupts disabled



Similar documents
Introduction to Static Analysis for Assurance

Secure Software Programming and Vulnerability Analysis

A Static Analyzer for Large Safety-Critical Software. Considered Programs and Semantics. Automatic Program Verification by Abstract Interpretation

Lecture 2 Introduction to Data Flow Analysis

Class notes Program Analysis course given by Prof. Mooly Sagiv Computer Science Department, Tel Aviv University second lecture 8/3/2007

APPROACHES TO SOFTWARE TESTING PROGRAM VERIFICATION AND VALIDATION

Static Taint-Analysis on Binary Executables

Linux Kernel. Security Report

Oracle Solaris Studio Code Analyzer

Model Checking: An Introduction

Software testing. Objectives

Testing LTL Formula Translation into Büchi Automata

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Compiler Construction

Write Barrier Removal by Static Analysis

The FDA Forensics Lab, New Tools and Capabilities

CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013

6.852: Distributed Algorithms Fall, Class 2

The Model Checker SPIN

The Course.

Reminder: Complexity (1) Parallel Complexity Theory. Reminder: Complexity (2) Complexity-new

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

Testing and Inspecting to Ensure High Quality

System modeling. Budapest University of Technology and Economics Department of Measurement and Information Systems

Software Testing & Analysis (F22ST3): Static Analysis Techniques 2. Andrew Ireland

Regression Verification: Status Report

Static Analysis of Virtualization- Obfuscated Binaries

Algorithms are the threads that tie together most of the subfields of computer science.

Statically Checking API Protocol Conformance with Mined Multi-Object Specifications Companion Report

Platform-independent static binary code analysis using a metaassembly

AUTOMATED TEST GENERATION FOR SOFTWARE COMPONENTS

Basics of VTune Performance Analyzer. Intel Software College. Objectives. VTune Performance Analyzer. Agenda

Eliminate Memory Errors and Improve Program Stability

Towards practical reactive security audit using extended static checkers 1

Mathematical Induction

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Detecting Critical Defects on the Developer s Desktop

Introduction. What is an Operating System?

GameTime: A Toolkit for Timing Analysis of Software

Verification and Validation of Software Components and Component Based Software Systems

Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case.

Module 10. Coding and Testing. Version 2 CSE IIT, Kharagpur

Using the Intel Inspector XE

Transparent Monitoring of a Process Self in a Virtual Environment

Outlook 2010 Setup Guide (POP3)

10CS35: Data Structures Using C

Lecture Notes on Static Analysis

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

Introduction to Computers and Programming. Testing

Introduction to Automated Testing

Y. Xiang, Constraint Satisfaction Problems

Outline. 1 Denitions. 2 Principles. 4 Implementation and Evaluation. 5 Debugging. 6 References

Factoring & Primality

Optimizations. Optimization Safety. Optimization Safety. Control Flow Graphs. Code transformations to improve program

Informatica e Sistemi in Tempo Reale

Fuzzing in Microsoft and FuzzGuru framework

Sources: On the Web: Slides will be available on:

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

Reducing Clocks in Timed Automata while Preserving Bisimulation

1 The Java Virtual Machine

Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

2) Write in detail the issues in the design of code generator.

Static Code Analysis Procedures in the Development Cycle

Change Impact Analysis

Software Testing. System, Acceptance and Regression Testing

The Darwin Game 2.0 Programming Guide

Static Program Transformations for Efficient Software Model Checking

Applications of formal verification for secure Cloud environments at CEA LIST

Scoping (Readings 7.1,7.4,7.6) Parameter passing methods (7.5) Building symbol tables (7.6)

Java Interview Questions and Answers

T Reactive Systems: Introduction and Finite State Automata

Source Code Security Analysis Tool Functional Specification Version 1.0

Exercise 4 Learning Python language fundamentals

Data Structure with C

TOOL EVALUATION REPORT: FORTIFY

Pixy: A Static Analysis Tool for Detecting Web Application Vulnerabilities (Technical Report)

Chapter 7: Functional Programming Languages

SOFT 437. Software Performance Analysis. Ch 5:Web Applications and Other Distributed Systems

Open-source Versus Commercial Software: A Quantitative Comparison

Coverity White Paper. Reduce Your Costs: Eliminate Critical Security Vulnerabilities with Development Testing

Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy

General Introduction

Intrusion Detection via Static Analysis

MonitorExplorer: A State Exploration-Based Approach to Testing Java Monitors

Transcription:

Static Analysis 15-654: Analysis of Software Artifacts Jonathan Aldrich 1 Find the Bug! Source: Engler et al., Checking System Rules Using System-Specific, Programmer-Written Compiler Extensions, OSDI 00. disable interrupts ERROR: returning with interrupts disabled re-enable interrupts 2 1

Metal Interrupt Analysis Source: Engler et al., Checking System Rules Using System-Specific, Programmer-Written Compiler Extensions, OSDI 00. enable => err(double enable) is_enabled enable disable is_disabled end path => err(end path with/intr disabled) disable => err(double disable) 3 Applying the Analysis Source: Engler et al., Checking System Rules Using System-Specific, Programmer-Written Compiler Extensions, OSDI 00. initial state is_enabled transition to is_disabled final state is_disabled: ERROR! transition to is_enabled final state is_enabled is OK 4 2

Outline Why static analysis? The limits of testing and inspection What is static analysis? Introduction to Dataflow Analysis Dataflow Analysis Frameworks Lattices Abstraction functions Control flow graphs Flow functions Worklist algorithm 5 6 3

Static Analysis Finds Mechanical Errors Defects that result from inconsistently following simple, mechanical design rules Security vulnerabilities Buffer overruns, unvalidated input Memory errors Null dereference, uninitialized data Resource leaks Memory, OS resources Violations of API or framework rules e.g. Windows device drivers; real time libraries; GUI frameworks Exceptions Arithmetic/library/user-defined Encapsulation violations Accessing internal data, calling private functions Race conditions Two threads access the same data without synchronization 7 Difficult to Find with Testing, Inspection Non-local, uncommon paths Security vulnerabilities Memory errors Resource leaks Violations of API or framework rules Exceptions Encapsulation violations Non-deterministic Race conditions 8 4

Quality Assurance at Microsoft (Part 1) Original process: manual code inspection Effective when system and team are small Too many paths to consider as system grew Early 1990s: add massive system and unit testing Tests took weeks to run Diversity of platforms and configurations Sheer volume of tests Inefficient detection of common patterns, security holes Non-local, intermittent, uncommon path bugs Was treading water in Windows Vista development Early 2000s: add static analysis More on this later 9 Process, Cost, and Quality Slide: William Scherlis Process intervention, testing, and inspection yield first-order software quality improvement Software Quality Additional technology and tools are needed to close the gap Perfection (unattainable) Critical Systems Acceptability CMM: 1 2 3 4 5 S&S, Agile, RUP, etc: less rigorous... more rigorous Process Rigor, Cost 10 5

Outline Why static analysis? What is static analysis? Abstract state space exploration Introduction to Dataflow Analysis Dataflow Analysis Frameworks Lattices Abstraction functions Control flow graphs Flow functions Worklist algorithm 11 Static Analysis Definition Static program analysis is the systematic examination of an abstraction of a program s state space Metal interrupt analysis Abstraction 2 states: enabled and disabled All program information variable values, heap contents is abstracted by these two states, plus the program counter Systematic Examines all paths through a function What about loops? More later Each path explored for each reachable state Assume interrupts initially enabled (Linux practice) Since the two states abstract all program information, the exploration is exhaustive 12 6

Outline Why static analysis? What is static analysis? Introduction to Dataflow Analysis Dataflow Analysis Frameworks Lattices Abstraction functions Control flow graphs Flow functions Worklist algorithm 13 An Analysis We ve Seen Hoare logic Useful for proving correctness Requires a lot of work (even for ESC/Java) Automated tool is unsound So is manual proof, without a proof checker 14 7

Motivation: Dataflow Analysis Catch interesting errors Non-local: x is null, x is written to y, y is dereferenced Optimize code Reduce run time, memory usage Soundness required Safety-critical domain Assure lack of certain errors Cannot optimize unless it is proven safe Correctness comes before performance Automation required Dramatically decreases cost Makes cost/benefit worthwhile for far more purposes 15 Dataflow analysis Tracks value flow through program Can distinguish order of operations Did you read the file after you closed it? Does this null value flow to that dereference? Differs from AST walker Walker simply collects information or checks patterns Tracking flow allows more interesting properties Abstracts values Chooses abstraction particular to property Is a variable null? Is a file open or closed? Could a variable be 0? Where did this value come from? More specialized than Hoare logic Hoare logic allows any property to be expressed Specialization allows automation and soundness 16 8

Zero Analysis Could variable x be 0? Useful to know if you have an expression y/x In C, useful for null pointer analysis Program semantics η maps every variable to an integer Semantic abstraction σ maps every variable to non zero (NZ), zero(z), or maybe zero (MZ) Abstraction function for integers α ZI : α ZI (0) = Z α ZI (n) = NZ for all n 0 We may not know if a value is zero or not Analysis is always an approximation Need MZ option, too 17 Zero Analysis Example x := 10; y := x; z := 0; while y > -1 do x := x / y; y := y-1; z := 5; σ =[] σ =[x α ZI (10)] 18 9

Zero Analysis Example x := 10; y := x; z := 0; while y > -1 do x := x / y; y := y-1; z := 5; σ =[] σ =[x NZ] σ =[x NZ,y σ(x)] 19 Zero Analysis Example x := 10; y := x; z := 0; while y > -1 do x := x / y; y := y-1; z := 5; σ =[] σ =[x NZ] σ =[x NZ,y NZ] σ =[x NZ,y NZ,z α ZI (0)] 20 10

Zero Analysis Example x := 10; y := x; z := 0; while y > -1 do x := x / y; y := y-1; z := 5; σ =[] σ =[x NZ] σ =[x NZ,y NZ] σ =[x NZ,y NZ,z Z] σ =[x NZ,y NZ,z Z] σ =[x NZ,y NZ,z Z] σ =[x NZ,y MZ,z Z] σ =[x NZ,y MZ,z NZ] 21 Zero Analysis Example x := 10; y := x; z := 0; while y > -1 do x := x / y; y := y-1; z := 5; σ =[] σ =[x NZ] σ =[x NZ,y NZ] σ =[x NZ,y NZ,z Z] σ =[x NZ,y MZ,z MZ] σ =[x NZ,y NZ,z Z] σ =[x NZ,y MZ,z Z] σ =[x NZ,y MZ,z NZ] 22 11

Zero Analysis Example x := 10; y := x; z := 0; while y > -1 do x := x / y; y := y-1; z := 5; σ =[] σ =[x NZ] σ =[x NZ,y NZ] σ =[x NZ,y NZ,z Z] σ =[x NZ,y MZ,z MZ] σ =[x NZ,y MZ,z MZ] σ =[x NZ,y MZ,z Z] σ =[x NZ,y MZ,z NZ] 23 Zero Analysis Example x := 10; y := x; z := 0; while y > -1 do x := x / y; y := y-1; z := 5; σ =[] σ =[x NZ] σ =[x NZ,y NZ] σ =[x NZ,y NZ,z Z] σ =[x NZ,y MZ,z MZ] σ =[x NZ,y MZ,z MZ] σ =[x NZ,y MZ,z MZ] σ =[x NZ,y MZ,z NZ] Nothing more happens! 24 12

Zero Analysis Termination The analysis values will not change, no matter how many times we execute the loop Proof: our analysis is deterministic We run through the loop with the current analysis values, none of them change Therefore, no matter how many times we run the loop, the results will remain the same Therefore, we have computed the dataflow analysis results for any number of loop iterations Why does this work If we simulate the loop, the data values could (in principle) keep changing indefinitely There are an infinite number of data values possible Not true for 32-bit integers, but might as well be true Counting to 2 32 is slow, even on today s processors Dataflow analysis only tracks 2 possibilities! So once we ve explored them all, nothing more will change This is the secret of abstraction We will make this argument more precise later 25 Using Zero Analysis Visit each division in the program Get the results of zero analysis for the divisor If the results are definitely zero, report an error If the results are possibly zero, report a warning 26 13

Quick Quiz Fill in the table to show how what information zero analysis will compute for the function given. Program Statement 0: <beginning of program> 1: x := 0 2: y := 1 3: if (z == 0) 4: x := x + y 5: else y := y 1 6: w := y Analysis Info after that statement 27 Outline Why static analysis? What is static analysis? Introduction to Dataflow Analysis Dataflow Analysis Frameworks Lattices Abstraction functions Control flow graphs Flow functions Worklist algorithm 28 14

Defining Dataflow Analyses Lattice Describes program data abstractly Abstract equivalent of environment Abstraction function Maps concrete environment to lattice element Flow functions Describes how abstract data changes Abstract equivalent of expression semantics Control flow graph Determines how abstract data propagates from statement to statement Abstract equivalent of statement semantics 29 Lattice A lattice is a tuple (L,,,, ) L is a set of abstract elements is a partial order on L Means at least as precise as is the least upper bound of two elements Must exist for every two elements in L Used to merge two abstract values (bottom) is the least element of L Means we haven t yet analyzed this yet Will become clear later (top) is the greatest element of L Means we don t know anything L may be infinite Typically should have finite height All paths from to should be finite We ll see why later Z =MZ NZ less precise more precise 30 15

Zero Analysis Lattice Integer zero lattice L ZI = {, Z, NZ, MZ } Z, NZ, NZ MZ, Z MZ MZ holds by transitivity defined as join for x y = z iff z is an upper bound of x and y z is the least such bound Obeys laws: X= X, X=, X X= X Also Z NZ = MZ = X. X = MZ X. X Z =MZ NZ 31 Zero Analysis Lattice Integer zero lattice L ZI = {, Z, NZ, MZ } Z, NZ, NZ MZ, Z MZ defined as join for = = MZ Program lattice is a tuple lattice L Z is the set of all maps from Var to L ZI σ 1 Z σ 2 iff x Var. σ 1 (x) ZI σ 2 (x) σ 1 Z σ 2 = { x σ 1 (x) ZI σ 2 (x) x Var } = { x ZI x Var } = { x ZI x Var } = { x MZ x Var } Can produce a tuple lattice from any base lattice Just define as above Z =MZ NZ 32 16

Tuple Lattices Visually For Var = { x,y } =MZ {x Z, y MZ} {x NZ, y MZ} {x MZ, y Z} {x MZ, y NZ} {x MZ, y ZI } {x Z, y Z} {x Z, y NZ} {x Z, y ZI } {x NZ, y ZI } {x ZI, y Z} {x ZI, y NZ} ={x ZI, y ZI } 33 One Path in a Tuple Lattice ={w MZ, x MZ, y MZ, z MZ} ={w Z, x MZ, y MZ, z MZ} ={w Z, x MZ, y NZ, z MZ} ={w Z, x NZ, y ZI, z ZI } ={w ZI, x NZ, y ZI, z ZI } ={w ZI, x ZI, y ZI, z ZI } 34 17

Outline Why static analysis? What is static analysis? Introduction to Dataflow Analysis Dataflow Analysis Frameworks Lattices Abstraction functions Control flow graphs Flow functions Worklist algorithm 35 Abstraction Function Maps each concrete program state to a lattice element For tuple lattices, the function can be defined for values and lifted to tuples Integer Zero abstraction function α ZI : α ZI (0) = Z α ZI (n) = NZ for all n 0 Zero Analysis abstraction function α ZA : α ZA (η) = {x α ZI (η(x)) x Var } This is just the tuple form of α ZI (n) Can be done for any tuple lattice 36 18

Quick Quiz Consider the following two tuple lattice values: [x Z, y MZ] and [x MZ, y NZ] How do the two compare in the lattice ordering for zero analysis? What is the join of these two tuple lattice values? 37 Outline Why static analysis? What is static analysis? Introduction to Dataflow Analysis Dataflow Analysis Frameworks Lattices Abstraction functions Control flow graphs Flow functions Worklist algorithm 38 19

Control Flow Graph (CFG) Shows order of statement execution Determines where data flows Decomposes expressions into primitive operations Typically one CFG node per useful AST node constants, variables, binary operations, assignments, if, while Loops are written out Form a loop in the CFG Benefit: analysis is defined one operation at a time 39 Intuition for Building a CFG Connect nodes in order of operation Defined by language Java order of operation Expressions, assignment, sequence Evaluate subexpressions left to right Evaluate node after children (postfix) While, If Evaluate condition first, then if/while if branches to else and then while branches to loop body and exit 40 20

Control Flow Graph Example while i*2 < 10 do if x < i+2 then x := x + 5 else i := i + 1 < while if END BEGIN * 10 < := := i 2 x + x + i + i 2 x 5 i 1 41 Quick Quiz Draw a CFG for the following program: 1: x := 0 2: y := 1 3: if (z == 0) 4: x := x + y 5: else y := y 1 6: w := y 42 21

Outline Why static analysis? What is static analysis? Introduction to Dataflow Analysis Dataflow Analysis Frameworks Lattices Abstraction functions Control flow graphs Flow functions Worklist algorithm 43 Flow Functions Compute dataflow information after a statement from dataflow information before the statement Formally, map a lattice element and a CFG node to a new lattice element Analysis performed on 3-address code inspired by 3 addresses in assembly language: add x,y,z Convert complex expressions to 3-address code Each subexpression represented by a temporary variable x+3*y t 1 :=3; t 2 := t 1 *y; t 3 :=x+t 2 44 22

While3Addr copy binary op literal unary op label jump branch x = y x = y op z (op {+,-,*,/, }) x = n x = op y label lab jump lab btrue x lab (op {-,!,++, }) 45 Zero Analysis Flow Functions ƒ ZA (σ, [x := y]) = [x σ(y)] σ ƒ ZA (σ, [x := n]) = if n==0 then [x Z]σ else [x NZ]σ ƒ ZA (σ, [x := ]) = [x MZ] σ Could be more precise, e.g. ƒ ZA (σ, [x := y + z]) = if σ[y]=z && σ[z]=z then [x Z]σ else [x MZ]σ ƒ ZA (σ, /* any non-assignment */) = σ 46 23

Zero Analysis Example BEGIN [;] 13 END x := 0; while x > 3 do x := x+1 [:=] 3 [while] 7 [x] 1 [0] 2 [<] 6 [:=] 12 [x] 4 [3] 5 [x] 8 [+] 11 [x] 9 [1] 10 47 Zero Analysis Example Initial dataflow σ ι = { x MZ x Var } BEGIN [;] 13 END [:=] 3 [while] 7 Intuition: We know nothing about initial variable values. We could use a precondition if we had one. [x] 1 [0] 2 [<] 6 [x] 4 [3] 5 [x] 8 [:=] 12 [+] 11 [x] 9 [1] 10 48 24

Zero Analysis Example σ ι = { x MZ x Var } BEGIN [;] 13 END σ 2 = ƒ ZA (σ ι, [t 2 := 0]) = [t 2 Z] σ ι [:=] 3 [while] 7 [x] 1 [0] 2 [<] 6 [:=] 12 ƒ ZA (σ, [x := n]) = if n==0 then [x Z]σ else [x NZ]σ [x] 4 [3] 5 [x] 8 [+] 11 [x] 9 [1] 10 49 Zero Analysis Example σ ι = { x MZ x Var } σ 2 = [t 2 Z] σ ι BEGIN [;] 13 END σ 3 = ƒ ZA (σ 2, [x := t 2 ]) = [x σ 2 (t 2 )] σ 2 = [x Z] σ 2 = [x Z, t 2 Z] σ ι [:=] 3 [x] 1 [0] 2 [<] 6 [while] 7 [:=] 12 ƒ ZA (σ, [x := y]) = [x σ(y)] σ [x] 4 [3] 5 [x] 8 [+] 11 [x] 9 [1] 10 50 25

Zero Analysis Example σ ι = { x MZ x Var } σ 3 = [x Z, t 2 Z] σ ι BEGIN [;] 13 END Input to [3] 5 comes from [:=] 3 and [:=] 12 Input should be σ 3 σ 12 What is σ 12? Solution: assume Benefit: σ 3 = σ 3 Same result as ignoring back edge first time [:=] 3 [x] 1 [0] 2 [<] 6 [x] 4 [3] 5 [while] 7 [x] 8 [:=] 12 [+] 11 [x] 9 [1] 10 51 Zero Analysis Example σ ι = { x MZ x Var } σ 3 = [x Z, t 2 Z] σ ι BEGIN [;] 13 END σ 12 = σ 5 = ƒ ZA (σ 3 σ 12, [t 5 := 3]) = ƒ ZA (σ 3, [t 5 := 3]) = ƒ ZA (σ 3, [t 5 := 3]) = [t 5 NZ] σ 3 [:=] 3 [x] 1 [0] 2 [<] 6 [while] 7 [:=] 12 ƒ ZA (σ, [x := n]) = if n==0 then [x Z]σ else [x NZ]σ [x] 4 [3] 5 [x] 8 [+] 11 [x] 9 [1] 10 52 26

Zero Analysis Example σ ι = { x MZ x Var } σ 3 = [x Z, t 2 Z] σ ι BEGIN [;] 13 END σ 12 = σ 5 = [t 5 NZ] σ 3 σ 6 = ƒ ZA (σ 5, [t 6 := x< t 5 ]) [:=] 3 [while] 7 = σ 5 = [t 5 NZ] σ 3 [x] 1 [0] 2 [<] 6 [:=] 12 ƒ ZA (σ, /* any other */) = σ [x] 4 [3] 5 [x] 8 [+] 11 [x] 9 [1] 10 53 Zero Analysis Example σ ι = { x MZ x Var } σ 3 = [x Z, t 2 Z] σ ι BEGIN [;] 13 END σ 12 = σ 6 = [t 5 NZ] σ 3 [:=] 3 [while] 7 Skipping similar nodes [x] 1 [0] 2 [<] 6 [:=] 12 [x] 4 [3] 5 [x] 8 [+] 11 [x] 9 [1] 10 54 27

Zero Analysis Example σ ι = { x MZ x Var } σ 3 σ 12 = [x Z, t 2 Z] σ ι BEGIN [;] 13 END = σ 10 = [t 10 NZ, ] σ 3 [:=] 3 [while] 7 σ 11 = ƒ ZA (σ 10, [t 11 := x + t 10 ]) = [t 11 MZ]σ 10 ƒ ZA (σ, [x := y op z]) = [x MZ] σ [x] 1 [0] 2 [<] 6 [:=] 12 [x] 4 [3] 5 [x] 8 [+] 11 [x] 9 [1] 10 55 Zero Analysis Example σ ι = { x MZ x Var } BEGIN [;] 13 END σ 3 σ 12 = [x Z, t 2 Z] σ ι = σ 11 = [t 10 NZ,t 11 MZ, ]σ 3 [:=] 3 [while] 7 σ 12 = ƒ ZA (σ 11, [x :=t 11 ]) = [x σ 11 (t 11 )] σ 11 = [x MZ] σ 11 = [x MZ, ] σ 3 [x] 1 [0] 2 [<] 6 [:=] 12 ƒ ZA (σ, [x := y]) = [x σ(y)] σ [x] 4 [3] 5 [x] 8 [+] 11 [x] 9 [1] 10 56 28

Zero Analysis Example σ ι = { x MZ x Var } BEGIN [;] 13 END σ 3 = [x Z, t 2 Z] σ ι σ 12 = [x MZ, ] σ 3 [:=] 3 [while] 7 σ 5 = ƒ ZA (σ 3 σ 12, [t 5 := 3]) = ƒ ZA ([x MZ]σ 3, [t 5 := 3]) [x] 1 [0] 2 [<] 6 [:=] 12 = [t 5 NZ] [x MZ, ]σ 3 = [t 5 NZ, x MZ, ] σ 3 ƒ ZA (σ, [x] k ) = [t k σ(x)] σ [x] 4 [3] 5 [x] 8 [+] 11 [x] 9 [1] 10 57 Zero Analysis Example σ ι = { x MZ x Var } BEGIN [;] 13 END σ 3 = [x Z, t 2 Z] σ ι σ 12 = [x MZ, ] σ 3 [:=] 3 [while] 7 [x] 1 [0] 2 [<] 6 [:=] 12 Propagation of x MZ continues σ 12 does not change, so no need to iterate again [x] 4 [3] 5 [x] 8 [+] 11 [x] 9 [1] 10 58 29

Quick Quiz Explain in detail how the dataflow lattice value for after the statement w := y is computed, using the CFG below as your point of reference. begin x := 0 y := 1 z==0 Answer: x := x+y y := y-1 w := y 59 Outline Why static analysis? What is static analysis? Introduction to Dataflow Analysis Dataflow Analysis Frameworks Lattices Abstraction functions Control flow graphs Flow functions Worklist algorithm 60 30

Worklist Dataflow Analysis Algorithm worklist = new Set(); for all node indexes i do results[i] = A ; results[entry] = ι A ; worklist.add(all nodes); while (!worklist.isempty()) do i = worklist.pop(); before = k pred(i) results[k]; after = ƒ A (before, node(i)); if (!(after results[i])) results[i] = after; for all k succ(i) do worklist.add(k); Ok to just add entry node if flow functions cannot return A (examples will assume this) Pop removes the most recently added element from the set (performance optimization) 61 Example of Worklist [a := 0] 1 [b := 0] 2 while [a < 2] 3 do [b := a] 4 ; [a := a + 1] 5 ; [a := 0] 6 Control Flow Graph 1 2 3 6 Position Worklist a b 0 1 MZ MZ 1 2 Z MZ 2 3 Z Z 3 4,6 Z Z 4 5,6 Z Z 5 3,6 MZ Z 3 4,6 MZ Z 4 5,6 MZ MZ 5 3,6 MZ MZ 3 4,6 MZ MZ 4 6 MZ MZ 6 Z MZ 4 5 62 31

Quick Quiz Show how the worklist algorithm given in class operates on the program given, by filling in the table below. 1: x := 0 2: y := 1 3: if (z == 0) 4: x := x + y 5: else y := y 1 6: w := y Position 0 Worklist x y w 63 Worklist Algorithm Performance Performance Visits node whenever input gets less precise up to h = height of lattice Propagates data along control flow edges up to e = max outbound edges per node Assume lattice operation cost is o Overall, O(h*e*o) Typically h,o,e bounded by n = number of statements in program O(n 3 ) for many data flow analyses O(n 2 ) if you assume a number of edges per node is small Good enough to run on a function Usually not run on an entire program at once, because n is too big 64 32