Compiled Code Verification Survey and Prospects



Similar documents
Functional Programming. Functional Programming Languages. Chapter 14. Introduction

Sources: On the Web: Slides will be available on:

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 20: Stack Frames 7 March 08

Habanero Extreme Scale Software Research Project

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Týr: a dependent type system for spatial memory safety in LLVM

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

High-Level Programming Languages. Nell Dale & John Lewis (adaptation by Michael Goldwasser)

Logistics. Software Testing. Logistics. Logistics. Plan for this week. Before we begin. Project. Final exam. Questions?

The CompCert verified C compiler

Lecture 1: Introduction

Properties of Real Numbers

The programming language C. sws1 1

On strong fairness in UNITY

1 Abstract Data Types Information Hiding

X86-64 Architecture Guide

Chapter 13: Program Development and Programming Languages

Formal Verification and Linear-time Model Checking

[Refer Slide Time: 05:10]

Introduction to Formal Methods. Các Phương Pháp Hình Thức Cho Phát Triển Phần Mềm

Bachelors of Computer Application Programming Principle & Algorithm (BCA-S102T)

ASSEMBLY PROGRAMMING ON A VIRTUAL COMPUTER

COMP 356 Programming Language Structures Notes for Chapter 10 of Concepts of Programming Languages Implementing Subprograms.

Outline. hardware components programming environments. installing Python executing Python code. decimal and binary notations running Sage

Instruction Set Design

A Static Analyzer for Large Safety-Critical Software. Considered Programs and Semantics. Automatic Program Verification by Abstract Interpretation

Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z)

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

Model Checking based Software Verification

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

Lifting the Hood of the Computer: * Program Animation with the Teaching Machine

Lumousoft Visual Programming Language and its IDE

Software Engineering Transfer Degree

Chapter 13: Program Development and Programming Languages

Introduction to Automated Testing

CSC Software II: Principles of Programming Languages

The Course.

T Reactive Systems: Introduction and Finite State Automata

InvGen: An Efficient Invariant Generator

Chapter 7D The Java Virtual Machine

Lecture 9 verifying temporal logic

Lecture 11 Doubly Linked Lists & Array of Linked Lists. Doubly Linked Lists

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

Sample Induction Proofs

1 The Java Virtual Machine

Reading 13 : Finite State Automata and Regular Expressions

Regression Verification: Status Report

C Compiler Targeting the Java Virtual Machine

Answer Key for California State Standards: Algebra I

Moving from CS 61A Scheme to CS 61B Java

Software Engineering using Formal Methods

AC : A PROCESSOR DESIGN PROJECT FOR A FIRST COURSE IN COMPUTER ORGANIZATION

Lecture 10: Dynamic Memory Allocation 1: Into the jaws of malloc()

Software testing. Objectives

The Model Checker SPIN

We will learn the Python programming language. Why? Because it is easy to learn and many people write programs in Python so we can share.

Probability Using Dice

Course 395: Machine Learning

Comp 255Q - 1M: Computer Organization Lab #3 - Machine Language Programs for the PDP-8

8 Divisibility and prime numbers

Lecture Notes on Linear Search

Compilers. Introduction to Compilers. Lecture 1. Spring term. Mick O Donnell: michael.odonnell@uam.es Alfonso Ortega: alfonso.ortega@uam.

CIS570 Modern Programming Language Implementation. Office hours: TDB 605 Levine

Chapter 5 Instructor's Manual

CS 106 Introduction to Computer Science I

Formal verification of a C-like memory model and its uses for verifying program transformations

VHDL Test Bench Tutorial

Winter 2002 MID-SESSION TEST Friday, March 1 6:30 to 8:00pm

Platform-independent static binary code analysis using a metaassembly

Lecture 11: Tail Recursion; Continuations

Central Processing Unit Simulation Version v2.5 (July 2005) Charles André University Nice-Sophia Antipolis

CS 3719 (Theory of Computation and Algorithms) Lecture 4

Comprehensive Static Analysis Using Polyspace Products. A Solution to Today s Embedded Software Verification Challenges WHITE PAPER

Basic Components of an LP:

Software Testing & Analysis (F22ST3): Static Analysis Techniques 2. Andrew Ireland

Debugging. Common Semantic Errors ESE112. Java Library. It is highly unlikely that you will write code that will work on the first go

Boolean Expressions, Conditions, Loops, and Enumerations. Precedence Rules (from highest to lowest priority)

Advances in Programming Languages

Solve addition and subtraction word problems, and add and subtract within 10, e.g., by using objects or drawings to represent the problem.

MPI-Checker Static Analysis for MPI

Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy

Artificial Intelligence

Structure of Presentation. Stages in Teaching Formal Methods. Motivation (1) Motivation (2) The Scope of Formal Methods (1)

Topics. Introduction. Java History CS 146. Introduction to Programming and Algorithms Module 1. Module Objectives

Introduction to SPIN. Acknowledgments. Parts of the slides are based on an earlier lecture by Radu Iosif, Verimag. Ralf Huuck. Features PROMELA/SPIN

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

Mathematical Induction

Introducing Formal Methods. Software Engineering and Formal Methods

Adversary Modelling 1

Verifying security protocols using theorem provers

Chapter 1: Key Concepts of Programming and Software Engineering

Regular Languages and Finite Automata

Instruction Set Architecture (ISA) Design. Classification Categories

Computer Organization and Architecture

Instruction Set Architecture (ISA)

Computer Architecture Lecture 2: Instruction Set Principles (Appendix A) Chih Wei Liu 劉 志 尉 National Chiao Tung University

Transcription:

Compiled Code Verification Survey and Prospects Amitabha Sanyal Department of Computer Science & Engineering IIT Bombay (Copyright c 2004 Amitabha Sanyal)

Acknowledgements Achyut Jagtap Aditya Kanade Abhijat Vichare Supratim Biswas Uday Khedker Compiled Code Verification Survey and Prospects : Amitabha Sanyal 1

So you thought compilers were always right? Programmers place a large amount of trust on the correctness of compilers. Is this trust justifiable? Yes, because Compiling algorithms are well studied. Have sound theoretical foundations. Large parts of compilers are automatically generated by tools which have become trustworthy by usage. However... Compiled Code Verification Survey and Prospects : Amitabha Sanyal 2

So you thought compilers were always right? http://gcc.gnu.org/bugzilla/query.cgi Compiled Code Verification Survey and Prospects : Amitabha Sanyal 3

So you thought compilers were always right? Bug #15549 int lessthan (_Bool b, unsigned char c) { return b < c; } int main () { if (!lessthan(1, a )) abort (); } Problem: b < c is compiled as (b == 0) & (c!= 0). From release of GCC 3.4.0 to 3.4.3, the number of open bugs increased from 1400 to 2115. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 4

Increasing Trustworthiness through Compiler Verification Compiled Code Verification Survey and Prospects : Amitabha Sanyal 5

Increasing Trustworthiness through Compiler Verification Formalize and verify the following diagram for every source program P.: P comp comp(p) smean tmean smean(p) abstract tmean(comp(p)) comp represents the transformations due to a compiler (harder problem), or a model of a compiler (easier). Is the model faithful? Compiled Code Verification Survey and Prospects : Amitabha Sanyal 6

Compiler Verification P: y = 5 x = y + 1 smean comp push 5 store a0 push a0 push 1 add store a1 tmean y = 5 x = 6 abstract a0 = 5 a1 = 6 stack = [...] Compiled Code Verification Survey and Prospects : Amitabha Sanyal 7

Compiler Verification History J. McCarthy and J. Painter, Correctness of a Compiler for Arithmetic Expressions, Mathematical Aspects of Computer Science, 1967. R. M. Burstall, Proving Properties of Programs by Structural Induction, Computer Journal 1969. R. Milner and R. W. Weyhrauch, Proving compiler correctness in a mechanized logic, Machine Intelligence 7, Edinburgh University Press, 1972. F. Lockwood Morris, Advice on structuring compilers and proving them correct, Proceedings of the 1st annual ACM SIGACT-SIGPLAN symposium on Principles of programming languages, 1973. J. W. Thatcher, E. G. Wagner, J. B. Wright, More on advice on structuring compilers and proving them correct, Lecture Notes In Computer Science - Volume 94, 1980. Wolfgang Polak, Compiler verification and Specification, Lecture Notes in Computer Science - Volume 124, 1981. L. M. Chirica and D. F. Martin, Toward Compiler Implementation Correctness Proofs, ACM TOPLAS, 1986. C.A.R. Hoare and He Jifeng Refinement algebra proves correctness of a compiler. Proceedings of International Summer School on Programming and Mathematical Method, Editor - M. Broy, 1992. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 8

Compiler Verification - Polak Verifies an actual compiler and not a model. Source language Pascal. Target machine realistic stack machine. Compiler written by author. No optimization phase. Machine aided proof (Uses Stanford Verifier). Compiled Code Verification Survey and Prospects : Amitabha Sanyal 9

Compiler Verification - Drawbacks Complexity. 1. Requires reasoning about actual compiler implementation. 2. Requires reasoning about the behaviour of the compiler for an infinite number of programs and their translations. Automation - unlikely Proof reuse? Compiled Code Verification Survey and Prospects : Amitabha Sanyal 10

Increasing Trustworthiness Through Compile Code Verification Compiled Code Verification Survey and Prospects : Amitabha Sanyal 11

Increasing trustworthiness Through Compiled Code Verification Formalize and verify the following diagram for a given source program: P: y = 5 x = y + 1 comp push 5 store a0 push a0 push 1 add store a1 smean tmean y = 5 x = 6 abstract a0 = 5 a1 = 6 stack = [...] Compiled Code Verification Survey and Prospects : Amitabha Sanyal 12

Compiled Code Verification Less complex Involves reasoning about a given pair of programs The compiler can be made to provide information to help verification. Automation - likely. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 13

Compiled code verification - major approaches Martin. C. Rinard Credible Compilation, Technical Report of MIT Lab for Computer Science 776, 1999. G. Necula, Translation Validation of an Optimizing Compiler. In Proceedings of PLDI 2000. L. Zuck, Amir Pnueli, Yi Fang and Benjamin Goldberg, VOC: A Translation Validator for Optimizing Compilers, Electronic Notes in Theoretical Computer Science 65 No. 2(2002). A.P.Heffter and M. Gawkoski, Towards Proof Generating Compilers, Proceedings of COCV 2004. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 14

Execution trace A source program... int a[100]; int g; int foo(int n) { int i = 0; while (i < n) { a[i] =g *i + 3; i = i + 1 } return i; } int main () { foo(50); return 0; }... and its execution trace. main (); foo(50); i = 0; (i < n) a[i] =g *i + 3; i = i + 1 (i < n) a[i] =g *i + 3; i = i + 1 (i < n) a[i] =g *i + 3; i = i + 1......... return i; return 0; Compiled Code Verification Survey and Prospects : Amitabha Sanyal 15

Observation points and observables For a given input, the source program will give rise to an execution trace. Similarly for the target program. Observation points: Where the correspondence of source and target is ensured. 1. At every assignment statement. 2. At call-return boundaries 3. At print statements Observables: Things which must correspond. 1. Every variable and its value. 2. Arguments and return values 3. Arguments of print statements Equivalent traces: If two traces have the same sequence of observation points, and at each observation point the observables correspond. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 16

Correctness Criterion Correctness criterion: equivalent execution traces. For any input, the source and the target should have Compiled Code Verification Survey and Prospects : Amitabha Sanyal 17

The Necula method Compiled Code Verification Survey and Prospects : Amitabha Sanyal 18

The Necula method An example program body of a function while if; repeat code hoisting strength reduction Compiled Code Verification Survey and Prospects : Amitabha Sanyal 19

Data Space Source params mem n i &g &a t u v locals Compiled Code Verification Survey and Prospects : Amitabha Sanyal 20

Data Space Target regs r0 mem r2 &g &a bp 8 bp 4 stack Compiled Code Verification Survey and Prospects : Amitabha Sanyal 21

Data Mapping Mapping 1. Globals globals. 2. locals and params stack and regs. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 22

Observation points and Observables Observation points: Call-return boundaries Observables: At call point arguments and global memory. At return point return values and global memory. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 23

Key question If we want the condition i = r0 and m = m at b2 and b5... Compiled Code Verification Survey and Prospects : Amitabha Sanyal 24

Key question how should the variables be related at b0 and b4? Compiled Code Verification Survey and Prospects : Amitabha Sanyal 25

Key question How does one express the values of variables at b2 in terms of variables at b0?... Through symbolic evaluation. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 26

Symbolic Evaluation m = m i = i+1 t = t u = t i v = t i+3 b0(m,i,t,u,v) b1(m, i+1, t, t i, t i+3) b0(m,i,t) b1(m, i+1, t i+3) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 27

Symbolic Evaluation b0(m, i) i < n? b1(m, 0) : b2(m, n) b0(m, i) = b1( upd(m, (&a+i), sel(m,&g) i + 3), i+1) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 28

Generating and Solving Constraints Constraints are generated to represent the condition that observables at observation points correspond. Solution of constraints actually establishes this correspondence. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 29

Generating and Solving Constraints b0(m,n) b4(m, [BP-4]) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 30

Generating and Solving Constraints b0(m,n) b4(m, [BP-4]) b0(m,n) = 0 < n? b2(m,0) : b3(m,0,n) b4(m,[bp-4]) = 0 < [BP-4]? b5(m,0) : b6(m,0,[bp-4]) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 31

Generating and Solving Constraints b0(m,n) b4(m, [BP-4]) n =[BP-4] constraints(b2(m,0) b5(m,0)) constraints(b3(m,0,n) b6(m,0, [BP-4])) b0(m,n) = 0 < n? b2(m,0) : b3(m,0,n) b4(m,[bp-4]) = 0 < [BP-4]? b5(m,0) : b6(m,0,[bp-4]) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 32

Generating and Solving Constraints b0(m,n) b1(m, [BP-4]) n =[BP-4] constraints(b2(m,0) b5(m,0)) constraints(b3(m,0,n) b6(m,0, [BP-4])) b2(m,i) b5(m, r0) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 33

Generating and Solving Constraints b0(m,n) b4(m, [BP-4]) n =[BP-4] constraints(b2(m,0) b5(m,0)) constraints(b3(m,0,n) b6(m,0, [BP-4])) b2(m,i) b5(m, r0) b2(m,i) = ret(m,i) b5(m,r0) = ret(m,r0) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 34

Generating and Solving Constraints b0(m,n) b4(m, [BP-4]) n =[BP-4] constraints(b2(m,0) b5(m,0)) constraints(b3(m,0,n) b6(m,0,[bp-4])) b2(m,i) b5(m, r0) m = m i = r0 b2(m,i) = ret(m,i) b5(m,r0) = ret(m,r0) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 35

Generating and Solving Constraints b3(m,i,n) b6(m, r0, [BP-4]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, 3), r0+1)) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1, n), b7(upd(m,&a+r0, 3), r0+1, [BP-4])) b3(m,i,n) = i+1 < n? b2(upd(m,&a+i, sel(m,&g) i+3), i+1) : b3(upd(m,&a+i, sel(m,&g) i+3), i+1, n) b6(m,r0,[bp-4]) = r0+1 < [BP-4]? b5(upd(m,&a+r0, 3), r0+1) : b7(upd(m,&a+r0, 3), r0+1, [BP-4]) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 36

Generating and Solving Constraints b3(m,i,n) b7(m, r0, r2, [BP-4], [BP-8]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, r2), r0+1) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1, n), b7(upd(m,&a+r0,r2),r0+1,r2+[bp-8],[bp-4],bp-8]) b3(m,i,n) = i+1 < n? b2(upd(m,&a+i, sel(m,&g) i+3), i+1) : b3(upd(m,&a+i, sel(m,&g) i+3), i+1,n) b7(m,r0,[bp-4]) = r0+1 < [BP-4]? b5(upd(m,&a+r0, r2), r0+1) : b7(upd(m,&a+r0, r2), r0+1, r2+[bp-8], [BP-4], [BP-8]) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 37

Satisfying the Constraints b0(m,n) b4(m, [BP-4]) n =[BP-4] constraints(b2(m,0) b5(m,0)) constraints(b3(m,0,n) b6(m,0,[bp-4])) b2(m,i) b5(m, r0) i = r0 m = m b3(m,i,n) b6(m, r0, [BP-4]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, 3), r0+1)) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1,n), b7(upd(m,&a+r0, 3), r0+1, [BP-4])) b3(m,i,n) b7(m, r0, r2, [BP-4], [BP-8]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, r2), r0+1) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1,n), b7(upd(m,&a+r0,r2),r0+1,r2+[bp-8],[bp-4], [BP-8]) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 38

Satisfying the Constraints b0(m,n) b4(m, [BP-4]) n =[BP-4] constraints(b2( m,0) b5( m,0)) constraints(b3(m,0,n) b6(m,0,[bp-4])) b2( m,i) b5( m, r0) i = r0 m = m b3(m,i,n) b6(m, r0, [BP-4]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, 3), r0+1)) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1,n), b7(upd(m,&a+r0, 3), r0+1, [BP-4])) b3(m,i,n) b7(m, r0, r2, [BP-4], [BP-8]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, r2), r0+1) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1, n), b7(upd(m,&a+r0,r2),r0+1,r2+[bp-8],[bp-4], [BP-8]) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 39

Satisfying the Constraints b0(m,n) b4(m, [BP-4]) b2( m,i) b5( m, r0) n =[BP-4] m = m constraints(b2( m,0) b5( m,0)) constraints(b3(m,0,n) b6(m,0,[bp-4])) b3(m,i,n) b6(m, r0, [BP-4]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, 3), r0+1)) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1, n), b7(upd(m,&a+r0, 3), r0+1, [BP-4])) i = r0 m = m b3(m,i,n) b7(m, r0, r2, [BP-4], [BP-8]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, r2), r0+1) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1, n), b7(upd(m,&a+r0,r2),r0+1,r2+[bp-8],[bp-4], [BP-8]) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 40

Satisfying the Constraints b0(m,n) b4(m, [BP-4]) b2(m, i ) b5(m, r0 ) n =[BP-4] m = m constraints(b2(m, 0 ) b5(m, 0 )) constraints(b3(m,0,n) b6(m,0,[bp-4])) b3(m,i,n) b6(m, r0, [BP-4]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, 3), r0+1)) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1, n), b7(upd(m,&a+r0, 3), r0+1, [BP-4])) i = r0 m = m b3(m,i,n) b7(m, r0, r2, [BP-4], [BP-8]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, r2), r0+1) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1,n), b7(upd(m,&a+r0,r2),r0+1,r2+[bp-8],[bp-4], [BP-8]) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 41

Satisfying the Constraints b0(m,n) b4(m, [BP-4]) b2(m,i) b5(m, r0) n =[BP-4] m = m constraints(b2(m,0) b5(m,0)) constraints(b3(m,0,n) b6(m,0,[bp-4])) b3(m,i,n) b6(m, r0, [BP-4]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, 3), r0+1)) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1, n), b7(upd(m,&a+r0, 3), r0+1, [BP-4])) i = r0 m = m b3(m,i,n) b7(m, r0, r2, [BP-4], [BP-8]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, r2), r0+1) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1,n), b7(upd(m,&a+r0,r2),r0+1,r2+[bp-8],[bp-4], [BP-8]) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 42

Satisfying the Constraints b0(m,n) b4(m, [BP-4]) b2( m,i) b5( m, r0) n =[BP-4] m = m constraints(b2(m,0) b5(m,0)) constraints(b3(m,0,n) b6(m,0,[bp-4])) b3(m,i,n) b6(m, r0, [BP-4]) i = r0 n = [BP-4] constraints(b2( upd(m,&a+i, sel(m,&g) i+3), i+1), b5( upd(m,&a+r0, 3), r0+1)) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1, n), b7(upd(m,&a+r0, 3), r0+1, [BP-4])) i = r0 m = m b3(m,i,n) b7(m, r0, r2, [BP-4], [BP-8]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, r2), r0+1) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1, n), b7(upd(m,&a+r0,r2),r0+1,r2+[bp-8],[bp-4], [BP-8]) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 43

Satisfying the Constraints b0(m,n) b4(m, [BP-4]) b2( m,i) b5( m, r0) n =[BP-4] m = m constraints(b2(m,0) b5(m,0)) constraints(b3(m,0,n) b6(m,0,[bp-4])) b3(m,i,n) b6(m, r0, [BP-4]) i = r0 n = [BP-4] m = m sel(m,&g) i = 0 constraints(b2( upd(m,&a+i, sel(m,&g) i+3), i+1), b5( upd(m,&a+r0, 3), r0+1)) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1), b7(upd(m,&a+r0, 3), r0+1, [BP-4])) i = r0 m = m b3(m,i,n) b7(m, r0, r2, [BP-4], [BP-8]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, r2), r0+1) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1), b7(upd(m,&a+r0,r2),r0+1,r2+[bp-8],[bp-4], [BP-8]) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 44

Satisfying the Constraints b0(m,n) b4(m, [BP-4]) b2(m,i) b5(m, r0) n =[BP-4] m = m constraints(b2(m,0) b5(m,0)) constraints(b3(m,0,n) b6(m,0,[bp-4])) b3(m,i,n) b6(m, r0, [BP-4]) i = r0 n = [BP-4] m = m sel(m,&g) i = 0 constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, 3), r0+1)) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1), b7(upd(m,&a+r0, 3), r0+1, [BP-4])) i = r0 m = m b3(m,i,n) b7(m, r0, r2, [BP-4], [BP-8]) i = r0 n = [BP-4] constraints(b2(upd(m,&a+i, sel(m,&g) i+3), i+1), b5(upd(m,&a+r0, r2), r0+1) constraints(b3(upd(m,&a+i, sel(m,&g) i+3), i+1), b7(upd(m,&a+r0,r2),r0+1,r2+[bp-8],[bp-4], [BP-8]) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 45

Satisfying the Constraints b0(m,n) n =[BP-4] m = m b4(m, [BP-4]) b2(m,i) b5(m, r0) i = r0 m = m b3(m,i,n) b6(m, r0, [BP-4]) b3(m,i,n) b7(m, r0, r2, [BP-4], [BP-8]) i = r0 n = [BP-4] m = m sel(m,&g) i = 0 i = r0 n = [BP-4] m = m sel(m,&g) i + 3 = r2 Compiled Code Verification Survey and Prospects : Amitabha Sanyal 46

Demonstrating correctness Simulation relation (b0, b4) n=[bp-4], m = m (b2, b5) i=r0, m = m (b3, b6) i=0, n=[bp-4], m = m, sel(m,&g) i = 0 (b3, b7) i=0, n=[bp-4], m = m, sel(m,&g) i + 3 = r2 Compiled Code Verification Survey and Prospects : Amitabha Sanyal 47

Simulation Relation (b0, b4) n=[bp-4], m = m (b2, b5) i=r0, m = m (b3, b6) i=0, n=[bp-4], m = m, sel(m,&g) i = 0 (b3, b7) i=0, n=[bp-4], m = m, sel(m,&g) i + 3 = r2 Compiled Code Verification Survey and Prospects : Amitabha Sanyal 48

Simulation Relation (b0, b4) n=[bp-4], m = m (b2, b5) i=r0, m = m (b3, b6) i=0, n=[bp-4], m = m, sel(m,&g) i = 0 (b3, b7) i=0, n=[bp-4], m = m, sel(m,&g) i + 3 = r2 Compiled Code Verification Survey and Prospects : Amitabha Sanyal 49

Simulation Relation (b0, b4) n=[bp-4], m = m (b2, b5) i=r0, m = m (b3, b6) i=0, n=[bp-4], m = m, sel(m,&g) i = 0 (b3, b7) i=0, n=[bp-4], m = m, sel(m,&g) i + 3 = r2 Compiled Code Verification Survey and Prospects : Amitabha Sanyal 50

Simulation Relation (b0, b4) n=[bp-4], m = m (b2, b5) i=r0, m = m (b3, b6) i=0, n=[bp-4], m = m, sel(m,&g) i = 0 (b3, b7) i=0, n=[bp-4], m = m, sel(m,&g) i + 3 = r2 Compiled Code Verification Survey and Prospects : Amitabha Sanyal 51

Adding procedure calls Source Target main main entry main entry main i+j=r0 m=m call foo(i+j) foo(n) entry foo n=[bp 4] m=m call foo(r0) foo([bp 4]) entry foo k = returnval k = r1 m=m return i i = r0 m=m r1 = returnval return r0 return k return r1 i + j = r0 m = m {n = [BP 4] m = m } n i+j [BP 4] r0 {i = r0 m = m } i k r0 r1 k = r1 m = m Compiled Code Verification Survey and Prospects : Amitabha Sanyal 52

Heffter-Gawkoski Approach Compiled Code Verification Survey and Prospects : Amitabha Sanyal 53

Heffter-Gawkoski Approach Compiled Code Verification Survey and Prospects : Amitabha Sanyal 54

Heffter-Gawkoski Approach Example Source Program: x = 7 + x; y = 8; x + y Semantics: evalprog :: (Sprogram, State) Integer Compiled Code Verification Survey and Prospects : Amitabha Sanyal 55

Heffter-Gawkoski Approach Example Target Program: push 7 load a0 add store a0 push 8 store a1 load a0 load a1 add Semantics: type Store = (Stack, Memory) exec :: (Tprogram, Store) Store Compiled Code Verification Survey and Prospects : Amitabha Sanyal 56

Heffter-Gawkoski Approach Memory map (dmap) Initialization of memory (initmem) Compiled Code Verification Survey and Prospects : Amitabha Sanyal 57

Heffter-Gawkoski Approach ctrans sp tp = def dmap st : evalprog sp st = top (fst (exec tp ([ ], initmem st dmap))) The value of the source program is the same as that left on the top of the stack by the target program Compiled Code Verification Survey and Prospects : Amitabha Sanyal 58

Heffter-Gawkoski Approach For: sp = [x = 7 + x; y = 8] x + y tp = [push 7; load a0; add; store a0; push 8; store a1; load a0; load a1; add dmap = [x a0; y a1] The compiler generates a proof script of: st : evalprog sp = top (fst (exec tp ([ ], initmem st dmap))) Proves if x = a0 = α, then both programs return α + 15 Compiled Code Verification Survey and Prospects : Amitabha Sanyal 59

Heffter-Gawkoski Approach Proof long and complex. Can checking of semantic equivalence be replaced by checking of syntactic equivalence? Compiled Code Verification Survey and Prospects : Amitabha Sanyal 60

Heffter-Gawkoski Approach S Compiler T1 S model of Compiler tr T1, T2, T3 S Compiler O1 T2 tr models the compiler as closely and S Compiler O2 T3 conveniently as possible. tr should be easily checkable. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 61

Heffter-Gawkoski Approach Compiler Model Correct translation Incorrect translation Compiled Code Verification Survey and Prospects : Amitabha Sanyal 62

Heffter-Gawkoski Approach If we additionally prove a result: ( dmap sp, tp : tr sp dmap tp) (ctrans sp tp) Compiler Model Correct translation Incorrect translation Compiled Code Verification Survey and Prospects : Amitabha Sanyal 63

Heffter-Gawkoski Approach Suppose T is the target produced by the compiler for a source S. Are S and T related by tr? 1. Yes T is a correct compilation of S. 2. No Either T is incorrect or false negative due to inadequate model. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 64

Looking Ahead Compiled Code Verification Survey and Prospects : Amitabha Sanyal 65

Towards More Realistic Compiled Code Verifiers Can Necula be extended to more realistic compilers? Modelling of richer source language features. Richer data types. Dynamic memory allocation. Aliasing. Expressions with multiple side effects. Functions calls and returns. Realistic modelling of target machine. Accurate modelling of target data space (aliasing in stack, heap). Addressing modes. Modelling target data types (bit, byte, word, long). Compiled Code Verification Survey and Prospects : Amitabha Sanyal 66

Looking Ahead Relating source to the target. Elaborate data mapping Operator mapping The constraint solver. Designing an algorithm for constraint solving. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 67

Our goal To build a realistic and usable compiled code verifier for MISRA C subset of GCC C. 1. No mallocs. 2. no unions. 3. Multiple side-effects in a expression not allowed. 4. No pointer arithmetic. Use Necula s method as basis with minimal compiler instrumentation. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 68

Conclusions Building a realistic compiled code verification system for a moderately simple language seems possible. Compiled Code Verification Survey and Prospects : Amitabha Sanyal 69