Deductive Program Verication with Why3 A Tutorial

Transcription

1 Lecture in the Logic and Semantics group at the Institute of Cybernetics (IoC), Tallinn, Estonia Deductive Program Verication with Why3 A Tutorial Jean-Christophe Filliâtre January 2013

2 Information This lecture has been prepared using Why3 version 0.80 (October 2012). The material related to the lecture (slides, exercises, source code from this document, etc.) is available at In addition, one may consult Why3's documentation and examples from Why3's web site Author: Jean-Christophe Filliâtre LRI - bâtiment 650 Université Paris-Sud Orsay Cedex France [email protected] Web : 2

3 Contents 1 Introduction 5 2 Logic First Contact with Why Talking to Theorem Provers Program Verication First Examples Turing's Proof of a Program McCarthy's 91 Function Arrays Algebraic Data Types Other Data Structures Using Why3 as an Intermediate Language Machine Arithmetic Memory Models

4 4

5 Chapter 1 Introduction Any computer scientist knows John McCarthy's 91 function [15]. function dened as follows: It is the recursive { n 10 if n > 100, f(n) = f(f(n + 11)) otherwise. One may wonder whether this function always return 91, as suggested by its name, or, otherwise, under which condition it does. One may also want to prove its termination, which is not obvious, or to prove that it is equivalent to the following iterative code: e 1 while e > 0 do if n > 100 then n n 10 e e 1 else n n + 11 e e + 1 return n These are questions related to program verication. The programs above are given in pseudo-code, but we can pose similar questions for programs written in mainstream programming languages. For instance, one may wish to prove that the following Java code indeed sorts an array of Boolean values int i = 0, j = a.length - 1; while (i < j) if (!a[i]) i++; else if (a[j]) j; else swap(a, i++, j); or that the following C program, although purposely obfuscated, indeed computes the number of solutions to the N-queens problem: t(a,b,c){int d=0,e=a&~b&~c,f=1;if(a)for(f=0;d=(e-=d)&-e;f+=t(a-d,(b+d)*2,( c+d)/2));return f;}main(q){scanf("%d",&q);printf("%d\n",t(~(~0q),0,0));} 5

6 This lecture is an introduction to Why3 [4], a set of tools to perform deductive program verication [10]. Deductive verication means that we express the correctness of a program as a mathematical statement and then we prove it, which can be summarized as follows: program + specification verification conditions proof This idea is as old as computer science [18, 13], but it is making a giant leap today thanks to recent progress in automated theorem provers, especially those of the SMT family 1. SMT solvers are able to discharge a huge amount of the verication tasks, if not all. The Why3 platform provides a logic language (called Why) and a programming language (called WhyML). The logic is an extension of rst-order logic with polymorphism, algebraic data types, and inductive predicates. The programming language is a rst-order ML-like language with imperative features, pattern matching, and exceptions. The two languages are intimately tied: any logic symbol can be used in programs, and the logic is used to specify programs (via traditional pre- and postconditions, loop invariants, etc.). A standard weakest precondition calculus [8] is used to extract verication conditions from programs. The resulting formulas are then going through several transformations to be sent to a wide set of external theorem provers. The data ow may be depicted as follows: file.mlw file.why VCgen WhyML Why transform/translate print/run Coq Alt-Ergo CVC3 Z3 etc. There are basically three ways of using Why3. for its logic only It is perfectly ne using Why3 without writing a single program. It means you use the logic only (declaring, dening, axiomatizing, stating goals) and you see Why3 as a mere front- for dozens of theorem provers. Yet you save the burden of learning the input languages of all these tools, and the various ways of calling them. You can also use Why3 to manipulate formulas by implementing your own transformations (there is an OCaml API). 1 A computer scientist recently coined the concept of SMT revolution. 6

7 to verify algorithms and data structures Though WhyML has limited imperative features (aliasing is not allowed), a lot of programs and data structures can be implemented in WhyML. Even when some data structure cannot be implemented, it is usually easy to model it and the remaining of the verication can be carried out. WhyML programs can be translated to executable OCaml code. as an intermediate language To verify programs written in a mainstream programming language, it is convenient to use Why3 as an intermediate language. A suitable memory model is designed in Why3 logic and then program constructs are compiled into WhyML constructs. The task of extracting verication conditions (and to pass them to theorem provers) is left to Why3. So far, Why3 has been successfully used that way to verify Java programs [16], C programs [11, 17], Ada programs [12], probabilistic programs [2], and cryptographic programs [1]. The following three chapters describe these three ways of using Why3. Many technical aspects are not covered in this tutorial, such as the transformation process to encode Why3's logic into SMT logic, the type checking rules to exclude aliasing, or the weakest precondition calculus used to extract verication conditions. We invite to reader to get more information from Why3's web site ( and from the author's habilitation thesis [9]. 7

8 8

9 Chapter 2 Logic In this chapter, we introduce Why3 logic. Program verication will be covered in the next two chapters. 2.1 First Contact with Why3 In a le demo_logic.why, let us dene the algebraic data type of polymorphic lists. To do so, we introduce a theory List containing the denition of type list α 1. theory List type list α = Nil Cons α (list α) This le can be processed in a batch way using the following command: why3 demo_logic.why The contents of le demo_logic.why is checked, then printed on standard output. Let us add the denition of a recursive predicate mem checking for the presence of an element x in a list l. predicate mem (x: α) (l: list α) = match l with Nil false Cons y r x = y mem x r The system automatically checks for the terminations of mem, looking for a subset of its arguments that ensures a lexicographic structural decreasing. In this case, it is limited to argument l. If we had written mem x l instead of mem x r, then the denition of mem would have been rejected: why3 demo_logic.why File "demo_logic.why", line 10, characters 2-6: Cannot prove the termination of mem Let us add a third declarations to theory List, namely a goal to show that 2 belongs to the list [1;2;3]. We state it as follows: 1 The type variable α is written 'a in the concrete syntax, as in OCaml. 9

10 goal G1: mem 2 (Cons 1 (Cons 2 (Cons 3 Nil))) It is then possible to attempt discharging this goal using one of the theorem provers supported by Why3. Let us assume that the SMT solver Alt-Ergo is installed and recognized by Why3 (this can be checked with why3 list-provers). Then we pass goal G1 to Alt-Ergo using option -P. why3 -P alt-ergo demo_logic.why demo_logic.why List G1 : Valid (0.03s) As we see, the goal is declared valid, without surprise. We can use the graphical user interface why3ide instead. why3ide demo_logic.why Then we can launch theorem provers interactively, including proof assistants such as Coq. Let us dene now the length of a list. We do that in a second theory Length, whose rst two declarations import the previous theory List, as well as the theory of integer arithmetic from Why3 standard library, namely int.int. theory Length use import List use import int.int In theory List, we did not have to import arithmetic, since only integer literals were used; but here we need addition and order relation as well. We dene a recursive function length over lists, similarly to what we did for predicate mem. function length (l: list α) : int = match l with Nil 0 Cons _ r length r + 1 Once again, the system automatically checks for termination. Then we state a lemma saying that a list length is always nonnegative. lemma length_nonnegative: l:list α. length(l) 0 The automated theorem prover Alt-Ergo is not able to discharge it. More specically, it quickly aborts with message Unknown, which means it could not check the validity of the goal. why3 -P alt-ergo demo_logic.why demo_logic.why List G1 : Valid (0.01s) demo_logic.why Length length_nonnegative : Unknown: Unknown (0.01s) Indeed, proving lemma length_nonnegative requires induction, which is out of reach of SMT solvers. One may use the Coq proof assistant instead to do that proof, by calling it from why3ide for instance. However, Alt-Ergo is able to discharge the following goal, using lemma length_nonnegative as an hypothesis: goal G3: x: int, l: list int. length (Cons x l) > 0 10

11 This is the distinction between declarations goal and lemma: the latter introduces a goal that may be used in the following. It is easy to check that lemma length_nonnegative is indeed an hypothesis of goal G3 from the graphical user interface (in the top-right window). Let us now dene the notion of sorted list. We start with lists of integers. We introduce a new theory SortedList for that purpose, containing the declarations of an inductive predicate sorted 2. theory SortedList use import List use import int.int inductive sorted (list int) = sorted_nil: sorted Nil sorted_one: x: int. sorted (Cons x Nil) sorted_two: x y: int, l: list int. x y sorted (Cons y l) sorted (Cons x (Cons y l)) Such a declaration denes sorted as the smallest predicate satisfying the three axioms sorted_nil, sorted_one, and sorted_two. We can state a goal saying that the list [1;2;3] is sorted: goal sorted123: sorted (Cons 1 (Cons 2 (Cons 3 Nil))) and it is easily discharged by automated theorem provers. However, dening sorted over lists of integers only, and using the order relation only, is not satisfactory. As much as possible, we could prefer dening sorted in a generic way, once and for all. The logic of Why3 being a rst-order logic, it is not possible to pass the order relation as an argument to predicate sorted. Though, there is an alternative. We modify theory SortedList to use an abstract data type t instead of integers and an uninterpreted binary predicate over t. theory SortedList use import List type t predicate ( ) t t To make it right, we add three axioms to express that is an order relation. 2 We could have dened predicate sorted recursively, as follows: predicate sorted (l: list int) = match l with Nil Cons _ Nil true Cons x (Cons y _ as r) x y sorted r But the purpose here is to introduce inductive predicates. 11

12 axiom le_refl: x: t. x x axiom le_asym: x y: t. x y y x x = y axiom le_trans: x y z: t. x y y z x z The denition of sorted is mostly unchanged: we simply substitute t for int and relation over type t for relation over type int. inductive sorted (list t) = sorted_nil: sorted Nil sorted_one: x: t. sorted (Cons x Nil) sorted_two: x y: t, l: list t. x y sorted (Cons y l) sorted (Cons x (Cons y l)) To handle the particular case of lists of integers, we rst set up a new theory in which we import lists and integers: theory SortedIntList use import int.int use import List Then we instantiate the generic theory SortedList with type int and with order relation over integers. To do so, we use the clone command, rather than use, as follows: clone import SortedList with type t = int, predicate ( ) = ( ), lemma le_refl, lemma le_asym, lemma le_trans This command makes a copy of theory SortedList, while substituting type int to type t and the order relation over integers to the uninterpreted predicate (in this case, the two predicates have the same name but this is not a requirement). This command introduces the declaration of a new inductive predicate sorted, with an argument of type list int. We can use it to state the following goal: goal sorted123: sorted (Cons 1 (Cons 2 (Cons 3 Nil))) Additionally, we notice that the clone command above also included lemma le_refl, lemma le_asym, and lemma le_trans. It means that these three axioms must be proved 3. They are turned into lemmas. We can easily check this out by looking at the set of goals contained in le demo_logic.why: why3 -P alt-ergo demo_logic.why... demo_logic.why SortedIntList le_refl : Valid (0.00s) demo_logic.why SortedIntList le_asym : Valid (0.00s) demo_logic.why SortedIntList le_trans : Valid (0.00s) demo_logic.why SortedIntList sorted123 : Valid (0.00s) 3 The system cannot automatically infer axioms that could become lemmas; this is not decidable. Thus it is up to the user to indicate the axioms to be turned into lemmas. 12

13 As illustrated by this example, the concept of cloning allows the design of generic theories to be instantiated later in various ways. This is analogous to generic classes in Java or functors in ML. But this is also slightly more exible, as parameters do not have to be set once and for all. A theory is thus parameterized in various ways simultaneously. In the example above, we can introduce another generic theory, namely that of order relation. It contains type t, order relation, and the three axioms. theory Order type t predicate ( ) t t axiom le_refl: x: t. x x axiom le_asym: x y: t. x y y x x = y axiom le_trans: x y z: t. x y y z x z Then we simply clone it inside theory SortedList, to get the exact same theory as before: theory SortedList use import List clone export Order... The benet here is the ability to reuse theory Order in other contexts. Figure 2.1 summarizes this new version of theory SortedList and its use on lists of integers. The Why3 standard library is built this way. Beside the obvious benet of factorization, splitting the declarations into small theories also allows a ne grain control of the context of each goal. With SMT solvers, in particular, limiting the size of the logical context may improve performances signicantly. Summary. The logic of Why3 is an extension of rst-order logic with polymorphism, mutually recursive algebraic data types, mutually recursive function/predicate symbols, mutually inductive predicates, and constructs let-in, match-with, and if-then-else. Logical declarations are of four dierent kinds: type declaration; function or predicate declaration; inductive predicate declaration; axiom, lemma, or goal declaration. Logical declarations are organized into theories. A theory T 1 may be used (use) in another theory T 2. In that case, symbols from T 1 are shared, axioms from T 1 remain axioms, lemmas from T 1 become axioms, and goals from T 1 are discarded. or cloned (clone) in another theory T 2. In that case, declarations from T 1 are copied or substituted, axioms from T 1 remain axioms or become lemmas/goals, lemmas from T 1 become axioms, and goals from T 1 are discarded. 13

14 theory List type list α = Nil Cons α (list α) theory Order type t predicate ( ) t t axiom le_refl: x: t. x x axiom le_asym: x y: t. x y y x x = y axiom le_trans: x y z: t. x y y z x z theory SortedList use import List clone export Order inductive sorted (list t) = sorted_nil: sorted Nil sorted_one: x: t. sorted (Cons x Nil) sorted_two: x y: t, l: list t. x y sorted (Cons y l) sorted (Cons x (Cons y l)) theory SortedIntList use import int.int use import List clone import SortedList with type t = int, predicate ( ) = ( ), lemma le_refl, lemma le_asym, lemma le_trans goal sorted123: sorted (Cons 1 (Cons 2 (Cons 3 Nil))) Figure 2.1: A generic theory of sorted lists. 14

15 2.2 Talking to Theorem Provers One of the rst benets of Why3 is to provide a technology to talk to provers. There are various theorem provers available and they have distinct logical languages, predened theories, or type systems. Why3 provides a common language and a common tool to use them. This technology is organized around the notion of task. A task is a logical context, that is a list of declarations, followed but exactly one goal, that is a formula. Tasks are extracted from the various theories. Given a task and a target theorem prover, Why3 performs a series of transformations on the task, so that it ts in the logic of the prover. For example, if the target prover is Z3, Why3 will apply, among other things, a transformation to get rid of inductive predicates, since Z3 knows nothing about it. Once all transformations are applied, Why3 simply prints the resulting task into the native syntax of the prover, using a pretty-printer. We can display the task ow as follows: theory Alt-Ergo theory T 1 T 2 P goal goal goal Z3 theory Vampire This journey is driven by a le (called a driver in Why3's terminology). This le lists the transformations to apply, the output format (that is the input syntax of the prover as well as predened symbols and axioms), and regular expressions to diagnose messages from the prover. Such a le can be set up by the user, for instance to add support for a new prover or to conduct experiments with a prover. Drivers are provided with Why3 for the following proves: Alt-Ergo, CVC3, Yices, E-prover, Gappa, MathSAT, Simplify, Spass, Vampire, VeriT, Z3, Zenon, iprover, Mathematica, Coq, PVS. Additionally, Why3 provides an OCaml API. One can build terms, formulas, declarations, tasks, theories (that are all guaranteed to be well-formed), and call provers. Using plug-ins, the system can be exted with new input languages, new transformations, and new pretty-printers. In particular, everything needed to add support for a new prover can be added to the system. More details can be found in the paper Why3: Shepherd your herd of provers [5] as well as Why3's manual [4]. 15

16 16

17 Chapter 3 Program Verication We now turn to program verication using Why First Examples Turing's Proof of a Program As a rst example, let us consider what is most likely the very rst proof of a program. It is a 1949 paper by Alan Turing called Checking a Large Routine [18]. The program computes n! using only additions. It is given as a control ow graph: STOP r = 1 u = 1 v = u TEST r n s = 1 u = u + v s = s + 1 r = r + 1 TEST s r The notation u = u+v stands for the assignment of u+v to variable u, that is u u+v using a modern notation. Variable n contains the input parameter and is not modied. When the program exits, variable u contains n!, the factorial of n. There are two nested loops. In the outer loop, variable u contains r! and its contents is copied into variable v. The inner loop computes (r + 1)! in u, using repetitive additions of v. Variables r and s are loop counters, so we can easily rewrite the program using two for loops. With a slight modication of the control ow 1, the program above is thus equivalent to: u 1 for r = 0 to n 1 do v u for s = 1 to r do u u + v Let us prove the correctness of this program using Why3. In a le demo_turing.mlw we write the program above in the syntax of WhyML, the programming language of Why3. 1 Finding which is left to the reader as an exercise. 17

18 Program declarations are grouped in modules (the programming equivalent to logical theories). module CheckingALargeRoutine use import int.fact use import ref.ref The rst command imports the logical theory int.fact, that is theory Fact from Why3 standard library. This theory provides a function symbol fact denoting the factorial. The second command imports a module from the standard library of WhyML, namely the module Ref providing references. As in ML, references are mutable variables: a reference is created with function ref, we access the contents of reference x with!x, and we assign it with x := e. We can now write a function routine taking an integer argument n as argument and computing its factorial using the algorithm above: let routine (n: int) : int requires { n 0 } ensures { result = fact n } = let u = ref 1 in for r = 0 to n-1 do let v =!u in for s = 1 to r do u :=!u + v done done;!u The code itself is similar to OCaml code. The specication is introduced with keywords requires and ensures. requires introduces the precondition n 0 and ensures the postcondition result = fact n. Within the postcondition, variable result stands for the value returned by the function. As we did in the previous chapter with a le containing logical declarations, we can pass this le to Why3 for type checking and proof processing. why3 -P alt-ergo demo_turing.mlw WP CheckingALargeRoutine WP_parameter routine : Unknown: Unknown (0.08s) We see that Alt-Ergo was not able to discharge the verication condition. If we switch to the graphical user interface (why3ide) which is quickly mandatory when performing program verication we can have a look at the goal expressing the correctness of function routine. Omitting the context, it reduces to the following formula: n, n 0 (0 > n 1 1 = fact n) (0 n 1 u, u = fact n) The rst part of the formula corresponds to the case where we do not even enter the outer for loop, since n = 0. Then we have to show 1 = fact n, that is provable. The other part of the formula corresponds to the case where we enter the for loop. Then we have to prove that the nal value of variable u, here written u, is equal to the factorial of n. But we do not have any information over u, which makes the formula unprovable. 18

19 module CheckingALargeRoutine use import int.fact use import ref.ref let routine (n: int) : int requires { n 0 } ensures { result = fact n } = let u = ref 1 in for r = 0 to n-1 do invariant {!u = fact r } let v =!u in for s = 1 to r do invariant {!u = s * fact r } u :=!u + v done done;!u Figure 3.1: Computing the factorial using additions (Turing, 1949). Indeed, we have to rst set a loop invariant to get information regarding the nal value of u. In our case, this invariant states that u contains the factorial of r. We write it as follows: for r = 0 to n-1 do invariant {!u = fact r } Using this invariant, it is now possible to establish the postcondition when exiting function routine. To see it, we can use the Split command in the graphical user interface, which results in four sub-goals. The validity of the postcondition is the fourth one and is easily discharged by a solver. On the contrary, we cannot discharge the third goal, which states the preservation of the loop invariant we just added. Without any surprise, we have to equip the inner loop with an invariant as well, namely the following: for s = 1 to r do invariant {!u = s * fact r } It is now easy for a solver to show the correctness of the program. There is no termination to prove here, as it is automatically guaranteed in the case of for loops. The whole le is given gure McCarthy's 91 Function Let us consider now another example, namely the aforementioned McCarthy's 91 function [15]. It is the following recursive function: { n 10 if n > 100, f(n) = f(f(n + 11)) otherwise. 19

20 In WhyML syntax, we dene such a function using the let rec construct. let rec f (n:int) : int = if n 100 then f (f (n + 11)) else n - 10 We int to prove the termination of this function. To do so, we must equip it with a variant, that is a term that decreases at each recursive call for a well-founded order relation. The simplest solution is to use an integer term and the following relation: x y def = x < y 0 y. In the case of function 91, the variant is quite easy to gure out. It is n. In WhyML syntax, we give it as follows: let rec f (n:int) : int variant { n } = if n 100 then f (f (n + 11)) else n - 10 As a result, we get a verication condition to show that the variant indeed decreases in each of the two recursive calls. In the rst case, the sub-goal is easily discharged. Indeed we have 100 (n + 11) 100 n when n 100. On the contrary, the second sub-goal cannot be discharged, for we don't have enough information regarding the value returned by the rst recursive call. Thus we have to give function f a specication, namely a postcondition: let rec f (n: int) : int variant { 100-n } ensures { result = if n 100 then 91 else n-10 } = if n 100 then f (f (n + 11)) else n - 10 Then we get a verication condition that is easily discharged by automated provers. It expresses both termination and correctness. This is an example where proving termination requires proving a behavioral property as well. (Another example is Floyd's algorithm for cycle detection, also know as tortoise and hare algorithm; see [9].) Let us consider now a non-recursive version of function 91, namely the following iterative program: let iter (n0: int) = let n = ref n0 in let e = ref 1 in while!e > 0 do if!n > 100 then begin n :=!n - 10; e :=!e - 1 else begin n :=!n + 11; e :=!e

21 done;!n The idea behind this program is the following. It is not necessary to maintain a stack of all recursive calls to f. The total number of such calls is enough. It is maintained in variable e, and initialized to 1. If n > 100 we subtract 10 from n and we decrement e, to account for one completed call to f. Otherwise, we add 11 to argument n and we increment e to account for a completed call to f and to two new calls to be performed. Proving termination of this program is both simpler and more complex than proving the termination of the recursive function. It is simpler as we do not have to prove anything regarding the values computed by the program. But it is also more complex, as the variant is now a pair, namely (101 n + 10e, e), that decreases for the lexicographic order, each component being ordered by relation above. Such a lexicographic order relation is dened in Why3 standard library and we simply need to import it: use import int.lex2 We can now set this variant to the while loop, using the following syntax: while!e > 0 do variant { (100 -!n + 10 *!e,!e) with lex }... where keyword with is used to specify the order relation. The verication condition is discharged automatically, without any other annotation in the program. We could stop here. But proving the behavioral correctness of this iterative code appears to be interesting as well. Indeed, this is not obvious that this new code computes f(n). The loop invariant is f e (n) = f(n 0 ) where n 0 stands for the initial value of n. Therefore, when we exit the loop, with e = 0, we get n = f(n 0 ) and we return this value. To write down such an invariant, we rst introduce the logical function f. function f (x: int) : int = if x 101 then x-10 else 91 This is not the recursive function we proved above, but rather the specication of function 91. Then we dene f k (x). A logical function cannot be dened recursively over integers (only structural recursion is allowed) thus we cannot dene f k (x) recursion over k. As a consequence, we turn to an axiomatization: function iter int int : int axiom iter_0: x: int. iter 0 x = x axiom iter_s: k x: int. 0 < k iter k x = iter (k-1) (f x) More generally, Why3 standard library contains a theory int.iter that axiomatizes the iteration of a given function. Thus we can conveniently replace the three declarations above with the appropriate cloning of that theory: 21

22 use import int.int use import int.lex2 use import ref.ref function f (n: int) : int = if n 100 then 91 else n-10 clone import int.iter with type t = int, function f = f let iter (n0: int) : int ensures { result = f n0 } = let n = ref n0 in let e = ref 1 in while!e > 0 do invariant {!e 0 iter!e!n = f n0 } variant { (100 -!n + 10 *!e,!e) with lex } if!n > 100 then begin n :=!n - 10; e :=!e - 1 else begin n :=!n + 11; e :=!e + 1 done;!n Figure 3.2: Non-recursive version of McCarthy's 91 function. clone import int.iter with type t = int, function f = f The annotated code is given gure 3.2. condition is automatically discharged. 3.2 Arrays As for the recursive version, the verication Up to now, we had programs using only integers. Let us consider now programs using arrays. Why3 standard library provides arrays in module array.array. use import array.array This module declares a polymorphic type array α, an access operation written a[e], an assignment operation written a[e1] e2, and various operations such as create, length, app, sub, or copy. As an example, let us consider proving the correctness of the following algorithm that sorts an array a of n Boolean values according to False < True. 22

23 i, j 0, n 1 while i < j if a[i] then i i + 1 else if a[j] then j j 1 else swap a[i] and a[j]; i i + 1; j j 1 False?...? True i j We rst set up the program specication. We have to state that array a is sorted when the program exits. We dene a predicate sorted to do so: predicate () (x y: bool) = x = False y = True predicate sorted (a: array bool) = i1 i2: int. 0 i1 i2 < a.length a[i1] a[i2] We also have to specify that the nal contents of array a is ane permutation of its initial contents (otherwise, a program setting all elements to False would be accepted as a valid sort). There are multiple ways to dene the permutation property. We could count the number of False (or True) values and assert that it is unchanged, or, equivalently, assert that the two multi-sets of elements are the same. Another solution, well-suited for the program above, consists in dening the permutation property as the smallest equivalence relation containing transpositions (swaps to two elements). Thus we dene permut as an inductive predicate inductive permut (a1 a2: array bool) = with two clauses for reexive and transitive properties permut_refl: a: array bool. permut a a permut_trans: a1 a2 a3: array bool. permut a1 a2 permut a2 a3 permut a1 a3 and a third clause to indicate that permut contains transpositions: permut_swap: a1 a2: array bool. i j : int. length a1 = length a2 0 i < length a1 0 j < length a2 a1[i] = a2[j] a1[j] = a2[i] ( k: int. 0 k < length a1 k i k j a1[k]=a1[k]) permut a1 a2 Note that we have no clause to say that permut is symmetric (which is, in a general case, required to dene an equivalence relation). There are several reasons: rst, we can prove that relation permut is symmetric; second, this extra property is not needed for our proof. The generic denition of permut for arrays with elements of any type is provided by Why3 standard library, in module array.arraypermut. We can now write the specication. It is as follows: let two_way_sort (a: array bool) : unit ensures { sorted a } ensures { permut (old a) a } 23

24 =... Notation old a in the postcondition stands for the initial value of a, that is its values at the function entry. We still need a loop invariant. It decomposes into four parts. First, we have two arithmetic relations over i and j. invariant { 0!i!j < length a } Then we state that all values left to i are all False. invariant { k: int. 0 k <!i a[k] = False } Similarly, we state that all values right to j are all True. invariant { k: int.!j < k < length a a[k] = True } Finally, we have to state the permutation property, that is that array a is always a permutation of its initial contents. To do so, we introduce a label Init at the of the function body, with construct 'Init:, then we use that label in the loop invariant to refer to the value of a at this program point, with notation at a 'Init. invariant { permut (at a 'Init) a } This completes the loop invariant. Besides, we prove termination in a trivial way, using j i as variant. The code is summarized in gure 3.3. Exercise: Dijkstra's Dutch national ag problem. Prove the correctness of a program sorting an array with three dierent values, standing for the three colors of the Dutch national ag (type color = Blue White Red). The code is given on the lecture web site. 3.3 Algebraic Data Types A key idea of Hoare logic [13] is that all symbols from the logic can be used in programs. In particular, we already used type int from the logic in many programs. Similarly, we can use algebraic data types in programs. In Why3 standard library, we nd the following algebraic data types: type bool = True False type option α = None Some α type list α = Nil Cons α (list α) (in bool.bool) (in option.option) (in list.list) The user is free to dene other algebraic data types and to use them in programs. Let us illustrate this on an example. This is the famous same fringe problem: given two binary trees holding values at nodes, we want to check whether they have the same elements when traversed in inorder. The problem is subtle, as the two trees may have dierent shapes. For instance, the two trees 24

25 use import int.int use import bool.bool use import ref.refint use import array.array use import array.arraypermut predicate () (x y: bool) = x = False y = True predicate sorted (a: array bool) = i1 i2: int. 0 i1 i2 < a.length a[i1] a[i2] let two_way_sort (a: array bool) : unit ensures { sorted a } ensures { permut (old a) a } = 'Init: let i = ref 0 in let j = ref (length a - 1) in while!i <!j do invariant { 0!i!j < length a } invariant { k: int. 0 k <!i a[k] = False } invariant { k: int.!j < k < length a a[k] = True } invariant { permut (at a 'Init) a } variant {!j -!i } if not a[!i] then incr i else if a[!j] then decr j else begin let tmp = a[!i] in a[!i] a[!j]; a[!j] tmp; incr i; decr j done Figure 3.3: Sorting an array of Boolean values. 25

26 both have elements 1, 3, 4, 5, 8 when traversed in inorder. This problem has practical application: it provides an order relation over binary search trees (BST) that is shapeinsensitive. This is useful to build other BSTs whose elements are BSTs (to implement sets of sets). Let us write a WhyML program to solve the same fringe problem and let us prove its correctness. We introduce some uninterpreted type elt for the elements (the nature of the elements is irrelevant here) and a type tree for binary search trees. type elt type tree = Empty Node tree elt tree For the purpose of the specication, we introduce a function elements returning the result of an inorder traversal of a tree. function elements (t: tree) : list elt = match t with Empty Nil Node l x r elements l ++ Cons x (elements r) The function we seek to dene and prove is thus as follows: let same_fringe (t1 t2: tree) : bool ensures { result=true elements t1 = elements t2 } =... Of course, we could simply use function elements to solve the problem. However, this would be a quite bad solution: rst, list concatenations are costly (they could reach a quadratic total cost) second, it is unnecessary to build the two lists if they quickly dier (at least in the context of a strict language as WhyML). Thus we consider a better solution, consisting in considering the leftmost branch of each tree as a list, from bottom up. Each element on this list is paired with its right sub-tree. It is thus a list of pairs (x i, t i ), as follows:... x n x 2 t n x 1 t 2 t 1 26

27 On the example above, the two lists are the following: The comparison then proceeds as follows. We rst compare the rst element of each list. If they dier, we are done. Otherwise, we replace the rst element in each list by the leftmost branch of its right sub-tree, and then we recursively perform the comparison. If at some point a list is exhausted while the other is not, we answer negatively. Finally, we successfully terminate if both lists exhaust at the same time. To write this algorithm in WhyML, we rst introduce a data type enum for the lists of pairs (in the following, we call such a list an enumerator). type enum = Done Next elt tree enum Then we dene a recursive function enum_elements to get the inorder traversal of an enumerator, analogous to function elements, for specication purposes only. function enum_elements (e: enum) : list elt = match e with Done Nil Next x r e Cons x (elements r ++ enum_elements e) Now we proceed to the code itself. We start with a function enum that builds an enumerator of type enum from a tree t of type tree. We proceed recursively along the leftmost branch of t. To do so, we pass a second argument e of type enum to accumulate. let rec enum (t: tree) (e: enum) : enum ensures { enum_elements result = elements t ++ enum_elements e } = match t with Empty e Node l x r enum l (Next x r e) The specication states that the traversal of the result is the concatenation of the traversal of t and that of e. Then we write the core of the algorithm as a function eq_enum comparing two enumerators e1 and e2. We proceed recursively, until we exhaust at least one list. let rec eq_enum (e1 e2: enum) : bool ensures { result=true enum_elements e1 = enum_elements e2 } = match e1, e2 with Done, Done True 27

28 Next x1 r1 e1, Next x2 r2 e2 x1 = x2 && eq_enum (enum r1 e1) (enum r2 e2) _ False The specication states that we decide equality of the traversals of e1 and e2. Finally, the solution to the same fringe problem is a mere call to eq_enum on the enumerators for the two trees. let same_fringe (t1 t2: tree) : bool ensures { result=true elements t1 = elements t2 } = eq_enum (enum t1 Done) (enum t2 Done) The solution is summarized in gure 3.4. Exercise: inorder traversal. Using the same type for binary trees, we consider the following recursive function that lls an array a with the elements of a tree t, following an inorder traversal. fill t start def = if t = Empty then return start if t = Node l x r then res fill l start if res = length a then return res a[res] x fill r (res + 1) The WhyML code for this program, as well as the properties to prove, are available from the lecture web site. 3.4 Other Data Structures If we look into Why3 standard library module array.array we nd the following declarations for the type of arrays: type array α model { length: int; mutable elts: map int α } Such a declaration has two meanings. In programs, it is analogous to the declaration of an abstract type, that would be type array α In the logic, however, it is equivalent to the declaration of an immutable record type, that would be type array α = { length: int; elts: map int α } Said otherwise, an array is modeled in the logic as a record with two elds, the length of type int, and the contents of type map int α (a purely applicative map). 28

29 use import int.int use import list.list use import list.app type elt type tree = Empty Node tree elt tree function elements (t: tree) : list elt = match t with Empty Nil Node l x r elements l ++ Cons x (elements r) type enum = Done Next elt tree enum function enum_elements (e: enum) : list elt = match e with Done Nil Next x r e Cons x (elements r ++ enum_elements e) let rec enum (t: tree) (e: enum) : enum ensures { enum_elements result = elements t ++ enum_elements e } = match t with Empty e Node l x r enum l (Next x r e) let rec eq_enum (e1 e2: enum) : bool ensures { result=true enum_elements e1 = enum_elements e2 } = match e1, e2 with Done, Done True Next x1 r1 e1, Next x2 r2 e2 x1 = x2 && eq_enum (enum r1 e1) (enum r2 e2) _ False let same_fringe (t1 t2: tree) : bool ensures { result=true elements t1 = elements t2 } = eq_enum (enum t1 Done) (enum t2 Done) Figure 3.4: Solving the same fringe problem. 29

30 The type being abstract in programs, we cannot implement operations over type array α. But we can declare them. For instance, the access operation is declared as follows: val ([]) (a: array α) (i: int) : α reads {a} requires { 0 i < length a } ensures { result = a[i] } This function takes a and i as arguments, together with a precondition to ensure an array access within bounds. It returns a value of type α, and the postcondition states that the returned value is the value contained in the purely applicative map, that is a.elts. As we see, logical annotations, such as the pre- and postcondition here, have access to the record elds length and elts. It is the purpose of a model type to be available in the logic (and only in the logic). The annotation reads {a} indicates that the function accesses the contents of array a (but does not modify it). The assignment operation is declared in a similar way: val ([] ) (a: array α) (i: int) (v: α) : unit writes {a} requires { 0 i < length a } ensures { a.elts = M.set (old a.elts) i v } The main dierence is the annotation writes {a}, which indicates that the contents of a is modied by a call to this function. This is allowed since the eld elts was declared to be mutable. Modeling Hash Tables. As an example, let us do a similar model of hash tables 2 More precisely, let us consider polymorphic hash tables with the following interface (in OCaml syntax): type t 'a 'b val create: int -> t 'a 'b val clear: t 'a 'b -> unit val add: t 'a 'b -> 'a -> 'b -> unit exception Not_found val find: t 'a 'b -> 'a -> 'b Function create returns a fresh, empty table. Function clear empties a given table. Function add inserts a new entry. Finally, function find returns the value mapped to a key, if any, and raises exception Not_found otherwise. As for arrays above, we rst declare a type t α β for the hash tables, modeling the contents using a purely applicative map. type t α β model { mutable contents: map α (option β) } We choose to map at most one value for each key. Thus the elds contents maps each key k of type α to a value of type option β: None stands for the absence of value for key k, and Some v stands for an entry k v in the table. To ease forthcoming specications, we set up notation h[k] for the entry for key k in table h. function ([]) (h: t α β) (k: α) : option β = get h.contents k 2 Hash tables are provided in Why3 standard library since version

31 Function create takes an integer n as argument (the initial size of the bucket array) and returns a fresh, empty table. val create (n:int) : t α β requires { 0 < n } ensures { k: α. result[k] = None } Function clear empties a given table. Thus it has the same postcondition as function create. But it also has species that table h is modied (writes {h}). val clear (h: t α β) : unit writes { h } ensures { k: α. h[k] = None } Function add also has a writes {h} annotation: val add (h: t α β) (k: α) (v: β) : unit writes { h } ensures { h[k] = Some v k': α. k' k h[k'] = (old h)[k'] } Its postcondition states that key k is now mapped to value v in the new contents of h and that any other key k' is still mapped to the same value as before, that is (old h)[k']. Indeed, annotation writes h denotes a possible modication of all the contents of table h and thus it is here the role of the postcondition to state what is modied and what is not. Finally, we declare function find. exception Not_found val find (h: t α β) (k: α) : β reads { h } ensures { h[k] = Some result } raises { Not_found h[k] = None } The keyword raises introduces a postcondition corresponding to the case where an exception is raised (Not_found in that case), while ensures declares a postcondition corresponding to the case where a value is returned. Figure 3.5 summarizes our model of hash tables. It is worth mentioning that there is currently no possibility to ensure that a given WhyML implementation of hash tables conforms to this model. 31

32 use import option.option use import int.int use import map.map type t α β model { mutable contents: map α (option β) } function ([]) (h: t α β) (k: α) : option β = get h.contents k val create (n:int) : t α β requires { 0 < n } ensures { k: α. result[k] = None } val clear (h: t α β) : unit writes { h } ensures { k: α. h[k] = None } val add (h: t α β) (k: α) (v: β) : unit writes { h } ensures { h[k] = Some v k': α. k' k h[k'] = (old h)[k'] } exception Not_found val find (h: t α β) (k: α) : β reads { h } ensures { h[k] = Some result } raises { Not_found h[k] = None } Figure 3.5: Modeling hash tables. 32

33 Chapter 4 Using Why3 as an Intermediate Language Using Why3 to verify programs written in a mainstream programming language (such as C or Java) requires an adequate modeling of its semantics in the logic of Why3. In this chapter, we briey explain how to do so on two examples. 4.1 Machine Arithmetic So far we have use arithmetic from Why3 standard library, that is mathematical integers of unbounded precision. Let us assume we need to model machine arithmetic instead (say, signed 32-bit integers), either to show the absence of arithmetic overow in a program, or to reason about possible overows. The main diculty is that we do not want to loose the arithmetic capabilities of SMT solvers (which only know about mathematical arithmetic). One way to do is to introduce a new, uninterpreted type int32 for machine integers type int32 together with a function giving the corresponding value of type int: function toint int32 : int The idea is then to use only type int in program annotations, that is to apply function toint in a systematic way around sub-expressions of type int32. The range of 32-bit integers is dened with two constants constant min_int: int = constant max_int: int = and an axiom states that any value of type int32 is bounded by these constants: axiom int32_domain: x: int32. min_int toint x max_int If our purpose is to build a model to prove the absence of arithmetic overow, we simply need to function to build a value of type int32 from a value of type int with a suitable precondition: 33

34 val ofint (x:int) : int32 requires { min_int x max_int } ensures { toint result = x } Then we can translate any program manipulating machine integers into a WhyML program where each arithmetic operation uses function ofint. For instance, an expression such as x + y is translated into ofint (toint x) (toint y). Equivalently, we can also declare (or dene) a function performing this computation, that is val (+) (x: int32) (y: int32) : int32 requires { min_int toint x + toint y max_int } ensures { toint result = toint x + toint y } The lecture web site contains an application of this model to the verication of a program searching for a value in a sorted array using binary search. This way we nd the famous bug related to the computation of the mean value of l and u using (l+u)/2 [3]. Then we can x the bug, for instance using l+(u-l)/2 instead, and we can show that this new program is safe from any arithmetic overow. 4.2 Memory Models A key idea of Hoare Logic, implemented in Why3, is that the system can statically identify the various memory locations. This is absence of aliasing. In particular, memory locations are not rst-class values. Thus to handle programs with mutable data structures such as linked lists or mutable trees, or programs with explicit pointers, one has to model the memory heap. A trivial example would be that of C programs with explicit pointers of type int*. A simple way to model such programs would be to introduce a type for pointers and a global reference holding the contents of the memory: type pointer val memory: ref (map pointer int) Then a C expression such as *p is translated into!memory[p]. If pointer arithmetic is allowed, then either the type pointer is dened as an alias for type int, or operations over type pointer (such as shifting, comparisons, etc.) are introduced. This is somehow a rather crude model, where the user quickly get stuck into a specication nightmare. There are more subtle models, such as the component-as-array model [7, 6]. Say we want to model the following C type for simply linked lists: struct List { int head; struct List *next; }; The idea behind the component-as-array model is to have several maps to model the various structure elds. In this case, we have (at least) two global maps for elds head and next: 34

35 type pointer val head: ref (map pointer int) val next: ref (map pointer pointer) An obvious benet is that any modication to reference head does not have any impact on the contents of reference next. The benet is even greater when there are many C elds (being they of the same type or not, of from the same structure or not). The componentas-array model may be even further rened using a region-analysis of pointers [14]. On the contrary, a drawback is that low-level tricks such as pointer casts or pointer arithmetic are not easily allowed anymore. 35

36 36

37 Bibliography [1] M. Barbosa, Jean-Christophe Filliâtre, J. Sousa Pinto, and B. Vieira. A Deductive Verication Platform for Cryptographic Software. In 4th International Workshop on Foundations and Techniques for Open Source Software Certication (OpenCert 2010), volume 33, Pisa, Italy, September Electronic Communications of the EASST. [2] Gilles Barthe, Benjamin Grégoire, and Santiago Zanella Béguelin. Formal certication of code-based cryptographic proofs. In Zhong Shao and Benjamin C. Pierce, editors, POPL, pages 90101, Savannah, GA, USA, January ACM Press. [3] Joshua Bloch. Nearly all binary searches and mergesorts are broken, extra-extra-read-all-about-it-nearly.html. [4] François Bobot, Jean-Christophe Filliâtre, Claude Marché, Guillaume Melquiond, and Andrei Paskevich. The Why3 platform. LRI, CNRS & Univ. Paris-Sud & INRIA Saclay, version 0.64 edition, February [5] François Bobot, Jean-Christophe Filliâtre, Claude Marché, and Andrei Paskevich. Why3: Shepherd your herd of provers. In Boogie 2011: First International Workshop on Intermediate Verication Languages, pages 5364, Wrocªaw, Poland, August [6] Richard Bornat. Proving pointer programs in Hoare logic. In Mathematics of Program Construction, pages , [7] Rod Burstall. Some techniques for proving correctness of programs which alter data structures. Machine Intelligence, 7:2350, [8] Edsger W. Dijkstra. A discipline of programming. Series in Automatic Computation. Prentice Hall Int., [9] Jean-Christophe Filliâtre. Deductive Program Verication. Thèse d'habilitation, Université Paris-Sud, December In English, hdr/memoire.pdf. [10] Jean-Christophe Filliâtre. Deductive software verication. International Journal on Software Tools for Technology Transfer (STTT), 13(5):397403, August

38 [11] Jean-Christophe Filliâtre and Claude Marché. Multi-Prover Verication of C Programs. In Sixth International Conference on Formal Engineering Methods, pages Springer, [12] Jérôme Guitton, Johannes Kanig, and Yannick Moy. Why Hi-Lite Ada? In Boogie 2011: First International Workshop on Intermediate Verication Languages, pages 2739, Wrocªaw, Poland, August [13] C. A. R. Hoare. An axiomatic basis for computer programming. Communications of the ACM, 12(10): and 583, October [14] Thierry Hubert and Claude Marché. Separation analysis for deductive verication. Technical report, Université Paris XI, separation.pdf. [15] Z. Manna and J. Mccarthy. Properties of programs and partial function logic. In Machine Intelligence, volume 5, [16] Claude Marché, Christine Paulin, and Xavier Urbain. The Krakatoa proof tool, [17] Yannick Moy. Automatic Modular Static Safety Checking for C Programs. PhD thesis, Université Paris-Sud, January [18] Alan Mathison Turing. Checking a large routine. In Report of a Conference on High Speed Automatic Calculing Machines, pages 6769, Cambridge, Mathematical Laboratory. 38