1 Computer-Assisted Theorem Proving for Assuring the Correct Operation of Software Amy Felty University of Ottawa Introduction to CSI5110
2 The General Problem From Evan I. Schwartz, Trust Me, I m Your Software, Discover Magazine, May 1996: All very complex computer programs will, at some time, fail. How often? No one knows; the programs are too complex to test. So where should we use them? How about in planes, nuclear power plants, weaponry... From Edmund M. Clarke and Jeannette M. Wing, ACM Computing Surveys, Volume 28, December 1996: Hardware and software systems will inevitably grow in scale and functionality. Because of this increase in complexity, the likelihood of subtle errors is much greater. Moreover, some of these errors may cause catastrophic loss of money, time, or even human life.
3 A Potential Solution: Formal Methods Also from Clarke and Wing: A major goal of software engineering is to enable developers to construct systems that operate reliably despite this complexity. One way of achieving this goal is by using formal methods, which are mathematically-based languages, techniques, and tools for specifying and verifying such systems. Use of formal methods does not a priori guarantee correctness. However, they can greatly increase our understanding of a system by revealing inconsistencies, ambiguities, and incompletenesses that might otherwise go undetected. The use of formal methods can be integrated into the system development process, and used at some or all stages, and integrated with informal methods (such as testing). They have been most successful at the specification and verification stages. Further exploration is ongoing for other stages such as requirements analysis, refinement, and testing.
Example Applications Safety Critical: Darlington Nuclear Generating Station, near Toronto; applied to decision-making logic for the shutdown system implemented in software Commercial Applications: IBM Customer Information Control System (CICS); applied to large transaction processing system used by banks, insurance companies, manufacturing firms, airlines, and others Security Applications: see next page 4
Security Example Subject: [Coq-Club] Formal Methods in the industry - a successful story From: NGUYEN Quang-Huy at gemalto.com Date: Thu, 6 Sep 2007 Dear colleagues, We are proud to announce that we have just successfully completed a Common Criteria (CC) evaluation on a Java Card based commercial product. This evaluation will lead to the world s first CC certificate of a Java product involving EAL7 components. The specific feature of the evaluation is that all the CC requirements on the development of the product (the ADV class) have been fulfilled at their highest level thanks to the use of a formal tool (the Coq proof assistant)... From a more technical point of view, the formal models and proofs developed in this work ensure the safe execution of any bytecode-verified applet on the product... =============================== Formal Methods group Gemalto Technology & Innovation 5
6 Formalization Spectrum less formal natural language text description mathematical proof specification languages more formal specification + mathematical proof and/or automated tools model checking, automated deduction/theorem proving theorem proving with user interaction
7 Logic and Inference Rules If it is raining, then the ground is getting wet. It is raining. Therefore, the ground is getting wet. The modus ponens inference rule P Q Q P P := it is raining Q := the ground is getting wet
8 Another Example All humans are mortal. Socrates is human. Therefore, Socrates is mortal. Universal instantiation x.p(x) P(t) A proof that Socrates is mortal x.human(x) mortal(x) human(socrates) mortal(socrates) mortal(socrates) human(socrates)
9 Logics Come in Many Varieties Logics can be: specialized to express various notions, for example temporal logics: P, P programming logics: {x > 0} x := x + 1 {x > 1} more expressive less expressive
10 Natural Deduction Theorem Proving p, q p p, q q -I p, q q r -I p, q p (q r) -E p q p (q r) -I (p q) (p (q r)) theorem prover: a program for finding such proofs goal-directed = bottom-up search for proofs more expressive logics are harder to automate, e.g., propositional logic predicate logic quantification higher-order logic: allows quantification over predicates and functions, e.g., R.R(1, 2) interaction allows user to guide search + more powerful, general, flexible - requires sophisticated knowledge
11 Booleans and Negation bool = {true, false} b: the negation function maps true to false and conversely. Inductive bool : Set := true : bool false : bool. Definition neg (b:bool) := match b with true => false false => true end.
12 Bit Strings Bit strings or boolean words are represented as lists of booleans. [true, false, true, false] Inductive word : Set := empty : word bit : bool -> word -> word. (bit true (bit false (bit true (bit false empty))))
Alternating Words A word w is alternating if for some bit b, w is of the form [b, b, b, b,...]. Inductive alt: bool -> word -> Prop := alt empty: forall (b:bool), alt b empty alt bit: forall (b:bool) (w:word), alt (neg b) w -> alt b (bit b w). A version without an explicit first bit. Inductive alternate (w:word): Prop := alter: forall (b:bool), alt b w -> alternate w. (alt b w) vs. (alternate w) 13
14 Paired Words A word w is said to be paired if it is of the form: [b 1, b 1, b 2, b 2,...]. Inductive paired: word -> Prop := paired empty: paired empty paired bit: forall (w:word) (b:bool), paired w -> paired (bit (neg b) (bit b w)).
15 Shuffle Shuffling u and v to obtain w: at each step a bit is taken from either u or v and put at the end of w. u v w Inductive shuffle: word -> word -> word -> Prop := shuffle empty: shuffle empty empty empty shuffle bit left: forall (u v w:word) (b:bool), shuffle u v w -> shuffle (bit b u) v (bit b w) shuffle bit right: forall (u v w:word) (b:bool), shuffle u v w -> shuffle u (bit b v) (bit b w).
The Card Trick Theorem Theorem: Let x be an alternating word of even length. Let u and v be two words such that their concatenation is x. Let w be a shuffling of u and v. If u and v begin with opposite bits, then w is paired. Otherwise the word obtained by moving the first bit of w to the end is paired. x u v w Theorem Gilbreath: forall x:word, even x -> alternate x -> forall u v:word, x=(append u v) -> forall w:word, shuffle u v w -> -> IF opposite u v then paired w else paired (rotate w). Note: The definitions and proofs require approx. 750 lines of input to the theorem prover. (Gérard Huet, The Gallina Specification Language: A case study, in Proceedings of the Twelfth Conference on Foundations of Software Technology and Theoretical Computer Science, 1992) 16
Verification of SRT Division Implemented by Intel Pentium chip with well-publicized division error Similar to the third grade division algorithm; 2 main differences. 1 Quotient digit is approximated by only considering the first few digits of divisor and dividend and using table look-up. 2 Partial remainder is computed by adding or subtracting depending upon whether the quotient digit is guessed correctly or overestimated by 1. Testing unlikely to catch error; caused by 5 wrong entries in look-up table Cost of error estimated to be almost $500 million. 17
18 3 Theorem Proving Verifications Verified after the fact using theorem proving systems by 3 research groups: Carnegie Mellon, SUNY Albany, and SRI Example: SRI 1 General math: Formalization of textbook knowledge of SRT division algorithm and proof of correctness. 2 Specific data-path circuit (bit-vector signals over time) to compute the partial remainder. 3 Specific look-up table. Missing table entries lead to subgoals that can t be proven.
Scalable Coherence Interface (SCI) IEEE standard for specifying communication between up to 64,000 multiprocessors in a shared memory network. A good representative of the kinds of protocols for which verification is important. Also representative of the level of complexity that can be handled by verification tools. 19
20 Cache Memory Memory Cache Memory Processor
21 Multiprocessor with Cache Memory Memory Cache Cache Cache Processor Processor Processor
22 SCI Cache Coherence Highlights of the Protocol Each processor keeps some local data indicating which parts of its own cache has the most up-to-date values, which memory locations it can write to, etc. Processors communicate information such as correct values, granting permission to write, etc., by sending messages back and forth. A doubly-linked list is used to keep track of the order in which processors request to read and write. Correctness expressed as 5 logical formulas stating, for example: There is always at most one processor with permission to write a particular memory location. Every processor that requests to write will eventually get a turn. Proof requires 14 lemmas including 8 fairly easy and 6 much harder.
Invariant 13 (a) (status p = Off status p = Pending) cs p = invalid. status p = Inlist cs p invalid. status p = Purging (cs p = dirty succ p = nil). status p = Ftod (cs p = fresh pred p = m). status p = Inqueue (pred p = m succ p = nil). delrightq(q, r, cs) buf [p] (r nil cs invalid). (visiting(p) status q Delleft delrightq(p, r, cs) buf [q]) succ p = q. (visiting(p) delrightr(q, ok) buf [p]) succ p = q. (b) (status p Inqueue cs p = invalid) (pred p = nil succ p = nil). cs p invalid pred p nil. (c) head m = nil p P.(idle(p) leaving(p)). (d) (head m = p p nil) p is maximal ranked active processor. (e) (idle(p) entering(p) leaving(p) p is maximal ranked visiting processor) staying(q) succ q p purgeq(q) buf [p] purger(q, p) buf [r]. (f) (read cache freshr(m, q, cv, arg) buf [p] read cache goner(m, q, cv, arg) buf [p]) ((q = nil rank(p) = 0 q P. visiting(q )) (q P [entering(q) q is maximal ranked processor] rank(p) = rank(q)+1)) cv m = cv. (g) (visiting(p) status p Purging succ p = nil) rank(p) = 0. (h) (visiting(p) succ p = q q nil) (visiting(q) rank(p) = rank(q) + 1). (i) prependq(q) buf [p] rank(q) = rank(p) + 1 (entering(p) p is maximal ranked visiting processor). (j) (delleftq(q, r, cv) buf [p] visiting(q)) (succ q = r pred q = p). (k) prependr(q, q, ok, cv, cs) buf [p] q is maximal ranked visiting processor cs invalid pred q = p rank(p) = rank(q) + 1 (staying(p ) pred p m). (l) prependr(q, nil, ok, cv, cs) buf [p] p P. visiting(p ) rank(p) = 0 cs invalid. (m) prependr(q, r, retry, cv, cs) buf [p] (entering(r) r is maximal ranked visiting processor) rank(p) = rank(r) + 1 [(visiting(r) q r) q = r]. (n) purgeq(q) buf [p] (visiting(p) rank(q) = rank(p) + 1). (o) purger(q, r) buf [p] visiting(q) [(r = nil rank(p) = 0) (r nil rank(p) = rank(r) + 1 visiting(r))]. (p) (pred p = m staying(p)) p is maximal ranked staying processor. p is maximal ranked staying processor (pred p = m q.prependr(p, p, ok, cv, cs) buf [q]). delrightq(q, m, cs) buf [p] q is maximal ranked staying processor. (q) cs p invalid pred p = m [ q P. pred p = q cs q=invalid q is the smallest ranked entering or staying processor with rank(q)>rank(p))]. (visiting(q) delrightq(q, r, cs) buf [p]) r = m (r P cs r = invalid) (r P cs r invalid r is the smallest ranked entering or staying processor with rank(r)>rank(q)). 23
24 Design by Contract with Java Modeling Language Design by Contract (DBC) A contract between a class and its clients. A client must guarantee certain conditions before calling a method defined by the class (preconditions). A class guarantees properties that hold after execution of the method (postconditions). Contracts are executable, i.e., can be checked by tools. Java Modeling Language (JML) JML assertions are annotations in the Java code (seen as comments by Java). JML extends Java with keywords such as: requires for preconditions ensures for postconditions invariant for properties that hold at the beginning and end of all methods, and at the end of a constructor execution \result to denote the result of a method call
25 An Example: Computing Square Root public class SqrtExample { public final static double epsilon = 0.0001; } /*@ requires x >= 0.0; @ @ ensures JMLDouble.approximatelyEqualTo(x, \result * \result, epsilon); @*/ public static double sqrt(double x) { return Math.sqrt(x); }
26 The Problem: Code Safety Code Producer Code Consumer Source Program Compiler Native Code load r3, 4(r2) add r2,r4,r1 store 1, 0(r7) store r1, 4(r7) add r7,0,r3 add r7,8,r7 beq r3,.-20 Execute Does no harm?
Proof-Carrying Code [Necula & Lee, 1997] Code Producer Implements a program and compiles it to native machine code C. The verification condition (safe C) is sent to a prover which proves it (automatically) and outputs a proof P. The compiler also sends hints to the prover. The code producer communicates the code and proof to the code consumer. Code Consumer Checks that P is a proof of (safe C). If successful, executes C as needed. Safety Policy Set ahead of time by the code consumer. Defined by a set of inference rules. 27
28 Proof-Carrying Code Code Producer Source Program Certifying Compiler Native Code C load r3, 4(r2) add r2,r4,r1 store 1, 0(r7) store r1, 4(r7) add r7,0,r3 add r7,8,r7 beq r3,.-20 Code Consumer Execute Hints Safety Proof of safe(c) OK Trusted Code Base Prover -i( -i(... -r (...) ) ) Checker
29 Advantages of Proof-Carrying Code Trusted Computing Base is quite small: includes only the checker. No need to trust compiler or prover. The safety policy (meaning of safe) can be general and flexible. Can use types, dataflow, induction, or any other provable property. Automated proof is possible for a large class of properties. Safety properties of interest are relatively simple. Hints from the compiler provide help.
30 Some Current and Future Uses of Theorem Proving Software, Hardware, and Protocol Correctness Safety and Security of Mobile Code web browsers executing applets from foreign sites As Specification Languages As Teaching Tools for Logic for Mathematics Tools for Mathematicians