How To Prove A Fact In The Kernel Of Truth

Similar documents
Foundational Proof Certificates

Automated Theorem Proving - summary of lecture 1

FORMALLY CERTIFIED SATISFIABILITY SOLVING. Duck Ki Oe. An Abstract

CS510 Software Engineering

CHAPTER 7 GENERAL PROOF SYSTEMS

Summary Last Lecture. Automated Reasoning. Outline of the Lecture. Definition sequent calculus. Theorem (Normalisation and Strong Normalisation)

Software Verification and System Assurance

Regression Verification: Status Report

(LMCS, p. 317) V.1. First Order Logic. This is the most powerful, most expressive logic that we will examine.

Lecture 8: Resolution theorem-proving

Relations: their uses in programming and computational specifications

Slot machines. an approach to the Strategy Challenge in SMT solving. Stéphane Graham-Lengrand CNRS - Ecole Polytechnique - SRI International,

Rigorous Software Development CSCI-GA

This asserts two sets are equal iff they have the same elements, that is, a set is determined by its elements.

Introducing Formal Methods. Software Engineering and Formal Methods

CHAPTER 3. Methods of Proofs. 1. Logical Arguments and Formal Proofs

Static Program Transformations for Efficient Software Model Checking

Schedule. Logic (master program) Literature & Online Material. gic. Time and Place. Literature. Exercises & Exam. Online Material

3. Mathematical Induction

Model Checking: An Introduction

Markov random fields and Gibbs measures

Handout #1: Mathematical Reasoning

Scalable Automated Symbolic Analysis of Administrative Role-Based Access Control Policies by SMT solving

Jedd: A BDD-based Relational Extension of Java

MATH10040 Chapter 2: Prime and relatively prime numbers

ON FUNCTIONAL SYMBOL-FREE LOGIC PROGRAMS

InvGen: An Efficient Invariant Generator

Mathematical Induction

ML for the Working Programmer

Summary of Last Lecture. Automated Reasoning. Outline of the Lecture. Definition. Example. Lemma non-redundant superposition inferences are liftable

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2

2 Temporal Logic Model Checking

Optimizing Description Logic Subsumption

Computational Methods for Database Repair by Signed Formulae

Tableaux Modulo Theories using Superdeduction

Remarks on Non-Fregean Logic

High Integrity Software Conference, Albuquerque, New Mexico, October 1997.

µz An Efficient Engine for Fixed points with Constraints

Lecture 13 of 41. More Propositional and Predicate Logic

Jieh Hsiang Department of Computer Science State University of New York Stony brook, NY 11794

INCIDENCE-BETWEENNESS GEOMETRY

Regular Languages and Finite Automata

Software Engineering and Automated Deduction

TECH. Requirements. Why are requirements important? The Requirements Process REQUIREMENTS ELICITATION AND ANALYSIS. Requirements vs.

Satisfiability Checking

Fast Reflexive Arithmetic Tactics the linear case and beyond

DEFINABLE TYPES IN PRESBURGER ARITHMETIC

Mathematical Induction

Chair of Software Engineering. Software Verification. Assertion Inference. Carlo A. Furia

University of Ostrava. Reasoning in Description Logic with Semantic Tableau Binary Trees

The Lean Theorem Prover

COMPUTER SCIENCE TRIPOS

8 Divisibility and prime numbers

Repair Checking in Inconsistent Databases: Algorithms and Complexity

Equality and dependent type theory. Oberwolfach, March 2 (with some later corrections)

Sudoku as a SAT Problem

Chapter 17. Orthogonal Matrices and Symmetries of Space

Jaakko Hintikka Boston University. and. Ilpo Halonen University of Helsinki INTERPOLATION AS EXPLANATION

Testing LTL Formula Translation into Büchi Automata

Automata-based Verification - I

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT?

5 Homogeneous systems

Trust but Verify: Authorization for Web Services. The University of Vermont

def: An axiom is a statement that is assumed to be true, or in the case of a mathematical system, is used to specify the system.

Logic in general. Inference rules and theorem proving

v w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors.

Math 319 Problem Set #3 Solution 21 February 2002

Computability Theory

Sample Induction Proofs

Lesson 9: Radicals and Conjugates

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Lecture 2: Universality

A Static Analyzer for Large Safety-Critical Software. Considered Programs and Semantics. Automatic Program Verification by Abstract Interpretation

Research Note. Bi-intuitionistic Boolean Bunched Logic

A Propositional Dynamic Logic for CCS Programs

Minimum Satisfying Assignments for SMT

First-Order Theories

Verifying Specifications with Proof Scores in CafeOBJ

Set theory as a foundation for mathematics

Reliability Applications (Independence and Bayes Rule)

Bi-approximation Semantics for Substructural Logic at Work

Overview of the TACITUS Project

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Cassandra. References:

On the generation of elliptic curves with 16 rational torsion points by Pythagorean triples

3. Prime and maximal ideals Definitions and Examples.

Left-Handed Completeness

CSE 459/598: Logic for Computer Scientists (Spring 2012)

Semantic Groundedness

Formalization of the CRM: Initial Thoughts

Software Modeling and Verification

Quotient Rings and Field Extensions

First-Order Logics and Truth Degrees

NP-Completeness and Cook s Theorem

Kevin James. MTHSC 412 Section 2.4 Prime Factors and Greatest Comm

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

T ( a i x i ) = a i T (x i ).

1. Give the 16 bit signed (twos complement) representation of the following decimal numbers, and convert to hexadecimal:

Coverability for Parallel Programs

Transcription:

The Kernel of Truth 1 N. Shankar Computer Science Laboratory SRI International Menlo Park, CA Mar 3, 2010 1 This research was supported NSF Grants CSR-EHCS(CPS)-0834810 and CNS-0917375.

Overview Deduction can be carried out by rigorous formal rules of inference. With mechanization, we can, in principle, achieve nearly absolute certainty, but in practice, there are many gaps. How can we combine a high degree of automation in verification tools while retaining trust? Check the verification, but verify the checker. The Kernel of Truth contains verified checkers whose verifications have been checked.

N. G. de Bruijn s on Trust... we ask whether this guarantee would be weakened by leaving the mechanical verification to a machine. This is a very reasonable, relevant and important question. It is related to proving the correctness of fairly extensive computer programs, and checking the interpretation of the specifications of those programs. And there is more: the hardware, the operating system have to be inspected thoroughly, as well as the syntax, the semantics and the compiler of the programming language. And even if all this would be covered to satisfaction, there is the fear that a computer might make errors without indicating them by total breakdown. I do not see how we ever can get to an absolute guarantee. But one has to admit that compared to human mechanical verification, computers are superior in every respect.

Trusting Inference Procedures Absolute proofs of consistency are ruled out by Gödel s second incompleteness theorem, but relative consistency proofs can be quite useful. We could hope for correctness relative to a kernel proof system, as in the foundational systems Automath and LCF. Caveat: LCF-based systems are themselves not free of kernel-level unsoundness.

Proof Generation to Verified Inference Procedures If we believe claims that have valid formal proofs, then we have spectrum of options. We can generate formal proofs that are validated by a primitive proof checker. This kernel proof checker and its runtime environment will have to be trusted. Proof generation imposes a serious time, space, effort overhead. Alternately, we can verify the inference procedure by proving that every claim has a proof. We have to trust the inference procedures used in this verification. Verifying cutting-edge inference tools is a fool s errand. We advocate a middle way between proof generation and verification.

Proof Generation: LCF A core of primitive inferences captured as an abstract datatype thm of provable theorems. Each primitive inference rule maps thm list to thm, and validations also have this type. Tactics written in ML are used to convert goals to subgoals so that τ(g) = {S 1,..., S n } with a validation v such that v(s 1,..., S n ) = G. HOL, Nuprl, Coq, and Isabelle use the LCF approach; Isabelle is a metalogical framework.

Equational Logic in LCF [Harrison] module type Birkhoff = sig type thm val axiom : formula -> thm val inst : (string, term) func -> thm -> thm val refl : term -> thm val sym : thm -> thm val trans : thm -> thm -> thm val cong : string -> thm list -> thm val dest_thm : thm -> formula list * formula end;;

Equational Logic in LCF module Proven : Birkhoff = struct type thm = formula list * formula let axiom p = match p with Atom("=",[s;t]) -> ([p],p) _ -> failwith "axiom: not an equation" let inst i (asm,p) = (asm,formsubst i p) let refl t = ([],Atom("=",[t;t])) let sym (asm,atom("=",[s;t])) = (asm,atom("=",[t;s])) let trans (asm1,atom("=",[s;t])) (asm2,atom("=",[t ;u])) = if t = t then (union asm1 asm2,atom("=",[s;u])) else failwith "trans: theorems don t match up". let dest_thm th = th end;;

Proof Generation: SAT When SAT procedure generates a satisfying assignment M for φ, it is easy to check that M = φ. But, when it returns UNSAT, how can we be convinced that there is no satisfying assignment? At the 2002 SAT competition, many SAT solvers were found to be buggy. Some SAT solvers, like Zchaff and Minisat do produce proofs. Malik/Zhang (DATE 2003) show that the overhead of proof generation is small (< 10%), and proof-checking time is modest compared to checking satisfiability. However, the proof sizes can get large (on the order of gigabytes). Fuzz testing on SMT solvers also reveals lots of bugs.

Proof generation for SMT CVC and CVC Lite generate proof logs. CVC Lite proof logs can be verified by HOL Light. Z3 also generates proof traces with external SMT calls. The overhead is small and checking is done relative to an external SAT solver. Checking is quick, but trace sizes as shown in the table below (from Leonardo de Moura) can get take up hundreds of megabytes. Benchmark Without Proofs With Proofs NEQ016 size7 26.01 secs 9.1 Mb 39.84 secs 426 Mb PEQ010 size8 6.65 secs 9.3 Mb 7.63 secs 40 Mb fischer6-mutex-14 7.01 secs 14 Mb 16.38 secs 142 Mb cache.inv16 15.99 secs 11 Mb 24.61 secs 159 Mb xs 20 40 2.2 secs 5.8 Mb 2.70 secs 15.5 Mb

Verifying the Verifier Reflexively Reflection was first introduced in the seventies with Davis/Schwarz, Weyhrauch, and Boyer/Moore s metafunctions. The syntax of the logic, or a fragment of the logic, is encoded in the logic itself and the tactics are essentially proved correct. In computational reflection, we define an interpreter for the reflected syntax of a fragment of the logic, e.g., arithmetic expressions, and construct a verified simplifier. Computational reflection (metafunctions) can be directly implemented in any logic that supports syntactic representation and evaluation.

Proof Reflection In proof reflection, we represent formal proofs and show that a new inference rule is derivable. For example, we can define a predicate Provable(A) and establish that Provable(f (A)) = Provable(A). Chaieb and Nipkow show that the reflected quantifier elimination procedures runs 60 to 130 times faster than the corresponding tactic. Jared Davis has built a fairly sophisticated self-verified prover incorporating induction, rewriting, and simplification. He defines 11 layers of proof checkers of increasing sophistication so that proofs at level i + 1 can be justified by proofs at level i, for 1 i 10.

Verifying Inference Procedures Instead of reflection, one can just use a verification system to verify decision procedures. There is a long history of work in verifying decision procedures, including 1 Satisfiability solvers 2 Shostak combination 3 BDD packages 4 Gröbner basis computation 5 Presburger arithmetic procedures However, these procedures are not comparable in complexity to state-of-the-art implementations.

Should Verifiers be Verified? Short answer: NO! There s many a slip betwixt cup and lip with respect to software. Verifying the verifier will only marginally impact software reliability or quality. Effective tools tend to be highly experimental in both construction and use. It would be hard for verification to keep up with the cutting edge in tool development. However, it does make sense for verifiers to generate certificates ranging from proofs to witnesses. These certificates can be checked offline by verified checkers.

Kernel of Truth Untrusted Frontline Verifier Hints Verified Offline Verifier Certificates Verified Checker Proofs Trusted Proof Kernel Verified Verifiers Proof generation

The Kernel of Truth (KoT) The kernel contains a reference proof system formalizing ZFC. It also contains several verified checkers for specialized certificate formats. If the checker validates the certificate for a claim, then there is a proof of the claim. These certificates can be more compact than proofs. Generating and checking certificates is easier than generating proofs. Proof generation (including LCF) and verification are subsumed. Verifying the checkers is (a lot) easier than verifying the inference procedures. But, why should we trust the latter verification?

The Kernel Proof Checker: Syntax The kernel proof checker is built on first-order logic. The symbols consist of variables, function symbols, predicate symbols, and quantifiers. Function and predicate symbols can be interpreted or uninterpreted. Interpreted symbols are used for the defined operations. Uninterpreted symbols are used as schematic variables, e.g., Skolem constants. The basic propositional connectives are and, and the existential quantifier is chosen as basic.

Kernel Proof Checker: One-Sided Sequents Ax Cut A, A, A, A, A, B, A B, A, B, (A B), A, A, The other connectives can be defined in terms of and.

Kernel Proof Checker: Quantifiers f p A[t/x], x.a, A[c/x], x.a, [λx.s/f ] [λx.a/p] The uninterpreted constant c in must not occur in the conclusion, and there are no free variables in t, λx.s, λx.a. The universal quantifier can be defined as a macro in terms of.

Kernel Proof Checker: Equality Equality is an interpreted predicate. Reflex Predicate Congruence a = a, a 1 = b 1,... a n = b n, p(a 1,..., a n ), p(b 1,..., b n ), Transitivity, symmetry, and function congruence can be derived from reflexivity and predicate congruence.

Formalizing ZFC The axioms of ZFC are added to the core first-order logic. Uninterpreted predicates can be used for formulating axiom schemes as axioms. For example, the comprehension axiom scheme of set theory can be written as y. z. x.(x z x y p(x)), where p is a schematic predicate. Here, p can be replaced by a lambda-expression of the form λw.a to yield y. z. x.x z x y A[x/w]. Similarly, the replacement axiom scheme can be written as ( ) ( x w.!y.q(x, y, w)) w. = z. y.(y z x w.q(x, y, w)) where q is a schematic predicate.

Verified Checkers: Rewriting A rewriter is a complex inference procedure with matching and strategies. It might interact with other inference tools, as is the case in PVS. Rewriters can be instrumented to generate traces that can be easily checked. Restricting attention to unconditional rewrite rules of the form x.l = r, for terms l and r, with vars(l) vars(r). Let π be a path, a finite sequence of positive natural numbers, and let s = s, and f (s 1,..., s n ) i;π = s i π.

Verified Checkers: Rewriting The result of replacing the subterm s π by a term t is represented by s π t. Certificate for a rewrite from e to e is given by a trace: a a sequence of triples τ 1,..., τ m. Each triple τ i of the form R i, π i, σ i with rewrite rule R i, a path π i, and a substitution σ i. Such a sequence of triples is a valid certificate for the claim e = e if there is a sequence of terms e 0,..., e m such that e e 0 and e e m, and e i+1 e i πi+1 σ i+1 (r i+1 ), where l i+1 and r i+1 are the left-hand and right-hand sides of the rewrite rule R i+1 and e i πi+1 σ i+1 (l i+1 ) for 0 i < m.

Verified Checkers: Rewriting We can write a function that checks the validity of a trace. The metatheorem states that when a trace is valid for showing e = e, there is a proof that e = e. This proof can be constructed using, congruence, and transitivity. More generally, we can show that A A for a rewrite within a formula A. The validity of the corresponding checker follows from the equivalence theorem which states that if a = b, then A[a] B[a].

Verified Checkers: Resolution Resolution can be used to construct a proof of from a set of clauses K. More generally, resolution can be used to construct the proof of a clause κ from some subset of clauses in K. Each resolution step where a clause κ is derived from the clauses κ 1 and κ 2 is represented by the proof of the sequent κ 1, κ 2, κ. This proof can be constructed using Ax,, and.

Verified Checkers: Logic Front-Ends We have seen how the KoT kernel can be used to verify checkers for inference procedures (e.g., rewriting) and proof formats (e.g., resolution). The kernel can also be used as the back-end for various logics, e.g., equational logic, higher-order logic and modal, temporal, and program logics. This is done by giving a ZFC semantics for these logics. The proof rules for the logic are then justified relative to this semantics.

Trusting the Verified Checker Since the checkers have been verified by untrusted tools, can we trust the checker? The untrusted verifier U has its results checked by V. Suppose that V is also capable of generating a proof. And, we have used U to verify V. Then, we can also generate an independently checkable proof for the correctness of V as verified by V. So that there is no need to trust V with its own verification.

A Hierarchy of Checkers Many inference tools can have their claims certified relative to other inference tools. For example, the computations of a BDD package can be certified by a SAT solver. Similarly, a static analysis tool can be certified by an SMT solver. An SMT solver can itself be certified using a SAT solver and certificate checkers for the individual theories. A SAT solver can be certified by generating resolution proofs. But we can also have verified reference tools, like a verified SAT solver.

SAT as a Kernel Core Propositional satisfiability (SAT) is the problem of checking if a Boolean formula φ has a truth assignment M such that M = φ. In particular, if φ is unsatisfiable, then φ is valid. The validation of many verifiers can be reduced to SAT plus a little bit. SAT can therefore be used as a key component of a kernel that can be used to check claims generated by other untrusted solvers.

Reduction to SAT: Binary Decision Diagrams BDD packages provide an operation sum of cubes to extract the disjunctive normal form. Thus, if BDD G φ represent the formula φ and let σ 1... σ n be the sum-of-cubes representation, with κ 1... κ n as the CNF of its negation. The correspondence between G φ and φ can be checked by testing the satisfiability of G φ κ 1... κ n and G φ σ i, for 1 1 n.

Reduction to SAT: Symbolic Model Checking In symbolic model checking, we have a transition system model M given by I, N with an initial set of states I and a transition relation N. A formula φ holds in the model if M = φ. The set of reachable states is the smallest set of states containing I and closed under the image operation with respect to N. The set R is an overapproximation of the reachable states if I (s) R(s) and R(s) N(s, s ) R(s ) are both unsatisfiable. AGP holds if R(s) P(s) is unsatisfiable. AFP can be validated by a sequence of sets S 0,..., S n such that S 0 (s) P(s) is unsatisfiable, S i+1 (s) N(s, s ) S i (s ) is unsatisfiable for each i n, and I (s) S 0 (s)... S n (s).

Reduction to SAT+: SMT A theory is a set of models closed under isomorphism. A formula φ is T -satisfiable for theory T if there is an M T such that M = φ. An SMT solver checks the theory satisfiability of a formula. When φ is unsatisfiable, it generates theory lemmas θ, such that θ is T -valid and θ φ is propositionally unsatisfiable. The theory lemmas θ can be supported by proofs or by certificates that can be checked by a verified checker. Certificates for arithmetic proofs can be obtained from Farkas lemma, Hilbert s Nullstellensatz, and Stengle s Positivstellensatz.

Reduction to SAT+: Quantified Logic Herbrand s theorem asserts that if a formula in prenex form is unsatisfiable, then some finite conjunction of ground Herbrand instances is unsatisfiable. For example, the formula x. y : P(x) P(y) can be Herbrandized as x.p(x) P(f (x)). The ground Herbrand instance (P(c) P(f (c))) (P(f (c)) P(f (f (c)))). Herbrand s theorem for first-order logic (without equality) can be used to reduce the validation of first-order proofs to SAT. Similarly, Herbrand s theorem for first-order logic with equality and higher-order logic can be used to reduce these logics to SMT (SAT + EUF).

Verifying a DPLL-Based SAT Solver Since SAT is a key component of the verified reference kernel, can we construct a verified reference SAT solver? With Marc Vaucher, we verified a SAT solver that is, in principle, executable. The bulk of the proof is termination. Marc had no prior experience in verification. We formalized it together in two weeks, and then Marc essentially completed the termination proof in another six weeks. Fixing up his termination proof and checking soundness/completeness arguments took up a few more days.

Inference Systems An inference system is a triple Ψ, Λ,. Ψ is a set of inference states Λ is a mapping from inference states to formulas is a binary relation between inference states that is 1 Conservative: If ψ ψ, then Λ(ψ) and Λ(ψ ) must be equisatisfiable. 2 Progressive: For any subset S of Ψ, there is a state ψ S such that there is no ψ S where ψ ψ. 3 Canonizing: If ψ Ψ is irreducible, that is, there is no ψ such that ψ ψ, then either ψ or Λ(ψ) is satisfiable.

DPLL-based SAT solving An inference system with state h, K, C, M and inference rules 1 Propagation: Add implied literal (with explanation) l[l κ] to M at level h, where clause l κ in K C where l is unassigned in M and κ is falsified by M. 2 Conflict: If a clause κ in K C is falsified by M at level 0, then the original formula is unsatisfiable. 3 Backjump: If a conflict is found at level h = i + 1, then the conflict is analyzed to yield a conflict lemma κ l where l is assigned level h but κ is falsified at level i < h. The search is continued with h = i with l added to M i and κ l added to C. 4 Decide: The search is continued by adding a new level h + 1 by adding an assignment for a previously unassigned variable.

Certifying SAT When SAT finds the input to be unsatisfiable, the conflict is a level 0. Each assignment at level 0 is the consequence of a unit resolution with prior assignments and the explanation clause. The explanation clauses are either input clauses or lemmas. Each lemma is itself derived from input clauses and prior lemmas using linear resolution the certificate can just be given as a sequence of lemma indices and literals. A conflict at level 0 can be proved by unit resolution from the unit clauses. This approach can be combined with the use of a verified SAT solver depending on which works best.

DPLL Example Let K be {p q, p q, p q, s p q, s p q, p r, q r}. step h M K C Γ select s 1 ; s K select r 2 ; s; r K propagate 2 ; s; r, q[ q r] K propagate 2 ; s; r, q, p[p q] K conflict 2 ; s; r, q, p K p q

DPLL Example (contd.) step h M K C Γ conflict 2 ; s; r, q, p K p q analyze 0 K q propagate 0 q[q] K q propagate 0 q, p[p q] K q propagate 0 q, p, r[ p r] K q conflict 0 q, p, r K q q r

DPLL Certificate The unsatisfiability results from a conflict on input clause q r with the level 0 assignment q, p, r, where 1 q is a lemma proved by linear resolution from p q and p q. 2 p follows by unit propagation from p q and q 3 r follows by unit propagation from p r and p. 4 follows by unit resolution from q r and q and r. We can build compact, easily checkable resolution certificates.

Conclusions Inference tools, even simple ones, do have bugs. Sometimes these bugs can lead to unsoundness. Verifying working inference procedures is somewhat pointless. Proof generation imposes a high overhead, particularly for experimental tools. The Kernel of Truth approach: Check the verification; verify the checker.