Outline. Database Theory. Perfect Query Optimization. Turing Machines. Question. Definition VU , SS Trakhtenbrot s Theorem

Similar documents
CS 3719 (Theory of Computation and Algorithms) Lecture 4

CSE 135: Introduction to Theory of Computation Decidability and Recognizability

CS154. Turing Machines. Turing Machine. Turing Machines versus DFAs FINITE STATE CONTROL AI N P U T INFINITE TAPE. read write move.

Computability Theory

Notes on Complexity Theory Last updated: August, Lecture 1

Computational Models Lecture 8, Spring 2009

The Halting Problem is Undecidable

Theory of Computation Chapter 2: Turing Machines

CAs and Turing Machines. The Basis for Universal Computation

This asserts two sets are equal iff they have the same elements, that is, a set is determined by its elements.

Turing Machines: An Introduction

3515ICT Theory of Computation Turing Machines

NP-Completeness and Cook s Theorem

Chapter 7 Uncomputability

THE TURING DEGREES AND THEIR LACK OF LINEAR ORDER

Lecture 2: Universality

ON FUNCTIONAL SYMBOL-FREE LOGIC PROGRAMS

Bounded Treewidth in Knowledge Representation and Reasoning 1

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

(IALC, Chapters 8 and 9) Introduction to Turing s life, Turing machines, universal machines, unsolvable problems.

How To Understand The Theory Of Computer Science

Fixed-Point Logics and Computation

CHAPTER 7 GENERAL PROOF SYSTEMS

Scalable Automated Symbolic Analysis of Administrative Role-Based Access Control Policies by SMT solving

Turing Machines, Part I

Notes on Complexity Theory Last updated: August, Lecture 1

Simulation-Based Security with Inexhaustible Interactive Turing Machines

Diagonalization. Ahto Buldas. Lecture 3 of Complexity Theory October 8, Slides based on S.Aurora, B.Barak. Complexity Theory: A Modern Approach.

Reading 13 : Finite State Automata and Regular Expressions

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

Mathematical Induction

1 Definition of a Turing machine

24 Uses of Turing Machines

Introduction to Logic in Computer Science: Autumn 2006

Properties of Stabilizing Computations

Oracle Turing machines faced with the verification problem

Super Turing-Machines

Theoretical Computer Science (Bridging Course) Complexity

Solutions to In-Class Problems Week 4, Mon.

SQL Database queries and their equivalence to predicate calculus

Monitoring Metric First-order Temporal Properties

Introduction to Turing Machines

Handout #1: Mathematical Reasoning

Automata and Formal Languages

The composition of Mappings in a Nautural Interface

3. Mathematical Induction

Informatique Fondamentale IMA S8

THE SEARCH FOR NATURAL DEFINABILITY IN THE TURING DEGREES

Universality in the theory of algorithms and computer science

How To Compare A Markov Algorithm To A Turing Machine

6.080/6.089 GITCS Feb 12, Lecture 3

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2

Outline 2. 1 Turing Machines. 2 Coding and Universality. 3 The Busy Beaver Problem. 4 Wolfram Prize. 5 Church-Turing Thesis.

Introduction to NP-Completeness Written and copyright c by Jie Wang 1

The Classes P and NP

(LMCS, p. 317) V.1. First Order Logic. This is the most powerful, most expressive logic that we will examine.

A first step towards modeling semistructured data in hybrid multimodal logic

C H A P T E R Regular Expressions regular expression

EQUATIONAL LOGIC AND ABSTRACT ALGEBRA * ABSTRACT

Regular Languages and Finite State Machines

Automata and Computability. Solutions to Exercises

Turing Machines, Busy Beavers, and Big Questions about Computing

Deterministic Finite Automata

4.6 The Primitive Recursive Functions

Hypercomputation: computing more than the Turing machine

Introduction to computer science

How To Understand The Relation Between Simplicity And Probability In Computer Science

Introduction to Automata Theory. Reading: Chapter 1

University of Ostrava. Reasoning in Description Logic with Semantic Tableau Binary Trees

Universal Turing Machine: A Model for all Computational Problems

Predicate Logic. Example: All men are mortal. Socrates is a man. Socrates is mortal.

Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson

Predicate Logic. For example, consider the following argument:

Solving Linear Equations in One Variable. Worked Examples

Degrees that are not degrees of categoricity

On the Structure of Turing Universe: The Non-Linear Ordering of Turing Degrees

Regular Languages and Finite Automata

SECTION 10-2 Mathematical Induction

WRITING PROOFS. Christopher Heil Georgia Institute of Technology

Welcome to... Problem Analysis and Complexity Theory , 3 VU

Kolmogorov Complexity and the Incompressibility Method

Complexity of Higher-Order Queries

x < y iff x < y, or x and y are incomparable and x χ(x,y) < y χ(x,y).

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

Testing LTL Formula Translation into Büchi Automata

Which Semantics for Neighbourhood Semantics?

Solutions Q1, Q3, Q4.(a), Q5, Q6 to INTLOGS16 Test 1

! " # The Logic of Descriptions. Logics for Data and Knowledge Representation. Terminology. Overview. Three Basic Features. Some History on DLs

The Recovery of a Schema Mapping: Bringing Exchanged Data Back

An example of a computable

Turing Machines and Understanding Computational Complexity

Lecture 1: Course overview, circuits, and formulas

Secrecy for Bounded Security Protocols With Freshness Check is NEXPTIME-complete

Computational complexity theory

So let us begin our quest to find the holy grail of real analysis.

Question Answering and the Nature of Intercomplete Databases

Software Verification and Testing. Lecture Notes: Temporal Logics

INTRODUCTORY SET THEORY

One More Decidable Class of Finitely Ground Programs

Biinterpretability up to double jump in the degrees

Transcription:

Database Theory Database Theory Outline Database Theory VU 181.140, SS 2016 4. Trakhtenbrot s Theorem Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 4. Trakhtenbrot s Theorem 4.1 Motivation 4.2 Turing Machines and Undecidability 4.3 Trakhtenbrot s Theorem 4.4 Finite vs. Infinite Domain 12 April, 2016 Pichler 12 April, 2016 Page 1 Pichler 12 April, 2016 Page 2 Database Theory 4. Trakhtenbrot s Theorem 4.1. Motivation Perfect Query Optimization Database Theory 4. Trakhtenbrot s Theorem 4.2. Turing Machines and Undecidability Turing Machines A legitimate question: Question Given a query Q in RA, does there exist at least one database A such that Q(A)? If there is no such database, then the query Q makes no sense and we can directly replace it by the empty result. Could save much run-time! We shall show that this problem is undecidable! We first recall some basic notions and results from the lecture Formale Methoden der Informatik. Turing machines are a formal model of algorithms to solve problems: Definition A Turing machine is a quadruple M = (K, Σ, δ, s) with a finite set of states K, a finite set of symbols Σ (alphabet of M) so that, Σ, a transition function δ: K Σ (K {q halt, q yes, q no }) Σ {+1, 1, 0}, a halting state q halt, an accepting state q yes, a rejecting state q no, and R/W head directions: +1 (right), 1 (left), and 0 (stay). Pichler 12 April, 2016 Page 3 Pichler 12 April, 2016 Page 4

Database Theory 4. Trakhtenbrot s Theorem 4.2. Turing Machines and Undecidability Function δ is the program of the machine. For the current state q K and the current symbol σ Σ, δ(q, σ) = (p, ρ, D) where p is the new state, ρ is the symbol to be overwritten on σ, and D {+1, 1, 0} is the direction in which the R/W head will move. For any states p and q, δ(q, ) = (p, ρ, D) with ρ = and D = +1. In other words: The delimiter is never overwritten by another symbol, and the R/W head never moves off the left end of the tape. The machine starts as follows: (i) the initial state of M = (K, Σ, δ, s) is s, (ii) the tape is initialized to the infinite string I..., where I is a finitely long string in (Σ { }) (I is the input of the machine) and (iii) the R/W head points to. The machine halts iff q halt, q yes, or q no has been reached. If q yes has been reached, then the machine accepts the input. If q no has been reached, then the machine rejects the input. If q halt has been reached, then the machine produces output. Database Theory 4. Trakhtenbrot s Theorem 4.2. Turing Machines and Undecidability Church-Turing Thesis Church-Turing Thesis Any reasonable attempt to model mathematically computer algorithms ends up with a model of computation that is equivalent to Turing machines. Evidence for this thesis All of the following models can be shown to have precisely the same expressive power as Turing machines: Random access machines µ-recursive functions any conventional programming language (Java, C,... ) Strengthening of the Church-Turing Thesis Turing machines are not less efficient than other models of computation! Pichler 12 April, 2016 Page 5 Pichler 12 April, 2016 Page 6 Database Theory 4. Trakhtenbrot s Theorem 4.2. Turing Machines and Undecidability Halting Problem Trakhtenbrot s Theorem HALTING INSTANCE: A Turing machine M, an input string I. QUESTION: Does M halt on I? Theorem HALTING is undecidable, i.e. there does not exist a Turing machine that decides HALTING. Undecidability applies already to the following variant of HALTING: HALTING-ɛ INSTANCE: A Turing machine M. QUESTION: Does M halt on the empty string ɛ, i.e. does M reach q halt, q yes, or q no when run on the initial tape contents...? Theorem (Trakhtenbrot s Theorem, 1950) For every relational vocabulary σ with at least one binary relation symbol, it is undecidable to check whether an FO sentence ϕ over σ is finitely satisfiable (i.e. has a finite model). This theorem rules out perfect query optimization. Translated into database terminology, it reads: Theorem For a database schema σ with at least one binary relation, it is undecidable whether a Boolean FO or RA query Q over σ is satisfied by at least one database. Pichler 12 April, 2016 Page 7 Pichler 12 April, 2016 Page 8

Idea to prove Trakhtenbrot s Theorem Proof of Trakhtenbrot s Theorem Define a relational signature σ suitable for encoding finite computations of a TM. Given an arbitrary TM M, we construct an FO formula ϕ M encoding the computation of M and a halting condition, such that: ϕ M has a finite model iff M halts on ɛ. The undecidability of HALTING-ɛ together with the reduction proves Trakhtenbrot s Theorem! Assume a machine M = (K, Σ, δ, q start ). Simplifying assumptions: σ may have several unary and binary relations Exercise. We could easily encode them into a single binary relation. Tape alphabet of M is Σ = {0, 1,, } Can always be obtained by simple binary encoding, e.g., let Σ = {a 1,..., a k } with k 8, then we use the following encoding: a 0 000, a 1 001, a 2 010, a 3 011, etc. Pichler 12 April, 2016 Page 9 Pichler 12 April, 2016 Page 10 We use the following relations: Binary < will encode a linear order (as usual, we ll write x < y instead of < (x, y)). The elements of this linear order will be used to simulate both time instants and tape positions (= cell numbers). Unary Min will denote the smallest element of <. Note: instead of a relation Min we can use a constant min. Binary Succ will encode the successor relation w.r.t. the linear order. Binary T 0, T 1, T, T are tape predicates: T α (p, t) indicates that cell number p at time t contains α. Binary H will store the head position: H(p, t) indicates that the R/W head at time t is at position p (i.e., at cell number p). Binary S will store the state: S(q, t) indicates that at time instant t the machine is in state q. We let ϕ M be the conjunction ϕ M = ϕ < ϕ Min ϕ comp that is explained next: < must be a strict linear order (a total, transitive, antisymmetric, irreflexive relation). Thus ϕ < is the conjunction of: x, y.(x y (x < y y < x)) x, y, z.((x < y y < z) x < z) x, y. (x < y y < x) We axiomatize the successor relation based on < as follows: x, y.(succ(x, y) (x < y) z.(x < z z < y)) Pichler 12 April, 2016 Page 11 Pichler 12 April, 2016 Page 12

Min must contain the minimal element of <. Thus ϕ Min is: The formula ϕ comp is defined as x, y.(min(x) (x = y x < y)) ϕ comp y 0, y 1,..., y k (ϕ states ϕ rest ), where each variable y i corresponds to the state q i of M (we assume the TM has k + 1 states), and ϕ states 0 i<j k y i y j. Intuitively, using the y 0, y 1,..., y k prefix and ϕ states we associate to each state of M a distinct domain element. The formula ϕ rest is the conjunction of several formulas defined next (R1-R6) to describe the behaviour of M. (R1) Formula defining the initial configuration of M with... on its input tape. At time instant 0 the tape has in the first cell of the tape: All other cells contain at time 0: p.(min(p) T (p, p)) p, t.((min(t) Min(p)) T (p, t)) The head is initially at the start position 0: t(min(t) H(t, t)) The machine is initially in state q start : t(min(t) S(y start, t)) Pichler 12 April, 2016 Page 13 Pichler 12 April, 2016 Page 14 (R2) Formulas stating that in every configuration, each cell of the tape contains exactly one symbol: p, t.(t 0 (p, t) T 1 (p, t) T (p, t) T (p, t)), (R5) Formulas describing the transitions. In particular, for each tuple (q 1, σ 1, q 2, σ 2, D) such that δ(q 1, σ 1 ) = (q 2, σ 2, D), we have the formula: ( p, t (H(p, t) T σ1 (p, t) S(y 1, t)) p, t. ( FollowTo(p, p ) Succ(t, t ) p, t.( T σ1 (p, t) T σ2 (p, t)), for all σ 1 σ 2 Σ (R3) A formula stating that at any time the machine is in exactly one state: t.(( S(y i, t)) (S(y i, t) S(y j, t))) 0 i k 0 i<j k (R4) A formula stating that at any time the head is at exactly one position: t. ( [ p.(h(p, t)] p, p.[h(p, t) H(p, t) p = p ] ) where: H(p, t ) S(y 2, t ) T σ2 (p, t ) r.(r p T 0 (r, t) T 0 (r, t )) r.(r p T 1 (r, t) T 1 (r, t )) r.(r p T (r, t) T (r, t )) r.(r p T (r, t) T (r, t )) )) Succ(p, p ) if D = +1, FollowTo(p, p ) Succ(p, p) if D = 1, p = p if D = 0. Pichler 12 April, 2016 Page 15 Pichler 12 April, 2016 Page 16

Further Consequences of Trakhtenbrot s Theorem (R6) A formula ϕ halt saying that M halts on input I : t.(s(y halt, t) S(y yes, t) S(y no, t)). This completes the description of the formula ϕ M, which faithfully describes the computation of M on the empty word ɛ. By construction of ϕ M, we have: ϕ M has a finite model iff M halts on ɛ The following problems can now be easily shown undecidable: checking whether an FO query is domain independent, checking query containment of two FO (or RA) queries; recall that this means: A : Q 1 (A) Q 2 (A); checking equivalence of two FO (or RA) queries. This completes the reduction from HALTING-ɛ and proves Trakhtenbrot s Theorem. Pichler 12 April, 2016 Page 17 Pichler 12 April, 2016 Page 18 Proof Sketches Undecidability of Domain Independence By reduction from finite satisfiability: Let ϕ be an arbitrary instance of finite satisfiability. Construct the following instance ψ of Domain Dependence: w.l.o.g. let x be a variable not occurring in ϕ; then we set ψ = R(x) ϕ. Undecidability of Query Containment and Query Equivalence By reduction from finite unsatisfiability: Let ϕ be an arbitrary instance of finite unsatisfiability; w.l.o.g., suppose that ϕ has no free variables (i.e., simply add existential quantifiers). Let χ be a trivially unsatisfiable query, e.g., χ = ( x)(r(x) R(x)). Define the instance (ϕ, χ) of Query Containment or Query Equivalence. Finite vs. Infinite Domain Motivation Recall the following property of the formula ϕ M in the proof of Trakhtenbrot s Theorem: ϕ M has a finite model iff M halts on ɛ. Question. What about arbitrary models (with possibly infinite domain)? It turns out that the ( direction of the) equivalence ϕ M has an arbitrary model iff M halts on ɛ does not hold. Indeed, suppose that M does not terminate on input ɛ. Then ϕ M has the following (infinite) model: Choose as domain D the natural numbers {0, 1,..., } plus some additional element a. Choose the ordering such that a is greater than all natural numbers. By assumption, M runs forever and we set S(, n), T σi (n, m), and H(n, m) according to the intended meaning of these predicates. Moreover, we set S(y halt, a) to true. This is consistent with the rest since, intuitively, time instant a is never reached. Pichler 12 April, 2016 Page 19 Pichler 12 April, 2016 Page 20

Finite vs. Infinite Domain (2) Question. How should we modify the problem reduction to prove undecidability of the Entscheidungsproblem (i.e. validity or, equivalently, unsatisfiability of FO without the restriction to finite models)? Undecidability of the Entscheidungsproblem We modify the problem reduction as follows: Transform the formula ϕ M into ϕ M as follows: we replace the subformula ϕ halt in ϕ M by ϕ halt. Then we have: ϕ M has no model at all iff M halts on ɛ. In other words, we have reduced HALTING-ɛ to Unsatisfiability. Question. Does this reduction also work for finite unsatisfiability? The answer is no, because of the the direction. Indeed, suppose that M does not terminate on input ɛ. Then, by the above equivalence, ϕ M has a model but no finite model! Intuitively, since M does not halt, any model refers to infinitely many time instants. Semi-Decidability By the Completeness Theorem, we know that Validity or, equivalently, Unsatisfiability of FO is semi-decidable. Question. What about finite validity or finite unsatisfiability? (i.e., is an FO formula true in every resp. no interpretation with finite domain.) Observation We have proved Trakhtenbrot s Theorem by reduction of the HALTING-ɛ problem to the finite satisfiability problem. This reduction can of course also be seen as a reduction from co-halting-ɛ to finite unsatisfiability. We know that the co-problem of HALTING is not semi-decidable. Hence, co-halting-ɛ is not semi-decidability either. Therefore, finite unsatisfiability is not semi-decidable. Pichler 12 April, 2016 Page 21 Pichler 12 April, 2016 Page 22 Semi-Decidability (2) Learning Objectives Recall that satisfiability of FO is not semi-decidable. In contrast, we now show that finite satisfiability is semi-decidable. Proof idea The evaluation of an FO formula in an interpretation is defined by a recursive algorithm. This algorithm terminates over finite domains. Hence, it is decidable if a given formula ϕ is satisfied by a finite interpretation I. Hence, for finite signatures, the problem whether an FO formula has a model with a given finite cardinality is decidable. Therefore, for finite signatures, finite satisfiability of FO is semi-decidable. Short recapitulation of Turing machines, undecidability (the HALTING problem). Formulation of Trakhtenbrot s Theorem in terms of FO logic and databases. Proof of Trakhtenbrot s Theorem. Further undecidability results. Differences between finite and infinite domain. Pichler 12 April, 2016 Page 23 Pichler 12 April, 2016 Page 24