The Satisfiability Problem. Cook s Theorem: An NP-Complete Problem Restricted SAT: CSAT, 3SAT

Similar documents
The Classes P and NP

NP-Completeness and Cook s Theorem

Introduction to Logic in Computer Science: Autumn 2006

Lecture 2: Universality

OHJ-2306 Introduction to Theoretical Computer Science, Fall

1. Nondeterministically guess a solution (called a certificate) 2. Check whether the solution solves the problem (called verification)

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

CMPSCI611: Approximating MAX-CUT Lecture 20

Theoretical Computer Science (Bridging Course) Complexity

Boolean Algebra Part 1

Page 1. CSCE 310J Data Structures & Algorithms. CSCE 310J Data Structures & Algorithms. P, NP, and NP-Complete. Polynomial-Time Algorithms

Notes on Complexity Theory Last updated: August, Lecture 1

Introduction to NP-Completeness Written and copyright c by Jie Wang 1

Notes on NP Completeness

Chapter 1. NP Completeness I Introduction. By Sariel Har-Peled, December 30, Version: 1.05

NP-Completeness I. Lecture Overview Introduction: Reduction and Expressiveness

CSE 135: Introduction to Theory of Computation Decidability and Recognizability

Lecture 19: Introduction to NP-Completeness Steven Skiena. Department of Computer Science State University of New York Stony Brook, NY

Reading 13 : Finite State Automata and Regular Expressions

Notes on Complexity Theory Last updated: August, Lecture 1

Welcome to... Problem Analysis and Complexity Theory , 3 VU

Mathematical Induction. Lecture 10-11

Lecture 7: NP-Complete Problems

CS510 Software Engineering

Resolution. Informatics 1 School of Informatics, University of Edinburgh

The Halting Problem is Undecidable

CoNP and Function Problems

WRITING PROOFS. Christopher Heil Georgia Institute of Technology

Diagonalization. Ahto Buldas. Lecture 3 of Complexity Theory October 8, Slides based on S.Aurora, B.Barak. Complexity Theory: A Modern Approach.

Mathematical Induction

Regular Expressions and Automata using Haskell

6.080/6.089 GITCS Feb 12, Lecture 3

Lecture 5 - CPA security, Pseudorandom functions

Quotient Rings and Field Extensions

Tutorial 8. NP-Complete Problems

CS154. Turing Machines. Turing Machine. Turing Machines versus DFAs FINITE STATE CONTROL AI N P U T INFINITE TAPE. read write move.

Why? A central concept in Computer Science. Algorithms are ubiquitous.

Theory of Computation Chapter 2: Turing Machines

Logic in Computer Science: Logic Gates

Data Structures Fibonacci Heaps, Amortized Analysis

Introduction to Algorithms. Part 3: P, NP Hard Problems

Logic in general. Inference rules and theorem proving

Near Optimal Solutions

Discrete Optimization

(IALC, Chapters 8 and 9) Introduction to Turing s life, Turing machines, universal machines, unsolvable problems.

Universality in the theory of algorithms and computer science

Guessing Game: NP-Complete?

C H A P T E R Regular Expressions regular expression

1 Definition of a Turing machine

The P versus NP Solution

Complexity Theory. IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar

CAs and Turing Machines. The Basis for Universal Computation

Reminder: Complexity (1) Parallel Complexity Theory. Reminder: Complexity (2) Complexity-new

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

Ian Stewart on Minesweeper

Basic Proof Techniques

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2

Full and Complete Binary Trees

Proseminar on Semantic Theory Fall 2013 Ling 720. Problem Set on the Formal Preliminaries : Answers and Notes

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

Generating models of a matched formula with a polynomial delay

CMPSCI 250: Introduction to Computation. Lecture #19: Regular Expressions and Their Languages David Mix Barrington 11 April 2013

CSC 373: Algorithm Design and Analysis Lecture 16

3515ICT Theory of Computation Turing Machines

Computational Models Lecture 8, Spring 2009

Vieta s Formulas and the Identity Theorem

One last point: we started off this book by introducing another famously hard search problem:

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

Chapter Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling

8 Divisibility and prime numbers

Computability Theory

SECTION 10-2 Mathematical Induction

Lecture 3: Finding integer solutions to systems of linear equations

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

INCIDENCE-BETWEENNESS GEOMETRY

COMPUTER SCIENCE TRIPOS

Chapter. NP-Completeness. Contents

Classification - Examples

Automata and Computability. Solutions to Exercises

Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson

Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy

CSEE 3827: Fundamentals of Computer Systems. Standard Forms and Simplification with Karnaugh Maps

A Working Knowledge of Computational Complexity for an Optimizer

Lecture 1: Oracle Turing Machines

6.3 Conditional Probability and Independence

1 if 1 x 0 1 if 0 x 1

Discuss the size of the instance for the minimum spanning tree problem.

6.2 Permutations continued

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

Dynamic Programming. Lecture Overview Introduction

Handout #1: Mathematical Reasoning

3. Mathematical Induction

LEARNING OBJECTIVES FOR THIS CHAPTER

SIMPLIFYING ALGEBRAIC FRACTIONS

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

Reductions & NP-completeness as part of Foundations of Computer Science undergraduate course

Handout #Ch7 San Skulrattanakulchai Gustavus Adolphus College Dec 6, Chapter 7: Digraphs

The Chinese Remainder Theorem

Transcription:

The Satisfiability Problem Cook s Theorem: An NP-Complete Problem Restricted SAT: CSAT, 3SAT 1

Boolean Expressions Boolean, or propositional-logic expressions are built from variables and constants using the operators AND, OR, and NOT. Constants are true and false, represented by 1 and 0, respectively. We ll use concatenation (juxtaposition) for AND, + for OR, - for NOT, unlike the text. 2

Example: Boolean expression (x+y)(-x + -y) is true only when variables x and y have opposite truth values. Note: parentheses can be used at will, and are needed to modify the precedence order NOT (highest), AND, OR. 3

The Satisfiability Problem (SAT ) Study of boolean functions generally is concerned with the set of truth assignments (assignments of 0 or 1 to each of the variables) that make the function true. NP-completeness needs only a simpler question (SAT): does there exist a truth assignment making the function true? 4

Example: SAT (x+y)(-x + -y) is satisfiable. There are, in fact, two satisfying truth assignments: 1. x=0; y=1. 2. x=1; y=0. x(-x) is not satisfiable. 5

SAT as a Language/Problem An instance of SAT is a boolean function. Must be coded in a finite alphabet. Use special symbols (, ), +, - as themselves. Represent the i-th variable by symbol x followed by integer i in binary. 6

Example: Encoding for SAT (x+y)(-x + -y) would be encoded by the string (x1+x10)(-x1+-x10) 7

SAT is in NP There is a multitape NTM that can decide if a Boolean formula of length n is satisfiable. The NTM takes O(n 2 ) time along any path. Use nondeterminism to guess a truth assignment on a second tape. Replace all variables by guessed truth values. Evaluate the formula for this assignment. Accept if true. 8

Cook s Theorem SAT is NP-complete. Really a stronger result: formulas may be in conjunctive normal form (CSAT) later. To prove, we must show how to construct a polytime reduction from each language L in NP to SAT. Start by assuming the most resticted possible form of NTM for L (next slide). 9

Assumptions About NTM for L 1. One tape only. 2. Head never moves left of the initial position. 3. States and tape symbols are disjoint. Key Points: States can be named arbitrarily, and the constructions many-tapes-to-one and two-wayinfinite-tape-to-one at most square the time. 10

More About the NTM M for L Let p(n) be a polynomial time bound for M. Let w be an input of length n to M. If M accepts w, it does so through a sequence I 0 I 1 I p(n) of p(n)+1 ID s. Assume trivial move from a final state. Each ID is of length at most p(n)+1, counting the state. 11

From ID Sequences to Boolean Functions The Boolean function that the transducer for L will construct from w will have (p(n)+1) 2 variables. Let variable X ij represent the j-th position of the i-th ID in the accepting sequence for w, if there is one. i and j each range from 0 to p(n). 12

Picture of Computation as an Array Initial ID I 1 X 00 X 01 X 0p(n) X 10 X 11 X 1p(n)...... I p(n) X p(n)0 X p(n)1 X p(n)p(n) 13

Intuition From M and w we construct a boolean formula that forces the X s to represent one of the possible ID sequences of NTM M with input w, if it is to be satisfiable. It is satisfiable iff some sequence leads to acceptance. 14

From ID s to Boolean Variables The X ij s are not boolean variables; they are states and tape symbols of M. However, we can represent the value of each X ij by a family of Boolean variables y ija, for each possible state or tape symbol A. y ija is true if and only if X ij = A. 15

Points to Remember 1. The boolean function has components that depend on n. These must be of size polynomial in n. 2. Other pieces depend only on M. No matter how many states/symbols m has, these are of constant size. 3. Any logical formula about a set of variables whose size is independent of n can be written somehow. 16

Designing the Function We want the Boolean function that describes the X ij s to be satisfiable if and only if the NTM M accepts w. Four conditions: 1. Unique: only one symbol per position. 2. Starts right: initial ID is q 0 w. 3. Moves right: each ID follows from the next by a move of M. 4. Finishes right: M accepts. 17

Unique Take the AND over all i, j, Y, and Z of (-y ijy + -y ijz ). That is, it is not possible for X ij to be both symbols Y and Z. 18

Starts Right The Boolean Function needs to assert that the first ID is the correct one with w = a 1 a n as input. 1. X 00 = q 0. 2. X 0i = a i for i = 1,, n. 3. X 0i = B (blank) for i = n+1,, p(n). Formula is the AND of y 0iZ for all i, where Z is the symbol in position i. 19

Finishes Right Somewhere, there must be an accepting state. Form the OR of Boolean variables y ijq, where i and j are arbitrary and q is an accepting state. Note: differs from text. 20

Running Time So Far Unique requires O(p 2 (n)) symbols be written. Parentheses, signs, propositional variables. Algorithm is easy, so it takes no more time than O(p 2 (n)). Starts Right takes O(p(n)) time. Finishes Right takes O(p 2 (n)) time. Variation over symbols Y and Z is independent of n, so covered by constant. Variation over i and j 21

Running Time (2) Caveat: Technically, the propositions that are output of the transducer must be coded in a fixed alphabet, e.g., x10011 rather than y ija. Thus, the time and output length have an additional factor O(log n) because there are O(p 2 (n)) variables. But log factors do not affect polynomials 22

Moves Right Works because Unique assures only one y ijx true. X ij = X i-1,j whenever the state is none of X i-1,j-1, X i-1,j, or X i-1,j+1. For each i and j, construct a formula that says (in propositional variables) the OR of X ij =X i-1,j and all y i-1,k,a where A is a state symbol (k = i-1, i, or i+1). Note: X ij =X i-1,j is the OR of y ija.y i-1,ja for all symbols A. 23

Constraining the Next Symbol A B C B A q C??? Easy case; must be B Hard case; all three may depend on the move of M 24

Moves Right (2) In the case where the state is nearby, we need to write an expression that: 1. Picks one of the possible moves of the NTM M. 2. Enforces the condition that when X i-1,j is the state, the values of X i,j-1, X i,j, and X i,j+1. are related to X i-1,j-1,x i-1,j, and X i-1,j+1 in a way that reflects the move. 25

Example: Moves Right Suppose δ(q, A) contains (p, B, L). Then one option for any i, j, and C is: C q A p C B If δ(q, A) contains (p, B, R), then an option for any i, j, and C is: C q A C B p 26

Moves Right (3) For each possible move, express the constraints on the six X s by a Boolean formula. For each i and j, take the OR over all possible moves. Take the AND over all i and j. Small point: for edges (e.g., state at 0), assume invisible symbols are blank. 27

Running Time We have to generate O(p 2 (n)) Boolean formulas, but each is constructed from the moves of the NTM M, which is fixed in size, independent of the input w. Takes time O(p 2 (n)) and generates an output of that length. Times log n, because variables must be coded in a fixed alphabet. 28

Cook s Theorem Finale In time O(p 2 (n) log n) the transducer produces a boolean formula, the AND of the four components: Unique, Starts, Finishes, and Moves Right. If M accepts w, the ID sequence gives us a satisfying truth assignment. If satisfiable, the truth values tell us an accepting computation of M. 29

Picture So Far We have one NP-complete problem: SAT. In the future, we shall do polytime reductions of SAT to other problems, thereby showing them NP-complete. Why? If we polytime reduce SAT to X, and X is in P, then so is SAT, and therefore so is all of NP. 30

Conjunctive Normal Form A Boolean formula is in Conjunctive Normal Form (CNF) if it is the AND of clauses. Each clause is the OR of literals. A literal is either a variable or the negation of a variable. Problem CSAT : is a Boolean formula in CNF satisfiable? 31

Example: CNF (x + -y + z)(-x)(-w + -x + y + z) ( 32

NP-Completeness of CSAT The proof of Cook s theorem can be modified to produce a formula in CNF. Unique is already the AND of clauses. Starts Right is the AND of clauses, each with one variable. Finishes Right is the OR of variables, i.e., a single clause. 33

NP-Completeness of CSAT (2) Only Moves Right is a problem, and not much of a problem. It is the product of formulas for each i and j. Those formulas are fixed, independent of n. 34

NP-Completeness of CSAT (3) You can convert any formula to CNF. It may exponentiate the size of the formula and therefore take time to write down that is exponential in the size of the original formula, but these numbers are all fixed for a given NTM M and independent of n. 35

k-sat If a boolean formula is in CNF and every clause consists of exactly k literals, we say the boolean formula is an instance of k-sat. Say the formula is in k-cnf. Example: 3-SAT formula (x + y + z)(x + -y + z)(x + y + -z)(x + -y + -z) 36

k-sat Facts Every boolean formula has an equivalent CNF formula. But the size of the CNF formula may be exponential in the size of the original. Not every boolean formula has a k-sat equivalent. 2SAT is in P; 3SAT is NP-complete. 37

Proof: 2SAT is in P (Sketch) Pick an assignment for some variable, say x = true. Any clause with x forces the other literal to be true. Example: (-x + -y) forces y to be false. Keep seeing what other truth values are forced by variables with known truth values. 38

Proof (2) One of three things can happen: 1. You reach a contradiction (e.g., z is forced to be both true and false). 2. You reach a point where no more variables have their truth value forced, but some clauses are not yet made true. 3. You reach a satisfying truth assignment. 39

Proof (3) Case 1: (Contradiction) There can only be a satisfying assignment if you use the other truth value for x. Simplify the formula by replacing x by this truth value and repeat the process. Case 3: You found a satisfying assignment, so answer yes. 40

Proof (4) Case 2: (You force values for some variables, but other variables and clauses are not affected). Adopt these truth values, eliminate the clauses that they satisfy, and repeat. In Cases 1 and 2 you have spent O(n 2 ) time and have reduced the length of the formula by > 1, so O(n 3 ) total. 41

3SAT This problem is NP-complete. Clearly it is in NP, since SAT is. It is not true that every Boolean formula can be converted to an equivalent 3-CNF formula, even if we exponentiate the size of the formula. 42

3SAT (2) But we don t need equivalence. We need to reduce every CNF formula F to some 3-CNF formula that is satisfiable if and only if F is. Reduction involves introducing new variables into long clauses, so we can split them apart. 43

Reduction of CSAT to 3SAT Let (x 1 + +x n ) be a clause in some CSAT instance, with n > 4. Note: the x s are literals, not variables; any of them could be negated variables. Introduce new variables y 1,,y n-3 that appear in no other clause. 44

CSAT to 3SAT (2) Replace (x 1 + +x n ) by (x 1 +x 2 +y 1 )(x 3 +y 2 + -y 1 ) (x i +y i-1 + -y i-2 ) (x n-2 +y n-3 + -y n-4 )(x n-1 +x n + -y n-3 ) If there is a satisfying assignment of the x s for the CSAT instance, then one of the literals x i must be made true. Assign y j = true if j < i-1 and y j = false for larger j. 45

CSAT to 3SAT (3) We are not done. We also need to show that if the resulting 3SAT instance is satisfiable, then the original CSAT instance was satisfiable. 46

CSAT to 3SAT (4) Suppose (x 1 +x 2 +y 1 )(x 3 +y 2 + -y 1 ) (x n-2 +y n-3 + -y n-4 )(x n-1 +x n + -y n-3 ) is satisfiable, but none of the x s is true. The first clause forces y 1 = true. Then the second clause forces y 2 = true. And so on all the y s must be true. But then the last clause is false. 47

CSAT to 3SAT (5) There is a little more to the reduction, for handling clauses of 1 or 2 literals. Replace (x) by (x+y 1 +y 2 ) (x+y 1 + -y 2 ) (x+ -y 1 +y 2 ) (x+ -y 1 + -y 2 ). Replace (w+x) by (w+x+y)(w+x+ -y). Remember: the y s are different variables for each CNF clause. 48

CSAT to 3SAT Running Time This reduction is surely polynomial. In fact it is linear in the length of the CSAT instance. Thus, we have polytime-reduced CSAT to 3SAT. Since CSAT is NP-complete, so is 3SAT. 49