Recap Decision Problems: Problems that take a yes/no answer Examples: Language Recognition: Is a given string a member of language L? Graph properties: Does a given graph have a Hamiltonian cycle? Language equality: Do two given NFAs recognize the same language? Questions about Turing Machines: Does the given TM halt when executed on the given string?
Recap A decision problem is decidable (solvable, recursive) if a TM that a) halts on every input, and b) always gives the right answer A decision problem is partially decidable (partially solvable, recursively enumerable) if a TM that a) halts/accepts on every YES instance* (may loop on NO instances), and b) always gives the right answer, i.e., does not accept any no instances *Note: a "YES instance" ("NO instance") is an instance of a problem for which the correct answer is "YES" ("NO")
Language Recognition as paradigm Any decision problem can be cast as a language recognition problem: 1. Define a way to encode problem instances as strings over some finite alphabet 2. Define the language L to be the set of strings that encode instances that have a yes answer Thus: A method of recognizing L can be used to solve instances of the problem Given an instance i, encode it as a string s Give s to the recognizer Answer yes if recognizer says s L, else no The point: language recognition gives us a way to talk about the hardness of all kinds of problems
Example:Traveling Salesman The problem: Given N cities, a maximum cost B, and the oneway airfare between each pair of cities, is there a tour that goes to each city exactly once and costs at most B? Instance: N, B, list of N 2 costs (N of them = 0) N=5 B=1800 DEN ORD SFO ATL MSP DEN ORD SFO ATL MSP 0 267 129 558 386 267 0 678 777 357 129 678 0 598 462 558 777 598 0 433 386 357 462 433 0
Example:Traveling Salesman Instance encoded as string over {0,1,#}: N B DEN-ORD 101#111000010000#0#100001011#10000001#100 0101110#110000010#100001011#0#... Language = all strings corresponding to instances having tours with total cost B N=5 B=1800 DEN ORD SFO ATL MSP DEN ORD SFO ATL MSP 0 267 129 558 386 267 0 678 777 357 129 678 0 598 462 558 777 598 0 433 386 357 462 433 0
Classes of Languages We have discussed classes of languages that require machines with increasingly sophisticated capabilities Each successive class contains all the languages in the earlier classes Let's review them order: simplest to most complex also "smallest" to "largest"
Regular Languages Example: strings with no b after an a Defined by Regular expressions a*b* Regular grammars S as B Λ B bb Λ Accepted by Deterministic Finite Automata (DFAs) Nondeterministic Finite Automata (NFAs)
Deterministic Context-Free Languages Example: {a n b n n N} Defined by LR(k) Grammars (we didn t discuss these) S asb Λ Accepted by Deterministic Pushdown Automata (DPDAs)
Context-Free Languages Example: palindromes over {a,b} Defined by Context-Free Grammars S asa bsb a b Λ Accepted by [Nondeterministic] Pushdown Automata (PDAs)
Context-Sensitive Languages Example: {a n b n c n n 1} Defined by Grammars with productions of the form v w where 1 v w Note: such languages cannot contain Λ Example: S abc aabc A abc aabc Cb bc Cc cc Accepted by Linear-Bounded Automata (LBAs) Nondeterministic TMs that use no more than n+2 cells of the tape, where n = size of the input string (see Example 13.2 in text)
Recursively Enumerable Languages Example: {a n M n (n) halts } Defined by Unrestricted Grammars Accepted by Nothing (not decidable) Can be enumerated by some TM
Arbitrary Languages Example: {a n M n halts on every input } Not defined by any grammar Non-grammatical Not recursively enumerable No TM can enumerate the language
COMPLEXITY CLASSES
The Concept of Complexity Needed: a way to talk about how hard a (solvable!) problem is How to quantify hardness? Focus on decidable problems Solution = Algorithm: a Turing Machine that always halts Intuition: the harder a problem, the more costly it is to solve Time cost: number of steps required Space cost: maximum number of tape cells used Intuition: cost should be proportional to the size of the instance Larger instances should cost more before we consider them hard Example: Expect TSP instance with 100 cities is costlier than one with 5 cities but it's the same problem!
Deterministic Time Complexity A deterministic Turing Machine s time complexity is the worst-case number of instructions it executes before halting, expressed as a function of input size Note: input is assumed to be efficiently encoded TM M has time complexity f(n) means f(n) = max(number of steps M takes with s as input) s: input of size n Example: Add-1 TM has time complexity 2n+3
Nondeterministic Time Complexity A nondeterministic Turing Machine s time complexity is f(n) if f(n) is the maximum number of instructions the TM executes in any of its possible computations on any input of length n.
Deterministic Space Complexity A deterministic Turing Machine s space complexity is the worst-case number of tape cells it visits, expressed as a function of input size TM M has time complexity f(n) means f(n) = max(number of nonblank cells on tape with s as input) s: input of size n Example: Add-1 TM has space complexity n+2
Nondeterministic Space Complexity The space complexity of a nondeterministic Turing Machine, each of whose possible computations halts, is the maximum number of tape cells it scans on any computation for an input of a given length
Which is greater Time or Space? For every TM: Time complexity Space complexity... because it takes at least one step to write one cell on the tape
Expressing Complexity Typically, we use big-oh notation for complexity Ignores constant and lower-order factors M s time complexity is O(n) instead of "M's time complexity is 2n+1" Recall from elsewhere (CS 215): f(x) = O(g(x)) means x 0 c x (x > x 0 f(x) < cg(x)) c g(x) f(x)
Expressing Complexity Big-Oh notation hides constant factors In discussing hardness we typically use even coarser measures: Linear cost: O(n) Polynomial cost: O(p(n)) where p(n) is some polynomial (we don't care which) Exponential cost: O(C n ) for some constant C (we don't care what) Example: Add-1 TM has linear cost
Polynomial Cost Tractability A problem is considered tractable if: There exists a deterministic TM that solves the problem (i.e., always gives the correct answer, always halts) and has polynomial time and space costs Note: there might be lots of TMs that have higher time or space costs The hardness of the problem is determined by the least-cost TM that solves it
Complexity Classes 3 important classes of problems: P: problems for which a deterministic polynomial-time solution exists NP: problems for which a nondeterministic polynomial-time solution exists PSPACE: problems for which a deterministic polynomial-space solution exists These classes define the boundaries of what is considered to be tractable
The Class P Problems in P can be solved in polynomial time (therefore polynomial space) by a deterministic TM Examples Finding a name in a list O(n) Sorting a list of names O(n log n) Finding all the shortest paths in a graph O(n 3 ) Recognizing a regular language O(n)
The Class NP Problems in NP can be solved in polynomial time (and space) by a nondeterministic TM NP: "Nondeterministic Polynomial-time" Two ways to understand this: Think of the NTM as executing all of its possible computations in parallel Accept as soon as any of them accepts Reject when all of them reject Think of the NTM as magically "guessing" which a computation that will correctly accept, and carrying it out
The Guess-and-Check Model Another way of thinking about it (per Hein): Guess a solution (in one step) S Run a deterministic algorithm to check (in polynomial time) whether S actually is a solution Example: TSP Guess a tour (say, ORD-DEN-MSP-SFO-ATL) Check whether its cost exceeds the bound by adding up the costs of the legs
The Class PSPACE Problems in PSPACE can be solved in polynomial space (but not necessarily polynomial time) by a deterministic TM PSPACE: "Polynomial space" Example: Quantified Boolean Formulas QBF: a logical formula of the form Q 1 x 1 Q 2 x 2...Q n x n E where each Q is either or, and E is a propositional calculus expression that uses only,, and parentheses Problem: given a QBF, is its value true?
Complexity Class Relationships Clearly P NP PSPACE Nobody knows whether the containment is proper That is, no one has ever been able to show there exists a problem in either NP P or PSPACE NP On the other hand, nobody has ever proved NP = P or NP = PSPACE!