5 Boolean Decision Trees (February 11)



Similar documents
Asymptotic Growth of Functions

Infinite Sequences and Series

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

Soving Recurrence Relations

Lecture 2: Karger s Min Cut Algorithm

Department of Computer Science, University of Otago

Your organization has a Class B IP address of Before you implement subnetting, the Network ID and Host ID are divided as follows:

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Incremental calculation of weighted mean and variance

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Properties of MLE: consistency, asymptotic normality. Fisher information.

CS103X: Discrete Structures Homework 4 Solutions

THE ABRACADABRA PROBLEM

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments

Measures of Spread and Boxplots Discrete Math, Section 9.4

1 Computing the Standard Deviation of Sample Means

Lesson 15 ANOVA (analysis of variance)

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

How To Understand The Theory Of Coectedess

4. Trees. 4.1 Basics. Definition: A graph having no cycles is said to be acyclic. A forest is an acyclic graph.

CS85: You Can t Do That (Lower Bounds in Computer Science) Lecture Notes, Spring Amit Chakrabarti Dartmouth College

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Convexity, Inequalities, and Norms

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis

Lecture 4: Cheeger s Inequality

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

The Power of Free Branching in a General Model of Backtracking and Dynamic Programming Algorithms


Elementary Theory of Russian Roulette

Concept: Types of algorithms

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8


Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.

THE HEIGHT OF q-binary SEARCH TREES

Output Analysis (2, Chapters 10 &11 Law)

CHAPTER 3 DIGITAL CODING OF SIGNALS

Factoring x n 1: cyclotomic and Aurifeuillian polynomials Paul Garrett <garrett@math.umn.edu>

CS100: Introduction to Computer Science

1. MATHEMATICAL INDUCTION

Chair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

CHAPTER 3 THE TIME VALUE OF MONEY

2. Degree Sequences. 2.1 Degree Sequences

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

Modified Line Search Method for Global Optimization

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

Lecture 5: Span, linear independence, bases, and dimension

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

3. Greatest Common Divisor - Least Common Multiple

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern.

2-3 The Remainder and Factor Theorems

Chapter 7 Methods of Finding Estimators

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

Solving Logarithms and Exponential Equations

A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

5: Introduction to Estimation

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

Present Value Factor To bring one dollar in the future back to present, one uses the Present Value Factor (PVF): Concept 9: Present Value

A Faster Clause-Shortening Algorithm for SAT with No Restriction on Clause Length

3 Basic Definitions of Probability Theory

Domain 1: Designing a SQL Server Instance and a Database Solution

FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. 1. Powers of a matrix

How To Solve The Homewor Problem Beautifully

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

I. Chi-squared Distributions

INFINITE SERIES KEITH CONRAD

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

A probabilistic proof of a binomial identity

Sequences and Series

Confidence Intervals for One Mean

Section 11.3: The Integral Test

1. C. The formula for the confidence interval for a population mean is: x t, which was

Maximum Likelihood Estimators.

Chapter 5: Inner Product Spaces

Overview of some probability distributions.

Basic Elements of Arithmetic Sequences and Series

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

Hypothesis testing. Null and alternative hypotheses

4.3. The Integral and Comparison Tests

Universal coding for classes of sources

SEQUENCES AND SERIES CHAPTER

Integer Factorization Algorithms

Math 113 HW #11 Solutions

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Chapter 14 Nonparametric Statistics

Betting on Football Pools

Hypergeometric Distributions

Determining the sample size

Math C067 Sampling Distributions

Normal Distribution.

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE The absolute value of the complex number z a bi is

Transcription:

5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected by a edge. How hard is it to decide whether G is coected? Specifically, how may etries i the adjacecy matrix do we have to examie? s usual, we wat the worst-case ruig time of the best possible algorithm, as a fuctio of, the umber of vertices: D() = mi T (, G) G Here I ll use D() istead of T () to emphasize that we are lookig at determiistic algorithms. Clearly D() (, sice the adjacecy matrix is symmetric ad has zeros o the diagoal. Oe way to derive a lower boud for ay decisio problem is to cosider the size of the smallest proof or certificate that verifies the result. To prove that a graph is coected, it is clearly both ecessary ad sufficiet to reveal the edges i a arbitrary spaig tree of G. This tree costitutes a certificate that G is coected, or a 1-certificate for short. Coversely, if we wat to prove that a graph is discoected, we eed to demostrate a cut with o crossig edges, that is, a partitio of the vertices ito two disjoit subsets, such that o edge has oe edpoit i each subset. Such a cut is called a 0-certificate. Clearly, ay algorithm that tests coectedess must check all the edges i some 1-certificate before returig True ad all the edges i a 0-certificate before returig False. Let C 0 () ad C 1 () respectively deote the imum size of a 0- or 1-certificate i a -vertex graph, ad let C() = {C 0 (), C 1 ()}. We immediately have the followig geeral result. Theorem 1. D() C(). I the case of graph coectivity, we have C 0 () = / / = ( ( mod )/4 ad C 1 () = 1, so D() C() = ( ( mod )/4 = Ω( ). I other words, the trivial algorithm check every edge is optimal up to a small costat factor. Surprisigly, this algorithm is actually exactly optimal! Theorem. D() = ( ) Proof: I ll describe a adversary strategy that requires ay algorithm to examie every edge. The adversary maitais two graphs, Y ad M, each with vertices; iitially, Y is empty ad M is a clique. Y ( yes ) cotais the edges that the algorithm has examied ad foud to be preset i the fictioal iput graph. M ( maybe ) cotais ay edge that might be i the fictioal iput graph; i other words, a edge (i, j) is abset from M if the algorithm kows that a ij = 0. Note that Y is a subgraph of M. The adversary uses the followig simple strategy whe the algorithm examies a potetial edge: retur False uless that aswer would force the fictioal iput graph to be discoected. 1

Examie(i, j): if (i, j) Y derisively retur True else if (i, j) M mockigly retur False else if M \ (i, j) is discoected add (i, j) to Y grudgigly retur True else remove (i, j) from M sigh ad retur False We easily observe that with this strategy M is always coected ad Y is always acyclic. Moreover, wheever the adversary adds a edge betwee two compoets of Y, every other edge joiig those two compoets of Y has already bee queried ad removed from M. It follows that Y becomes coected oly after ( queries; at that momet, both Y ad M cosist of the same spaig tree, ad the algorithm ca safely retur True. Before the ( th query, M is coected ad Y is discoected. Sice both graphs are cosistet with the adversary s aswers, either could serve as the fictioal iput graph, which meas the algorithm caot possibly determie the correct output. graph property like coectivity that requires lookig at every possible edge to detect is called evasive. We will retur to evasive graph properties i a future lecture. 5. Strig Properties (Boolea fuctios, laguages, whatever... ) Let s look at these ideas i a little more geerality. strig property 1 is ay fuctio of the form F : {0, 1} {0, 1}. Let T (, s) deote the umber of bits i a -bit iput strig s that a algorithm examies before correctly returig F (s). The determiistic decisio tree complexity of a strig property F is, as usual, the worst-case ruig time of ay algorithm to compute it, as a fuctio of the iput size : D() = mi T (, s) s = The model of computatio is the same boolea decisio tree that we saw i the very first lecture. For each iput size, we ca model ay algorithm by a rooted biary tree, where each iteral ode stores the idex of the ext bit to examie, ad each leaf stores a output value. I this model, T (, s) is the depth of the path traversed by the iput strig s i tree, ad the determiistic complexity of ay strig property is the miimum depth of ay tree that correctly computes it. strig property is evasive if D() =. We ca defie the certificate complexity of a strig property F as follows. Ituitively, a 1- certificate is a subset of the bits i the iput s that forces F (s) = 1, ad a 0-certificate is a subset of bits that forces F (s) = 0. The certificate complexity C(s) of a strig s is the size of the smallest certificate cosistet with s. Fially, the certificate complexity C() of F is the imum certificate complexity of ay -bit iput strig. Equivaletly, we have C() = mi T (, s) s = 1 dmittedly, this is a rather bizarre ame, but it is aalogous to graph properties such as coectedess (which we just saw), acyclicity, plaarity, ad the like. strig property is just a boolea fuctio with argumets, or equivaletly, a set of -bit strigs, or equivaletly, a laguage.

where (as above) the mi is take over all algorithms that correctly compute F for all iputs. Origially, certificate complexity was kow as o-determiistic decisio tree complexity, sice it correspods to a odetermiistic variat of the boolea decisio tree model. The best way to thik of a odetermiistic decisio tree is as a family of determiistic decisio trees, where for each iput, we use the best decisio tree i the set for that iput. We ve see oe example of these defiitios already. For the fuctio F : {0, 1} ( {0, 1} that idicates the coectivity of a -vertex graph, we have D( ( ( ) = ( ad C( ) ) = mod. s aother example, let F : {0, 1} {0, 1} deote the exactly half fuctio: F (s) = 1 if ad oly if s cotais exactly 1s ad exactly 0s. For this fuctio, we easily observe that D() = C() =. 5.3 Blum s Theorem Oe of the most geeral results about decisio tree complexity was proved by Mauel Blum. Theorem 3 (Blum). For ay strig property, C() D() C(). Proof: The first iequality C() D() is almost trivial, sice ay algorithm must examie all the bits i a certificate before returig its output. lterately, the iequality x mi y f(x, y) mi y x f(x, y) is easy to prove for ay fuctio f(x, y). To prove the other iequality, we describe a algorithm that examies C() bits. Let π s deote the smallest certificate for ay iput strig s. Let S 0 = {π s F (s) = 0} ad S 1 = {π s F (s) = 1} be the sets of all miimal 0- ad 1-certificates, respectively. To simplify otatio, let f = C(). Our algorithm works i k phases, examiig at most k bits i each phase. t each phase, we keep oly the certificates that are cosistet with the bits we ve examied so far. Let S0 i deote the subset of S 0 cosistet with the bits see i the first i phases, ad defie S1 i similarly. I particular, we have S0 0 = S 0 ad S1 0 = S 1. However, sice it is poitless to carry aroud ay bit whose value we already kow, the algorithm oly maitais the bits i each certificate that have ot yet bee queried. Thus, after each iput bit x j is queried, ay certificate π that is defied i that bit positio is either removed (if π j x j ) or its legth is decreased by oe (if π j = x j ). I the ith phase, the algorithm simply chooses a arbitrary 0-certificate π S i 1 0 ad queries all its (previously uqueried) bits. If all the queries agree with π, the algorithm correctly returs False; otherwise, it cotiues with the ext phase. Now I claim that each phase reduces the legth of every survivig 1-certificate by at least 1. Let σ be a arbitrary 1-certificate i S i 1 1. Sice every iput strig cotais either a 0-certificate or a 1-certificate, but ot both, there must be at least oe bit positio that appears i both σ ad the chose 0-certificate π. Moreover, this bit positio was ot queried i ay earlier phase, because otherwise, at most oe of them would have survived. If the iput strig agrees with π at that commo bit positio, σ is discarded; otherwise, the legth of σ decreases, as claimed. It follows that after at most k phases, we are left with either o 1-certificates, i which case the output must be False, or a sigle empty 1-certificate (i.e., a 1-certificate that matches the iput i every bit positio), i which case the output must be True. Sice each phase examies at most k bits, the algorithm examies at most k bits altogether. Nodetermiism meas ever havig to admit you re wrog. 3

We have already see examples where the first iequality is tight. The secod iequality ca be tight as well. Cosider the fuctio x 1 if k = 0 F k (x 1, x,..., x k) = F k 1 (x 1, x,..., x k 1) F k 1 (x k 1 +1, x k 1 +,..., x k) if k is eve F k 1 (x 1, x,..., x k 1) F k 1 (x k 1 +1, x k 1 +,..., x k) if k is odd This fuctio F k models a d-or tree of depth k: a boolea circuit i the form of a complete biary tree, where the leaves are iputs, gates alterate betwee d ad Or at each level, ad the value at the root is the output. (Do t cofuse the d-or tree with the decisio tree that evaluates it!) x 1 x 1 x x 1 x x 3 x 4 x 1 x x 3 x 4 x 5 x 6 x 7 x 8 d-or trees of depth 0 through 3 It s easy to show that this fuctio is evasive, but that it has much smaller certificate complexity: D(F k ) = k C(F k ) = k/ I fact, it s so easy that I ll leave it as a homework exercise. 5.4 Radomized complexity Recall that a radomized decisio tree is just a probability distributio over a set of determiistic decisio trees, ad that the radomized complexity of a strig property is the worst-case expected ruig time of the fastest radomized decisio tree: R() = mi P s Pr P [] T (, s) gai, I m usig R() istead of the earlier otatio T () to emphasize the radomizatio. We ve already see examples where radomess helps a little bit, but a more extreme example might be more helpful. Let M : {0, 1} 3 {0, 1} deote the boolea media (or majority) fuctio x + y + z M(x, y, z) =. The we ca defie the iterated media fuctio M k : {0, 1} 3k {0, 1} as M 0 (x 1 ) = x 1 ad M k (x 1,..., x 3 k) = M(M k 1 (x 1,..., x 3 k 1), M k 1 (x 3 k 1 +1,..., x 3 k 1), M k 1 (x 3 k 1 +1,..., x 3 k)) for all k 1. 3 3 Douglas Hofstadter defied a three-player game called Hruska based o iterated medias. I a level-0 Hruska game, each player chooses a iteger betwee 0 ad 5; the player with the media choice wis the game, ad adds his chose umber to his level-1 score. For ay i 1, a level-i game cosists of six level-(i 1) games, startig with level-i scores of zero; the wier is the player with the media level-i score at the ed of the game, which is the added to that player s score at level i + 1. I order to make the game fair (ad well-defied), each roud uses a differet permutatio of the players to break ties. I oce wo a level-3 Hruska game by havig the media level-3 score, by havig the lowest level- score, by havig the media level-1 score, by choosig the umber 5 o my 16th level-0 tur. I do t recommed playig a level-4 game, uless you wat your brai to explode. 4

It is ot hard to prove that D(M k ) = 3 k ad C(M k ) = k ; this is a good warm-up for the previous homework exercise. The radomized complexity fits betwee these two bouds. Cosider the followig radomized algorithm for computig M(x, y, z): Query two of x, y, z at radom. If they are equal, retur their commo value. Otherwise, query the remaiig variable ad retur its value. The probability of two variables that agree is at least 1/3, so the expected ruig time of this algorithm is at most 8/3. We ca use this idea recursively to evaluate M k quickly as well. Radomly choose two of the three istaces of M k 1 ad evaluate them recursively. If they retur the same value, we re doe; otherwise (with probability at most /3) we have to evaluate the third istace. We have the recurrece R(M k ) 8 3 R(M k 1), which has the obvious solutio R(M k ) (8/3) k. s it turs out, this algorithm is optimal so R(M k ) = (8/3) k but we do t have the tools to prove this yet. Soo. Promise. Notice that i this case, the radomized complexity falls strictly betwee the certificate complexity ad the determiistic complexity. This should ot be a surprise; eve radomized algorithms have to ru log eough to certify their aswers. I geeral, we have the followig trivial extesio of Blum s theorem for ay boolea fuctio. C() R() T () C() For the d-or tree fuctio F k, the radomized complexity is ( 1+ 33 4 ) k 1.68614 k 0.753. The upper boud follows from a careful radomized algorithm, which I will leave as the third part of the homework exercise. The matchig lower boud is due to Saks ad Widgerso 4. I fact, Saks ad Widgerso cojecture that this is a lower boud o the radomized complexity of ay evasive boolea fuctio. s far as I kow, this cojecture is still ope. 4 Proc. STOC 1986 5