CS85: You Ca t Do That () Lecture Notes, Sprig 2008 Amit Chakrabarti Dartmouth College Latest Update: May 9, 2008
Lecture 1 Compariso Trees: Sortig ad Selectio Scribe: William Che 1.1 Sortig Defiitio 1.1.1 (The Sortig Problem). Give items (x 1, x 2,, x ) from a ordered uiverse, output a permutatio σ : [] [] such that x σ (1) < x σ (2) < x σ (). At preset we do ot kow how to prove superliear lower bouds for the Turig machie model for ay problem i NP, so we istead restrict ourselves to a more limited model of computatio, the compariso tree. We begi by cosiderig determiistic sortig algorithms. Determiistic Sortig Algorithms Whereas the Turig machie is allowed to move heads, read ad write symbols at every time step, i a compariso tree T we are oly allowed to make comparisos ad read the result. Without loss of geerality we ll assume that the x i s are distict, so every compariso results i either a < or >. At each iteral ode of T we make a compariso, the result of which gives tells us the ext compariso to make by directig us to either the left or the right child. At each leaf ode we must have eough iformatio to be able to output the sorted list of items. The cost of a sortig algorithm i this model is the just the height of such a compariso tree T, i.e., the maximum depth of a leaf of T. Now ote that T is i fact a biary tree, so we have l 2 h, h lg l, where l is the umber of leaves, ad h is the height of T. But ow we ote that sice there are! possible permutatios, ay correct tree must have at least! leaves, so the, by Stirlig s formula, we have h lg! = ( lg ). Ope Problem. Prove that some laguage L NP caot be decided i O() time o a Turig machie. Radomized Sortig Algorithms Now we tur our attetio to radomized compariso-based sortig algorithms ad prove that i this case access to a costat time radom umber geerator does ot give us ay additioal power. That is to say, it does ot chage our lower boud. A example of such a algorithm is the variat of quicksort where, at each recursive call, the pivot is 1
LECTURE 1. COMPARISON TREES: SORTING AND SELECTION chose uiformly at radom. Note that the radomess i ay radomized algorithm ca be captured as a ifiite biary radom strig, chose at the begiig, such that at every call of the radom umber geerator, we simply read off the ext bit (or the ext few bits) o the radom strig. The, we ca say that a radomized algorithm is just a probability distributio over determiistic algorithms (each obtaied by fixig the radom strig). I other words, whe callig the radomized algorithm, oe ca move all the radomess to the begiig, ad simply pick at radom a determiistic algorithm to call. With this view of radomized algorithms i mid, we describe a extremely useful lemma due to Adrew Chi-Chih Yao (1979). But first, some prelimiaries. Suppose we have some otio of cost of a algorithm (for some fuctio/problem f ) o a iput, i.e., a (oegative) fuctio cost : A X R +, where A is a space of (determiistic) algorithms ad X is a space of iputs. A example of a useful cost measure is ruig time. I what follows, we will cosider radom algorithms A λ, where λ is a probability distributio o A. Note that, per our above remarks, a radomized algorithm is specified by such a distributio λ. We will also cosider a radom iput X µ, for some distributio µ o X. Defiitio 1.1.2 (Distributioal, worst-case ad radomized complexity). I the above settig, the term E µ [cost(a, X)] is the just the expected cost of a particular algorithm a A o a radom iput X µ. The quatity D µ ( f ) = mi a E µ [cost(a, X)], where f is the fuctio we wat to compute, is called the distributioal complexity of f accordig to iput distributio µ. We costrast this to the worst case complexity of f, which we ca write as follows C( f ) = mi a max cost(a, x). x Fially, we defie R( f ), the radomized complexity of f as follows: R( f ) = mi λ From the defiitios above it follows easily that max E λ [cost(a, x)]. x µ D µ ( f ) C( f ) ad R( f ) C( f ). To obtai the latter iequality, cosider the trivial distributios λ that are each supported o a sigle algorithm a A. Theorem 1.1.3 (Yao s Miimax Lemma). We have 1. (The Easy Half) For all iput distributios µ, D µ ( f ) R( f ). 2. (The Difficult Half) max µ { Dµ ( f ) } = R( f ). The proof of part (2) is somewhat o-trivial ad requires the use of the liear programmig duality theorem. Moreover, we do ot eed part (2) at this poit, so we shall oly prove part (1). We ote i passig that part (2) ca be writte as max mi E µ [cost(a, X)] = mi max E λ [cost(a, x)]. µ a λ x Proof of Part (1). To visualize this proof, it helps to cosider the followig table, where X = {x 1, x 2, x 3,...}, A = {a 1, a 2, a 3,...} ad c i j = cost(a i, x j ): a 1 a 2 a 3 x 1 c 11 c 12 c 13 x 2 c 21 c 22 c 23 x 3 c 31 c 32 c 33....... 2
LECTURE 1. COMPARISON TREES: SORTING AND SELECTION While readig the argumet below, thik of row averages ad colum averages i the above table. Defie R λ ( f ) = max x E λ [cost(a, x)]. The, for all x, we have whece E λ [cost(a, x)] R λ ( f ), E µ [E λ [cost(a, X)]] R λ ( f ). Sice expectatio is essetially just summatio, we ca switch the order of summatio to get E λ [ Eµ [cost(a, X)] ] R λ ( f ), which implies that there must be a algorithm a such that E µ [cost(a, X)] R λ ( f ). Sice the distributioal complexity D µ ( f ) takes the miimum such a, it s clear that D µ ( f ) R λ ( f ). Sice this holds for ay λ, we have D µ ( f ) mi λ R λ ( f ) = R( f ). Before we returig to the issue of a lower boud for radomized sortig, we itroduce aother lemma. Lemma 1.1.4 (Markov s Iequality). If X is a radom variable takig oly oegative values, the for all t > 0, we have Pr[X t] E[X]/t. Proof. A easy oe-lier: E[X] = E[X X < t] Pr[X < t] + E[X X t] Pr[X t] 0 + t Pr[X t]. Corollary 1.1.5. Uder the above coditios, for ay costat c > 0, Pr[X > c E[X]] 1/c. Let m deote the umber of comparisos made o ay particular ru, or equivaletly its rutime. Cosider a radomized sortig algorithm a (i.e., a distributio over compariso trees), where we set C = max iput {E λ[m]} ote here that m is a fuctio of both the algorithm ad the iput. Theorem 1.1.6 (Radomized sortig lower boud). Let T (x) deote the expected umber of comparisos made by a radomized -elemet sortig algorithm o iput x. Let C = max x T (x). The C = ( lg ). Proof. Let A deote the space of all (determiistic) compariso trees that sort a -elemet array ad let X deote the space of all permutatios of []. Cosider a radomized sortig algorithm give by a probability distributio λ o A. As it stads, a radomized sortig algorithm is always correct o every iput, but has a radom rutime (betwee () ad O( 2 )) that depeds o its iput. (Such algorithms are called Las Vegas algorithms.) We wat to covert each a A ito a correspodig algorithm a that has a otrivially bouded rutime, but may sometimes spits out garbage. We do this by allowig at most 10C comparisos, ad outputtig some garbage if the sorted permutatio is still ukow. Let A deote a radom algorithm correspodig to a radom A λ. Corollary 1.1.5 implies Now defie We ca the rewrite the above iequality as Pr[A is wrog o x] = Pr[NumComps(A, x) > 10C] 1 10. { 1, if a is wrog o iput x cost(a, x) = 0, otherwise max E λ [cost(a, x)] 1 x 10. 3
LECTURE 1. COMPARISON TREES: SORTING AND SELECTION By Yao s miimax lemma, for all iput distributios µ, we have D µ (sortig with 10C comparisos) 1 10. Now pick µ to be the uiform distributio o X. Pick ay determiistic algorithm a with 10C comparisos that achieves the miimum i the defiitio of D µ. The, for X µ, we have Pr[a is wrog o X] 1/10. Now the height of a s tree is 10C, so the umber of leaves of a s tree is 2 10C, ad thus the umber of distict permutatios that a ca output is 2 10C. But ow we see that Pr[X is oe of the permutatios output by a] > 9 10, so 2 10C (9/10)! ad hece, by Stirlig s formula, C 1 ( ( )) 9 lg 10 10! = ( lg ). 1.2 Selectio Defiitio 1.2.1 (The Selectio Problem). Give items (x 1, x 2,, x ) from a ordered uiverse ad a iteger k [], the selectio problem is to retur the kth smallest elemet, i.e., x i such that { j : x j < x i } = k 1. Let V (, k) deote the height of the shortest compariso tree that solves this problem. Special Cases. The miimum (resp. maximum) selectio problems, where k = 1 (resp. k = ), ad the media problem, where k = 2. Miimum ad maximum ca clearly be doe i O() time. Other selectio problems are less trivial. 1.2.1 Miimum Selectio Theorem 1.2.2. Let T be a compariso tree that fids the miimum of elemets. The every leaf of T has depth at least 1. Sice T is biary, it must therefore have at least 2 1 leaves. Proof. Cosider ay leaf λ of T. If x i is the output at λ, the every x j ( j i) must have wo at least oe compariso o the path leadig to λ (if ot, we caot kow for sure that x j is ot the miimum). Sice each compariso ca be wo by at most oe elemet, we have depth(λ) 1. Corollary 1.2.3. V (, 1) = 1. 1.2.2 Geeral Selectio By a adversarial argumet, Hyafil [Hya76] proved that V (, k) k + (k 1) lg k 1. This was stregtheed by Fusseegger ad Gabow [FG79] to the boud stated i the ext theorem. Both these results give poor bouds for k. However, observe that V (, k) = V (, k). Note. For the special case where k = 2, we have V (, 2) = 2 + lg. It is a good exercise to prove this: both the upper ad the lower boud are iterestig. Also, k {1, 2, 1, } are the oly values for which we kow V (, k) exactly. Theorem 1.2.4 (Fusseegger & Gabow). V (, k) >= k + lg ( k 1). 4
LECTURE 1. COMPARISON TREES: SORTING AND SELECTION Proof. Let T be a compariso tree for selectig the kth smallest elemet. At a leaf λ of T, if we output x i, the every x j ( j i) must be settled with respect to x i, i.e., we must kow whether or ot x j < x i. I particular, we must kow the set S λ = {x j : x j < x i }. Fix a particular set U {x 1,..., x }, with U = k 1. Now cosider the followig set of leaves: L U = {λ : S λ = U}. Notice that the output at ay leaf i L U is the miimum of the elemets i U. Therefore, if we treat U as give, ad prue T accordigly, by elimiatig ay odes that perform comparisos ivolvig oe or more elemets of U, the the residual tree is oe that has leaf set L U ad fids the miimum of the k + 1 elemets of U. By Theorem 1.2.2, the prued tree must have at least 2 k leaves. Therefore, L U 2 k. The umber of leaves of T is the sum of L U over all ( k 1) choices for the set U. Therefore, the height of T must be This proves the theorem. 1.2.3 Media Selectio height(t ) (( ) lg k 1 2 k ) = k + ( ) lg. k 1 Turig ow to the media selectio special case, recall that by Stirlig s formula, we have ( ) ( 2 ) =, /2 which already gives us a good itial lower boud for media selectio: ( V, ) ( ) 2 2 + lg 2 1 2 + 1 2 lg O(1) = 3 2 o(). Usig more sophisticated techiques, oe ca prove the followig stroger lower bouds: V (, ) 2 2 o(). [BJ85] V (, ) 2 (2 + 2 50 ) o(). [DZ01] O the upper boud side, we agai have otrivial results, startig with the famous groups of five algorithm: V (, ) 2 6 + o(). [BFP + 73] V (, ) 2 3 + o(). [SPP76] V (, ) 2 2.995. [DZ99] 5
Lecture 2 Boolea Decisio Trees Scribe: William Che 2.1 Defiitios ad Basic Theorems Defiitio 2.1.1 (Decisio Tree Complexity). Let f : {0, 1} {0, 1} be some Boolea fuctio. We sometimes write f whe we wat to emphasize the iput size. A decisio tree T models a computatio that adaptively reads the bits of a iput x, brachig based o each bit read. Thus, at each iteral ode of T, we read a specific bit (idicated by the label of the ode) ad brach left or right depedig upo whether the bit read was a 0 or a 1. At each leaf ode we must have eough iformatio to determie f (x). We defie the determiistic decisio tree complexity D( f ) of f to be D( f ) = mi{height(t ) : T is a decisio tree that evaluates f }. For example, easy adversarial argumets show that for the or, ad ad parity fuctios, we have D(OR) = D(AND) = D(PAR) =. It is coveiet to itroduce a term for fuctios that are maximally hard to evaluate i the decisio tree model. Defiitio 2.1.2 (Evasiveess). A fuctio f : {0, 1} {0, 1} is said to be evasive (a.k.a. elusive) if D( f ) =. We shall try to relate this algorithmically defied quatity D( f ) to several more combiatorial measure of the hardess of f, that we ow defie. Defiitio 2.1.3 (Certificate Complexity). For a fuctio f : {0, 1} {0, 1}, defie a 0-certificate to be a strig α {0, 1, } such that x {0, 1} : x matches α f (x) = 0. That is to say, for ay iput x that matches α for all o- bits, we have f (x) = 0, o matter the settigs of the free variables give by the s. Defie the size of a certificate α to be the umber of exposed (i.e., o- ) bits. The, defie the 0-certificate complexity of f as C 0 ( f ) = max mi{size(α) : α is a 0-certificate that matches x}. x We defie 1-certificates ad C 1 ( f ) similarly. We use the covetio that C 0 ( f ) = if f 1, ad similarly for C 1 ( f ). Fially, we defie the certificate complexity C( f ) of a fuctio f to be the larger of the two, that is C( f ) = max{c 0 ( f ), C 1 ( f )}. Example 2.1.4. C 1 (OR ) = 1, whereas C 0 (OR ) =, so C(OR ) =. The quatity C( f ) is sometimes called the odetermiistic decisio tree complexity. Note that for ay decisio tree for f, ay leaf givig a aswer p {0, 1} must be of depth at least C p ( f ), so we have C( f ) D( f ). 6
LECTURE 2. BOOLEAN DECISION TREES Defiitio 2.1.5 (Sesitivity). For a fuctio f : {0, 1} {0, 1}, defie the sesitivity s x ( f ) for f o a particular iput x as follows s x ( f ) = {i [] : f (x (i) ) f (x)}, where we write x (i) to deote the strig x with the ith bit flipped. We the defie the sesitivity of f as s( f ) = max{s x ( f )}. x Defiitio 2.1.6 (Block Sesitivity). For a block B [] of bit positios ad = x, let x (B) be the strig x with all bits idexed by B flipped. Defie the block sesitivity bs x ( f ) for f o a particular iput x as bs x ( f ) = max{b : {B 1, B 2,..., B b } is a set of disjoit blocks such that i [b] : f (x) f (x (B i ) )}. That is to say, bs x ( f ) is the maximum umber of disjoit blocks that are each sesitive for f o iput x. We the defie the block sesitivity of f as bs( f ) = max bs x ( f ). x Note that sice block sesitivity is just a geeralizatio of sesitivity (which requires that the blocks be of size 1 each), we have bs x ( f ) s x ( f ), ad thus bs( f ) s( f ). Before we go o to develop some of the theory behid this, cosider the followig example. Example 2.1.7. Cosider a fuctio f (x) defied as follows, where the iput x is of legth ad is arraged i a matrix. Assume is a eve iteger. Defie { 1, if some row of the matrix matches 0 f (x) = 110 0, otherwise. Sice each row ca have at most 2 sesitive bits, we have s( f ) = 2 = O( ). However, for block sesitivity we have bs( f ) bs z ( f ) 2 where z = 0 ad we have 2 blocks each cosistig of 2 cosecutive bits, which makes every block sesitive. This gives a quadratic gap betwee s( f ) ad bs( f ). Theorem 2.1.8. s( f ) bs( f ) C( f ) D( f ). Proof. The leftmost ad rightmost iequalities have already bee established above. To see that the middle iequality holds, observe that every certificate must expose at least oe bit per sesitive block. Here we ote that there could be at least a quadratic gap betwee s( f ) ad bs( f ) as evideced by example 2.1.7. It turs out that there is a maximum of a quadratic gap betwee C( f ) ad D( f ), which leads us to the ext theorem, whose proof we will see as a corollary later i this sectio. Theorem 2.1.9 (Blum-Impagliazzo). D( f ) C( f ) 2 Example 2.1.10. Cosider a fuctio g = AND OR. That is to say, g divides the iput x with x = ito groups, feeds each group ito the subfuctio OR, ad the feeds the results of each OR ito AND. To certify 0, we eed oly expose all of the iputs i oe OR-subtree to show that they are all 0. To certify 1, we eed oly expose oe iput i each of the OR-subtrees to show that there is at least a sigle 1 goig ito every OR. The, we have C(g) = however, it s easy to see that give ay subset of the iput, the fial aswer still depeds o the rest of the iput, ad thus D(g) =. Therefore, g is evasive. 7
LECTURE 2. BOOLEAN DECISION TREES 2.1.1 Degree of a Boolea Fuctio Defiitio 2.1.11 (Represetative Polyomial). For a fuctio f : {0, 1} {0, 1}, we say that a polyomial p(x 1, x 2,..., x ) represets f if for all a {0, 1}, we have p( a) = f ( a). Defiitio 2.1.12 (Degree of a Boolea Fuctio). For a fuctio f : {0, 1} {0, 1} ad a represetative polyomial p for f, the degree of f is the just the degree of p. Theorem 2.1.13. For ay such f, there exists a uique multiliear polyomial that represets f. Proof. Let { a 1, a 2,..., a k } be the set of all iputs that make f = 1. The, we claim that for all b, c {0, 1}, there exists a polyomial p b (x 1,..., x ) such that { 1 if c = pb ( c) = b 0 otherwise To see this, for ay such vector b, cosider the polyomial p b (x 1,..., x ) = (x i + b i 1)) Clearly this polyomial is 1 if ad oly if every x i = b i. The, it follows that the polyomial that represets f is just the polyomial k p = i=1 p ai i=1 Sice each a i is distict, at most oe p ai will evaluate to 1, but sice that implies that the iput is oe of the acceptig iputs for f, p behaves as desired. To prove uiqueess, suppose 2 multiliear polyomials p, q both represet f, the the polyomial r = p q satisfies r( a) = 0 for all a {0, 1}. Cosider the smallest moomial x i1, x i2,..., x ik i r. FINISH UNIQUENESS PROOF. Theorem 2.1.14. For a Boolea fuctio f, let p be its represetative polyomial. The we have deg(p) D( f ) Proof. For ay leaf λ of a decisio tree, without loss of geerality let x 1, x 2,..., x r be the queries made alog the path to that leaf, ad let b 1, b 2,..., b r be the results of the queries. The for every leaf λ of a decisio tree for f, write a polyomial p λ give as follows p λ = x i (x i 1) i:b i =1 Clearly each p λ has degree r D( f ), the the polyomial i:b i =0 p = λ p λ represets f ad also has degree D( f ). Example 2.1.15. We preset a example to show that deg( f ) ca be much smaller tha s( f ). Cosider the fuctio o a iput vector x with size { NAE(R-NAE( x 1,..., x R-NAE( x) = 3 ), R-NAE( x 3,..., x 2 3 ), R-NAE( x 2,..., x )) if x > 3 3 NAE( x 1, x 2, x 3 ) if x = 3 where we assume that = 3 k for some iteger k, ad NAE is defied as follows { 1 if x, y, z are ot all equal NAE(x, y, z) = 0 if x = y = z 8
LECTURE 2. BOOLEAN DECISION TREES I other words, this is the k-layer tree of NAE computatios applied recursively to 3 k iitial iputs. For x = 0, we have s x (R-NAE) = 3 k, sice chagig ay sigle bit will flip the aswer. Now observe that we ca alteratively write the NAE(x, y, z) fuctio as a degree 2 polyomial NAE(x, y, z) x + y + z xy yz zx The, we ca see that at the secod lowest level we have 3 k 1 polyomials, each of degree 2 (lowest level cotais 3 k polyomials of degree 1 - coordiates of the iput vector), but at the root, we have a sigle polyomial of degree 2 k, but this is polyomially distict to the sesitivity s x (R-NAE) = 3 k. Alteratively, writig this i terms of, we have deg( f ) = log 3 2 s( f ) = Now returig to the example g = AND OR, we have deg( f ) = s( f ) = C( f ) To see why deg( f ) =, simply recall that o matter the iput, we must always examie every bit of the iput to get a defiitive aswer. I other words, the decisio tree has costat height D(g) = (refer to example 2.1.10). Theorem 2.1.16 (Beals-Buhrma-Cleve-Marca-deWolf). D( f ) C 1 ( f )bs( f ) C 0 ( f )bs( f ) Proof. Without loss of geerality we oly cosider the first case, that D( f ) C 1 ( f )bs( f ), as the case for C 0 ( f )bs( f ) is symmetric. We proceed to iduct o the umber of variables, or size of the iput. The base case = 1 is trivial. Before we cotiue with the iductio step, we defie a subfuctio of f as follows f α : {0, 1} k {0, 1}, with bits i α set to their values i α where α {0, 1, } ad exposes k bits, or has exactly k characters {0, 1}. If f has o 1-certificate, the f = 0, ad D( f ) = 0, so we re doe. FINISH BEALS BUHRMAN PROOF Theorem 2.1.17 (Nisa). C( f ) s( f )bs( f ) Proof. Try it yourself. Corollary 2.1.18. D( f ) C( f )bs( f ) s( f )bs( f ) 2 bs( f ) 3 Proof. Direct result of theorems 2.1.17 ad 2.1.16. Corollary 2.1.19. D( f ) C( f ) 2 Proof. D( f ) mi{c 0 ( f ), C 1 ( f )} bs( f ) mi{c 0 ( f ), C 1 ( f )} C( f ) = mi{c 0 ( f ), C 1 ( f )} max{c 0 ( f ), C 1 ( f )} = C 0 ( f ) C 1 ( f ) C( f ) 2 9
Lecture 3 Degree of a Boolea Fuctio Scribe: Ragaath Kodapally Theorems: 1. S( f ) bs( f ) C( f ) D( f ) 2. deg( f ) D( f ) 3. C( f ) S( f )bs( f ) bs( f ) 2 [Nisa] 4. D( f ) C 1 ( f )bs( f ) [Beals et al] 5. Corollary: D( f ) C 0 ( f )C 1 ( f ) C( f ) 2 [Blum-Impagliazzo] 6. Corollary: D( f ) bs( f ) 3 7. bs( f ) 2 deg( f ) 2 8. D( f ) bs( f ) deg( f ) 2 2 deg( f ) 4 Examples: 1. f : S( f ) = ( ), bs( f ) = () 2. f : S( f ) = bs( f ) = C( f ) =, deg( f ) = D( f ) = 3. f : deg( f ) = log 3 2, S( f ) = Theorem 7: Proof. Pick a iput a ( {0, 1} ) ad blocks B 1, B 2,..., B b that achieve the max i bs( f ) = b. Let P(x 1,..., x ) be a multi-liear polyomial represetatio of f. Create a ew polyomial Q(y 1, y 2,..., y b ) from P by settig x i s as follows If i B 1 B 2... B b, set x i = a i 10
LECTURE 3. DEGREE OF A BOOLEAN FUNCTION If i B j, If a i = 1, set x i = y j If a i = 0, set x i = 1 y j Multi-liearize the polyomial deg(q) deg(p) (Note: deg(q) b) Q(1, 1, 1,..., 1) = P( a ) = f ( a ) Q(0, 1, 1,..., 1) = Q(1, 0, 1,..., 1) = Q(1, 1,..., 0) = 1 f ( a ) (Because each B j is sesitive) Q( c ) {0, 1} c {0, 1} b [Trick by Misky-Papert] Defie Q sym (k) as follows: Q sym (k) = a =k Q( a ) ( b k), k {0, 1, 2,..., b} where a (weight of a ) is the umber of oes i a. Note: deg(q sym ) b These defiitios defie Q sym uiquely. Markov s Iequality: Q sym (k) = f ( a ), if k = b = 1 f ( a ), if k = b 1 [0, 1], else Defiitio 3.0.20 (Iterval Norm of a fuctio). Defie f S = sup x S f (x), for f : R R, S R. Suppose f is a polyomial of degree, the f [ 1,1] 2 f [ 1,1] Corollary: If a x b, c f (x) d, the f [a,b] 2 b a d c Let λ = (Q sym ) [0,b] Let κ = max{(sup Q sym (x)) 1, if Q sym (x), 0} The Q sym (x), for x [0, b] maps to [ k, 1 + k]. If κ = 0, Q sym maps [0, b] to [0, 1]. Markov s Iequality implies (Q sym ) [0,b] deg(qsym ) 2 (1 0) (b 0) deg(qsym ) 2 bs( f ) However, Q sym (b 1) = 1 f ( a ) ad Q sym (b) = f ( a ). By Mea-Value theorem, α [b 1, b] such that (Q sym ) (α) = 1. so, (Q sym ) [0,b] 1 bs( f ) deg(q sym ) 2 If κ > 0 If κ = Q sym (x) 1 for some x = x 0. Suppose x 0 [i, i + 1]. Cosiderig the smaller of the two itervals [i, x 0 ], [x 0, i + 1] ad applyig mea-value theorem, (Q sym ) [0,b] κ 1/2 = 2κ 11
LECTURE 3. DEGREE OF A BOOLEAN FUNCTION Q sym maps [0, b] to [ k, 1 + k]. Markov s Iequality implies, 2κ (Q sym ) [0,b] deg(qsym ) 2 (1 + κ ( κ)) b 0 So, 2κ deg(qsym ) 2 (1 + 2κ) bs( f ) bs( f ) deg(q sym ) 2 ( 1 + 1) (1) 2κ Usig α such that (Q sym ) (α) = 1, we have 1 (Q sym ) [0,b] deg(qsym ) 2 (1 + 2κ) bs( f ) bs( f ) deg(q sym ) 2 (1 + 2κ) (2) (1)&(2) bs( f ) deg(q sym ) 2 (1 + mi{ 1 2κ, 2κ}) 2 deg(q sym ) 2 2 deg(q) 2 2 deg(p) 2 = 2 deg( f ) 2 The proof is similar for the case where κ = Q sym (x) for some x. Theorem 8: D( f ) bs( f ) deg( f ) 2 Proof. Cosider polyomial represetatio P of f. Defiitio 3.0.21. Defie maxoomial = maximum degree moomial of f There exists a set (size bs( f ) deg( f )) of variables that itersects every maxoomial. Query all these variables to get subfuctio f α with deg( f α ) deg( f ) 1. Sice all the maxoomials of f vaish after exposig the variables, the degree of the resultig polyomial represetatio ( f α ) decreases at least by 1. If the above two statemets hold, the we ca prove the theorem by iductio o the degree of the fuctio f. If f is a costat fuctio, the the theorem holds trivially. Assume that it is true for all fuctios whose degree is less tha = deg( f ). D( f ) bs( f ) deg( f ) + D( f α ) The algorithm to get those variables is bs( f ) deg( f ) + bs( f α ) deg( f α ) 2 bs( f )(deg( f ) + (deg( f ) 1) 2 ) By iductio hypothesis bs( f )deg( f ) 2 if deg( f ) 1 12
LECTURE 3. DEGREE OF A BOOLEAN FUNCTION Start with a empty set S Pick all variables i ay maxoomial that is ot yet itersected ad add these variables to S. (Number of variables added is less tha or equal to deg( f ).) If the curret set itersects every maxoomial stop else repeat the above step. Claim: Number of iteratios i the algorithm is at most bs( f ) Proof: We claim that iside every maxoomial, there is a sesitive block for 0. Pick ay maxoomial. Set x i (i maxoomial) to 0. The resultig fuctio s polyomial represetatio still cotais the maxoomial. Thus, the resultig boolea fuctio is ot a costat. This implies that there exists a sub-block iside the maxoomial such that flippig it chages the value of the fuctio. Thus, the umber of maxoomials is at most bs( f ). Hece, the umber of iteratios which ca be at most the umber of maxoomials is at most bs( f ). 13
Lecture 4 Symmetric Fuctios ad Mootoe Fuctios Scribe: Ragaath Kodapally Defiitio 4.0.22. f : {0, 1} {0, 1} is said to be symmetric if x {0, 1} π S (Group of all permutatios o []) f (x) = f (x π(1), x π(2),..., x π() ) This is the same as sayig f (x) depeds oly o x (umber of oes i x). Recall the proof of bs( f ) 2 deg( f ) 2. We used symmetrizig trick. Ca prove: For ay symmetric f, deg( f ) = O( α ) where α(= 0.548) < 1 is a costat. [Vo Zur Gathe & Roche] The, D( f ) deg( f ) = O() For x, y {0, 1} write x y if i(x i y i ) Defiitio 4.0.23. f is mootoe if x y f (x) f (y) Example of mootoe fuctios: O R, AN D, AN D O R. No-examples: P AR, O R. Theorem 4.0.24. If f is mootoe, the S( f ) = bs( f ) = C( f ) Proof. We already have S( f ) bs( f ) C( f ). Remais to prove: C( f ) S( f ). Let x {0, 1}. Let α be a miimal 0-certificate matchig x (suppose f (x) = 0). Claim: All exposed bits of α are zeros. Proof: If ot, we ca chage a exposed bit (which is 1) i α ad still have f (x α ) = 0 as f is mootoe. So, we do t eed to expose that bit. So, α ca t be miimal 0-certificate. Claim: Every exposed bit i α is sesitive for [α : 1 ] = y (the iput obtaied by settig *=1 everywhere i α) Proof: Suppose i th bit is exposed by α but f (y (i) ) = f (y) = 0. By mootoicity, ay other settig of * s i α causes f = 0. So, i th bit is ot ecessary, cotradictig the miimality of α. By secod claim: S( f ) umber of exposed bits i α. Thus, S( f ) C 0 ( f ). Similarly, S( f ) C 1 ( f ). Therefore, S( f ) max{c 0 ( f ), C 1 ( f )} = C( f ) Theorem 4.0.25. If f is mootoe, D( f ) bs( f ) 2 = S( f ) 2 [Usig a earlier theorem: D( f ) C( f )bs( f ) = S( f ) 2 ] The above bouds are tight, cosiderig f = AN D O R. 14
LECTURE 4. SYMMETRIC FUNCTIONS AND MONOTONE FUNCTIONS The oly possible symmetric mootoe fuctios are T H R,k where T H R,k (x) =1, if x k D(T H R,k ) = (k 0) (by a simple adversarial argumet). Defiitio 4.0.26. f : {0, 1} {0, 1} is evasive if D( f ) =. 0, Otherwise Defiitio 4.0.27. A boolea fuctio f (x 1,2, x 1,3,..., x 1, ) is a graph property if f ( x ) depeds oly o the graph described by x. Every π S iduces a permutatio o ( []) 2. f is a graph property iff it is ivariat with respect to these permutatios. Theorem 4.0.28. Every mootoe o-costat graph property f o -vertex graph has D( f ) = ( 2 ) [Rivest & Vuillemi] Cojecture: D( f ) = ( 2), i.e f is evasive! [Richard Karp]. Theorem 4.0.29. If = p α for a prime p, α N, the f is evasive. [Kah-Saks-Sturtevat] Theorem 4.0.30. If f is ivariat uder edge-deletio or cotractio (mior closed property) ad 0, the f is evasive. [Amit-Khot-Shi] Theorem 4.0.31. Ay mootoe bipartite graph property is evasive. [Yao] Proof. Let f : {0, 1} N {0, 1} where N = ( 2) be a o-costat mootoe graph property. f is ivariat uder a group G S N of permutatios (G = S ). G is a trasitive o {1,..., N} i.e, for ay i, j [N], π G such that π(i) = j. Lemma 1: Suppose f : {0, 1} d {0, 1} is mootoe ( costat), ivariat uder a trasitive group ad d = 2 α for some α N, the f is evasive. Corollary to Lemma 1: If = 2 k, the D( f ) 2 /4. Hierarchically cluster [] ito subclusters of size /2 ad recurse. Cosider the subfuctio of f obtaied the followig way: g i : A {0, 1} ad g(x) = f (x), where A = {x is a graph o [] where there is a edge betwee l, j if l 1 i 0 k = j 1 i 0 k ad o edge if l 01 i 1 0 k j 01 i 1 0 k } Number of iputs to subfuctio = (2 k 1 ) 2 = 2 /4. By Lemma 1, D( f ) D(subfuctio) = 2 /4. There are k subfuctios, depedig o how deep we take the clusterig: g 1 (stop after oe level of clusterig),g 2,..., g k (go dow to sigleto clusters). g k ( 0 ) = f ( 0 ) = 0 g 1 ( 1 ) = f ( 1 ) = 1 g i ( 1 ) = g i 1 ( 0 ) i such that g i ( 0 ) = 0 ad g i ( 1 ) = 1 [i.e g i is o-costat] Corollary 2: For all N, D( f ) 2 /16 Proof of Lemma 1: Cosider S k = {x {0, 1} : x = k ad f (x) = 1} Defiitio 4.0.32. orb(x) = {y : y is obtaied from x by applyig some π G} S k is partitioed ito orbits. Claim: If k 0, k d( i.e x 0, x 1 ), the every orbit has eve size. Therefore, S k is eve. Number of oes i resultig matrix = k orb(x) = d (#1 s i ay colum) orb(x) = d ( #1 s i ay col) k 15 = 2α (some iteger) k = eve
LECTURE 4. SYMMETRIC FUNCTIONS AND MONOTONE FUNCTIONS as 0 < k < d. By Claim: {x : f (x) = 1} = S 0 + S 1 +... + S d 1 + S d S 0 = 0 as f ( 0 ) = 0 ad S d = 1. = 0 + eve umber + 1 = odd Cosider all x {0, 1} that reach leaf λ of decisio tree at depth < d. A eve umber of x s reach here. Therefore, if all leaves whose output is f (x) = 1 have depth < d, the umber of x : f (x) = 1, is eve. This is a cotradictio ad hece, there exists a leaf whose depth is d. Thus, D( f ) = d ad f is evasive. 16
Lecture 5 Radomized Decisio Tree Complexity: Boolea fuctios We saw this type of complexity earlier for the sortig problem. Scribe: Priya Nataraja Defiitio 5.0.33 (Cost of a radomized decisio tree o iput x). cost(t, x) = umber of bits of x queried by t. Let radomized decisio tree T λ (probability distributio over decisio trees). From Yao s miimax lemma we have: R( f ) D µ ( f ), for ay iput distributio µ. So, it is eough to prove a lower boud o D µ ( f ). Ulike what we did i sortig, here we wo t covert to a Mote Carlo algorithm. We will strictly use Las Vegas algorithm. Theorem 5.0.34. Let f be a o-costat mootoe graph property o N = ( 2) bits, i.e., vertices. The, R( f ) = ( 4/3 ) = (N 2/3 ). This is sayig somethig more ivolved tha the easy thig you ca prove: for ay mootoe f : R( f ) D( f ). Last time we showed that D( f ) 2 /16 = ( 2 ) [Rivest-Vuiellemi]. Yao s cojecture: R( f ) = ( 2 ) = (N). Theorem 5.0.34 is ot the best result kow, we state below (without proof) the best result kow: R( f ) = ( 4/3 log 1/3 ) [Chakrabarti-Khot], this is a stregtheig of Hajal s proof. Today we will prove somethig that implies Theorem 5.0.34. Recall that graph properties are ivariat uder certai permutatios (ot all permutatios) ad these permutatios form a trasitive group. Note: From ow o, umber of iputs = (ot N). Theorem 5.0.35. If f : {0, 1} is ivariat uder a trasitive group of permutatios, the R( f ) = ( 2/3 ). Match the R( f ) boud i Theorem 5.0.34 to that i Theorem 5.0.35! Proof of Theorem 5.0.35 usig probabilistic aalysis. Proof. Defiitio 5.0.36 (Ifluece of a coordiate o a fuctio). Let f : {0, 1} {0, 1}. Defie the ifluece of the i th coordiate o f as: I i ( f ) = Pr x [ f (x) f (x (i) )]. I words: choose a radom x ad what is the chace that flippig the i th bit will chage the output. Note that I i ( f ) looks like s x ( f ). Because of ivariace uder the trasitive group, we have: I 1 ( f ) = I 2 ( f ) = = I ( f ) = I ( f )/, where I ( f ) = i=1 I i ( f ). We will also make use of aother techique: 17
LECTURE 5. RANDOMIZED DECISION TREE COMPLEXITY: BOOLEAN FUNCTIONS Defiitio 5.0.37 (Great Notatioal Shift). TRUE: 1 1; FALSE: 0 1 So, old x ew 1 2x, ad old 1 y 2 ew y Why are we doig this? Give fuctio f : { 1, 1} { 1, 1}, this is a sig vector istead of a bit vector. I this otatio also we have a uique multiliear polyomial. It is importat to ote that this otatio does ot chage the degree of the multiliear polyomial, though it might chage the coefficiets ad the umber of moomials. Example 5.0.38. AND(x, y, z) = x yz (old) I the ew otatio we have: Notice that the coefficiets of the liear terms = 2. ( ) ( ) ( ) 1 x 1 y 1 z AND(x, y, z) = 1 2 2 2 2 } {{ } ew old } {{ } old ew = 3 4 + x 4 + y 4 + z 4 xy 4 yz 4 xz 4 + xyz 4 Give that x takes ±1 values, we ca say somethig specific about I i ( f ): If f : { 1, 1} { 1, 1}, the: [ ] 1 I i ( f ) = E x 2 f (x x i = 1) f (x x i = 1) = 1 2 E x[ D i f (x) ] Let φ(x, y, z) = 3 4 + x 4 + y 4 + 4 z xy 4 yz 4 xz 4 + xyz 4 D 1 φ(y, z) = 2. 1 4 + higher degree terms E y,z [D 1 φ(y, z)] = 2. 1 4 + 0 1 2 E[φ(y, z)] = coefficiet of x For mootoe f : I i ( f ) = 1 2 E x[d i f (x)] = coefficiet of x i i polyomial represetatio of f Lemma1: For ay f : { 1, 1} { 1, 1} ad determiistic decisio tree T evaluatig f, we have: Var x [ f ] i=1 δ i I i ( f ), where δ i = Pr x [T queries x i o iput x]. Var x [ f ] = E x [ f (x) 2 ] (E[ f (x)]) 2 = 1 E x [ f (x)] 2 Note: If f is balaced (i.e., if Pr[ f (x) = 1] = Pr[ f (x) = 1]), the Var[ f ] = 1. For example, PARITY fuctio is balaced, AND is ot. We will prove Theorem 5.0.35 assumig f is balaced; we will fix this later. Recall that Lemma1 makes o assumptios about the fuctio f. If f is ivariat uder a trasitive group, the I i ( f ) = I ( f ), so from Lemma1 we get: Var[ f ] I ( f ) δ i i=1 = I ( f ) E[# variables queried] = I ( f ) D uiform( f )[by pickig optimal determiistic decisio tree for uiform distributio]. 18
LECTURE 5. RANDOMIZED DECISION TREE COMPLEXITY: BOOLEAN FUNCTIONS Lemma2 [O Doell-Servedio]: For ay mootoe f : { 1, 1} { 1, 1}, I ( f ) D uif ( f ). From Lemma1 ad Lemma2, we get that if f is mootoe, trasitive, ad balaced, the: So ow we will prove lemmas 1 ad 2. Proof of Lemma1: 1 I ( f ) D uif( f ) D uif( f ) 3/2 R( f ) D uif ( f ) 2/3 Var[ f ] δ i I i ( f ) i=1 Var[ f ] = 1 E[ f (x)] 2 = 1 E x [ f (x)].e y [ f (y)] = 1 E x,y [ f (x) f (y)] = E x,y [1 f (x) f (y)] = E x,y [ f (x) f (y) ] That is, for ay sequece of radom variables {0, 1}, x = u [0], u [1],..., u [d] = y: Var[ f ] d 1 i=0 E[ f (u[i] ) f (u [i+1] ) ] (by triagle iequality) For example: Start: f (x 1, x 2, x 3, x 4, x 5 ) [= x = u [0] ] f (x 1, y 2, x 3, x 4, x 5 ) [tree T read x 2 i the above step] f (x 1, y 2, x 3, y 4, x 5 ) [tree T read x 4 above] Ad ow suppose the tree has reached a leaf so it outputs a value of f ad stops. Whe the tree stops, replace the remaiig x s with y s. So we get f (y 1, y 2, y 3, y 4, y 5 ) Note that goig from u [i] to u [i+1] is just re-radomizig, there is o coditioig ivolved because the variable which was queried (whose value got kow) got replaced by a radom variable. Expressio i summatio above is just E u [ f (u) f (u ( j) ) ] where x j is the variable queried by T at this step. Note that this looks a lot like the defiitio of ifluece, without the 1/2. E u [ f (u) f (u ( j) ) ]: read this as take a radom variable u ad re-radomize u by replacig the j th variable by tossig a coi agai. Sice with probability 1/2, the value of the j th variable got flipped or remaied the same, we get: E u [ f (u) f (u ( j) ) ] = 1 2 E u[ f (u) f (u ( j) ) ] = I j (u) So, if x j1, x j2,..., x jd is the radom sequece of variables read (ote that d is also radom), the: Var[ f ] (I j1 ( f ) + I j2 ( f ) +... + I jd ( f )) Pr[j 1, j 2,..., j d occurs] = I i ( f ) Pr[T queries x i ] i=1 Proof of Lemma2: Recall that for mootoe f, I i ( f ) = 2 1 E x[d i f (x)], which is also the coefficiet of x i i the represetig polyomial. Let T be a determiistic decisio tree for f ad suppose T outputs ±1 values. Let L = {leaves of T } ad L + = {leaves of T that output 1}, i.e., acceptig leaves. For each leaf λ L, we have a polyomial p λ (x 1, x 2,..., x ) such that { 1, if (x1, x p λ (x 1, x 2,..., x ) = 2,..., x ) reaches λ 0, otherwise 19
LECTURE 5. RANDOMIZED DECISION TREE COMPLEXITY: BOOLEAN FUNCTIONS deg(p λ ) = depth of λ. coefficiet of x i i p λ =. 2 depth(λ) 1 { +1, if upo readig xi o path to λ, we brach right Sig of coefficiet = 1, otherwise Notatios: d(λ) = depth(λ) s(λ) = skew of λ Usig the above defiitios, we have: 1 = (#left braches #right braches) o path to λ I (p λ ) = (coefficiet of x i i p λ ) i=1 = s(λ) 2 d(λ) 1 Observe that represetig polyomial of f = 1 2 λ L + p λ. Therefore, I ( f ) = (coefficiet of x i i represetig polyomial of f) i=1 = λ L + λ L s(λ) 2 d(λ) s(λ) 2 d(λ) = E radom brachig [ s(λ) ] I ( f ) λ d(λ) 2 d(λ) = D ui f ( f ). The first iequality is due to drukard s walk i probability theory. Lecture dated 04/25/2008 follows: Today we will prove somethig for graph properties, which will sometimes be stroger tha what we proved last time: 1 E[ f (x)] 2 = V ar[ f (x)] i=1 δi T I i ( f ) I ( f ) D ui f ( f ) D ui f ( f ) 3/2. The first iequality is due to Lemma1 of last time, the secod iequality is due to symmetry, ad the last iequality is due to mootoicity ad Lemma2. If f is balaced, we get D ui f ( f ) 2/3. I the proof we did last time, we did ot use the full power of uiform distributio; we just used the fact that idividual poits are idepedet. If f is ot balaced, we get D ui f ( f ) (some value close to 0), which is ot a good lower boud. For geeral f, istead of assumig X ui f, we use X µ p, where µ p meas choose each x i = 1 with probability p / 1 with probability 1 p idepedetly. µ p (x 1, x 2,..., x ) = p N1(x) (1 p) N 1(x) Now, we will eed to have the followig modificatios: p(1 p) Dµp ( f ) 3/2 1 E[ f (x)] 2 = V ar[ f (x)] i=1 δi,p T I i,p( f ) I ( f ) D µp ( f ), where we get the last iequality due to geeralized drukard walk. If p = 0, E[ f (x)] = 1; if p = 1, E[ f (x)] = 1. So, itermediate value theorem tells us that p such that E[ f (x)] = 0. So we pick this p ad we get a boud for D µp 2/3. Although this boud is for D µp, Yao s miimax lemma tells us that for ay distributio, D µp is a lower boud for R( f ). This probability p where E[ f (x)] = 0 is called critical probability of f. Every mootoe fuctio has a well-defied critical probability. Today s theorem will be i terms of this critical probability. 20
LECTURE 5. RANDOMIZED DECISION TREE COMPLEXITY: BOOLEAN FUNCTIONS Theorem 5.0.39. If f is a o-costat, mootoe graph property o -vertex graphs, the R( f ) = (mi{ p, 2 log }), where p is the critical probability of f. Note that we ca always assume p 1 2. Why? Because if the critical probability of f > 1/2, we ca always cosider g( x ) = 1 f (1 x ), the p (g) = 1 p ( f ). Critical probability p is the probability that makes E[ f (x)] = 1/2. Let us ow defie the key graph theoretic property graph packig that we eed for the proof. Graph packig is the basis for a lot of lower bouds for graph properties. Defiitio 5.0.40 (Graph Packig). Give graphs G, H with V (G) = V (H), we say that G ad H pack if bijectio φ : V (G) V (H) such that {u, v} E(G), {φ(u), φ(v)}otie(h). φ is called a packig. We state the followig theorem, but we will ot prove it: Theorem 5.0.41 (Sauer-Specer theorem). If V (G) = V (H) = ad (G) (H) 2, the G ad H pack. Here (G) = max degree i G. Note: you ca ever pack a 0-certificate ad a 1-certificate. What is the graph of a 0-certificate? It is a buch of thigs that are ot edges i the iput graph. Ituitio: Sauer-Specer theorem says that if two graphs are small, you ca pack them. Sice we caot pack a 0-certificate ad a 1-certificate, it implies that either of the two has to be big so that certificate complexity is big which gives you a hadle o the lower boud. Cosider a decisio tree; pick a radom iput x {0, 1} accordig to µ p ad ru the decisio tree o x. Let Y = #1 s read i the iput, ad Z = #0 s read i the iput. The, cost of the tree o radom iput = Y + Z. D µp ( f ) = E[cost] = E[Y ] + E[Z] By Yao s miimax lemma, it is eough to prove that either E[Y ] or E[Z] is (mi{ p, log 2 }). Note that Y ad Z are o-egative radom variables (sice they represet the umber of 0 s ad 1 s read), so it is eough to prove the lower boud o oe of them. Proof outlie (remember we ca assume p 1/2): Claim 1: E[Z] E[Y ] = 1 p p. if E[Y ] 64 the: E[Z] = 1 p p 64 128p = (/p) (mi{ p, 2 log }) we may assume E[Y ] 64. E[Y f (x) = 1 (G x ) p+ E[Y ] 4p log ], where G x = graph represeted Pr[ f (x)=1 (G x ) p+ 4p log ] by 1 s i the iput x (so other edges are off ). Theorem 5.0.42. E[X c] E[X] Pr[c] Proof. E[X] = E[X c] Pr[c] + E[X c] Pr[ c] E[X c] Pr[c] 21
LECTURE 5. RANDOMIZED DECISION TREE COMPLEXITY: BOOLEAN FUNCTIONS Pr[A B] = 1 Pr[A B] 1 Pr[A] Pr[B] = Pr[A] Pr[B] Usig our assumptio, ad the iequality above, we get: E[Y f (x) = 1 (G x ) p + E[Y ] 4p log ] Pr[ f (x)=1...] /64 1 /32 + o() 2 1 Later: The costat 4 above is chose so that Pr[ (G x ) p + 4p log ] 1 1 (Cheroff boud) This implies that there exists a iput, x, satisfyig f (x) = 1 ad (G x ) p + 4p log, such that y /32 + o(). Therefore, there exists a 1-certificate with p + 4p log ad #edges /32 + o(). This meas that we are touchig at most /16 vertices, ad so at least /2 vertices are isolated. Claim 2: All 0-certificates must have at least 2 edges. 16(p+ 4p log ) Now we have a lower boud o the umber of edges of ay 0-certificate. As we did E[Y ], we ca say E[Z] E[Z f (x) = 0] Pr[ f (x) = 0] 2 16(p + 4p log ) 1 2 = 2 32(p + 4p log ) { } 2 mi 64p, 2 256 log { } mi 64p, 2 256 log Claim 2 Proof: This proof amouts to showig that if the claim were false, we could pack a 1-certificate ad 0-certificate. We will show that if (G) k ad G has more that /2 isolated vertices ad E(H) 16 (G) 2, the G ad Hpack. Number of edges i a graph equals half the sum of the degrees of the vertices, i.e., E(H) = 1 2 deg(v) v H = d avg(h) 2 d avg (H) = 2 E(H) = 8 (G) If H is a subgraph of H spaed by /2 lowest degree vertices, the (H ) 2 d avg (H).. Now, if you have a upper boud o the average, you ca boud the max of /2 smallest items. 22
LECTURE 5. RANDOMIZED DECISION TREE COMPLEXITY: BOOLEAN FUNCTIONS By packig H H with the isolated vertices of G, we ca exted a packig of H ad G to oe oe H ad G. We ca get a packig of H ad G because, (G ) (H ) (G) (H ) Apply Sauer-Specer theorem. Claim 1 Proof: Defie a buch of idicator variables. Let, 4 = /2 2. Y i = Z i = { 1, if xi is read as a 1. 0, otherwise { 1, if xi is read as a 0. 0, otherwise The, ( 2) ( 2) Y = Y i ; Z = Z i i=1 i=1 The it is eough to show that: i : { either E[Yi ] = E[Z i ] = 0 or E[Z i ] E[Y i ] = 1 p p Eough to show the same uder a arbitrary coditioig o x j ( j i) because x i s are idepedet. Apply a arbitrary coditioig of x j ( j i). If this coditioig implies that the i-th coordiate is ot read, it implies Y i = Z i = 0. Otherwise this is the oly coordiate that is left to be read; ad we kow Pr[1] = p ad Pr[0] = 1 p, i.e., E[Z i ] E[Y i ] = Pr[X i = 0] Pr[X i = 1] = 1 p. p What do you mea by eough to show uder arbitrary coditioig? If you have coditioig o rest of N 1 variables the we have C 1, C 2,..., C 2 N 1 coditios. This implies, E[Z i ] = E[Z i C 1 ] Pr[C 1 ] + E[Z i C 2 ] Pr[C 2 ] +... E[Y i ] = E[Y i C 1 ] Pr[C 1 ] + E[Y i C 2 ] Pr[C 2 ] +... We used a property of decisio trees that whether or ot you read ith variable is idepedet of the value of the ith variable; it may deped o what the values of other variables are. 23
Lecture 6 Liear ad Algebraic Decisio Trees Scribe: Umag Bhaskar 6.1 Models for the Elemet Distictess Problem Defiitio 6.1.1 (The Elemet Distictess Problem). Give items (x 1, x 2,, x ) from a uiverse U ot ecessarily ordered, does i j [] : x i x j? The complexity of the problem depeds o the model of computatio used. Model 0. Equality testig oly. This is a very weak model with o o-trivial solutios. The elemet distictess problem would require ( 2 ) comparisos. Model 1. Assume uiverse U is totally ordered. We ca compare ay two elemets x i ad x j ad take either of 3 braches depedig o whether x i > x j, x i = x j, or x i > x j. Note that i a computer sciece cotext this makes sese, as it is always possible to compare two objects by comparig their biary represetatios. I this model, the problem reduces to sortig, hece the elemet distictess problem ca be solved i O( log ) comparisos. I fact we have a matchig lower boud ( log ) for this problem, through a complicated adversarial argumet. Istead of describig the adversarial argumet we describe a stroger model. We the use this model to give us a lower boud, sice a lower boud for the problem i the stroger model is also a lower boud i this model. Model 2. We assume w.l.o.g. that the uiverse U = R with the usual orderig. The the compariso of two elemets x i ad x j is equivalet to decidig if x i x j is <, > or = 0. For this model we allow more geeral tests, such as is x 1 2x 3 + 4x 5 x 7 > 0, = 0, < 0 I geeral at a iteral ode v i the decisio tree, we ca decide whether p v (x 1, x 2,, x ) brach accordigly. Here p v (x 1, x 2,, x ) = a (v) 0 + 24 i=1 a (v) i x i.? > 0 = 0 < 0, ad
LECTURE 6. LINEAR AND ALGEBRAIC DECISION TREES 6.2 Liear Decisio Trees For model 2 as described above, we have the followig defiitio. Defiitio 6.2.1 (Liear Decisio Trees). A liear decisio tree is a 3-ary tree such that each iteral ode v ca > 0 decide whether p v (x 1, x 2,, x ) = 0, ad brach accordigly. The depth of such a tree is defied as usual. < 0 If we assume that our tree tests membership i a set, we ca cosider the output to be {0, 1}. The we ca thik of two subsets W 0, W 1 R where: 1. W i = { x R : tree outputs i} 2. W 0 W1 = 3. W 0 W1 = R Cosider a leaf λ of the tree. The the set S λ = {x R : x reaches λ} is described by a set of liear equality / iequality costraits, correspodig to the odes o the path from the root to the leaf. The S λ is a covex polytope, sice each liear equality / iequality defies a covex set, ad the itersectio of covex sets is itself a covex set. Covex sets are coected, i the followig sese: Defiitio 6.2.2 (Coected Sets). A set S R is said to be (path-) coected if a, b S, a fuctio p ab : [0, 1] S such that: p ab is cotiuous p ab (0) = a p ab (1) = b Suppose a liear decisio tree T has L 1 leaves that output 1 ad L 0 leaves that 0. The W 1 = λ outputs 1 For a set S, defie #S to be the umber of coected compoets of S. The, sice each leaf outputs at most a sigle coected compoet, S λ ad similarly, #W 1 L 1 #W 0 L 0 For a give fuctio f : R {0, 1} (for example, elemet distictess) we defie f 1 (1) = W 1 = { x : f (x) = 1}, ad f 1 (0) is similarly defied. The, ad, L 1 # f 1 (1) L 0 # f 1 (0) This implies for ay decisio tree that computes f, the umber of leaves L satisfies L = L 1 + L 0 # f 1 (1) + # f 1 (0) 25
LECTURE 6. LINEAR AND ALGEBRAIC DECISION TREES ad hece, height(t ) log 3 (# f 1 (1) + # f 1 (0)) We ow retur to the elemet distictess problem. Remember that { 1, if i j, xi x f (x 1, x 2,, x ) = j 0, o.w. It turs out that f 1 (1) has at least! coected compoets. Actually the correct umber is!, but we are oly iterested i the lower boud which we will prove. Lemma 6.2.3. σ, π S, the group of all permutatios, σ π, the poits x σ = (σ (1), σ (2),..., σ ()) ad x π = (π(1), π(2),..., π()) lie i distict coected compoets of f 1 (1). Proof. The proof is by cotradictio. Suppose ot, the there exists a path from x σ to x π lyig etirely iside f 1 (1). The, for permutatios σ, π, i j s.t. σ (i) < σ ( j) π(i) > π( j) Now x (σ ) ad x (π) lie o opposite sides of the hyperplae x i = x j. Hece ay path from x (σ ) to x (π) must cotai a poit x s.t. xi = x j, by the Itermediate Value Theorem ad f (x ) = 0, which gives us the required cotradictio. Suppose a liear decisio tree T solves elemet distictess. The D(elemet distictess) height(t ) log 3 (# f 1 (0) + # f 1 (1)) log 3 (!) = ( log ) Note that very little of this proof actually uses the liearity of the odes. We ca as well use other fuctios e.g. quadratic fuctios ad get the same results. 6.3 Algebraic Decisio Trees A algebraic decisio tree of order d is similar to a liear decisio tree, except the iteral odes evaluate polyomials of degree d, ad brach accordigly. The problem ow is that the compoets may o loger be coected, e.g. (x + y)(x y) > 0. However, it turs out that low degree polyomials are t too bad, as evideced by the followig theorem: Theorem 6.3.1 (Milor-Thom). Let p 1 (x 1, x 2,..., x m ), p 2 (x 1, x 2,..., x m ),..., p t (x 1, x 2,..., x m ) be polyomials with real coefficiets of degree k each. Let V = { x R m : p 1 ( x) = p 2 ( x) = = p t ( x) = 0} (called a algebraic variety. The the umber of coected compoets i V = #V k(2k 1) m 1. Milor cojectured that the lower boud for the umber of coected compoets i V should be k m. As a example, cosider the equalities p 1 (x) = (x 1 1)(x 1 2)... (x 1 k) = 0 p 2 (x) = (x 2 1)(x 2 2)... (x 2 k) = 0... p m (x) = (x m 1)(x m 2)... (x m k) = 0 The p 1 (x) = p 2 (x) = = p m (x) is satisfied at exactly k m poits, each of which forms a coected compoet. It follows from the theorem that each leaf gives at most k(2k 1) m 1 compoets. However, iteral odes are ot restricted to equalities ad ca evaluate iequalities. I the ext lecture we itroduce Be-Or s lemma, which hadles this case. We use this to get a lower boud for tree depth i such a model, i.e. where the iteral odes are allowed to perform algebraic computatios ad compute iequalities. 26
Lecture 7 Algebraic Computatio Trees ad Be-Or s Theorem Scribe: Umag Bhaskar 7.1 Itroductio Last time we saw the Milor-Thom theorem, which says that the followig Theorem 7.1.1 (Milor-Thom). Let p 1 (x 1, x 2,..., x m ), p 2 (x 1, x 2,..., x m ),..., p t (x 1, x 2,..., x m ) be polyomials with real coefficiets of degree k each. Let W = { x R m : p 1 ( x) = p 2 ( x) = = p t ( x) = 0}. The the umber of coected compoets i W = #W k(2k 1) (m 1). If istead of costat-degree polyomials we allowed arbitrary degree polyomials the it turs out that the elemet distictess problem could be doe i O(1) time! Cosider the expressio x = i< j (x i x j ) The x = 0 i j : x i = x j. Hece, this is obviously too powerful a model. I geeral such problems ca be posed as set recogitio problems, e.g. the elemet distictess ca be posed as recogisig the set W where W = { x R m : i, j (i j x i x j )}. Defie D li (W ) as the height of the shortest liear decisio tree that recogizes W. We saw i the last lecture that for this particular problem, D li (W ) log 3 (#W ). 7.2 Algebraic Computatio Trees Defiitio 7.2.1 (Algebraic Computatio Trees). A algebraic computatio tree (abbreviated ACT ) is oe with two types of iteral odes: 1. Brachig Nodes: labelled v:0 where v is a acestor of the curret ode 2. Computatio Nodes: labelled v op v where v, v are (a) acestors of the curret ode, (b) iput variables, or (c) costats 27
LECTURE 7. ALGEBRAIC COMPUTATION TREES AND BEN-OR S THEOREM ad op is oe of +,,,. The leaves are labelled ACCEPT or REJECT. The computatio odes i a algebraic computatio tree have 1 child, while brachig odes have 3 childre. For a set W R, we defie D alg (W ) = miimum height of a algebraic computatio tree recogizig W. Note that i this model, every computatio is beig idividually charged, hece a operatio such as computig i< j (x i x j ) would be charged O( 2 ). The Milor-Thom theorem is applicable to polyomials of low degree, ad it turs out that we still do ot exactly have that. For example, cosider a path of legth h i a tree. The the degree of the polyomial computed ca be 2 h. w 1 = x 1 x 1 w 2 = w 1 w 1 w 3 = w 2 w 2... w h = w h 1 w h 1 We eed to somehow boud the degree of the polyomial obtaied from such a path. Let v be a ode i a ACT ad let w deote its paret. I order to do this, with each ode v of a ACT, we associate a polyomial equality / iequality as follows: 1. If w = paret(v) is a brachig ode, the p v = p w 0, where the iequality cotais <, =, or > depedig o the brach take from w to reach v. 2. If w = paret(v) is a computatio ode of the form v 1 op v 2, the (a) If op = +,, or, the p v = p v1 op p v2 (b) if op =, the p v p v2 = p v2. At this poit, we ca eve itroduce the square root operator for our algebraic computatio tree so that if the ode is labelled v, the the correspodig polyomial iequality is pv 2 = p v. Let λ be a leaf of the ACT, ad we defie as usual S λ = { x R : x reaches λ}. Let w i deote the variable itroduced at ode i i the tree for obtaiig the polyomial equalities ad iequalities. The we ca write S λ = x R : w 1, w 2,..., w h R s.t. ( coditio at v) v acestor of λ where coditio at v is of the form p( x, w) = 0, > 0 or < 0 for some polyomial p of degree 2. I fact we do ot require iequalities of the form p( x, w) < 0, sice we ca always egate them. Now that we have bouded-degree polyomials, i order to apply Milor-Thom s theorem we eed to remove the iequalities. For this, coditios of the form p( x, w) 0 ca be rewritte as y : p( x, w) = y 2 This does ot icrease the degree of the polyomial, although ow we are movig the equatio to a higher dimesio space (through the itroductio of more variables). We ca move back to the lower dimesio space by simply projectig the solutio space to the lower dimesio. Note that this projectio fuctio is a cotiuous map. But we still do t kow how to hadle strict iequalities, which is all we have. I order to hadle strict iequalities, we proceed as follows. Suppose that the subset v R m is defied by a umber of polyomial equatios ad strict iequalities i.e. V = { z R m : p 1 ( z) = p 2 ( z) = = p t ( z) q 1 ( z) > 0, q 2 ( z) > 0,..., q u ( z) > 0)} The S λ = { x R : w R h ( x, w) V }, where V is of the the above form. Further, S λ is the projectio of V oto the first coordiates. Sice projectio is a cotiuous fuctio, 28
LECTURE 7. ALGEBRAIC COMPUTATION TREES AND BEN-OR S THEOREM #S λ #V Suppose V = V 1 V 2 V l, where the V i s are the coected compoets of V. Pick a poit z i V i, for each V i, the we have a poit i each coected compoet of V. Each z i is also i V, so it satisfies the equality ad iequality costraits, ad q 1 ( z i ), q 2 ( z i ),..., q u ( z i ) are all > 0. Let ε = mi q j ( z i ). Let i, j V ε = { z R m : p 1 ( z) = p 2 ( z) = = p t ( z) = 0 q 1 ( z) ε, q 2 ( z) ε,..., q u ( z) ε} Clearly V ε V. Also, V ε cotais every ( z i ), i.e., V ε cotais a poit i each compoet V i of V. Therefore, #V ε #V Now we ca replace the iequalities, ad we obtai the set V ε as V = { z R m : y R u (polyomial equatios of degree at most 2) } ad ote that V ε is the projectio of ( z, u) R m+u oto R m. The, #S λ #V #V ε #V 2(2.2 1) +h+u 1 dimesio: + h + h + h + u ad the last iequality is obtaied by applyig the Milor-Thom theorem, sice the costraits i V are equatios of degree 2, as required by Milor-Thom. Also, h = umber of computatio odes, ad u umber of brachig odes, hece h + u height of the tree. So, #S λ 2.3 +ht(t ) 1 3 +ht(t ) Usig the fact that #(A B) #A + #B ad W = S λ, we get that λ accepts #W #S λ λ accepts ad sice the umber of leaves 3 ht(t ), or, #W 3 ht(t ) 3 +ht(t ) +2 ht(t ) = 3 ht(t ) log 3 (#W ) 2 which is Be-Or s theroem. From this we ca derive a umber of results, such as, = (log(#w ) ) For elemet distictess sice #W =!, D alg (W ) = (log! ) = ( log ) Set equality, set iclusio, covex hull are all ( log ) Kapsack is ( 2 ) 29
Lecture 8 Circuits Scribe: Chrisil Arackaparambil We have previously looked at decisio trees that allow us to prove lower bouds o data access for a problem. We saw that the best possible lower boud provable i this settig is () (or if you care about costats). We ow look at a differet model of computatio called a circuit, that allows us to prove lower bouds o data movemet. Loosely speakig, this is the amout of data we will eed to move aroud i order to solve a problem. We will see that i this model, a () lower boud is i fact trivial. Circuits attempt to address the issus of the lack of super-costat lower bouds with the Turig Machie (TM) model, where the oly techique available is simulatio ad diagoalizatio ad we do ot kow how to aalyze the computatio. Usig circuits we will be able to look ito the iards of computatio. A circuit is built out of AND ( ), OR ( ) ad NOT ( ) gates alogwith wires ad has special desigated iput ad output gates. More formally, Defiitio 8.0.2 (Boolea Circuit). A Boolea circuit o Boolea variables x 1, x 2,..., x is a directed acyclic graph (DAG) with a desigated type for each vertex: 1. Iput vertices with idegree (fa-i) = 0. These are labelled with: x 1, x 1, x 2, x 2,..., x, x. 2. A output vertex with outdegree (fa-out) = 0. 3. AND ( ), OR ( ) gates with idegree = 2, ad NOT ( ) gates with idegree = 1. Sice a circuit is a DAG, we ca have a topological sort x 1,..., x, g 1 = x +1, g 2 = x +2,..., g s = x +s = y of the graph with the iput vertices as the first few vertices i the sort ad the output gate as the last vertex i the sort. The we ca iductively defie the value computed by the circuit as follows: 1. Value of iput vertex x i is the assigmet to variable x i. 2. Value of x j for ( j > ) is: value(x j1 ) value(x j2 ), if x j is a AND gate with iputs x j1 ad x j2. value(x j1 ) value(x j2 ), if x j is a OR gate with iputs x j1 ad x j2. value(x j1 ) if x j is a NOT gate with iput x j1. 3. Value computed by the circuit o a give iput assigmet is value(y). Every Boolea circuit computes a Boolea fuctio ad provides a computatioally efficiet ecodig of the fuctio. To compute f : {0, 1} {0, 1} (i.e. laguage L f {0, 1} ), we eed a circuit family C =1, where C computes f o iputs of legth. 30
LECTURE 8. CIRCUITS Defiitio 8.0.3 (Size of a circuit). For a circuit C, we defie its size as size(c) = # edges (wires) i C = (# gates i C). Sice we have idegree 2, the umber of wires is withi a factor of 2 of the umber of gates. Defiitio 8.0.4 (Depth of a circuit). For a circuit C, we defie its depth as depth(c) = maximum legth of a iput-output path. We will see later that size ad depth ca be traded off. We will look at lower bouds o size, depth ad size-depth tradeoffs. Theorem 8.0.5. If L P, the L has a polyomial sized circuit family i.e. ad size(c ) = poly(). C =1 such that x {0, 1} C (x) = 1 x L, Proof. Proof idea Build a circuit to simulate the Turig Machie for L. I fact, we get a circuit of size O((rutime of TM) 2 ). (See Sipser [Sip06], Chapter 9, pages 355 356). With this we have a pla to show P NP: 1. Pick your favorite NP-complete problem L. 2. By the above theorem, L has polyomial-size circuits if P = NP. 3. Prove a super-polyomial circuit size lower boud for L. However, aroud 25 years of effort i this lie of work has ot produced eve a super-liear lower boud for a NP-complete problem. The best kow lower boud for a problem i NP is 5 o(). What we ca prove however is that polyomial circuits of some restricted types caot solve certai problems efficietly. Theorem 8.0.6 (Shao s Theorem). Almost all fuctios f : {0, 1} {0, 1} require circuit size ( ( 1 + 1 ))). boud ca be improved to 2 ( ) 2. (The Proof. We ca assume that the circuit does ot use NOT gates. We ca get the effect of a NOT gate somewhere i the circuit by usig DeMorga s Laws ad havig all iputs as well as their egatios. We ca lay dow ay circuit i a topological sort, so that ay edge is from a vertex u to a vertex v that is after u i the sorted order. With this i mid, we ca cout the umber of circuits (DAGs) of size s. For each gate we have two choices for the type of the gate ad s 2 choices for iputs that feed the gate. So the total umber of such circuits is (2s 2 ) s s 3s. Each circuit computes a uique Boolea fuctio, so that the umber of fuctios that have a circuit of size 10 2 is ( 2 ) 3 2 10 2 3 2 /10 = 10 (10) 3 2 /10 = 2 3 2 10 3 2 10 log 10 = 2 2 ( 10 3 3 log 10 10 ) 2 < 2 310 But the total umber of fuctios f : {0, 1} {0, 1} is 2 2 2, so that lim 2 10 3 2 = 0. Observe that every fuctio f : {0, 1} {0, 1} has a circuit of size O( 2 ). We ca get this by writig f as a OR of miterms (DNF). We will eed at most 2 miterms. ) It is possible to improve the above boud to O. as ( 2 We will ow prove our first cocrete lower boud. This boud is for the fuctio THR k : {0, 1} {0, 1} defied { THR k 1, if the umber of 1s i x is k (x) = 0, otherwise 2. 31
LECTURE 8. CIRCUITS Theorem 8.0.7. Ay circuit for THR 2 must have 2 3 gates. Proof. Note that we have the trivial lower boud of 1. The lower boud i the theorem is proved by the techique of gate elimiatio. We will prove the theorem by iductio o. For the base case ( = 2), we have THR 2 2 = AND 2 for which we eed a circuit of size 1 = 2 3. Now, let C be a miimal circuit for THR 2. The C does ot have ay gate that reads iputs (x i, x i ) or (x i, x i ) or ( x i, x i ) for i []. Pick a gate i C such that its iputs are z i = x i (or x i ) ad z j = x j (or x j ), for some i j. The we ca claim that either z i or z j must have fa-out at least 2. To see this, ote that by suitably settig z i ad z j, we ca get three differet subfuctios o the remaiig 2 variables: THR 2 2, THR1 2 ad THR0 2. But if both z i ad z j have fa-out oly 1, the settigs of z i ad z j ca create oly two differet subcircuits, which gives us a cotradictio. Suppose x i (or x i ) has fa-out 2. The, settig x i = 0 elimiates at least two gates. So we have a circuit for THR 2 1 (the resultig subfuctio) with size (origial size) 2. Now by our iductio hypothesis, (ew size) 2( 1) 3 = 2 5. This implies that the origial size was 2+ (ew size) 2 3. 8.1 Ubouded Fa-i Circuits We study the class of fuctios that are computable by ubouded fa-i circuits. Here the DAG represetig a circuit is allowed to have ubouded idegree for vertices represetig AND, OR ad NOT gates. Defiitio 8.1.1. AC 0 is the class of O(1)-depth, polyomial size circuits with ubouded fa-i for AND, OR ad NOT gates. 8.2 Lower Boud via Radom Restrictios There are some assumptios we ca make about AC 0 circuits: 1. We do ot have ay NOT gates. As before, if a NOT gate is ideed required somewhere i a circuit, we ca get a equivalet circuit usig DeMorga s Laws that does ot have ay NOT gates, but uses the egatios of variables as iput gates. 2. The vertices (gates) of the circuit are partitioed ito d + 1 layers 0, 1,..., d such that: ay edge (wire) (u, v) is from vertex u i layer i to vertex v i layer i + 1 for some i, iputs x i, x i, i [] are at layer 0, the output is at layer d, each layer cosists of either all AND gates or all OR gates ad layers alterate betwee AND gates ad OR gates. d is the depth of the circuit. We ca covert the DAG for the circuit ito this form after a topological sort. First group the AND gates ad OR gates together ito layers. If there is a edge from a vertex u i layer i to a vertex v i layer j > i + 1, the we ca replace the edge with a path from u to v havig a vertex i each itermediate layer. Each ew vertex represets a appropriate gate for the layer it is i. For each edge i the origial circuit the umber of gates (wires) we add is at most the (costat) depth of the origial circuit. Note that costat-depth polyomial-sized circuits remai so eve after the above trasformatios, so we ca assume that AC 0 circuits have these properties. Now we are ready to prove our first lower boud o AC 0. Theorem 8.2.1 (Håstad [Hås86]). PAR / AC 0. Proof. We will prove this by iductio o the depth of the circuit. Suppose there exists a depth d, size c circuit for PAR. We will show that there exists a restrictio (partial assigmet) such that after applyig it 32
LECTURE 8. CIRCUITS the resultig subfuctio which is either PAR or PAR depeds o 1/4 variables. there exists a depth d 1, size O( c ) circuit for the resultig subfuctio. But by our iductio hypothesis this will ot be possible. For the base case, we will show that a circuit of depth 2 computig PAR or PAR must have > 2 1 gates. This will complete the proof. For the base, case let s assume without loss of geerality that level 1 is composed of OR gates ad level 2 of a AND gate producig the output. We claim that each OR gate must deped o every iput x i. For suppose this is ot the case. Some OR gate is idepedet of (say) x 1. The set x 2, x 3,..., x so as to make that OR gate output 0. The the circuit outputs 0, but the remaiig subfuctio is ot costat. This gives a cotradictio. Now, for each OR gate there is a uique iput that makes it output 0. For each 0-iput to PAR there must be a OR-gate that outputs 0. So the umber of OR gates is at least the umber of 0-iputs to PAR which is 2 1 To prove the iductio step, cosider the followig radom restrictio o the PAR fuctio. For each i [], idepedetly set x i 0, with probability 1 2 1 2 1, with probability 1 2 1, with probability 1 2. Here meas that the variable is ot assiged a value ad is free. Repeat this radom experimet with the resultig subfuctio. After oe restrictio we get that E[# of free variables] = 1 =. The by applyig a Cheroff boud we see that with probability 1 2 O(), we have 2 free variables. After two restrictios, with probability 1 exp ( ), we have 1/4 4 free variables. Let BAD 0 deote the evet that this is ot the case. We make two claims: ( ) Claim 1: After the first radom restrictio, with probability 1 O 1, every layer-1 OR gate depeds o 4c variables. Claim 2: If every layer-1 OR gate depeds o b variables, the after aother radom restrictio, with probability 1 O ( 1 ), every layer-2 AND gate depeds o l b variables (where l b is a costat depedig o b aloe). Now, we make the observatio that for a fuctio g : {0, 1} {0, 1} that depeds oly o l of its iputs, we have a depth-2 AND-of-OR s circuit ad a depth-2 OR-of-AND s circuit computig g both of size l 2 l. Let BAD ( 1 ) ad BAD 2 respectively deote the evets that Claim 1 ad Claim 2 do ot hold. The with probability 1 O 1 oe of the bad evets BAD 0, BAD 1 ad BAD 2 occur. So there exists a restrictio of the variables such that oe of the bad evets occur. That restrictio leaves us with a circuit o 1/4 4 variables ad by the above switchig argumet, we ca make layer-2 use oly OR gates ad layer-1 use oly AND gates. The we ca combie layers 2 ad 3 to reduce depth by 1, completig the iductio step. Proof of Claim 1. Cosider a layer-1 OR gate G. We have two cases: Fat case: Fa-i of G is 4c log. The with high probability G is set to be costat 1. To be precise, Pr[G is ot set to 1] Thi case: Fa-i of G is 4c log. ( 1 2 + 1 ) 4c log 2 ( ) 2 4c log 2 4c log = 3 = c log 81 16 < 2c. 3 ( ) ( ) 4c log 1 4c Pr[G depeds o 4c variables] (4c log ) 4c 2c 1.5c. 4c 33
LECTURE 8. CIRCUITS Applyig the Uio boud, Pr[ G that depeds o 4c variables] c 1.5c O ( ) 1. Proof of Claim 2. The proof is by iductio o b. For the base case b = 1, every layer-2 gate is really a layer-1 gate. So this case reduces to Claim 1 ad we ca set l 1 = 4c. For the iductio step, cosider a layer-2 AND gate G. Let M be a maximal set of o-iteractig OR-gates that are i G, where by iteractig we mea that two gates share a iput variable. Let V be the set of iputs read by gates i M. We agai have two cases: Fat case: V > a log. The I this case, M V b > a log b Pr[o OR-gate i M is set to 0] Thi case: V a log. I this case we have, so that Pr[a particular OR-gate i M is set to 0] ( 1 1 ) M 3 b (1 13 ) a log b b ( 1 2 1 ) b 2 ( ) 1 b. 3 a log e b 3 b a/(b 3b) O ( ) 1, for a = b 3 b. Pr[after restrictio V has i free variables] ( ) ( ) V 1 i i ( ) a log i/2 i (a log ) i i/2 i/3. Choose i = 4c. The with probability 1 4c/3, there are 4c free variables remaiig i V. This implies that for every oe of the 2 4c settigs of these free variables, the resultig circuit computes a ANDof-ORs with bottom fa-i b 1. This is because all OR gates that are ot i M iteract with V. This further implies that every oe of these 2 4c subfuctios will (after restrictio) deped o l b 1 variables. So the whole fuctio uder G depeds o 4c + 2 4c l b 1 variables. Set l b = 4c + 2 4c l b 1, to complete the iductio step. If we do our calculatios carefully through the steps of the iductio above, we could prove a lower boud of 2 (1/d). O the other had, we ca also costruct a depth-d circuit of size O( 2 1/(d 1) ). 8.3 Lower Boud via Polyomial Approximatios We will ow look at the Razborov-Smolesky Theorem which gives a alterate proof of PAR / AC 0 but also tells us more. This proof also takes a more holistic view of the circuit, rather tha lookig at just the gates at the lower layers. Theorem 8.3.1 (Razborov-Smolesky [Raz87, Smo87]). PAR / AC 0 (but we will show more). Proof. Sice p q = ( p q), we ca assume that our circuit cosists of OR ad NOT gates aloe. We will ot cout NOT gates towards depth. We will associate with each gate G of the depth-d circuit, a polyomial p G (x 1, x 2,..., x ) F 3 [x 1, x 2,..., x ] (F 3 is the field of elemets {0, 1, 2} uder additio ad multiplicatio modulo 3). p G will approximate the Boolea 34
LECTURE 8. CIRCUITS fuctio calculated by G i the followig sese: for all a {0, 1}, p G ( a) {0, 1} ad Pr a {0,1} [p G ( a) G( a)] small(l) 1, where l is a parameter to be fixed later. 2 Evetually, l we will get a polyomial that approximates parity ad has low degree, which we will see is a cotradictio. We defie a polyomial that approximates a OR gate i the followig maer. We wat to kow if the iput x to the OR gate is 0. Take a radom vector r R {0, 1} ad compute r x mod 3 = i [] r i x i mod 3 = i [],r i =1 x i mod 3. If x = 0, the r x mod 3 = 0 always. If x 0, the Pr r [ r x mod 3 = 0] 2 1. Pick S [], uiformly at radom (this is equivalet to pickig r R {0, 1} as above). Cosider the polyomial i S x i F 3 [x 1, x 2,..., x ]. If x = 0, this polyomial is always 0. Otherwise, this polyomial is ot 0 with probability 2 1, that is, the square of the polyomial is 1 with probability 1 2. If we pick S 1, S 2,..., S l [] uiformly ad idepedetly ad costruct the polyomial ( ) 2 ( ) 2 ( ) 2 1 1 1... 1, i S 1 x i i S 2 x i i S l x i the this is a polyomial of degree 2l ad for a particular iput, this choice is bad with probability 1 2 l. Now, by Yao s Miimax Lemma, we have that there exists a polyomial of degree 2l that agrees with OR except at 1/2 l fractio of the iputs. Now we topologically sort the circuit C to get the order: x 1 = g 1, x 2 = g 2,..., x = g, g +1, g +2,..., g s, where s = size(c). For i = 1 to s, we write a polyomial p gi correspodig to gate g i that approximates the Boolea fuctio computed at g i : For i, p gi = x i. If g i is a NOT gate with iput g j, the p gi = 1 p g j. If g i is a OR gate with iputs h 1, h 2,..., h k, the 2 l p gi = 1 1 p hm, m S j j=1 where S 1,..., S l are as obtaied above (usig Yao s Lemma). Let f = p gs be the polyomial correspodig to the output gate. The, 1 Pr f ( a) PAR( a)] (umber of OR gates) a {0,1} [ 2 l s 2 l. Also, deg(p gi ) (2l) depth(g i ). So, deg( f ) (2l) d. Set l such that (2l) d =, that is l = 1 2 1 2d. The we kow that f ( a) = PAR( a) for 2 ( 1 s 2 l ) vectors a {0, 1} ad deg( f ). Now we make the great otatioal chage: 0 FALSE +1 1 TRUE 1. The we have that there exists A { 1, 1} ad a polyomial ˆ f F 3 [x 1,..., x ] such that: 1. def( ˆ f ) = def( f ), ad 2. a A : ˆ f ( a) = ±1PAR( a) = i=1 a i. 35
LECTURE 8. CIRCUITS Cosider F3 A = {all fuctios : A F 3}, so that F3 A = F 3 A = 3 A. Pick a fuctio φ : A F 3. The φ has a multiliear polyomial represetatio g(x 1,..., x ) that agrees with φ o { 1, 1} (hece o A). The polyomial g(x 1,..., x ) is of the form (moomials) (coefficiets), where each moomial is of the form i I x i for some I []. Note that, i F 3, x i = x i x i = f ˆ(x 1,..., x ) i I i [] i []\I x i i []\I for x A (required for the last equality). The LHS above has degree I, while the RHS has degree + I. Applyig this to every moomial of degree > 2 +, we get a polyomial ĝ such that g(x 1,..., x ) = ĝ(x 1,..., x ) for x A with deg(ĝ) 2 +. Now, the umber of multiomial polyomials i F 3 [x 1,..., x ] of degree 2 + is 3 (#moomials of degree /2+ ) = 3 /2+ i=0 ( i) 3 0.98 2, where the fial iequality follows from cocetratio results. We have that for a radom variable X havig Biomial distributio with parameters ad p = 1 2, the iterval [ 2, 2 + ] is a 95% cofidece iterval. That is Pr [ 2 X 2 + ] = 0.95. The, because of the symmetric distributio aroud the mea 2, we have that [ 0, 2 + ] is a iterval of cofidece 97.5%. This gives us that 2 + ( ) i=0 i 0.98 2. ( Now, comparig the two expressios for the umber of fuctios, A 0.98 2. But A 2 1 ), s so 2 l that 1 s 2 l 0.98. This implies that s 0.02 2 l = 0.02 2 1 2 1 2d = 2 (1/d). I a circuit we ca also have MOD 3 gates. A MOD 3 gate takig iputs x 1,..., x produces output y = 1 if i x i mod 3 0 ad y = 0 otherwise. If we deote the class of such circuits as AC 0 [3] the the above proof also shows that PAR / AC 0 [3]. I geeral, if we have MOD m gates takig iputs x 1,..., x ad producig output y = 1 if i x i mod m 0 ad y = 0 otherwise the we have the followig defiitio. Defiitio 8.3.2. AC 0 [m] is the class of O(1)-depth, polyomial size circuits with ubouded fa-i usig AND, OR ad NOT ad MOD m gates. Note that PAR AC 0 [2]. The same proof as above also shows that MOD q / AC 0 [p], for primes p q. This proof is due to Smolesky [Smo87]. We do ot have ay idea of the computatioal power of AC 0 [6]. For istace we do ot kow whether or ot AC 0 [6] = NEXP. We also defie the class ACC 0 as Defiitio 8.3.3. ACC 0 is the class of O(1)-depth, polyomial size circuits with ubouded fa-i usig AND, OR ad NOT ad MOD m1, MOD m2,..., MOD mk gates, for some costats m 1, m 2,..., m k. We ca thik of ACC 0 as the class of AC 0 circuits but with couters. 36
Bibliography [BFP + 73] Mauel Blum, Robert W. Floyd, Vaugha R. Pratt, Roald L. Rivest, ad Robert Edre Tarja. Time bouds for selectio. JCSS, 7(4):448 461, 1973. [BJ85] Samuel W. Bet ad Joh W. Joh. Fidig the media requires 2 comparisos. I STOC, pages 213 216, 1985. [DZ99] Dorit Dor ad Uri Zwick. Selectig the media. SICOMP, 28(5):1722 1758, 1999. [DZ01] Dorit Dor ad Uri Zwick. Media selectio requires (2 + ε) comparisos. SIDMA, 14(3):312 325, 2001. [FG79] Frak Fusseegger ad Harold N. Gabow. A coutig approach to lower bouds for selectio problems. JACM, 26(2):227 238, 1979. [Hås86] Joha Håstad. Almost optimal lower bouds for small depth circuits. I STOC, pages 6 20, 1986. [Hya76] Lauret Hyafil. Bouds for selectio. SICOMP, 5(1):109 114, 1976. [Raz87] [Sip06] [Smo87] Alexader A. Razborov. Lower bouds o the size of bouded depth circuits over a complete basis with logical additio. Mathematical Notes of the Academy of Scieces of the USSR, 41(4):333 338, 1987. Michael Sipser. Itroductio to the Theory of Computatio, 2d editio. Thomso Course Techology, secod editio editio, 2006. Roma Smolesky. Algebraic methods i the theory of lower bouds for boolea circuit complexity. I STOC, pages 77 82, 1987. [SPP76] Arold Schöhage, Mike Paterso, ad Nicholas Pippeger. Fidig the media. JCSS, 13(2):184 199, 1976. 37