Which Networks Are Least Susceptible to Cascading Failures?


 Rudolph Anthony
 1 years ago
 Views:
Transcription
1 Which Networks Are Least Susceptible to Cascaing Failures? Larry Blume Davi Easley Jon Kleinberg Robert Kleinberg Éva Taros July 011 Abstract. The resilience of networks to various types of failures is an unercurrent in many parts of graph theory an network algorithms. In this paper we stuy the resilience of networks in the presence of cascaing failures failures that sprea from one noe to another across the network structure. One fins such cascaing processes at work in the kin of contagious failures that sprea among financial institutions uring a financial crisis, through noes of a power gri or communication network uring a wiesprea outage, or through a human population uring the outbreak of an epiemic isease. A wiely stuie moel of cascaes in networks assumes that each noe v of the network has a threshol l(v), an fails if it has at least l(v) faile neighbors. We assume that each noe selects a threshol l(v) inepenently using a probability istribution µ. Our work centers on a parameter that we call the µrisk of a graph: the maximum failure probability of any noe in the graph, in this threshol cascae moel parameterize by threshol istribution µ. This efines a very broa class of moels; for example, the large literature on ege percolation, in which propagation happens along eges that are inclue inepenently at ranom with some probability p, takes place in a small part of the parameter space of threshol cascae moels, an one where the istribution µ is monotonically ecreasing with the threshol. In contrast we want to stuy the whole space, incluing threshol istributions with qualitatively ifferent behavior, such as those that are sharply increasing. We evelop techniques for relating ifferences in µrisk to the structures of the unerlying graphs. This is challenging in large part because, espite the simplicity of its formulation, the threshol cascae moel has been very har to analyze for arbitrary graphs G an arbitrary threshol istributions µ. It turns out that when selecting among a set of graphs to minimize the µrisk, the result epens quite intricately on µ. We evelop several techniques for evaluating the µrisk of regular graphs. For = we are able to solve the problem completely: the optimal graph is always a clique (i.e. triangle) or tree (i.e. infinite path), although which graph is better exhibits a surprising nonmonotonicity as the threshol parameters vary. When > we present a technique base on powerseries expansions of the failure probability that allows us to compare graphs in certain parts of the parameter space, eriving conclusions incluing the fact that as µ varies, at least three ifferent graphs are optimal among regular graphs. In particular, the set of optimal graphs here inclues one which is neither a clique nor a tree. Dept. of Economics, Cornell University, Ithaca NY 14853; IHS, Vienna; an the Santa Fe Institute. Supporte in part by WWTF Grant Die Evolution von Normen an Konventionen in er Wirtschaft. Dept. of Economics, Cornell University, Ithaca NY Supporte in part by NSF grant CCF Dept. of Computer Science, Cornell University, Ithaca NY Supporte in part by the MacArthur Founation, a Google Research Grant, a Yahoo Research Alliance Grant, an NSF grants IIS , CCF , an IIS Dept. of Computer Science, Cornell University, Ithaca NY Supporte in part by NSF awars CCF an CCF , AFOSR grant FA , a Google Research Grant, an Alfre P. Sloan Founation Fellowship, an a Microsoft Research New Faculty Fellowship. Dept. of Computer Science, Cornell University, Ithaca NY Supporte in part by NSF grants CCF an CCF , ONR grant N , a Yahoo! Research Alliance Grant, an a Google Research Grant. 1
2 1 Introuction The resilience of networks to various types of failures is an unercurrent in many parts of graph theory an network algorithms. For example, the efinitions of cuts an expansion each capture types of robustness in the presence of worstcase ege or noe eletion, while the stuy of network reliability is base on the question of connectivity in the presence of probabilistic ege failures, among other issues. In this paper we are intereste in the resilience of networks in the presence of cascaing failures failures that sprea from one noe to another across the network structure. One fins such cascaing processes at work in the kin of contagious failures that sprea among financial institutions uring a financial crisis [1], in the breakowns that sprea through noes of a power gri or communication network uring a wiesprea outage [3], or in the course of an epiemic isease as it spreas through a human population []. To represent cascaing failures we use the following basic threshol cascae moel, which has been stuie extensively both in the context of failures an also in other settings involving social or biological contagion [6, 8, 9, 10, 11, 1, 13, 14]. 1 We are given a graph G, an each noe v chooses a threshol l(v) inepenently from a istribution µ on the natural numbers, choosing threshol l(v) = j with probability µ(j). The quantity l(v) represents the number of faile neighbors that v can withstan before v fails as well thus we can think of µ as etermining the istribution of levels of health of the noes in the population, an hence implicitly controlling the way the failure process spreas on G. To etermine the outcome of the failure process, we first eclare all noes with threshol 0 to have faile. We then repeately check whether any noe v that has not yet faile has at least l(v) faile neighbors if so, we eclare v to have faile as well, an we continue iterating. For example, Figure 1 shows the outcome of this process on two ifferent graphs G with particular choices of noe threshols. For a given noe r in G, we efine its failure probability f µ (G, r) to be the probability it fails when noe threshols l(v) are rawn inepenently from µ an then the threshol cascae moel is run with these threshols. Now we let f µ(g) = sup r V (G) f µ (G, r), namely, the maximum failure probability in G. We view f µ(g) as our measure of the resilience of G against cascaing failures that operate uner the threshol istribution µ; accoringly, we refer to f µ(g) as the µrisk of G, an we seek graphs of low µrisk. A Motivating Contrast: Cliques an Trees. How o ifferent network structures compare in their resilience to a cascaing failure? Because the failure probability clearly goes up as we a eges to a given noe set, we take the toplevel issue of ege ensity out of consieration by posing this question over the set of all (finite or infinite) connecte regular graphs, for a fixe choice of. We use G to enote this set of graphs, an for graphs in G we ask how they compare accoring to their µrisk. When we consier G, we will also restrict the threshol istributions to the set of all istributions supporte on {0, 1,,..., }, a set which we enote by Γ. As a first concrete example of the kin of results to come, we consier a comparison between two basic regular graphs; the analysis justifying this comparison will follow from the framework 1 The threshol cascae moel is also relate to the nonlinear voter moel [7], though somewhat ifferent in its specifics. Unless explicitly note otherwise, all quantification over graphs in this paper takes place over the set of connecte graphs only. This oes not come at any real loss of generality, since the µrisk of a isconnecte graph is simply the supremum of the µrisk in each connecte component.
3 (a) Clique (b) Tree Figure 1: The sprea of failures on two graphs accoring to the threshol cascae moel. On each graph, the threshols are rawn insie the noes, an the noes with thick borers are those that fail as a result of the process. evelope in the paper. To begin with, for conjecturing structures that prouce low µrisk, we can raw on intuitions from the motivating omains iscusse above. A stanar notion in epiemic isease is that it is angerous to belong to a large connecte component, an this suggests the clique K +1 as a resilient network. On the other han, a principle in financial networks is that it is important to have iversity among one s neighbors in the present context, a lack of eges among one s neighbors so that shocks are uncorrelate. This suggests the infinite complete ary tree T as a resilient network. (By way of illustration, note that if we were to continue the tree in Figure 1(b) inefinitely ownwar, we woul have the complete 3ary tree T 3.) An intriguing point, of course, is that these two sources of intuition point in completely opposite irections. But as one consequence of the framework we evelop here (in Section 4) we will see that both intuitions are essentially correct each of K +1 or T can be better than the other, for ifferent choices of the threshol istribution. Specifically, we will show that there exist µ, ν Γ such that f µ(k +1 ) < f µ(t ) an f ν (T ) < f ν (K +1 ). In fact, this traeoff between cliques an trees shows up in instructive ways on very simply parametrize subsets of the space Γ. For example, suppose we choose a very small value ε > 0, an for a variable x we efine (µ(0), µ(1), µ()) = (ε, x, 1 ε x) with µ(j) = 0 for j >. Then when x = 1 ε, so that all threshols are either 0 or 1, a noe s failure probability is strictly increasing in the size of the component it belongs to, an so K +1 uniquely minimizes the µrisk. At the other extreme, when x = 0, a short argument shows that K +1 uniquely optimizes the µrisk here too. But as we prove in Section 5.1, it is possible to choose a value of x strictly between 0 an 1 ε for which T has strictly lower µrisk than K +1. Moreover, the value of x where T has lower µrisk accors with the financial intuition about the value of iversity: it occurs when x is very small, but significantly larger than ε, so that threshols of 1 are much more numerous than threshols of 0. In this case, failures are still rare, but if a noe u has connecte neighbors v an w, then there is a nontrivial risk that v will have threshol 0 an w will have threshol 1, at which point v s failure will ricochet off w an bring own u as well, even if u has the maximum (an most likely) threshol of. In this region of the space Γ of threshol istributions, it is safer to have no links among your neighbors, even at the expense of proucing very large connecte components. There is also an important qualitative message unerlying this contrast: the question of which graph is more resilient against cascaing failures epens sensitively on the way in which failure moves through the graph (via the mixture of threshols etermine by µ). 3
4 This contrast, an the reasons behin it, suggest that the space Γ has a rich structure when viewe in terms of the µrisk it inuces on graphs. Inee, as we ve just seen, even monotonic traeoffs between simple parameters of µ Γ can prouce nonmonotonic transitions between graphs for example, with K +1 first being better, then worse, then better again compare to T as we vary x above. Our overall plan in this paper is thus to evelop techniques for relating ifferences in µrisk to the structures of the unerlying graphs. This is challenging in large part because, espite the simplicity of its formulation, the threshol cascae moel has been very har to analyze for arbitrary graphs G an arbitrary threshol istributions µ. Existing results have either mae the strong assumptions that either µ obeys a iminishing property (that threshol probabilities exhibit some form of monotonic ecrease in the threshol size) [10, 1] or that the unerlying graph G is a tree [8, 14], a lattice [7], or a complete graph [9, 13]. In fact, even the existing techniques evelope specifically for cliques an trees o not appear strong enough to ientify the contrast iscusse above, which emerges from our framework in Section 5.1. An for comparing graphs outsie these special cases, very few tools are available; one of our motivating goals is to evelop tools of this type. It is also worth noting that the large literature on ege percolation, in which propagation happens along eges that are inclue inepenently at ranom with some probability p, eals with a particular class of moels that when viewe in terms of threshols have the iminishing property iscusse above. This inclues the large literature on G n,p, viewe as ranom ege sets of the complete graph [5]; the authors own recent work on network formation in the presence of contagion exclusively use a moel base on this type of ege percolation [4]. The point is that for this special case, component size is the ominant effect, an so the graphs of minimum µrisk are essentially cliques; working in this part of the space thus oes not enable one to look at traeoffs between open an close neighborhoos as in our motivating iscussion of K +1 vs. T. (As we will see, the constructions of µ Γ that favor T inee involve threshols with a sharply increasing property over part of the support set; for certain applications, this increasing property is often viewe as crucial, which accors with the intuition iscusse earlier.) Hence we nee to look beyon moels with an ege percolation structure to see things that even qualitatively resemble the phenomena we are trying to stuy. Summary of Results. The contrast between K +1 an T establishes that there is no single graph H such that H achieves the minimum µrisk for all istributions µ Γ. It is thus natural to ask whether K +1 an T are sufficient to jointly cover the space Γ, in the sense that at least one of them is optimal at each µ Γ. More generally, we say that a (finite or infinite) set of graphs H = {H 1, H,...} G is a sufficient set for Γ if for each µ Γ, at least one member of H achieves the minimum µrisk over all graphs in G. In this terminology, our question becomes: ( ) Does {K +1, T } form a sufficient set for Γ? One consequence of the results in the paper is a complete answer to Question ( ). We fin, in fact, that the answer to this question epens on the value of. We begin with a fairly complete analysis of µrisk for the case of egree =, answering Question ( ) affirmatively in this case. While the set of graphs in G is clearly very simple (cycles of each length 3, an the infinite path), the behavior of µrisk on G is still rich enough that the nonmonotonic phenomenon iscusse above takes place even between K 3 an T. (Observe that 4
5 T, the infinite ary tree, is better known as the infinite path). We fin in fact that at each µ with 0 < µ(0) < 1, at least one of K 3 or T achieves strictly lower µrisk than every other graph in G {K 3, T }. When >, the behavior of µrisk on G becomes much more complicate. Here we establish that for each >, the two graphs {K +1, T } o not form a sufficient set for Γ. We o this by consiering a graph that we call the (regular) tree of triangles, consisting essentially of a collection of isjoint triangles attache accoring to the structure of an infinite regular tree. ( is specifie precisely in Section 5., an epicte schematically for the case = 3 in Figure ). We construct a istribution µ Γ for which has strictly lower µrisk than both K +1 an T. Intuitively, the tree of triangles interpolates between the complete neighborhoo iversification of T an the complete neighborhoo closure of K +1, an hence points towar a further structural imension to the problem of minimizing µrisk. Despite the complex structure of µrisk when >, we have a set of results making it possible to compare the µrisk of certain specific graphs to the µrisk of arbitrary graphs. In aition to the comparisons among K +1, T, an escribe above, we establish the following further results for K +1 an T. First, as note above, it is not har to show that there are istributions µ Γ for which K +1 has strictly lower µrisk than any other G G. A much more intricate argument establishes a type of optimality property for T as well: for each graph G G, we construct a istribution µ G Γ for which T has strictly lower µ G risk than G. This is a broa generalization of the T vs.k +1 comparison, in that it says that such a comparison is possible for every G G : in other wors, T is more resilient than every other connecte regular graph at some point in Γ. Our analysis in fact establishes a strengthening of this result for T for every finite set H of connecte regular graphs, there is a istribution µ H Γ on which T achieves strictly lower µ H risk than each member of H. An this in turn yiels a negative answer to a more general version of Question ( ): When >, there is no twoelement sufficient set of graphs for Γ. Our results for > are base on a unifying technique, motivate by the construction of the istribution µ = (ε, x, 1 ε x) use to compare K +1 an T above. The technique is base on using power series approximations to stuy the µrisk for µ in the vicinity of particular threshol istributions; roughly speaking, it works as follows. We focus on cases in which the istribution µ concentrates almost all of its probability on a single threshol l max an the remaining probability is ivie up over values j < l max. The ranom raw of a threshol from µ in this case can be treate as a small perturbation of the fixe threshol istribution in which every noe gets threshol l max an no noes fail. A given noe s failure probability can then be expresse using a power series in the variables {µ(j) j < l max } an the power series coefficients for ifferent graphs provie enough information to compare them accoring to µrisk when the probabilities {µ(j) j < l max } are sufficiently close to zero. The computation of the power series coefficients then reuces to a counting problem involving certain partial assignments of threshols to noes of G. In aition to their role in our analyses, we believe that small perturbations of a single fixe threshol are a very natural special case to consier for the threshol cascae moel. Specifically, let Γ h (x) Γ be the set of istributions in Γ such that µ(0) > 0, µ(j) < x for j < h, an µ(j) = 0 for j > h. (In other wors, most of the probability mass is concentrate on h, an the rest is on values below h.) Threshol istributions in Γ h (x) for small x > 0 correspon to scenarios in which all noes begin with a fixe level of health h, an then a shock to the system causes a small fraction of noes to fail, an a small fraction of others to be weakene, with positive threshols 5
6 below h. The stuy of µrisk on Γ h (x) correspons simply to the question of which networks are most resilient to the effect of such shocks. Overall, then, we believe that the techniques evelope here suggest avenues for further progress on a set of basic questions involving the threshol cascae moel, incluing sharper comparisons of the µrisk between ifferent graphs, an how these comparisons epen both on µ an on the unerlying graph structure. Definition of the moel In the threshol cascae moel, there is a graph G (possibly infinite) in which each noe v ranomly samples a label l(v) N. Given a labeling l of graph G, we efine a subset S V (G) to be failurestable if every noe v S has strictly fewer than l(v) neighbors in S. We efine the set of faile noes Φ (G, l) to be the intersection of all failurestable noe sets. Given a graph G with root vertex r, an a istribution µ on noe labels, we efine the root failure probability to be the probability that r Φ (G, l) when l is ranomly sample by assigning each noe an inepenent label with istribution µ. We enote the root failure probability by f µ (G, r). It is not har to see that this efinition of Φ (G, l) is equivalent to the one we use in the introuction as state by the following lemma. Lemma.1. The set Φ (G, l) is failurestable. It is also equal to the union of the infinite sequence of sets Φ 0 (G, l) Φ 1 (G, l) efine inuctively by specifying that Φ 0 (G, l) = {v l(v) = 0} an Φ i+1 (G, l) = {v Φ i (G, l) contains at least l(v) neighbors of v}. It is also equal to the set of all noes v V (G) such that v Φ (G 0, l) for some finite subgraph G 0 G. Proof. It is easy to see that the intersection of failurestable sets is failurestable, hence Φ (G, l) is failurestable. The containment i Φ i (G, l) Φ (G, l) is obvious from the efinition of Φ i (G, l). To establish the reverse containment, it suffices to show that i Φ i (G, l) is failurestable. This hols because any v having l(v) or more neighbors in i Φ i (G, l) must also have that same number of neighbors in Φ k (G, l) for some sufficiently large k; it then follows that v Φ k+1 (G, l). If G 0 is any finite subgraph of G an v Φ (G 0, l), then v Φ (G, l) since Φ (G 0, l) Φ (G, l). To prove the converse, we will show that the set of all v such that v Φ (G 0, l) for some finite G 0 is a failurestable set. Inee, suppose that v has neighbors w 1,..., w l(v), each belonging to a finite ( ) subgraph G 0 (w i ) such that w i Φ (G 0 (w i ), l). Then v Φ l(v) i=1 G 0(w i ), l, as esire. 3 The case = In this section, we specialize to regular unirecte graphs G. For any such graph, one can efine a permutation R of the vertex set such that for every v V (G), the set of neighbors of v is {R(v), R 1 (v)}. The following algorithm RootFail processes a labeling l of G an outputs fail if an only if the root vertex r belongs to Φ (G, l). The algorithm works as follows. First it inspects the label l(r): if this is not equal to 1 or, then it halts instantly an outputs fail if an only if l(r) = 0. Otherwise, fin the least i such that R i (r) 1 an the least j such that R j (r) 1. Let l + = l(r i (r)), l = l(r j (r)). If i is unefine, then set i = an l + =. Similarly, if j is unefine then set j = an l =. Now, if l(r) = 1, output fail if an only if l + = 0 or 6
7 l = 0. If l(r) =, output fail if an only if l + = 0 an l = 0. Define the length of an execution of this algorithm to be equal to i + j. (Note that if i = or j =, the algorithm RootFail will not actually halt. For this reason, an actual implementation of RootFail woul have to be more careful to inspect the vertices in interleave orer R(r), R 1 (r), R (r), R (r),... until it can prove that the root must fail. Such an implementation is not guarantee to halt, but when processing any labeling l such that r Φ (G, l) it is guarantee to halt after a finite number of steps an output fail.) The key to analyzing the root failure probability in regular graphs is the following observation: there is a probabilistic coupling of the labelings l P of the infinite path P an the labelings l C of the ncycle C = C n, such that for every sample point at which RootFail(P, l P ) has execution length less than n, RootFail(C, l C ) also has execution length less than n an the two executions are ientical. We now efine some events on the sample space of this coupling. For any k, let E k enote the event that RootFail(P, l P ) has execution length at least k. Let F P enote the event that r Φ (P, l P ) an let F C enote the event that r Φ (C, l C ). Since the executions of RootFail(P, l P ) an RootFail(C, l C ) are ientical on the complement of E n, we fin that Pr(F P ) Pr(F C ) = Pr(E n ) [Pr(F P E n ) Pr(F C E n )]. We now procee to compute each of the conitional probabilities on the righthan sie. Let s, t, u enote the label probabilities µ(0), µ(1), µ(), respectively. Let q = s 1 t, which is the conitional probability that the label of any noe is 0, given that its label is not 1. Then we have Pr(F P E n ) = t ( 1 (1 q) ) + u t + u t + u q. The first term on the right accounts for the case that l(r) = 1 an ( the ) secon term accounts for the case that l(r) =. After some manipulation pulling out t t+u q from the first term an ( ) u t+u q from the secon one we obtain the formula Pr(F P E n ) = q + t u ( t+u q q ). To compute Pr(F C E n ), note that when E n occurs, the root s label is either 1 or, an at most one of the remaining labels is not equal to 1. Furthermore, in any such labeling of C, the root fails if an only if one of the other n 1 noes has label 0. Thus, Pr(E n ) = (t + u)[t n 1 + (n 1)(1 t)t n ] Pr(E n F C ) = (t + u)(n 1)st n Pr(F P E n ) Pr(F C E n ) = t u t + u ( 1 (n 1)s Pr(F C E n ) = t + (n 1)(1 t) = q ( q q ) qt + t + (n 1)(1 t) [ t u ( Pr(F P ) Pr(F C ) = Pr(E n ) q q ) + t + u ) t t + (n 1)(1 t) qt t + (n 1)(1 t) On the last line, both factors are ecreasing functions of n. Consequently, when they are both positive, their prouct is a ecreasing function of n. In other wors, if an ncycle is better than an infinite path, then an (n 1)cycle is better still. We have thus prove the following. ]. 7
8 Theorem 3.1. For each µ Γ, at least one of the 3cycle or the infinite path has minimum µrisk over all graphs in G. 4 Computing Failure Probabilities Via Power Series When >, the metho of the preceing section oes not appear to be applicable. In effect, since the breathfirst search of such a graph buils a tree which, at any stage of the search, may have more than two leaves (in fact, an unboune number of them) there are many more opportunities for correlation as ifferent leaves of the tree are iscovere to refer to the same noe of G. For this reason, an analysis along the lines of Section 3 seems hopeless. Instea we specialize to cases in which the istribution µ concentrates almost all of its probability on a single label l max an the remaining probability is ivie up over labels j < l max. We then express the µrisk as a power series in the probabilities {µ(j) j < l max }, which allows us to compare ifferent graphs accoring to their lowegree power series coefficients. 4.1 Definitions We now present the efinitions that we nee, followe by a escription of the power series for the root failure probability an its convergence properties. Throughout this section, we will illustrate the efinition on a very simple graph: a 3noe path, with the root r place at the mile noe, an we let v an w be the two other (leaf) noes of the path. Throughout this section an the following ones, we will assume that labels take values in the set {0,..., l max } for some fixe positive integer l max. For purposes of our example, we assume that l max, where most of the probability is concentrate, is equal to : µ(0) = s an µ(1) = t are small positive numbers, an µ() = 1 s t is close to 1. We will compute failure probabilities by working with partial noe labelings λ, in which labels are assigne to only some of the noes, i.e., a partial function λ from V (G) to {0,..., l max }. Its omain of efinition, Dom(λ), is the set of all v V (G) such that λ(v) is efine; when Dom(λ) = V (G) we refer to λ as a full labeling or simply a labeling. We say that a partial labeling λ is an explanation of root failure (ERF) if the root fails in every full labeling of G that agrees with λ on Dom(λ). We say that λ is a minimal explanation of root failure (MERF) if it is an ERF, an every proper sublabeling of λ is not an ERF. Note that Dom(λ) is a finite set whenever λ is a MERF, by Lemma.1. Thus, on the threenoe path with r in the mile, there are four MERFs: (a) assigning 0 to r; (b) assigning 1 to r an 0 to v; (c) assigning 1 to r an 0 to w; an () assigning 0 to v an w. We can think of partial labelings as events in the full sample space of labelings, an (a)() are thus four events that cover the event that r fails. Hence the probability r fails is boune above by the sum of the probabilities of these four events, which is s + st + s. To get the precise failure probability of r, we nee to incorporate inclusionexclusion terms arising from overlaps in these four MERFs. In our example, there are two istinct labelings that correspon to such overlaps: (i) assigning 0 to all three noes: this arises when events (a) an () both occur, so it contributes s 3 to the probability. (ii) assigning 1 to r an 0 to both v an w: this arises when any two out of (b), (c), an () occur, an also when all three occur. By the inclusionexclusion formula, this contributes 8
9 3s t + s t = s t to the probability, with the first term coming from twoway overlaps an the secon term coming from the threeway overlap. Putting all this together, we get the root failure probability for the small example: s + st + s s 3 s t. MERFS give rise to such overlaps when they are compatible. We sat that two partial labelings λ 1, λ are compatible if λ 1 (v) = λ (v) for every v Dom(λ 1 ) Dom(λ ). The union of two compatible partial labelings λ 1, λ is the unique partial function λ such that {(v, λ(v)) v Dom(λ)} = {(v, λ 1 (v)) v Dom(λ 1 )} {(v, λ(v)) v Dom(λ )}. For notational reasons, it will be convenient to make the union operation into a binary operation that is efine for any pair of partial labelings, not only for compatible pairs. To o so, we efine the set Λ to be a set consisting of all partial labelings, together with one special element enote that is interprete to be incompatible with every element of Λ, incluing itself. We exten the union operation to a binary operation on Λ by specifying that λ 1 λ = when λ 1 an λ are incompatible. For a partial labeling λ, we efine E(λ) to be the set of all full labelings that exten λ; note that E( ) =, an that for every two partial labelings λ 1, λ we have the relation E(λ 1 ) E(λ ) = E(λ 1 λ ). For the inclusionexclusion formula, we ll nee to think about finite unions of MERFs which we ll call UMERFs. For graph G with root vertex r, we will enote the set of all MERFs by M(G, r) an the set of all UMERFs by U(G, r). We will sometimes abbreviate these to M, U when the ientity of the graph an root vertex are obvious from context. We can now escribe the plan for arbitrary graphs, incluing infinite ones, when µ(j) = s j are small numbers for j < l max, an µ(l max ) = 1 l max 1 j=0 s j. We first show that when l max > /, for any vector of natural numbers i = (i 0, i 1,..., i lmax 1), there are only finitely many MERFs that assign i k noes a label of k, for k = 0,..., l max 1. Moreover, we can write the root s failure probability as a multivariate power series of the form i a is i 0 0 s i 1 1 s i lmax 1 l max 1, an this power series has a positive raius of convergence. We use this to compare failure probabilities in ifferent graphs by enumerating a finite set of terms in the power series until we ientify a ifference between them. 4. A power series for computing the root failure probability We make the set of all labelings l into a probability space by eclaring the labels {l(v) v V (G)} to be inepenent ranom variables with common istribution µ. The measurable sets in this probability space are the σfiel generate by the sets E(λ), where λ ranges over all partial labelings of G. By Lemma.1, whenever the root fails there is a MERF that explains the failure, i.e. the event r Φ (G, l) is the union of the events E(λ) for λ M. Since M is a countable set, we can choose an arbitrary onetoone corresponence m : N M. Then ( ) ( n ) Pr(r Φ (G, l)) = Pr E(m(i)) = lim Pr E(m(i)). (1) n i=1 Each of the probabilities on the righthan sie can be expane using the inclusionexclusion i=1 9
10 formula: ( n ) n Pr E(m(i)) = ( 1) k+1 Pr (E(m(i 1 )) E(m(i k ))) i=1 = k=1 n ( 1) k+1 1 i 1 < <i k n k=1 1 i 1 < <i k n Pr (E(m(i 1 ) m(i k ))). () The righthan sie of () is easy to evaluate: using variables s i (i = 0,..., l max ) to enote the values s i = µ(i), the probability of the event E(λ) for any partial labeling is given by Pr(E(λ)) = v Dom(λ) s λ(v) = sλ, (3) where this is taken as the efinition of s λ. Combining () an (3), an regrouping the terms we get the following lemma. Lemma 4.1. ( n ) Pr E(m(i)) = i=1 λ U n ( 1) k+1 a k,n λ s λ. (4) Here, a k,n λ for a UMERF λ an integers 1 k n, is efine to be the number of istinct ktuples (i 1,..., i k ) such that 1 i 1 < < i k n an λ = m(i 1 ) m(i k ). 4.3 Convergence of the power series To take the limit as n an obtain a wellefine power series, it is necessary to have a finiteness theorem that justifies that the coefficient of s λ eventually stabilizes as n grows. In fact, in orer for the power series to have positive raius of convergence the coefficients must grow no faster than exponentially. Proving such bouns requires bouning the number of UMERFs of a given size. In general this is not possible: for some graphs an some settings of the parameter l max, the number of UMERFs of a specifie size is not even finite. As a simple example, consier an infinite path an l max = 1; there are infinitely many MERFs λ consisting of a single noe labele with 0. This example generalizes to any positive even egree : the graph G is forme from an infinite sequence of inepenent sets of size /, with every two consecutive such inepenent sets being joine by a complete bipartite graph. When l max = /, there are infinitely many MERFs obtaine by taking one of the inepenent sets in the sequence an labeling all of its noes with 0. Each of these MERFs λ has i(λ) = (/, 0,..., 0). The remainer of this section is evote to specifying some sufficient conitions uner which the righthan sie of Equation (4) can be rewritten as a power series with positive raius of convergence. For any partial labeling λ, we efine its size λ = Dom(λ) to be the number of noes it labels. We begin by ientifying some sufficient conitions uner which we can assert that for every partial labeling λ, the number of noes that are guarantee to fail in every labeling extening λ is at most O( λ ). Lemma 4.. Suppose we are given a graph G, a efault threshol l max, an a partial labeling λ. Let λ be the full labeling that extens λ by assigning label l max to each noe not labele by λ, an let F = Φ ( G, λ ). k=1 10
11 1. If G is regular an < l max then F ( + 1) λ.. Suppose that for every noe v of G, every connecte component of G \ {v} contains strictly fewer than l max neighbors of v. Then F < λ. Proof. Arrange the elements of F into a sequence v 1, v,... such that each of the sets Φ i ( G, λ ) is an initial segment of the sequence. Thus, each v F has at least λ(v) neighbors that precee it in the sequence. We can think of the sequence v 1, v,... as specifying a possible orer in which the noes of F faile in an execution of the threshol cascae moel. To prove both parts of the lemma we will efine a potential function that maps vertex sets to nonnegative integers, then evaluate the potential function on each initial segment of the sequence, an consier how the value of the potential function changes every time a new noe fails (i.e., is ae to the initial segment). We will use two ifferent potential functions corresponing to the two parts. For Part 1 efine ϕ(s), for any vertex set S, to be the number of eges of G having one enpoint in S an the other in its complement. Each time a new noe v k fails, it increases the value of ϕ by at most since it has only neighbors. Furthermore, if v k Dom(λ) then v k has at least l max neighbors that precee it in the sequence an at most l max that succee it. Thus, the net change in ϕ is boune above by ( l max ) l max, which is at most 1 by our assumption that < l max. The potential function ϕ thus starts at 0, increases by at most λ over the whole sequence of failures, an is never negative; hence there can be at most λ steps of the sequence when it strictly ecreases, an therefore at most λ noes in F \ Dom(λ). Consequently F ( + 1) λ. For Part, we instea use the potential function ψ(s) efine as the number of connecte components in the inuce subgraph G[S]. Each time a new noe v k fails, it increases ψ by at most 1. Now consier how ψ changes when a noe w Dom(λ) fails. Since λ(w) = l max, we know that w has at least l max neighbors that precee it in the sequence. By our assumption on the structure of G, at least two of these neighbors belong to ifferent connecte components of G \ {w}. These components merge together when w fails, causing ψ to ecrease by at least 1. Since the initial value of ψ is 0 an its final value is strictly positive, an it increases by at most 1 in each step, we know that the number of steps in which ψ increases must be greater than the number of steps in which it ecreases. Hence, F \ Dom(λ) < λ, implying F < λ as claime. The next lemma provies a simple metho for bouning the number of UMERFs of size z by an exponential function of z. Lemma 4.3. Suppose, for a given graph G an efault threshol l max, that there exists a constant c such that every partial labeling λ satisfies Φ ( G, λ ) c λ. Then for every z, the number of UMERFs of size z is at most ( + 1) 3cz. In particular, this upper boun is at most ( + 1) 3(+1)z whenever one of the sufficient conitions in Lemma 4. hols. Proof. Let λ be a partial labeling an let F = Φ ( G, λ ). If λ is a MERF, then F inuces a connecte subset of G, since otherwise we coul remove the labels provie by λ in any component of G[F ] not containing the root r an arrive at a proper sublabeling of λ that is also an ERF. This implies that if λ is a UMERF, the set G[F ] must also be connecte, since it is the union of a finite set of connecte graphs all containing a common noe r. We can escribe any such F uniquely by specifying the sequence of ege labels (each inexe from 1 to ) that are taken in the F steps of a epthfirst search traversal of G[F ] starting from r. Hence there are at most F c λ such sets. As each UMERF of size λ is uniquely associate with such a set F together with a labeling 11
12 of its noes, we obtain an upper boun of cz (1 + l max ) cz on the number of UMERFs of size z. The lemma follows because 1 + l max + 1. Assume for the remainer of this section that G an l max satisfy one of the two sufficient conitions in Lemma 4.; thus, the hypothesis of Lemma 4.3 hols with c = + 1. The conclusion of Lemma 4.3 is alreay enough for us to be able to express the series on the righthan sie of Equation (4) via a more useful inexing. First, for any UMERF λ, let i(λ) enote the vector of natural numbers i = (i 0, i 1,..., i lmax ) such that λ assigns exactly i k noes a label of k. The corresponing event E(λ) has probability s λ = s i 0 0 s i 1 1 s i lmax l max, a quantity we will abbreviate as s i. For any vector of natural numbers i = (i 0, i 1,..., i lmax ), let i = l max k=0 i k; the number of UMERFs λ with i(λ) = i is boune by the expression in Lemma 4.3, with z = i an c = + 1. Moreover, any MERF λ that appears in a union of MERFs forming λ must have a vector i(λ ) that is coorinatewise ominate by i(λ), an hence Lemma 4.3 implies that only a finite set of MERFs can appear in unions that form λ. It follows that the sequence of coefficients a k,n λ eventually stabilizes as n that is, for every λ, k there is an integer a k λ an a threshol n 0 such that a k,n λ = a k λ for all n n 0. Thus we can group together all UMERFs λ with i(λ) = i an write ( ) Pr E(m(i)) a i s i, (5) i=1 = i λ U i(λ)=i k ( 1) k+1 a k λ si = i where the righthan sie shoul be taken as the efinition of a i, an the grouping by i in the sum on the righthan sie is justifie by the fact that in the preceing triple summation, the sums over λ an k range over finite sets. If we can show that a i epens only exponentially on i, this will establish that the power series has a positive raius of convergence. We observe that if the thir summation weren t present in Equation (5), an instea we only were summing over k = 1 (corresponing to MERFs), then such an exponential upper boun woul follow irectly from Lemma 4.3. It follows that to show an exponential upper boun on a i, it is sufficient, for each fixe UMERF λ with i(λ) = i, to show that k ( 1)k+1 a k λ is boune above by an exponential function of i. To o this, we consier the (potentially very large) set of all MERFs λ 1,..., λ m that can appear in a union forming λ. Let Dom(λ) = D, with D = n, an Dom(λ j ) = D j. For each subset of k of these MERFs whose union equals D, we get a term ( 1) k+1 in the sum we are bouning. We woul like to show that the absolute sum of all these terms is boune above by an exponential function of n, but since there coul be many more than this many terms in the sum, we nee an argument that actually exploits the cancellation among terms of the form ( 1) k+1, rather than naïvely treating each as potentially having the same sign. The upper boun we nee follows from our next lemma. Lemma 4.4. Let D be an nelement set, an let D 1,..., D m be (not necessarily istinct) subsets of D. Let C be the collection of all subsets J {1,..., m} for which j J D j = D. Then ( 1) J n. (The crucial point is that the righthan sie is inepenent of m.) J C 1
13 Proof. We prove this by inuction on n, with the case of n = 1 being easy. For n > 1, choose any element x D an let D = D x. We efine C 0 to be the collection of all J {1,..., m} for which j J D j D. an C 1 to be the collection of all J {1,..., m} for which j J D j = D. Now, by the inuction hypothesis applie to the sets D an {D j x : j = 1,,..., m}, we have J C 0 ( 1) J n 1. By the inuction hypothesis applie to the sets D an {D j : x D j }, we have J C 1 ( 1) J n 1. Finally, C 1 C 0 an J C if an only if J C 0 C 1, so we have J C ( 1) J = J C 0 ( 1) J J C 1 ( 1) J, from which it follows that J C ( 1) J n. Putting these bouns together, we see that a i is boune above by an exponential function of i, an hence: Theorem 4.5. If < l max, the power series in Equation (5) has a positive raius of convergence. The power series also has a positive raius of convergence if for every noe v, every connecte component of G \ {v} contains strictly fewer than l max neighbors of v. 5 Comparing Cliques, Trees, an Trees of Triangles 5.1 Comparing T to K +1 In the introuction, we note that it is easy to ientify two istinct settings of the parameters for µ for which K +1 has uniquely optimal µrisk among connecte regular graphs. First, when l max = 1, the probability the root fails is monotonic in the size of the connecte component that contains it, an K +1 uniquely minimizes this for connecte regular graphs. But K +1 is also uniquely optimal for larger values of l max, when µ assigns every label to be either 0 or l max. Inee, in this case, the only way the root can fail in K +1 is if at least l max of its neighbors fail. This event also causes the root to fail in any connecte regular graph G, but when G K +1 there are other positiveprobability events that also cause the root to fail, so again K +1 is uniquely optimal. As a first application of our powerseries technique, we now show that there are parameter settings for which T has lower root failure probability than K +1. For this comparison, we consier µ such that l max =, an label 0 has probability s, while label 1 has probability t, where s an t are small quantities that will be efine precisely later. Observe that when l max =, T satisfies the hypothesis of Lemma 4., Part, an hence its power series has a positive raius of convergence. The power series for K +1 is actually a polynomial in s an t, since K +1 is a finite graph, so its raius of convergence is infinite. Let us work out some of the lowegree terms for T an for K +1. For T, the coefficient on the term s is 1, corresponing to the MERF in which the root gets labele 0. The coefficient on the term st is, corresponing to MERFs in which the root gets labele 1 an any one of the root s neighbors gets labele 0. There are no inclusionexclusion corrections contributing to either of these coefficients. For K +1, the coefficient on the term s is 1, as in T, corresponing to the root getting labele 0. However, the coefficient on the term st is : there are MERFs in which the root gets labele 1 an any one of the root s neighbors gets labele 0; there are also ( 1) more MERFs in which one neighbor of the root gets labele 0 an another gets labele 1. Now, suppose we set s = t 3. Then the power series for the root failure probability in T is t 3 + t 4 + O(t 5 ), whereas the power series for the root failure probability in K 4 is t 3 + t 4 + O(t 5 ). 13
14 Figure : The tree of triangles for = 3. The O(t 5 ) estimate of the error term is vali insie the power series raius of convergence. Hence, for t sufficiently small an s = t 3, we fin that f µ(t ) < f µ(k +1 ). We have thus shown Theorem 5.1. For each 3, there exists a µ Γ for which T has strictly lower µrisk than K Comparing to K +1 an T We now show that when >, the graphs {K +1, T } o not form a sufficient set for Γ. We o this by establishing the following theorem. Theorem 5.. For each 3, there exists a µ Γ for which the regular tree of triangles has strictly lower µrisk than either T or K +1. The regular tree of triangles is a graph consisting of a collection of isjoint triangles connecte accoring to the structure of T 3 6 : it is the graph obtaine by starting from T 3 6 an replacing each noe u (with neighbors v 1,..., v 3 6 in T 3 6 ) by three noes {u 1, u, u 3 }. These three noes u 1, u, u 3 are mutually connecte into a triangle, an u i is also connecte to one noe in each of the triangles that replaces v j, for j = (i 1)( ) + 1,..., i( ). We raw a small portion of s repeating structure, in the case = 3, in Figure. We construct the istribution µ in Theorem 5. from a small perturbation of the fixe threshol l max = 3. To analyze the root failure probability in in this case, we first observe that its power series has a positive raius of convergence for all 3, since satisfies the hypothesis of Lemma 4., Part. (A connecte component of \ {v} can contain at most neighbors of v.) Thus, we can compare the root failure probabilities in, K +1, an T by comparing lowegree terms in their power series, as we i when we compare K +1 with T in Section 5.1. Because the calculations are somewhat lengthy, we present them in Appenix A rather than in the main text. 6 Comparing T to an arbitrary regular graph In Section 5.1 we compare f µ(t ) with f µ(k +1 ), for 3, when µ is a small perturbation of l max = that is, when (µ(0), µ(1), µ()) = (s, t, 1 s t). We saw that the tree has strictly 14
15 lower µrisk than the clique when t is sufficiently small an s is sufficiently small relative to t. Generalizing this, the same powerseries technique can be use to show that for any connecte regular graph other than T, one can fin a setting of s, t > 0 such that f µ(t ) < f µ(g). This will establish the following theorem, the proof of which is the main focus of the present section. Theorem 6.1. For each 3 an each graph G G, there exists a µ G Γ for which T has strictly lower µ G risk than G. Most of the proof applies to all values of 3. At the en of the analysis, we separately hanle the cases of = 3 an > 3. Focusing on = 3 first allows us to use the conition that < l max = 4 an hence ensure that the root failure probability in the graph G has a power series expansion with a positive raius of convergence. After analyzing the case of = 3, we exten the proof to > 3; this still epens on evaluating power series coefficients but requires some new techniques to hanle the potential nonconvergence of the power series for G. But to begin with, we allow 3 to be arbitrary. Since G is a connecte graph that is not a tree, it has finite girth L. Let r be a noe of G that belongs to an Lcycle, an let r be an arbitrary noe of T = T. Applying the results of Section 4, we will be bouning the probabilities f µ (G, r) an f µ (T, r ) using sums of monomials s λ inexe by UMERFs λ. Any such monomial s λ = s i t j has i 1: all MERFs have at least one thresholzero noe, since otherwise the faile set is empty. We will be setting s = t L 1, so that all the monomials whose magnitue is greater than t L 1 are of the form st j (0 j L ). Focusing, therefore, on UMERFs λ having i(λ) = (1, j), we will establish the facts summarize in the following lemma. Lemma 6.. Let G be any regular graph of girth L. (1) If λ is any UMERF in G such that i(λ) = (1, j), where 0 j L, then λ is a MERF. () When 0 j < L, there is a onetoone corresponence between MERFs λ such that i(λ) = (1, j) in G an in T = T. (3) When j = L, G has strictly more MERFs with i(λ) = (1, j) than oes T. Proof. Let λ be a UMERF such that i(λ) = (1, j), 0 j L, an let v be the unique noe in Dom(λ) such that l(v) = 0. If we exten λ to a labeling l by assigning threshol to every noe not in Dom(λ), then the faile set Φ (G, l) is, by Lemma.1, the union of an increasing sequence of sets Φ 0 (G, l) Φ 1 (G, l). Each of these inuces a connecte subgraph containing v, since the initial set Φ 0 (G, l) is the singleton {v}, an a noe belonging to Φ i (G, l) must have at least one neighbor in Φ i 1 (G, l). Now consier the smallest i (if any) such that Φ i (G, l) contains a noe w Dom(λ). As l(w) =, the noe w must have two neighbors x, y Φ i 1 (G, l). Combining the eges (w, x), (w, y) with a path from x to y in Φ i 1 (G, l) we obtain a cycle in G whose vertex set belongs to {w} Dom(λ), a set of carinality j +. When j < L, this contraicts our assumption that G has girth L, an thus we may conclue that Φ (G, l) = Dom(λ). In this case, the inuce subgraph on Dom(λ) is a tree containing v an r, an in fact it must be a path connecting v to r. (Any threshol1 noes in Dom(λ) lying outsie this path cannot belong to any MERF, contraicting our assumption that λ is a UMERF.) It follows that λ is a MERF, an that the number of such MERFs, for a specifie value of j, is equal to the number of jhop simple paths in G terminating at r. Our assumption that G is regular, 15
16 with girth greater than j +, implies that every nonbacktracking walk of j hops terminating at r is a simple path, an that the vertex sets of all these simple paths are istinct. Consequently, in both G an T the number of MERFs λ such that i(λ) = (1, j) is equal to the number of jhop nonbacktracking walks in a regular graph, i.e. ( 1) j 1. When j = L, the set Φ (G, λ) inuces either a tree (in which case, by the same reasoning as before, it must be a jhop path from v to r) or an Lcycle containing a single noe w such that l(w) =, a single noe v with l(v) = 0, an all other noes having label 1. In the latter case, our assumption that every noe in Dom(λ) belongs to a MERF implies either that w = r, or that v, w, r occur consecutively on the cycle C an that C \ {w} is a jhop path from v to r. Finally, it is easy to see that in all of these cases, λ is a MERF. We have thus shown that every UMERF in G with i(λ) = (1, L ) is a MERF, an that the number of these MERFs is at least ( 1) L 3 + L 1. Here, ( 1) L 3 counts the number of (L )hop paths terminating at the root which is also the coefficient of st L in the power series for T an L 1 counts the number of ways of labeling C \ {r} with a single 0 an L 1 s. Proof of Theorem 6.1, = 3 case. As < l max, the power series for f µ (G, r) an f µ (T, r ) converge for sufficiently small s an t. Thus the ifference f µ (G, r) f µ (T, r ), may be expresse as i=(i,j) (ag ij at ij )si t j where a G ij an at ij are the power series coefficients in (5) for G an T, respectively. Grouping the terms into those with Li + j L an those with Li + j L 1, we fin that the first set of terms inclues only pairs (i, j) such that i = 1, 0 j L, an by Lemma 6., (a G ij a T ij)s i t j = (a G 1,L a T 1,L )st L (L 1)t L. (6) Li+j L Recall, from Lemmas 4.3 an 4.4, that the number of UMERFs λ such that i(λ) = (i, j) is boune above by ( + 1) 3(+1)(i+j) an that the coefficient k ( 1)k+1 a k λ for each of them is boune by i+j in absolute value. Thus, (a G ij a T ij)s i t j Li+j L 1 i+j+1 ( + 1) 3(+1)(i+j) t Li+j k=l 1 Li+j=k < k k+1 ( + 1) 3(+1)k t k < k=l 1 k=l 1 [4( + 1) 3(+1) t] k = ( 4( + 1) 3(+1) ) L 1 t L 1 1 4( + 1) 3(+1), t where the last line is justifie as long as the enominator is strictly positive. By choosing t sufficiently small, we can ensure not only that the enominator is strictly positive but that the quantity on the last line is less than t L. Then, the positive (L 1)t L contribution from the lowegree terms in the power series more than offsets the possibly negative contribution from the highegree terms, an this proves that f µ (G, r) > f µ (T, r ), as claime. Proof of Theorem 6.1, > 3 case. When comparing the infinite regular tree T against another connecte regular graph G when l max, a tricky issue arises because the power series for G nee not converge. Recall, however, that the power series for T still converges. Thus, to 16
17 compute the root failure probablity f µ (T, r ) we will continue to use the full power series, an we will continue to enote it by i,j at ij si t j. To estimate f µ (G, r) we must aopt a ifferent approach. Specifically, we number MERFs m(1), m(),... in orer of increasing Li + j, where (i, j) = i(m). Instea of taking the union of the entire countable sequence of events E(m(k)) (k = 1,,...) we truncate this sequence at the highest value of n such that i(m(n)) = (i, j) with Li + j L. We boun f µ (G, r) from below by Pr ( n i=1 E(m(i))) an evaluate this probability using (4). Denote the resulting polynomial by i,j ag ij si t j. Note that it is a polynomial, not a power series, since we are taking a union of only finitely many events, each escribe by a MERF. With this revise interpretation of the coefficients a G ij, the boun in Equation (6) is once again justifie by Lemma 6.. To boun the remainer term Li+j L 1 (ag ij at ij )si t j from above, we use (a G ij a T ij)s i t j a G ijs i t j + a T ijs i t j (7) Li+j L 1 Li+j L 1 Li+j L 1 an eal with each term on the righthan sie separately. The secon term, involving power series coefficients of T, is ealt with exactly as before, yieling an upper boun of orer O(t L 1 ) for that term as t 0. The first term is a polynomial in s an t, not a power series. Upon substituting s = t L, it becomes a univariate polynomial in t, in which each monomial has an exponent of L 1 or higher. No matter how large the coefficients of this polynomial may be, they are finite, an so as t 0 the absolute value of the polynomial is O(t L 1 ). Thus, the righthan sie of (7) is O(t L 1 ) as t 0, an as before the proof finishes by observing that for small enough t this error term cannot possibly offset the positive (L 1)t L contribution from the lowegree terms. The conclusion is that ( n ) Pr E(m(i)) f µ (T, r ) (L 1)t L O(t L 1 ) G i=1 as t 0. Recalling that f µ (G, r) Pr G ( n i=1 E(m(i))), we conclue that the failure probability of r in G excees the failure probability of r in T. 6.1 A Connection to Sufficient Sets A strengthening of Theorem 6.1 has a consequence for sufficient sets, as we now iscuss. (Recall that a set of graphs H G is a sufficient set for Γ if for each µ Γ, at least one member of H achieves the minimum µrisk over all graphs in G.) We first escribe the relevant strengthening of the theorem. Notice that the proof of Theorem 6.1 in fact shows something stronger than was claime. If we have any finite set of graphs H G, none of which is T, then we can efine L to be the maximum girth of any graph in H. Using this value of L, we can efine a istribution µ just as before, an the analysis in the proof of Theorem 6.1 then irectly establishes the following. Theorem 6.3. For every finite set H of connecte regular graphs, there is a istribution µ H Γ for which T achieves strictly lower µ H risk than each member of H. In other wors, rather than simply being more resilient than any single other graph G at some point in Γ, the tree T is in fact simultaneously more resilient than any finite set of other graphs at some point in Γ. 17
18 From this stronger form of the result, we obtain the following immeiate consequence. Theorem 6.4. When 3, there is no sufficient set of size for Γ. Proof. If there were such a set H G of size, then it woul have to contain K +1, since K +1 uniquely minimizes the µrisk for some istributions µ Γ. The other graph in H can t be T, since by Theorem 5. there are µ for which has strictly lower µrisk than both K +1 an T. But if the other graph in H were some G T, then by Theorem 6.3 we coul fin a µ for which T has lower µrisk than both K +1 an G, an so this is not possible either. References [1] Franklin Allen an Douglas M. Gale. Financial contagion. Journal of Political Economy, 108(1):1 33, February 000. [] Roy M. Anerson an Robert M. May. Infectious Diseases of Humans. Oxfor University Press, 199. [3] Chalee Asavathiratham, Sanip Roy, Bernar Lesieutre, an George Verghese. The influence moel. IEEE Control Systems Magazine, 1(6):5 64, December 001. [4] Larry Blume, Davi Easley, Jon Kleinberg, Robert Kleinberg, an Éva Taros. Network formation in the presence of contagious risk. In Proc. 1th ACM Conference on Electronic Commerce, 011. [5] Bela Bollobás. Ranom Graphs. Cambrige University Press, secon eition, 001. [6] Damon Centola an Michael Macy. Complex contagions an the weakness of long ties. American Journal of Sociology, 113:70 734, 007. [7] J. T. Cox an Richar Durrett. Nonlinear voter moels. In Richar Durrett an Harry Kesten, eitors, Ranom Walks, Brownian Motion, an Interacting Particle Systems, pages Birkhauser, [8] Peter Dos an Duncan Watts. Universal behavior in a generalize moel of contagion. Physical Review Letters, 9(18701), 004. [9] Mark Granovetter. Threshol moels of collective behavior. American Journal of Sociology, 83: , [10] Davi Kempe, Jon Kleinberg, an Éva Taros. Maximizing the sprea of influence in a social network. In Proc. 9th ACM SIGKDD International Conference on Knowlege Discovery an Data Mining, pages , 003. [11] Stephen Morris. Contagion. Review of Economic Stuies, 67:57 78, 000. [1] Elchanan Mossel an Sebastien Roch. On the submoularity of influence in social networks. In Proc. 39th ACM Symposium on Theory of Computing, 007. [13] Thomas Schelling. Micromotives an Macrobehavior. Norton, [14] Duncan J. Watts. A simple moel of global cascaes on ranom networks. Proc. Natl. Aca. Sci. USA, 99(9): , April
19 A PowerSeries Computations Comparing to K +1 an T In this appenix, we provie the etails of the powerseries computations that establish a istribution µ Γ for which the regular tree of triangles has strictly lower µrisk than both K +1 an T. To fin a choice of parameters for which has lower root failure probability than either T or K +1, we consier istributions µ for which we have (µ(0), µ(1), µ(), µ(3)) = (s, t, u, 1 s t u) for small s, t, an u, an µ(j) = 0 for j > 3. This is a small perturbation of a efault threshol of l max = 3. Recall that in this case, Theorem 4.5 justifies using a power series for the root failure probability in both T an ; the use of a power series for K +1 is justifie because K +1 is a finite graph, hence its power series is really a polynomial. We can procee by comparing the monomials associate with these three graphs. It will turn out to be sufficient to go up to monomials of total egree at most 3 in orer to ientify choices for s, t, an u efining our istribution µ. We work out the coefficients on all monomials of total egree up to 3, as follows. We o this tersely, listing the MERFs an UMERFs that account for each. We also omit terms without a factor of s, since every MERF an UMERF must assign at least one label of 0. For K +1 : 1s: The root gets label 0. 0s : No MERF or UMERF can assign only two labels of 0, since no thir noe will fail. st: The root gets label 1 an one neighbor gets label 0. 0su: No MERF or UMERF can assign just one 0 an one, since the noe with label will not fail. ( 3) s 3 : There is a MERF for each 3tuple of neighbors who are assigne labels of 0. ( ) ( 3)s t: There are ( ) ( ) MERFs contributing to this term: in each, two of the root s neighbors are labele 0, an a thir is labele 1. There are also ( ) UMERFs contributing to this term: in each, two neighbors of the root are labele 0 an the root is labele 1. Each of these UMERFs can be written as the union of two MERFs (the root labele 1 an a neighbor labele 0), so each contributes 1. Thus the total is ( ) ( 3). ( ) ( 1)s u: There are ( ) MERFs where the root gets labele an two neighbors are labele 0; there are ( ) ( ) more MERFs where two neighbors of the root are labele 0 an a thir is labele. ( ) ( )st : Two neighbors of the root get label 1 an a thir gets label 0. 0su : A single 0 cannot cause any noes of label to fail. ( 1) stu: There are ( 1) MERFs where the root has label, one neighbor has label 0, an another has label 1. There are ( 1)( ) more MERFs where the three of the neighbors have labels 0, 1, an in some orer. For T, the terms of total egree up to, as well as the s 3 an su terms, are the same as for K +1, an by the same arguments. For the other egree3 terms: 19
20 ( ) s t: The ( ) ( ) MERFs from K+1 for this term are not present in T, but the ( ) UMERFs each contributing 1 are. ( ) s u: There are ( ) MERFs where the root gets labele an two neighbors are labele 0. (The other ( ) ( ) MERFs from K+1, where two neighbors of the root are labele 0 an a thir is labele, are not present here.) ( 1)st : A granchil of the root gets label 0, an the parent of this chil together with the root get label 1. 0stu: There is no MERF or UMERF that assigns labels of 0, 1, an to three noes. For, we think of the root r as having neighbor set Z = {v 0, v 1, v,..., v 1 }, where the only ege among the noes in Z is between v 0 an v 1. We will refer to neighbors of Z that o not belong to the set Z {r} as epthtwo neighbors. Again the terms of total egree up to, as well as the s 3 an su terms, are the same as for K +1, an by the same arguments. For the other egree3 terms: (( ) ( ) )s t: There are ( ) UMERFs each contributing 1 as in K+1 an T, an there are also ( ) MERFs in which one of v 0 or v 1 is labele 0, the other is labele 1, an a neighbor of the root in Z {v 0, v 1 } is labele 0. ( ) s u: We get the ( ) MERFs that were present in T, but not the aitional ( ) ( ) that were present in K +1. ( + 1)( )st : There are ( )( 1) MERFs where the root an a noe w Z {v 0, v 1 } are labele 1, an one of w s epthtwo neighbors is labele 0. There are ( ) more MERFs where, for i {0, 1}, the root an v i are labele 1, an one of v i s epthtwo neighbors is labele 0. stu: The root is labele, one of v 0 or v 1 is labele 1, an the other is labele 0. For one of these graphs G, let f G (s, t, u) enote the root failure probability, an let g G (s, t, u) enote the sum of all terms in the power series of total egree 4 an higher. Then we have [( ) ( ( ))] [( ) ( )] f K+1 (s, t, u) f (s, t, u) = ( 3) ( ) s t + ( 1) s u [( ) ] + ( ) ( + 1)( ) st + [ ( 1) ] stu +(g K+1 (s, t, u) g (s, t, u)) [( ) ] [( ) ] = ( )s t + ( ) s u [( ) ] + ( + 1) ( )st + [ ( 1) ] stu +(g K+1 (s, t, u) g (s, t, u)) 0
New Trade Models, New Welfare Implications
New Trae Moels, New Welfare Implications Marc J. Melitz Harvar University, NBER an CEPR Stephen J. Reing Princeton University, NBER an CEPR August 13, 2014 Abstract We show that enogenous firm selection
More informationHow Bad is Forming Your Own Opinion?
How Bad is Forming Your Own Opinion? David Bindel Jon Kleinberg Sigal Oren August, 0 Abstract A longstanding line of work in economic theory has studied models by which a group of people in a social network,
More informationDatabasefriendly random projections: JohnsonLindenstrauss with binary coins
Journal of Computer an System Sciences 66 (2003) 671 687 http://www.elsevier.com/locate/jcss Databasefrienly ranom projections: JohnsonLinenstrauss with binary coins Dimitris Achlioptas Microsoft Research,
More informationRegular Languages are Testable with a Constant Number of Queries
Regular Languages are Testable with a Constant Number of Queries Noga Alon Michael Krivelevich Ilan Newman Mario Szegedy Abstract We continue the study of combinatorial property testing, initiated by Goldreich,
More informationCrossOver Analysis Using TTests
Chapter 35 CrossOver Analysis Using ests Introuction his proceure analyzes ata from a twotreatment, twoperio (x) crossover esign. he response is assume to be a continuous ranom variable that follows
More informationThere are two different ways you can interpret the information given in a demand curve.
Econ 500 Microeconomic Review Deman What these notes hope to o is to o a quick review of supply, eman, an equilibrium, with an emphasis on a more quantifiable approach. Deman Curve (Big icture) The whole
More informationSteering User Behavior with Badges
Steering User Behavior with Badges Ashton Anderson Daniel Huttenlocher Jon Kleinberg Jure Leskovec Stanford University Cornell University Cornell University Stanford University ashton@cs.stanford.edu {dph,
More informationHow to Use Expert Advice
NICOLÒ CESABIANCHI Università di Milano, Milan, Italy YOAV FREUND AT&T Labs, Florham Park, New Jersey DAVID HAUSSLER AND DAVID P. HELMBOLD University of California, Santa Cruz, Santa Cruz, California
More informationIf You re So Smart, Why Aren t You Rich? Belief Selection in Complete and Incomplete Markets
If You re So Smart, Why Aren t You Rich? Belief Selection in Complete and Incomplete Markets Lawrence Blume and David Easley Department of Economics Cornell University July 2002 Today: June 24, 2004 The
More informationIEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 4, APRIL 2006 1289. Compressed Sensing. David L. Donoho, Member, IEEE
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 4, APRIL 2006 1289 Compressed Sensing David L. Donoho, Member, IEEE Abstract Suppose is an unknown vector in (a digital image or signal); we plan to
More informationON THE DISTRIBUTION OF SPACINGS BETWEEN ZEROS OF THE ZETA FUNCTION. A. M. Odlyzko AT&T Bell Laboratories Murray Hill, New Jersey ABSTRACT
ON THE DISTRIBUTION OF SPACINGS BETWEEN ZEROS OF THE ZETA FUNCTION A. M. Odlyzko AT&T Bell Laboratories Murray Hill, New Jersey ABSTRACT A numerical study of the distribution of spacings between zeros
More informationThe SmallWorld Phenomenon: An Algorithmic Perspective
The SmallWorld Phenomenon: An Algorithmic Perspective Jon Kleinberg Abstract Long a matter of folklore, the smallworld phenomenon the principle that we are all linked by short chains of acquaintances
More informationGraphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations
Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations Jure Leskovec Carnegie Mellon University jure@cs.cmu.edu Jon Kleinberg Cornell University kleinber@cs.cornell.edu Christos
More informationIntellectual Need and ProblemFree Activity in the Mathematics Classroom
Intellectual Need 1 Intellectual Need and ProblemFree Activity in the Mathematics Classroom Evan Fuller, Jeffrey M. Rabin, Guershon Harel University of California, San Diego Correspondence concerning
More informationErgodicity and Energy Distributions for Some Boundary Driven Integrable Hamiltonian Chains
Ergodicity and Energy Distributions for Some Boundary Driven Integrable Hamiltonian Chains Peter Balint 1, Kevin K. Lin 2, and LaiSang Young 3 Abstract. We consider systems of moving particles in 1dimensional
More informationTHE PROBLEM OF finding localized energy solutions
600 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 3, MARCH 1997 Sparse Signal Reconstruction from Limited Data Using FOCUSS: A Reweighted Minimum Norm Algorithm Irina F. Gorodnitsky, Member, IEEE,
More informationDiscovering Value from Community Activity on Focused Question Answering Sites: A Case Study of Stack Overflow
Discovering Value from Community Activity on Focused Question Answering Sites: A Case Study of Stack Overflow Ashton Anderson Daniel Huttenlocher Jon Kleinberg Jure Leskovec Stanford University Cornell
More informationWHICH SCORING RULE MAXIMIZES CONDORCET EFFICIENCY? 1. Introduction
WHICH SCORING RULE MAXIMIZES CONDORCET EFFICIENCY? DAVIDE P. CERVONE, WILLIAM V. GEHRLEIN, AND WILLIAM S. ZWICKER Abstract. Consider an election in which each of the n voters casts a vote consisting of
More informationHow many numbers there are?
How many numbers there are? RADEK HONZIK Radek Honzik: Charles University, Department of Logic, Celetná 20, Praha 1, 116 42, Czech Republic radek.honzik@ff.cuni.cz Contents 1 What are numbers 2 1.1 Natural
More informationThe Gödel Phenomena in Mathematics: A Modern View
Chapter 1 The Gödel Phenomena in Mathematics: A Modern View Avi Wigderson Herbert Maass Professor School of Mathematics Institute for Advanced Study Princeton, New Jersey, USA 1.1 Introduction What are
More informationA mini course on additive combinatorics
A mini course on additive combinatorics 1 First draft. Dated Oct 24th, 2007 These are notes from a mini course on additive combinatorics given in Princeton University on August 2324, 2007. The lectures
More informationAn efficient reconciliation algorithm for social networks
An efficient reconciliation algorithm for social networks Nitish Korula Google Inc. 76 Ninth Ave, 4th Floor New York, NY nitish@google.com Silvio Lattanzi Google Inc. 76 Ninth Ave, 4th Floor New York,
More informationLearning Deep Architectures for AI. Contents
Foundations and Trends R in Machine Learning Vol. 2, No. 1 (2009) 1 127 c 2009 Y. Bengio DOI: 10.1561/2200000006 Learning Deep Architectures for AI By Yoshua Bengio Contents 1 Introduction 2 1.1 How do
More informationONEDIMENSIONAL RANDOM WALKS 1. SIMPLE RANDOM WALK
ONEDIMENSIONAL RANDOM WALKS 1. SIMPLE RANDOM WALK Definition 1. A random walk on the integers with step distribution F and initial state x is a sequence S n of random variables whose increments are independent,
More informationSubspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity
Subspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity Wei Dai and Olgica Milenkovic Department of Electrical and Computer Engineering University of Illinois at UrbanaChampaign
More informationGOSSIP: IDENTIFYING CENTRAL INDIVIDUALS IN A SOCIAL NETWORK
GOSSIP: IDENTIFYING CENTRAL INDIVIDUALS IN A SOCIAL NETWORK ABHIJIT BANERJEE, ARUN G. CHANDRASEKHAR, ESTHER DUFLO, AND MATTHEW O. JACKSON Abstract. Can we identify the members of a community who are bestplaced
More informationTruthful Mechanisms for OneParameter Agents
Truthful Mechanisms for OneParameter Agents Aaron Archer Éva Tardos y Abstract In this paper, we show how to design truthful (dominant strategy) mechanisms for several combinatorial problems where each
More informationFrom Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images
SIAM REVIEW Vol. 51,No. 1,pp. 34 81 c 2009 Society for Industrial and Applied Mathematics From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images Alfred M. Bruckstein David
More informationDiscovering All Most Specific Sentences
Discovering All Most Specific Sentences DIMITRIOS GUNOPULOS Computer Science and Engineering Department, University of California, Riverside RONI KHARDON EECS Department, Tufts University, Medford, MA
More informationDecoding by Linear Programming
Decoding by Linear Programming Emmanuel Candes and Terence Tao Applied and Computational Mathematics, Caltech, Pasadena, CA 91125 Department of Mathematics, University of California, Los Angeles, CA 90095
More information