1 Problem Statement. 2 Overview

Similar documents
8.1 Min Degree Spanning Tree

Exponential time algorithms for graph coloring

Lecture 22: November 10

Lecture 1: Course overview, circuits, and formulas

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

1 The Line vs Point Test

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan , Fall 2010

2.1 Complexity Classes

CS 598CSC: Combinatorial Optimization Lecture date: 2/4/2010

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

Triangle deletion. Ernie Croot. February 3, 2010

Lecture 13 - Basic Number Theory.

Lecture 11: The Goldreich-Levin Theorem

Proximal mapping via network optimization

(67902) Topics in Theory and Complexity Nov 2, Lecture 7

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

1 Formulating The Low Degree Testing Problem

Randomized algorithms

Lecture 2: Universality

Guessing Game: NP-Complete?


5.1 Bipartite Matching

Lecture 8. Confidence intervals and the central limit theorem

CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma

Welcome to... Problem Analysis and Complexity Theory , 3 VU

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10

CS 3719 (Theory of Computation and Algorithms) Lecture 4

CMPSCI611: Approximating MAX-CUT Lecture 20

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability

4. Continuous Random Variables, the Pareto and Normal Distributions

General Sampling Methods

1 Solving LPs: The Simplex Algorithm of George Dantzig

CHAPTER 5 Round-off errors

Finding and counting given length cycles

OHJ-2306 Introduction to Theoretical Computer Science, Fall

Page 1. CSCE 310J Data Structures & Algorithms. CSCE 310J Data Structures & Algorithms. P, NP, and NP-Complete. Polynomial-Time Algorithms

CHAPTER 2 Estimating Probabilities

Institut für Informatik Lehrstuhl Theoretische Informatik I / Komplexitätstheorie. An Iterative Compression Algorithm for Vertex Cover

Polarization codes and the rate of polarization

Supplement to Call Centers with Delay Information: Models and Insights

You flip a fair coin four times, what is the probability that you obtain three heads.

1. Nondeterministically guess a solution (called a certificate) 2. Check whether the solution solves the problem (called verification)

An Introduction to Basic Statistics and Probability

5. Continuous Random Variables

Mathematical Induction

CSC 373: Algorithm Design and Analysis Lecture 16

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Practice problems for Homework 11 - Point Estimation

arxiv: v1 [math.pr] 5 Dec 2011

Definition Given a graph G on n vertices, we define the following quantities:

Lecture 4: BK inequality 27th August and 6th September, 2007

4. How many integers between 2004 and 4002 are perfect squares?

Approximating Concept Stability

Chapter 4. Probability and Probability Distributions

Aggregate Loss Models

On the Unique Games Conjecture

P versus NP, and More

Approximation Algorithms

Online Bipartite Perfect Matching With Augmentations

Polytope Examples (PolyComp Fukuda) Matching Polytope 1

Maximum Likelihood Estimation

1 Sufficient statistics

A simpler and better derandomization of an approximation algorithm for Single Source Rent-or-Buy

Homework # 3 Solutions

0.1 Phase Estimation Technique

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

Basics of Statistical Machine Learning

Chapter 3 RANDOM VARIATE GENERATION

Lecture 19: Introduction to NP-Completeness Steven Skiena. Department of Computer Science State University of New York Stony Brook, NY

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory

Why? A central concept in Computer Science. Algorithms are ubiquitous.

Efficiency and the Cramér-Rao Inequality

Notes on Complexity Theory Last updated: August, Lecture 1

Online Adwords Allocation

LINEAR INEQUALITIES. Mathematics is the art of saying many things in many different ways. MAXWELL

Lecture 3: Finding integer solutions to systems of linear equations

Chapter 4: Statistical Hypothesis Testing

CSE 135: Introduction to Theory of Computation Decidability and Recognizability

Lecture Note 1 Set and Probability Theory. MIT Spring 2006 Herman Bennett

Support Vector Machines Explained

Introduction to Scheduling Theory

STAT 830 Convergence in Distribution

Turing Machines: An Introduction

Steiner Tree Approximation via IRR. Randomized Rounding

Introduction to Hypothesis Testing

Generating models of a matched formula with a polynomial delay

5.1 Radical Notation and Rational Exponents

Near Optimal Solutions

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Lecture 8: More Continuous Random Variables

1 The Black-Scholes model: extensions and hedging

Confidence Intervals for Exponential Reliability

Transcription:

Lecture Notes on an FPRAS for Network Unreliability. Eric Vigoda Georgia Institute of Technology Last updated for 7530 - Randomized Algorithms, Spring 200. In this lecture, we will consider the problem of calculating the probability that a given network fails, and give a FPRAS (Fully Polynomial Randomized Approximation Scheme) for the same. The work on reliability is due to Karger [2], and the earlier work on #DNF is due to Karp and Luby [4], which was later improved by Karp, Luby and Madras [5]. Problem Statement Fix an undirected graph G and 0 p. Let H be a graph obtained by deleting edges of G, independently with probability p. Let F AIL G (p) denote the probability that the graph H obtained as above is disconnected. The network reliability problem is to calculate F AIL G (p) given inputs G and p. 2 Overview For an undirected graph H, let MINCUT(H) denote its min-cut. First observe that a graph H is disconnected iff MINCUT(H) = 0. If G is an undirected graph and H is obtained by deleting some edges of G, then H is disconnected iff the deleted edges contain some cut of G (not necessarily a min-cut). Hence, we have that where E(G) denotes the edge set of G. F AIL G (p) = Pr (E(G) \ E(H) contains a cut of G) () Since, calculating the exact value of F AIL G (p) is #P-complete, we will try to develop an FPRAS for the same, i.e. there is an algorithm A ɛ > 0, δ > 0 and all inputs (G, p), we have ( ɛ)f AIL G (p) A(G, p) ( + ɛ)f AIL G (p), with probability δ and the running time of A is polynomial in the input size, /ɛ and log(/δ). Here the input size is O( G + log(/p)). One simple way to solve this problem is to keep generating random H s, and observe how many of them turn out to be disconnected. For this to work, we need to generate at least /F AIL G (p) random H s, and then some if we want to keep the error down. Since F AIL G (p) can be exponentially small, this algorithm will not cut it. If F AIL G (p) is polynomially small, we do as suggested above. In the alternate case, we show that if in equation (), we keep only the terms corresponding to small cuts, then the right hand side, is a good enough approximation to F AIL G (p). In the rest of this lecture, we make the terms polynomially small and small cuts more precise and show that the algorithm does indeed work as advertised. For the rest of this lecture (G, p) denotes a fixed input 2003. Based on scribe notes first prepared by Murali Krishnan Ganapathy at the University of Chicago in the winter quarter,

to our algorithm. Similarly, we fix some ɛ > 0 and δ > 0. Let C := MINCUT(G). This is calculated by using Karger s algorithm discussed in the previous lecture. Finally put q := F AIL G (p) and observe that q = F AIL G (p) P r(e(g) \ E(H) contains a specific cut) p C. Now we consider two cases depending on whether p C is greater or smaller than n 4. This defines polynomially small. 3 p C n 4 In this case, we show that the Monte Carlo method discussed before actually works. Put l = 3n4 ɛ log(2/δ). 2 All in all, we do l trials, and in each trial we create a random H by tossing coins of bias p, once for each edge of G, and then check if the H generated is connected. Finally, we output the fraction of the H s which were connected. For i =,..., l, set X i equal to if the i th trial resulted in a disconnected graph and 0 otherwise, and put X = i X i. Since Pr(X i = ) = F AIL G (p) = q, E(X) = ql. Since the X i s are independent, by Chernoff s bound we have, Pr ( X E(X) ɛe(x)) 2 exp( ɛ 2 ql/3) 2 exp( ɛ 2 4 n4 n ɛ 2 log(2/δ)) = 2 exp( log(2/δ)) = δ. Each trial involves generating the random H, and testing it for connectedness. Checking connectedness takes time O( G ). Generating the random H requires time O( G log(/p)), assuming that we only have access to unbiased coins. So, total time is O(l G log(/p)), which is polynomial in the size of the input, /ɛ and log(/δ). 4 Enumerating small cuts In this section we consider the problem of enumerating the small cuts. This will be used as a subroutine to solve the main problem at hand. The algorithm is a variation of Karger s min-cut algorithm [] to find a cut of size αc, where α > is a parameter:. Pick a random edge and contract it (eliminating self loops and keeping multiple edges). 2. Repeat step, till there are k := 2α meta-vertices left. 3. Choose a random subset T of these k-vertices. 4. If the cut defined by T and its complement, has size αc, output it. 2

Fix a cut S of size αc. For the above algorithm to output S, it must not choose any edge in S during step, and in step 3, the random subset T should correspond to one side of the cut defined by S. Since every vertex (of every intermediate graph) has degree C (since C = MINCUT(G)), Pr(S survives till step 3) = n k t=0 ( n 2α k ( ) ) αc (n t)c/2 ( n k) (2) ( n k) (3) Hence, Pr(output = S) ( n k)2 k+ ( ) k k 2 2en n k assuming α > 2 = n 2α Since S was an arbitrary cut of size αc, we have that the number of cuts of size αc is n 2α. This bound can be easily improved to n 2α if we replace the right hand side of equation (2) with ( ) ( k 2α / n 2α), whose definition would involve Γ functions. In order to enumerate all small cuts with high probability, we repeat the above algorithm r := θ log(θ/δ) times, where θ = n 2α. For a specific cut S, Pr(S was not the result of the i th trial ) /θ. Hence the probability that S was not the result any of the r-trials is bounded above by ( /θ) r. This implies that the probability that some small cut did not appear in any of the r-trials, is bounded above by θ( θ ) θ log(θ/δ) δ. This gives us an algorithm which for any α >, δ > 0, enumerates all cuts of a graph of size αc with probability δ and running time O(rn 2 ). 5 A FPRAS for #DNF In this section, we consider the problem of estimating the number of satisfying assignments of a DNF formula, and its weighted version. Let Φ be a DNF formula with M clauses θ,..., θ M. Let r i be the number of literals appearing in θ i. Let N denote the total number of distinct variables in Φ. Without loss of generality assume Φ is satisfiable (checking this can be done in time O( r i ). Let 0 p, be given. Consider a random assignment with bias p. We wish to calculate the probability, f(φ), of the random assignment satisfying Φ. In case p = /2, it reduces to the problem of counting the number of satisfying assignments. The obvious trick of choosing a random assignment (with required bias) and checking if it satisfies Φ is out of question, since f(φ) can be exponentially small. Let w i denote the probability that a random assignment satisfies the i th clause. If p i and n i denote the number of positive and negative literals appearing in θ i, then w i = p pi ( p) ni. Finally, put = i w i. Consider the following algorithm: 3

. Choose a random clause θ i, with probability of selecting θ i equal to w i /. 2. Choose a random assignment σ satisfying θ i. 3. Calculate S = /N(σ), where N(σ) is the number of clauses satisfied by σ. 4. Repeat steps -3 t := (4M/ɛ 2 ) times, and output the observed mean. First we show that E(S) = f(φ). For any assignment σ, let w(σ) denote the probability that a random assignment equals σ, i.e. w(σ) = p a ( p) b, where a is the # of variables set to and b = N a is the # of variables set to 0. P r(choosing σ in step 2 θ i in step ) = w(σ)/w i = E(S θ i in step ) = σ: N(σ) w(σ) w i This gives E(S) = i = i w i w(σ) N(σ) w i w(σ)/n(σ) = w(σ)/n(σ) σ =Φ i: = w(σ)/n(σ) σ =Φ = σ =Φ w(σ) = f(φ) i: Similarly E(S 2 ) = i = i = σ =Φ w i 2 w(σ) N(σ) 2 w i N(σ) 2 w(σ) N(σ) w(σ) f(φ) the last inequality following from the fact that N(σ) if σ = Φ. Now V ar(s) = E(S 2 ) E(S) 2 (/f(φ) )f(φ) 2 (M )f(φ) 2. To see that Mf(Φ), define A i to be the event that a random σ satisfies θ i and note that = i w i = i Pr(A i) M Pr( i A i ) = Mf(Φ). 4

Now let S,..., S t denote the observed values during the run of the algorithm, and let Ŝ be the observed mean, i.e. Ŝ = ( i S i)/t. Clearly, E(Ŝ) = E(S) = f(φ) and V ar(ŝ) = V ar(s)/t (M )f(φ)2 /t. Now applying Chebychev s inequality, we have ) ( Pr ( Ŝ f(φ) ɛf(φ) Pr( Ŝ f(φ) ɛ ) t V ar(ŝ) M M ɛ 2 /4 t To reduce the error probability to less then δ, we repeat the above procedure log(/δ) times and output the median of all the observed means. 6 p C n 4 Now back to the original network reliability problem. In the case when q = F AIL G (p) is small, we use equation, and consider only the terms on the right hand side corresponding to small cuts. We now show that that one can choose α >, so that if we restrict ourselves to only those terms which correspond to cuts of size αc, then the error we introduce is small. Then we enumerate all small cuts and approximate the probability using the FPRAS for #DNF. Choose ζ 2 such that p C = n (2+ζ). Let H be a random graph obtained from G by removing some edges (in the prescribed manner). Then Pr(some cut of size αc fails in H) n 2β p βc dβ β α n ζβ dβ β α = O() n ζα Setting α = 2 ln(ɛ/o())/ζ ln n, we see that Pr(large cut failing in H) ɛn 2ζ ɛp C ɛq. Therefore, by ignoring all large cuts (i.e. those of size > αc), we calculate F AIL G (p) to within a multiplicative factor of ± ɛ. Now, we enumerate all the small cuts, using the algorithm from Section 4. Let C,..., C r be the small cuts. Define a DNF formula Φ over the variables {x e } e E(G) as follows. For each cut C i, define the clause θ i = e Ci x e. Set x e to true if the edge appears in H, and false otherwise. Then θ i evaluates to true exactly when the cut C i fails in H. Hence Φ evaluates to true iff some small cut fails in H. So, we are left with the problem of evaluating the probability that a random assignment of truth values to x e (true with prob p) satisfies Φ. This is solved using the FPRAS of section 5. 7 Putting it all together Combining all the previous discussion the final algorithm is:. C = MINCUT(G) 2. If p C > n 4 goto step 3, else goto step 4 5

3. (Monte Carlo) Using l := 3n4 ɛ log(2/δ) trials, estimate F AIL 2 G (p) to within ± ɛ with probability δ and end program. 4. Calculate α = 2 ln(ɛ/o())/ζ ln n, where ζ satisfies p C = n 2 ζ. 5. Using the extension of Karger s min-cut algorithm described in Section 4, enumerate all cuts of size αc, with probability δ. 6. Construct the DNF Φ as described in Section 6. 7. Using FPRAS of section 5 evaluate f(φ) to within a multiplicative error of + ɛ with probability δ. 8. Output f(φ) This algorithm calculates F AIL G (p) to within a multiplicative error of ( ± ɛ) with probability ( δ) 2. Each step takes polynomial time with respect to its input,/ɛ and log(/δ). Composing all this we get an algorithm whose running time is polynomial in G, log p, /ɛ and log δ, and the cumulative multiplicative error is bounded by ( ± ɛ) with error probability bounded by ( δ) 2. 8 Open problems An interesting open problem is to devise a deterministic approximation algorithm for F AIL G (p). Karger [3] showed that his approach can be efficiently derandomized when F AIL G (p) /n 4. However it is open how to derandomize the case F AIL G (p) > /n 4, which was the somewhat trivial case for the randomized algorithm. The high-level idea for the deterministic algorithm for small failure probabilities is the following. There is an efficient deterministic algorithm to list all small cuts via max-flow computations. Then F AIL G (p) can be estimated via inclusion-exclusion where only the first few terms are needed for a sufficient estimate. References [] D. R. Karger. Global min-cuts in RNC, and other ramifications of a simple min-cut algorithm. In Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms (Austin, TX, 993), pages 2 30, New York, 993. ACM. [2] D. R. Karger. A randomized fully polynomial time approximation scheme for the all-terminal network reliability problem. SIAM J. Comput., 29(2):492 54, 999. [3] D. R. Karger. A randomized fully polynomial time approximation scheme for the all-terminal network reliability problem. SIAM Rev., 43(3):499 522, 200. Reprint of SIAM J. Comput. 29 (999), no. 2, 492 54. [4] R. M. Karp and M. Luby. Monte Carlo algorithms for the planar multiterminal network reliability problem. J. Complexity, ():45 64, 985. [5] R. M. Karp, M. Luby, and N. Madras. Monte Carlo approximation algorithms for enumeration problems. J. Algorithms, 0(3):429 448, 989. 6