Randomized algorithms



Similar documents
2.1 Complexity Classes

Introduction to Algorithms. Part 3: P, NP Hard Problems

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)

Near Optimal Solutions

6.042/18.062J Mathematics for Computer Science. Expected Value I

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation

IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE OPERATORS

Finding and counting given length cycles

Breaking Generalized Diffie-Hellman Modulo a Composite is no Easier than Factoring

CSC 180 H1F Algorithm Runtime Analysis Lecture Notes Fall 2015

CS473 - Algorithms I

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory

Distributed Computing over Communication Networks: Maximal Independent Set

Feb 7 Homework Solutions Math 151, Winter Chapter 4 Problems (pages )

arxiv: v1 [math.pr] 5 Dec 2011

CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma

Guessing Game: NP-Complete?

Lecture 1: Course overview, circuits, and formulas

Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case.

SIMS 255 Foundations of Software Design. Complexity and NP-completeness

Introduction to Algorithms March 10, 2004 Massachusetts Institute of Technology Professors Erik Demaine and Shafi Goldwasser Quiz 1.

CAD Algorithms. P and NP

Expected Value and the Game of Craps

Lecture 10: CPA Encryption, MACs, Hash Functions. 2 Recap of last lecture - PRGs for one time pads

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

Security in Outsourcing of Association Rule Mining

Small Maximal Independent Sets and Faster Exact Graph Coloring

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

General Sampling Methods

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

GREATEST COMMON DIVISOR

The Running Time of Programs

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

1 The Line vs Point Test

Introduction to computer science

Private Approximation of Clustering and Vertex Cover

Large induced subgraphs with all degrees odd

each college c i C has a capacity q i - the maximum number of students it will admit

Lecture 2: Universality

Chapter Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling

7. Show that the expectation value function that appears in Lecture 1, namely

The Classes P and NP

Diagonalization. Ahto Buldas. Lecture 3 of Complexity Theory October 8, Slides based on S.Aurora, B.Barak. Complexity Theory: A Modern Approach.

Primality - Factorization

4.6 Linear Programming duality

Lecture 4: AC 0 lower bounds and pseudorandomness

The Assumption(s) of Normality

Many algorithms, particularly divide and conquer algorithms, have time complexities which are naturally

Approximation Algorithms

(67902) Topics in Theory and Complexity Nov 2, Lecture 7

CSC148 Lecture 8. Algorithm Analysis Binary Search Sorting

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

Is n a Prime Number? Manindra Agrawal. March 27, 2006, Delft. IIT Kanpur

1 Formulating The Low Degree Testing Problem

Loop Invariants and Binary Search

Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

Faster deterministic integer factorisation

Notes from Week 1: Algorithms for sequential prediction

6.852: Distributed Algorithms Fall, Class 2

Tutorial 8. NP-Complete Problems

Analysis of Approximation Algorithms for k-set Cover using Factor-Revealing Linear Programs

The Relative Worst Order Ratio for On-Line Algorithms

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

THE SCHEDULING OF MAINTENANCE SERVICE

Online Supplement for Maximizing throughput in zero-buffer tandem lines with dedicated and flexible servers by Mohammad H. Yarmand and Douglas G.

Ph.D. Thesis. Judit Nagy-György. Supervisor: Péter Hajnal Associate Professor

IEOR 4404 Homework #2 Intro OR: Deterministic Models February 14, 2011 Prof. Jay Sethuraman Page 1 of 5. Homework #2

Find-The-Number. 1 Find-The-Number With Comps

Statistics 100A Homework 3 Solutions

A Note on Maximum Independent Sets in Rectangle Intersection Graphs

Theorem (informal statement): There are no extendible methods in David Chalmers s sense unless P = NP.

A Note on Proebsting s Paradox

Factoring Algorithms

CS473 - Algorithms I

THE TURING DEGREES AND THEIR LACK OF LINEAR ORDER

Research Tools & Techniques

Analysis of Binary Search algorithm and Selection Sort algorithm

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = i.

MATHEMATICS 154, SPRING 2010 PROBABILITY THEORY Outline #3 (Combinatorics, bridge, poker)

Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits

MAS108 Probability I

14.1 Rent-or-buy problem

Why? A central concept in Computer Science. Algorithms are ubiquitous.

Cryptography and Network Security Number Theory

Talk announcement please consider attending!

1 Problem Statement. 2 Overview

Discuss the size of the instance for the minimum spanning tree problem.

1 Message Authentication

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

Follow the Perturbed Leader

Sequential Data Structures

Revealing Information while Preserving Privacy

Factoring & Primality

Transcription:

Randomized algorithms March 10, 2005 1 What are randomized algorithms? Algorithms which use random numbers to make decisions during the executions of the algorithm. Why would we want to do this?? Deterministic algorithm: for a fixed input, it always gives the same output and has the same running-time. We want our algorithms to be correct and fast. So the deterministic algorithm must always produce the correct answer, and must always run within the stated time-bound. Isn t this overkill? How about this: for a fixed input, we usually produce the correct answer, and usually run within the stated time-bound. Advantages Maybe we can get faster algorithms? (yes!) Maybe we can get simpler algorithms? (yes!) Maybe we can tackle harder problems? (open) 1

Two kinds of randomized algorithms 1. Monte Carlo algorithms: May sometimes produce an incorrect solution we bound the probability of failure. 2. Las Vegas algorithms: Always give the correct solution, the only variation is the running time we study the distribution of the running time. Example: Nerds and Doofuses m students take an exam which has n questions. Every student is either a nerd or a doofus. Nerds get all n answers right. Doofuses get less than n 2 answers right. Problem: Grade all the exams, giving all nerds an A and all doofuses a D. Deterministic algorithm: 1. For each student, grade at most the first n 2 see a wrong answer. questions in order stop as soon as you 2. If you ve seen a wrong answer, give grade D. Otherwise give grade A. Correctness: Always correct. Running time: nm 2 Monte Carlo algorithm: 1. For each student, choose 10 questions at random and grade them. 2. If you ve seen a wrong answer, give grade D. Otherwise give grade A. Correctness: Nerds always get an A, doofuses may get an A too. (What is the probability of this?) Running time: 10m 2

Las Vegas algorithm: 1. For each student, repeatedly choose a question at random and grade it, until you have graded n correct answers or seen a wrong answer. 2 2. If you ve seen a wrong answer, give grade D. Otherwise give grade A. Correctness: Always correct, provided it terminates! Running time: Variable - algorithm may not terminate! How about computing the average running time? Let RT (e) = running time of Las Vegas algorithm on exam e The running time RT (e) is a random variable. We want to compute the expected running time E[RT (e)]. Definition 1.1 T (n) = max E[RT (e)] input e, e =n is the expected running time of the algorithm. Important: Note that the variation in running time is due to the random choices made by the algorithm we do NOT assume a distribution on the inputs. Compare this with the definition of worst-case running time for a deterministic algorithm: Definition 1.2 T (n) = max RT (e) input e, e =n is the worst case running time of the deterministic algorithm. Caveat: A randomized algorithm might have exponential running time in the worst case (or even unbounded) while having good expected running time. Definition 1.3 The running time of a randomized algorithm A is O(f(n)) with highprobability if Pr[RT (A) c f(n)] 1 n d, where c and d are appropriate constants. [ For technical ] reasons, we also require that A has expected running time O(f(n)), i.e. E RT (A) = O(f(n)). 3

2 Some Probability Definition 2.1 (Random variable) A random variable is a real-valued function over the set of events [plus some technical conditions]. Definition 2.2 (Conditional Probability) Pr[X = x Y = y] (read as the probability that X = x given that Y = y ) is defined as Pr[X = x Y = y] = Pr[X = x Y = y] Pr[Y = y] For example, let s roll a dice. Let Y = 1 if the number we get is even, and X is the number we get. Then Pr[X = 2 Y = 1] = Pr[number is 2 number is even] Pr[number is even] = 1/6 1/2 = 1 3 Lemma 2.3 If X and Y are random variables, then Pr[X = x] = y Pr[X = x Y = y] Pr[Y = y] Definition 2.4 (Expectation) The expectation of a random variable X is E[X] = x x Pr[X = x] If Y is also a random variable, then the expectation of X given that Y = y is E[X Y = y] = x x Pr[X = x Y = y] Corollary 2.5 (Follows from Lemma 2.3) If X and Y are random variables, E[X] = y E[X Y = y] Pr[Y = y] Quick question: What is the expected running time of the Las Vegas algorithm given earlier if all students are doofuses? Lemma 2.6 (Linearity of expectation) For any two random variables X, Y, we have E[X + Y ] = E[X] + E[Y ]. Definition 2.7 (Independence) X and Y are independent random variables if Pr[X Y = y] = Pr[X] Lemma 2.8 If X and Y are independent random variables, then E[XY ] = E[X] E[Y ] 4

3 Sorting Nuts and Bolts Problem 3.1 You are given n nuts and n bolts. Every nut has a matching bolt, and all the n pairs of nuts and bolts have different sizes. Unfortunately, you get the nuts and bolts separated from each other and you have to match the nuts to the bolts. Furthermore, given a nut and a bolt, all you can do is to try and match one bolt against a nut (i.e., you can not compare two nuts to each other, or two bolts to each other). When comparing a nut to a bolt, either they match, or one is smaller than other. How to match the n nuts to the n bolts quickly? Answer: Simulate QuickSort: MatchNutsAndBolts(N,B) Pick a random nut α n from B (pivot) Find its matching bolt α b in B N L All bolts in B smaller than α n N R All bolts in B larger than α n B L, B R - def similarly MatchNutsAndBolts(N R,B R ) MatchNutsAndBolts(N L,B L ) Theorem 3.2 The expected running time of this algorithm is T (n) = O(n log n) where n is the number of nuts and bolts. The worst case running time of this algorithm is O(n 2 ). Proof: Let RT (n) = maximum running time of MatchNutsAndBolts on instances of size n (random variable). Rank of a bolt is its location in the sorted order of bolts. Let P = rank of the first pivot chosen by the algorithm MatchNutsAndBolts (random variable). We want to compute T (n) = E[RT (n)] by definition = n k=1 E[RT (n) P = k] Pr[P = k] by Corollary 2.5 = n 1 k=1 E[RT (n) P = k] n because for any k, Pr[P = k] = 1 n What is (RT (n) P = k)? (RT (n) P = k) RT (k 1) + RT (n k) + O(n) 5

So we have, T (n) = 1 n n k=1 E[RT (n) P = k] 1 ( n n k=1 E [RT (k 1) + RT (n k) + O(n)]) = 1 ( n n k=1 E[RT (k 1)] + E[RT (n k)] + O(n)) by linearity of expectation = 1 [ n n k=1 T (k 1) + T (n k)] + O(n) We have the recurrence: T (n) = O(n) + 1 n n (T (k 1) + T (n k)) k=1 Solution? T (n) = O(n log n). Proof? Guess and verify (by induction). 4 Analyzing QuickSort The previous analysis also works for QuickSort. However, there is an alternative analysis which is also very interesting. Let a 1,..., a n be the n given numbers (in sorted order as they appear in the output). It is enough to bound the number of comparisons performed by QuickSort to bound its running time. Let { 1 if we compared ai to a X ij = j 0 otherwise X ij is an indicator variable (i.e. a random variable that only takes values 0 and 1). The number of comparisons performed by QuickSort is exactly i<j X ij Hence, the expected running time of QuickSort is So what is Pr[X ij = 1]? E[ i<j X ij] = i<j E[X ij] = i<j Pr[X ij = 1] 6

Key observation: Two elements a i and a j are compared by QuickSort at most once, and this happens if and only if both a i and a j are in the same sub-problem, and one of them is selected as a pivot. If the pivot is smaller than min(a i, a j ) or larger than max(a i, a j ), then a i and a j will remain in the same sub-problem. So, X ij = 1 iff a i or a j is the first element in a i, a i+1,..., a j to be chosen as the pivot. The probability of this happening is clearly 2 j i + 1 Thus, the running time of QuickSort is i<j Pr[X ij = 1] = i<j 2 j i + 1 = O(n log n). In fact, the running time of QuickSort is O(n log n) with high-probability. 5 QuickSort with High Probability Given a set of numbers S, QuickSort randomly chooses a pivot from S and splits the set into two sub-problems: S L and S R. Every element in a sub-problem S of size 2 or more is compared once (to the pivot). So, an element takes part in k comparisons iff it occurs in k sub-problems. We want to show: each element occurs in O(log n) sub-problems with high probability. More specifically, let X i be an indicator variable such that X i = 1 iff the element a i occurs in more than 20 log n sub-problems. Suppose we know that Pr[X i = 1] 1/n 3. Then, if C is the number of comparisons performed by QuickSort, we have: [ ] n 1 Pr[C 20n log n] Pr X i = 1 Pr[X i = 1] n n = 1 3 n 2 (We used the union rule above: Pr[A B] Pr[A] + Pr[B]). So we will be done if we can show that Pr[X i = 1] 1/n 3. How to prove such a thing??? i 7 i=1

5.1 Proving that an element occurs in few sub-problems We say that a sub-problem S is lucky if it splits evenly into sub-problems S L and S R. More specifically, both S L and S R must be at most (3/4) S. Every time an element is in a lucky sub-problem S, it will next be in a sub-problem of size at most (3/4) S. So, an element can occur in at most M = log 4/3 n lucky sub-problems. Let Y k be an indicator variable such that Y k = 1 iff the k-th sub-problem is lucky. A sub-problem is lucky iff the pivot is picked from the middle half of S in sorted order, so Pr[Y k = 1] = 1 2 Also, if j k then Y j and Y k are independent. So, the Y k s are independent indicator variables that get value 1 with probability 1/2. Think of these as coin flips, where head = lucky sub-problem. We want to know: how many times do we need to toss a fair coin before we get at least M heads with high probability? Answer: Next time. But could we also use it to prove that the Las Vegas algorithm for the Nerds and Doofuses problem runs in a certain time T with high probability? 8