CS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis

Size: px
Start display at page:

Download "CS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis"

Transcription

1 CS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis Eli Upfal Eli Office: 319 TA s: Ahmad Mahmoody and Justin Sung-ho Oh

2 It is remarkable that this science, which originated in the consideration of games and chances, should have become the most important object of human knowledge... The most important questions of life are, for the most part, really only problems of probability Pierre Simons, Marquis de Laplace ( ).

3 Why Probability in Computing? The ideal naive computation paradigm: Deterministic algorithm computes efficiently a correct output for any input. In reality : Algorithms are often randomized (no deterministic algorithm, or randomized algorithm is simpler and more efficient) Input is assumed from a distribution - probabilistic analysis. (No efficient worst case solution or iterative computation on unknown future input). Correct Output is not known, need to learn from examples.

4 Why Probability in Computing? Almost any advance computer application today has some randomization/statistical/machine learning components: Network security Cryptography Web search and Web advertising Spam filtering Social network tools Recommendation systems: Amazon, Netfix,.. Communication protocols Computational finance System biology DNA sequencing and analysis Data mining

5 Probability and Computing Randomized algorithms - random steps help! - cryptography and security, fast algorithms, simulations Probabilistic analysis of algorithms - Why hard to solve problems in theory are often not that hard in practice. Statistical inference - Machine learning, data mining... All are based on the same (mostly discrete) probability theory principles and techniques

6 Course Details - Main Topics Review basic probability theory through analysis of randomized algorithms. Large deviation: Chernoff and Hoeffding bounds The probabilistic method Martingale (in discrete space) The theory of learning, PAC learning and VC-dimension Monte Carlo methods, Metropolis algorithm,... Convergence of Monte Carlo Markov Chains methods. The Poisson process and some queueing theory....

7 Course Details Pre-requisite: CS 145, CS 45 or equivalent (first three chapters in the textbook). Textbook:

8 Homeworks, Midterm and Final: Weekly assignments. Typeset in Latex (or readable like typed) - template on the website Concise and correct proofs. Can work together - but write in your own words. Work must be submitted on time. Midterm and final: take home exams, absolute no collaboration, cheaters get C.

9 Verifying Matrix Multiplication Given three n n matrices A, B, and C in a Boolean field, we want to verify AB = C. Standard method: Matrix multiplication - takes Θ(n 3 ) (Θ(n 2.37 )) operations.

10 Randomized algorithm: 1 Chooses a random vector r = (r 1, r 2,..., r n ) {0, 1} n. 2 Compute B r; 3 Compute A(B r); 4 Computes C r; 5 If A(B r) C r return AB C, else return AB = C. The algorithm takes Θ(n 2 ) time. Theorem If AB C, and r is chosen uniformly at random from {0, 1} n, then Pr(AB r = C r) 1 2.

11 Probability Space Definition A probability space has three components: 1 A sample space Ω, which is the set of all possible outcomes of the random process modeled by the probability space; 2 A family of sets F representing the allowable events, where each set in F is a subset of the sample space Ω; 3 A probability function Pr : F R, satisfying the definition below. An element of Ω is a simple event. In a discrete probability space we use F = 2 Ω.

12 Probability Function Definition A probability function is any function Pr : F R that satisfies the following conditions: 1 For any event E, 0 Pr(E) 1; 2 Pr(Ω) = 1; 3 For any finite or countably infinite sequence of pairwise mutually disjoint events E 1, E 2, E 3,... Pr E i = Pr(E i ). i 1 i 1 The probability of an event is the sum of the probabilities of its simple events.

13 Independent Events Definition Two events E and F are independent if and only if Pr(E F ) = Pr(E) Pr(F ). More generally, events E 1, E 2,... E k are mutually independent if and only if for any subset I [1, k], ( ) Pr E i = Pr(E i ). i I i I

14 Verifying Matrix Multiplication Randomized algorithm: 1 Chooses a random vector r = (r 1, r 2,..., r n ) {0, 1} n. 2 Compute B r; 3 Compute A(B r); 4 Computes C r; 5 If A(B r) C r return AB C, else return AB = C. The algorithm takes Θ(n 2 ) time. Theorem If AB C, and r is chosen uniformly at random from {0, 1} n, then Pr(AB r = C r) 1 2.

15 Lemma Choosing r = (r 1, r 2,..., r n ) {0, 1} n uniformly at random is equivalent to choosing each r i independently and uniformly from {0, 1}. Proof. If each r i is chosen independently and uniformly at random, each of the 2 n possible vectors r is chosen with probability 2 n, giving the lemma.

16 Proof: Let D = AB C 0. AB r = C r implies that D r = 0. Since D 0 it has some non-zero entry; assume d 11. For D r = 0, it must be the case that or equivalently n d 1j r j = 0, j=1 r 1 = n j=2 d 1jr j d 11. (1) Here we use d 11 0.

17 Principle of Deferred Decision Assume that we fixed r 2,..., r n. The RHS is already determined, the only variable is r 1. n j=2 r 1 = d 1jr j. (2) d 11 Probability that r 1 = RHS is no more than 1/2.

18 More formally, summing over all collections of values (x 2, x 3, x 4,..., x n ) {0, 1} n 1, we have = = = Pr(AB r = C r) (x 2,...,x n) {0,1} n 1 Pr (AB r = C r (r 2,..., r n) = (x 2,..., x n)) Pr ((r 2,..., r n) = (x 2,..., x n)) Pr ((AB r = C r) ((r 2,..., r n) = (x 2,..., x n))) (x 2,...,x n) {0,1} n 1 (( n j=2 r 1 = d ) ) 1jr j ((r 2,..., r n) = (x 2,..., x n)) (x 2,...,x n) {0,1} n 1 Pr (x 2,...,x n) {0,1} n 1 Pr d 11 ( n j=2 r 1 = d ) 1jr j Pr ((r 2,..., r n) = (x 2,..., x n)) d 11 1 Pr((r2,..., rn) = (x2,..., xn)) 2 (x 2,...,x n) {0,1} n 1 = 1 2.

19 Theorem (Law of Total Probability) Let E 1, E 2,..., E n be mutually disjoint events in the sample space Ω, and n i=1 E i = Ω, then Pr(B) = n Pr(B E i ) = i=1 n Pr(B E i ) Pr(E i ). i=1 Proof. Since the events E i, i = 1,..., n are disjoint and cover the entire sample space Ω, Pr(B) = n Pr(B E i ) = i=1 n Pr(B E i ) Pr(E i ). i=1

20 Computing Conditional Probabilities Definition The conditional probability that event E 1 occurs given that event E 2 occurs is Pr(E 1 E 2 ) = Pr(E 1 E 2 ). Pr(E 2 ) The conditional probability is only well-defined if Pr(E 2 ) > 0. By conditioning on E 2 we restrict the sample space to the set E 2. Thus we are interested in Pr(E 1 E 2 ) normalized by Pr(E 2 ).

21 Bayes Law Theorem (Bayes Law) Assume that E 1, E 2,..., E n are mutually disjoint sets such that n i=1 E i = Ω, then Pr(E j B) = Pr(E j B) Pr(B) = Pr(B E j ) Pr(E j ) n i=1 Pr(B E i) Pr(E i ).

22 NO CLASS ON THURSDAY FEBRUARY 4! First homework will be posted on the website

23 Min-Cut

24 Min-Cut Algorithm Input: An n-node graph G. Output: A minimal set of edges that disconnects the graph. 1 Repeat n 2 times: 1 Pick an edge uniformly at random. 2 Contract the two vertices connected by that edge, eliminate all edges connecting the two vertices. 2 Output the set of edges connecting the two remaining vertices.

25 Theorem The algorithm outputs a min-cut set of edges with probability 2 n(n 1). Lemma Vertex contraction does not reduce the size of the min-cut set. (Contraction can only increase the size of the min-cut set.) Proof. Every cut set in the new graph is a cut set in the original graph.

26 Analysis of the Algorithm Assume that the graph has a min-cut set of k edges. We compute the probability of finding one such set C. Lemma If no edge of C was contracted, no edge of C was eliminated. Proof. Let X and Y be the two set of vertices cut by C. If the contracting edge connects two vertices in X (res. Y ), then all its parallel edges also connect vertices in X (res. Y ).

27 Let E i = the edge contracted in iteration i is not in C. Let F i = i j=1 E j = no edge of C was contracted in the first i iterations. We need to compute Pr(F n 2 )

28 Since the minimum cut-set has k edges, all vertices have degree k, and the graph has nk/2 edges. There are at least nk/2 edges in the graph, k edges are in C. Pr(E 1 ) = Pr(F 1 ) 1 2k nk = 1 2 n.

29 Assume that the first contraction did not eliminate an edge of C (conditioning on the event E 1 = F 1 ). After the first vertex contraction we are left with an n 1 node graph, with minimum cut set, and minimum degree k. The new graph has at least k(n 1)/2 edges. Pr(E 2 F 1 ) 1 Similarly, Pr(E i F i 1 ) 1 k k(n 1)/2 1 2 n 1. k k(n i+1)/2 = 1 2 n i+1.

30 We need to compute Pr(F n 2 ) We use Pr(A B) = Pr(A B)Pr(B) Pr(F n 2 ) = Pr(E n 2 F n 3 ) = Pr(E n 2 F n 3 )Pr(F n 3 ) = Pr(E n 2 F n 3 )Pr(E n 3 F n 4 )...Pr(E 2 F 1 )Pr(F 1 ) ( n 2 = n ( Π n 2 i=1 1 ) ( n 3 n 1 ) 2 n i + 1 )... ) ( n 4 n 2 = Π n 2 i=1 ( 4 6 ) ( 3 5 ( ) n i 1 n i + 1 ) ( ) ( ) 2 1 = n(n 1).

31 Useful identities: Pr(A B) = Pr(A B) Pr(B) Pr(A B) = Pr(A B)Pr(B) Pr(A B C) = Pr(A B C)Pr(B C) = Pr(A B C)Pr(B C)Pr(C) Let A 1,..., A n be a sequence of events. Let E i = i j=1 A i Pr(E n ) = Pr(A n E n 1 )Pr(E n 1 ) = Pr(A n E n 1 )Pr(A n 1 E n 2 )...P(A 2 E 1 )Pr(A 1 )

32 Theorem Assume that we run the randomized min-cut algorithm n(n 1) log n times and output the minimum size cut-set found in all the iterations. The probability that the output is not a min-cut set is bounded by ( ) 2 n(n 1) log n 1 e 2 log n = 1 n(n 1) n 2. Proof. The algorithm has a one side error: the output is never smaller than the min-cut value.

33 The Taylor series expansion of e x gives e x = 1 x + x 2 2!... Thus, for x < 1, 1 x e x.

CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma

CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma Please Note: The references at the end are given for extra reading if you are interested in exploring these ideas further. You are

More information

Introduction to Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

2.1 Complexity Classes

2.1 Complexity Classes 15-859(M): Randomized Algorithms Lecturer: Shuchi Chawla Topic: Complexity classes, Identity checking Date: September 15, 2004 Scribe: Andrew Gilpin 2.1 Complexity Classes In this lecture we will look

More information

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

More information

Polarization codes and the rate of polarization

Polarization codes and the rate of polarization Polarization codes and the rate of polarization Erdal Arıkan, Emre Telatar Bilkent U., EPFL Sept 10, 2008 Channel Polarization Given a binary input DMC W, i.i.d. uniformly distributed inputs (X 1,...,

More information

Randomized algorithms

Randomized algorithms Randomized algorithms March 10, 2005 1 What are randomized algorithms? Algorithms which use random numbers to make decisions during the executions of the algorithm. Why would we want to do this?? Deterministic

More information

Markov random fields and Gibbs measures

Markov random fields and Gibbs measures Chapter Markov random fields and Gibbs measures 1. Conditional independence Suppose X i is a random element of (X i, B i ), for i = 1, 2, 3, with all X i defined on the same probability space (.F, P).

More information

THE CENTRAL LIMIT THEOREM TORONTO

THE CENTRAL LIMIT THEOREM TORONTO THE CENTRAL LIMIT THEOREM DANIEL RÜDT UNIVERSITY OF TORONTO MARCH, 2010 Contents 1 Introduction 1 2 Mathematical Background 3 3 The Central Limit Theorem 4 4 Examples 4 4.1 Roulette......................................

More information

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge, The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge, cheapest first, we had to determine whether its two endpoints

More information

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

More information

Chapter 6: Episode discovery process

Chapter 6: Episode discovery process Chapter 6: Episode discovery process Algorithmic Methods of Data Mining, Fall 2005, Chapter 6: Episode discovery process 1 6. Episode discovery process The knowledge discovery process KDD process of analyzing

More information

Lecture 22: November 10

Lecture 22: November 10 CS271 Randomness & Computation Fall 2011 Lecture 22: November 10 Lecturer: Alistair Sinclair Based on scribe notes by Rafael Frongillo Disclaimer: These notes have not been subjected to the usual scrutiny

More information

Graph Security Testing

Graph Security Testing JOURNAL OF APPLIED COMPUTER SCIENCE Vol. 23 No. 1 (2015), pp. 29-45 Graph Security Testing Tomasz Gieniusz 1, Robert Lewoń 1, Michał Małafiejski 1 1 Gdańsk University of Technology, Poland Department of

More information

SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH

SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH 31 Kragujevac J. Math. 25 (2003) 31 49. SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH Kinkar Ch. Das Department of Mathematics, Indian Institute of Technology, Kharagpur 721302, W.B.,

More information

8.1 Min Degree Spanning Tree

8.1 Min Degree Spanning Tree CS880: Approximations Algorithms Scribe: Siddharth Barman Lecturer: Shuchi Chawla Topic: Min Degree Spanning Tree Date: 02/15/07 In this lecture we give a local search based algorithm for the Min Degree

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information

Complexity Theory. IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar

Complexity Theory. IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar Complexity Theory IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar Outline Goals Computation of Problems Concepts and Definitions Complexity Classes and Problems Polynomial Time Reductions Examples

More information

How To Prove The Dirichlet Unit Theorem

How To Prove The Dirichlet Unit Theorem Chapter 6 The Dirichlet Unit Theorem As usual, we will be working in the ring B of algebraic integers of a number field L. Two factorizations of an element of B are regarded as essentially the same if

More information

Lecture 13 - Basic Number Theory.

Lecture 13 - Basic Number Theory. Lecture 13 - Basic Number Theory. Boaz Barak March 22, 2010 Divisibility and primes Unless mentioned otherwise throughout this lecture all numbers are non-negative integers. We say that A divides B, denoted

More information

LECTURE 4. Last time: Lecture outline

LECTURE 4. Last time: Lecture outline LECTURE 4 Last time: Types of convergence Weak Law of Large Numbers Strong Law of Large Numbers Asymptotic Equipartition Property Lecture outline Stochastic processes Markov chains Entropy rate Random

More information

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory CSC2420 Fall 2012: Algorithm Design, Analysis and Theory Allan Borodin November 15, 2012; Lecture 10 1 / 27 Randomized online bipartite matching and the adwords problem. We briefly return to online algorithms

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

High degree graphs contain large-star factors

High degree graphs contain large-star factors High degree graphs contain large-star factors Dedicated to László Lovász, for his 60th birthday Noga Alon Nicholas Wormald Abstract We show that any finite simple graph with minimum degree d contains a

More information

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

More information

(67902) Topics in Theory and Complexity Nov 2, 2006. Lecture 7

(67902) Topics in Theory and Complexity Nov 2, 2006. Lecture 7 (67902) Topics in Theory and Complexity Nov 2, 2006 Lecturer: Irit Dinur Lecture 7 Scribe: Rani Lekach 1 Lecture overview This Lecture consists of two parts In the first part we will refresh the definition

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Martin Lauer AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg martin.lauer@kit.edu

More information

Notes on Continuous Random Variables

Notes on Continuous Random Variables Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

More information

Conductance, the Normalized Laplacian, and Cheeger s Inequality

Conductance, the Normalized Laplacian, and Cheeger s Inequality Spectral Graph Theory Lecture 6 Conductance, the Normalized Laplacian, and Cheeger s Inequality Daniel A. Spielman September 21, 2015 Disclaimer These notes are not necessarily an accurate representation

More information

The positive minimum degree game on sparse graphs

The positive minimum degree game on sparse graphs The positive minimum degree game on sparse graphs József Balogh Department of Mathematical Sciences University of Illinois, USA jobal@math.uiuc.edu András Pluhár Department of Computer Science University

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

Cycles and clique-minors in expanders

Cycles and clique-minors in expanders Cycles and clique-minors in expanders Benny Sudakov UCLA and Princeton University Expanders Definition: The vertex boundary of a subset X of a graph G: X = { all vertices in G\X with at least one neighbor

More information

Why? A central concept in Computer Science. Algorithms are ubiquitous.

Why? A central concept in Computer Science. Algorithms are ubiquitous. Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online

More information

Primes in Sequences. Lee 1. By: Jae Young Lee. Project for MA 341 (Number Theory) Boston University Summer Term I 2009 Instructor: Kalin Kostadinov

Primes in Sequences. Lee 1. By: Jae Young Lee. Project for MA 341 (Number Theory) Boston University Summer Term I 2009 Instructor: Kalin Kostadinov Lee 1 Primes in Sequences By: Jae Young Lee Project for MA 341 (Number Theory) Boston University Summer Term I 2009 Instructor: Kalin Kostadinov Lee 2 Jae Young Lee MA341 Number Theory PRIMES IN SEQUENCES

More information

Nimble Algorithms for Cloud Computing. Ravi Kannan, Santosh Vempala and David Woodruff

Nimble Algorithms for Cloud Computing. Ravi Kannan, Santosh Vempala and David Woodruff Nimble Algorithms for Cloud Computing Ravi Kannan, Santosh Vempala and David Woodruff Cloud computing Data is distributed arbitrarily on many servers Parallel algorithms: time Streaming algorithms: sublinear

More information

SOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 3. Spaces with special properties

SOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 3. Spaces with special properties SOLUTIONS TO EXERCISES FOR MATHEMATICS 205A Part 3 Fall 2008 III. Spaces with special properties III.1 : Compact spaces I Problems from Munkres, 26, pp. 170 172 3. Show that a finite union of compact subspaces

More information

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem Time on my hands: Coin tosses. Problem Formulation: Suppose that I have

More information

Lecture 4 Online and streaming algorithms for clustering

Lecture 4 Online and streaming algorithms for clustering CSE 291: Geometric algorithms Spring 2013 Lecture 4 Online and streaming algorithms for clustering 4.1 On-line k-clustering To the extent that clustering takes place in the brain, it happens in an on-line

More information

160 CHAPTER 4. VECTOR SPACES

160 CHAPTER 4. VECTOR SPACES 160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results

More information

Adaptive Search with Stochastic Acceptance Probabilities for Global Optimization

Adaptive Search with Stochastic Acceptance Probabilities for Global Optimization Adaptive Search with Stochastic Acceptance Probabilities for Global Optimization Archis Ghate a and Robert L. Smith b a Industrial Engineering, University of Washington, Box 352650, Seattle, Washington,

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Joschka Bödecker AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg jboedeck@informatik.uni-freiburg.de

More information

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With

More information

A Sublinear Bipartiteness Tester for Bounded Degree Graphs

A Sublinear Bipartiteness Tester for Bounded Degree Graphs A Sublinear Bipartiteness Tester for Bounded Degree Graphs Oded Goldreich Dana Ron February 5, 1998 Abstract We present a sublinear-time algorithm for testing whether a bounded degree graph is bipartite

More information

5 Directed acyclic graphs

5 Directed acyclic graphs 5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate

More information

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem

More information

17.3.1 Follow the Perturbed Leader

17.3.1 Follow the Perturbed Leader CS787: Advanced Algorithms Topic: Online Learning Presenters: David He, Chris Hopman 17.3.1 Follow the Perturbed Leader 17.3.1.1 Prediction Problem Recall the prediction problem that we discussed in class.

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Globally Optimal Crowdsourcing Quality Management

Globally Optimal Crowdsourcing Quality Management Globally Optimal Crowdsourcing Quality Management Akash Das Sarma Stanford University akashds@stanford.edu Aditya G. Parameswaran University of Illinois (UIUC) adityagp@illinois.edu Jennifer Widom Stanford

More information

Connectivity and cuts

Connectivity and cuts Math 104, Graph Theory February 19, 2013 Measure of connectivity How connected are each of these graphs? > increasing connectivity > I G 1 is a tree, so it is a connected graph w/minimum # of edges. Every

More information

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large

More information

1 The Brownian bridge construction

1 The Brownian bridge construction The Brownian bridge construction The Brownian bridge construction is a way to build a Brownian motion path by successively adding finer scale detail. This construction leads to a relatively easy proof

More information

Lecture 13: Factoring Integers

Lecture 13: Factoring Integers CS 880: Quantum Information Processing 0/4/0 Lecture 3: Factoring Integers Instructor: Dieter van Melkebeek Scribe: Mark Wellons In this lecture, we review order finding and use this to develop a method

More information

Small Maximal Independent Sets and Faster Exact Graph Coloring

Small Maximal Independent Sets and Faster Exact Graph Coloring Small Maximal Independent Sets and Faster Exact Graph Coloring David Eppstein Univ. of California, Irvine Dept. of Information and Computer Science The Exact Graph Coloring Problem: Given an undirected

More information

Notes from Week 1: Algorithms for sequential prediction

Notes from Week 1: Algorithms for sequential prediction CS 683 Learning, Games, and Electronic Markets Spring 2007 Notes from Week 1: Algorithms for sequential prediction Instructor: Robert Kleinberg 22-26 Jan 2007 1 Introduction In this course we will be looking

More information

Introduction to Online Learning Theory

Introduction to Online Learning Theory Introduction to Online Learning Theory Wojciech Kot lowski Institute of Computing Science, Poznań University of Technology IDSS, 04.06.2013 1 / 53 Outline 1 Example: Online (Stochastic) Gradient Descent

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

Practice Problems for Homework #8. Markov Chains. Read Sections 7.1-7.3. Solve the practice problems below.

Practice Problems for Homework #8. Markov Chains. Read Sections 7.1-7.3. Solve the practice problems below. Practice Problems for Homework #8. Markov Chains. Read Sections 7.1-7.3 Solve the practice problems below. Open Homework Assignment #8 and solve the problems. 1. (10 marks) A computer system can operate

More information

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014 Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)

More information

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)! Math 7 Fall 205 HOMEWORK 5 SOLUTIONS Problem. 2008 B2 Let F 0 x = ln x. For n 0 and x > 0, let F n+ x = 0 F ntdt. Evaluate n!f n lim n ln n. By directly computing F n x for small n s, we obtain the following

More information

General Sampling Methods

General Sampling Methods General Sampling Methods Reference: Glasserman, 2.2 and 2.3 Claudio Pacati academic year 2016 17 1 Inverse Transform Method Assume U U(0, 1) and let F be the cumulative distribution function of a distribution

More information

1. Prove that the empty set is a subset of every set.

1. Prove that the empty set is a subset of every set. 1. Prove that the empty set is a subset of every set. Basic Topology Written by Men-Gen Tsai email: b89902089@ntu.edu.tw Proof: For any element x of the empty set, x is also an element of every set since

More information

Today s Topics. Primes & Greatest Common Divisors

Today s Topics. Primes & Greatest Common Divisors Today s Topics Primes & Greatest Common Divisors Prime representations Important theorems about primality Greatest Common Divisors Least Common Multiples Euclid s algorithm Once and for all, what are prime

More information

Random graphs with a given degree sequence

Random graphs with a given degree sequence Sourav Chatterjee (NYU) Persi Diaconis (Stanford) Allan Sly (Microsoft) Let G be an undirected simple graph on n vertices. Let d 1,..., d n be the degrees of the vertices of G arranged in descending order.

More information

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm. Approximation Algorithms Chapter Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of

More information

U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009. Notes on Algebra

U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009. Notes on Algebra U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009 Notes on Algebra These notes contain as little theory as possible, and most results are stated without proof. Any introductory

More information

CS 301 Course Information

CS 301 Course Information CS 301: Languages and Automata January 9, 2009 CS 301 Course Information Prof. Robert H. Sloan Handout 1 Lecture: Tuesday Thursday, 2:00 3:15, LC A5 Weekly Problem Session: Wednesday, 4:00 4:50 p.m., LC

More information

Page 1. CSCE 310J Data Structures & Algorithms. CSCE 310J Data Structures & Algorithms. P, NP, and NP-Complete. Polynomial-Time Algorithms

Page 1. CSCE 310J Data Structures & Algorithms. CSCE 310J Data Structures & Algorithms. P, NP, and NP-Complete. Polynomial-Time Algorithms CSCE 310J Data Structures & Algorithms P, NP, and NP-Complete Dr. Steve Goddard goddard@cse.unl.edu CSCE 310J Data Structures & Algorithms Giving credit where credit is due:» Most of the lecture notes

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

STAT 830 Convergence in Distribution

STAT 830 Convergence in Distribution STAT 830 Convergence in Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Convergence in Distribution STAT 830 Fall 2011 1 / 31

More information

Lecture 6 Online and streaming algorithms for clustering

Lecture 6 Online and streaming algorithms for clustering CSE 291: Unsupervised learning Spring 2008 Lecture 6 Online and streaming algorithms for clustering 6.1 On-line k-clustering To the extent that clustering takes place in the brain, it happens in an on-line

More information

Lecture 3: Finding integer solutions to systems of linear equations

Lecture 3: Finding integer solutions to systems of linear equations Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture

More information

Find-The-Number. 1 Find-The-Number With Comps

Find-The-Number. 1 Find-The-Number With Comps Find-The-Number 1 Find-The-Number With Comps Consider the following two-person game, which we call Find-The-Number with Comps. Player A (for answerer) has a number x between 1 and 1000. Player Q (for questioner)

More information

Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses

Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses Chapter ML:IV IV. Statistical Learning Probability Basics Bayes Classification Maximum a-posteriori Hypotheses ML:IV-1 Statistical Learning STEIN 2005-2015 Area Overview Mathematics Statistics...... Stochastics

More information

The Ergodic Theorem and randomness

The Ergodic Theorem and randomness The Ergodic Theorem and randomness Peter Gács Department of Computer Science Boston University March 19, 2008 Peter Gács (Boston University) Ergodic theorem March 19, 2008 1 / 27 Introduction Introduction

More information

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students Eastern Washington University Department of Computer Science Questionnaire for Prospective Masters in Computer Science Students I. Personal Information Name: Last First M.I. Mailing Address: Permanent

More information

Part 2: Community Detection

Part 2: Community Detection Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -

More information

MTAT.07.003 Cryptology II. Sigma Protocols. Sven Laur University of Tartu

MTAT.07.003 Cryptology II. Sigma Protocols. Sven Laur University of Tartu MTAT.07.003 Cryptology II Sigma Protocols Sven Laur University of Tartu Formal Syntax Sigma protocols α R α β γ = γ sk (α, β) β B sk V pk (α, β, γ) A sigma protocol for an efficiently computable relation

More information

NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

More information

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014 Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

More information

Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation: CSE341T 08/31/2015 Lecture 3 Cost Model: Work, Span and Parallelism In this lecture, we will look at how one analyze a parallel program written using Cilk Plus. When we analyze the cost of an algorithm

More information

An Approximation Algorithm for Bounded Degree Deletion

An Approximation Algorithm for Bounded Degree Deletion An Approximation Algorithm for Bounded Degree Deletion Tomáš Ebenlendr Petr Kolman Jiří Sgall Abstract Bounded Degree Deletion is the following generalization of Vertex Cover. Given an undirected graph

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Exponential time algorithms for graph coloring

Exponential time algorithms for graph coloring Exponential time algorithms for graph coloring Uriel Feige Lecture notes, March 14, 2011 1 Introduction Let [n] denote the set {1,..., k}. A k-labeling of vertices of a graph G(V, E) is a function V [k].

More information

Probability and statistics; Rehearsal for pattern recognition

Probability and statistics; Rehearsal for pattern recognition Probability and statistics; Rehearsal for pattern recognition Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering, Department of Cybernetics Center for Machine Perception

More information

Breaking Generalized Diffie-Hellman Modulo a Composite is no Easier than Factoring

Breaking Generalized Diffie-Hellman Modulo a Composite is no Easier than Factoring Breaking Generalized Diffie-Hellman Modulo a Composite is no Easier than Factoring Eli Biham Dan Boneh Omer Reingold Abstract The Diffie-Hellman key-exchange protocol may naturally be extended to k > 2

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10 CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10 Introduction to Discrete Probability Probability theory has its origins in gambling analyzing card games, dice,

More information

1 The Line vs Point Test

1 The Line vs Point Test 6.875 PCP and Hardness of Approximation MIT, Fall 2010 Lecture 5: Low Degree Testing Lecturer: Dana Moshkovitz Scribe: Gregory Minton and Dana Moshkovitz Having seen a probabilistic verifier for linearity

More information

Hydrodynamic Limits of Randomized Load Balancing Networks

Hydrodynamic Limits of Randomized Load Balancing Networks Hydrodynamic Limits of Randomized Load Balancing Networks Kavita Ramanan and Mohammadreza Aghajani Brown University Stochastic Networks and Stochastic Geometry a conference in honour of François Baccelli

More information

Chapter 11. 11.1 Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling

Chapter 11. 11.1 Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling Approximation Algorithms Chapter Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one

More information

Ph.D. Thesis. Judit Nagy-György. Supervisor: Péter Hajnal Associate Professor

Ph.D. Thesis. Judit Nagy-György. Supervisor: Péter Hajnal Associate Professor Online algorithms for combinatorial problems Ph.D. Thesis by Judit Nagy-György Supervisor: Péter Hajnal Associate Professor Doctoral School in Mathematics and Computer Science University of Szeged Bolyai

More information

B490 Mining the Big Data. 0 Introduction

B490 Mining the Big Data. 0 Introduction B490 Mining the Big Data 0 Introduction Qin Zhang 1-1 Data Mining What is Data Mining? A definition : Discovery of useful, possibly unexpected, patterns in data. 2-1 Data Mining What is Data Mining? A

More information

Load Balancing and Switch Scheduling

Load Balancing and Switch Scheduling EE384Y Project Final Report Load Balancing and Switch Scheduling Xiangheng Liu Department of Electrical Engineering Stanford University, Stanford CA 94305 Email: liuxh@systems.stanford.edu Abstract Load

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes. 1. The advantage of.. is that they solve the problem if sequential storage representation. But disadvantage in that is they are sequential lists. [A] Lists [B] Linked Lists [A] Trees [A] Queues 2. The

More information

The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

More information

Load Balancing. Load Balancing 1 / 24

Load Balancing. Load Balancing 1 / 24 Load Balancing Backtracking, branch & bound and alpha-beta pruning: how to assign work to idle processes without much communication? Additionally for alpha-beta pruning: implementing the young-brothers-wait

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

9th Max-Planck Advanced Course on the Foundations of Computer Science (ADFOCS) Primal-Dual Algorithms for Online Optimization: Lecture 1

9th Max-Planck Advanced Course on the Foundations of Computer Science (ADFOCS) Primal-Dual Algorithms for Online Optimization: Lecture 1 9th Max-Planck Advanced Course on the Foundations of Computer Science (ADFOCS) Primal-Dual Algorithms for Online Optimization: Lecture 1 Seffi Naor Computer Science Dept. Technion Haifa, Israel Introduction

More information