# COS513 LECTURE 2 CONDITIONAL INDEPENDENCE AND FACTORIZATION

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 COS513 LECTURE 2 CONDITIONAL INDEPENDENCE AND FACTORIATION HAIPENG HENG, JIEQI U Let { 1, 2,, n } be a set of random variables. Now we have a series of questions about them: What are the conditional probabilities (answered by normalization / marginalization)? What are the independencies (answered by factorization of the joint distribution)? For now, we assume that all random variables are discrete. Then the joint distribution is a table: (0.1) p(x 1, x 2,, x n ). Therefore, Graphic Model (GM) is a economic representation of joint distribution taking advantage of the local relationships between the random variables. 1. DIRECTED GRAPHICAL MODELS (DGM) A directed GM is a Directed Acyclic Graph (DAG) G(V, E), where V, the set of vertices, represents the random variables; E, the edges, denotes the parent of relationships. We also define (1.1) Π i = set of parents of i. For example, in the graph shown in Figure 1.1, The random variables are FIGURE 1.1. Example of Graphical Model 1

2 2 HAIPENG HENG, JIEQI U { 1, 2, 6 }, and (1.2) Π 6 = { 2, 3 }. This DAG represents the following joint distribution: (1.3) p(x 1:6 ) = p(x 1 )p(x 2 x 1 )p(x 3 x 1 )p(x 4 x 2 )p(x 5 x 3 ). In general, (1.4) p(x 1:n ) = n p(x i x πi ) i=1 specifies a particular joint distribution. Note here π i stands for the set of indices of the parents of i. For those who are familiar with the term Bayesian Network, it is worth pointing out that DGM is basically a Bayesian Network. Why cycles is not allowed for DGM? We will address the reason for this acyclic requirement in the later lectures. The GM is essentially a family of distributions, it is a family of those who respect the factorization implied by the graph. If we assume that 1:6 are all binary random variables, then the naive representation (the complete table of the joint distribution) has 2 6 entries in the table. et if we notice that the representation for p(x 3 x 1 ) has only 4 entries, then the GM representation for the joint distribution has only 6 i=1 2 π i +1 = 24 entries. Thus, we replace an exponential growth in n, the total number of nodes, to an exponential growth in π i,the number of parents. Therefore, GM representation provides eminent saving in space Conditional Independence. First we define independence and conditional independence: Independence: A B p(x A, x B ) = p(x A )p(x B ) Conditional Independence: A B C p(x A, x B x C ) = p(x A x c )p(x B x C ) p(x A x B, x C ) = p(x A x c ) Independence is akin to factorization, hence akin to examination of the structure of the graph.

3 COS513 LECTURE 2 CONDITIONAL INDEPENDENCE AND FACTORIATION Basic Independencies. The Chain Rule of the probability is: n (1.5) p(x 1:n ) = p(x i x 1:i 1 ). For example, (1.6) p(x 1:6 ) = p(x 1 )p(x 2 x 1 )p(x 3 x 1, x 2 ) p(x 6 x 1,, x 5 ). i=1 This is suggestive to independencies. Conditional independencies are imbedded in the graph. By comparing equations (1.3) and (1.6), it suggests (1.7) p(x 6 x 1,, x 5 ) =, which is equivalent to (1.8) 6 1, 3, 4 2, 5. By strict computation, we can show that this is indeed true. (See Appendix A.1) Let I be a topological ordering, this is equivalent to π i appears before i, i. Let ν i be the set of indices appearing before i in I. Then we have a series of conditional independencies, given a topological ordering: (1.9) { i νi πi }. And these conditional independencies are called basic independencies. For the GM in Figure 1.1, a possible topological ordering is (1.10) I = { 1, 2, 3, 4, 5, 6 }, then, 1, 2 1, 3 2 1, 4 { 1, 3 } 2, 5 { 4, 2, 1 } 3. Basic independencies are attached to topological ordering. However, they are not all the conditional independencies implied by the graph. By simple computation (See Appendix A.2), we can confirm that (1.11) p(x 4 x 1, x 2, x 3 ) = p(x 4 x 2 ).

4 HAIPENG HENG, JIEQI U FIGURE BAES BALL ALGORITHM Three Simple Graphs. (1) A little sequence As shown in Figure 2.1, can be deemed as the past, the present and the future. Based 4 on the graph, we have the joint distribution 2 (2.1) p(x, y, z) = p(x)p(y x)p(z y) By simple derivation, we know that 6 (2.2). This is illustrated in Figure 2.2. Note that shading denotes conditioning FIGURE 2.2 Here we have the following observations: (a) This is the only conditional independency implied by this graph. (b) It suggests that graph separation can be related to the conditional probabilities. (c) The interpretationof the graph is the past is independent of the future given the present, which is the famous Markov Assumption. The above GM is a simplified version of Markov Chain. Remark: The only conditional independency does not mean that other independencies cannot hold. For some settings, other independencies may hold, but they do not hold for all joint distributions represented by the GM. So, the arrow in the graph between x and y does not mean that y has to depend on x. For some settings of p(y x), we may have, but this independency does not hold for all of the settings.

5 COS513 LECTURE 2 CONDITIONAL INDEPENDENCE AND FACTORIATION 5 (2) A little tree FIGURE 2.3 The joint distribution implied by the GM shown in Figure 2.3 is (2.3) p(x, y, z) = p(y)p(x y)p(z y). Notice that 3 5 p(x, y, z) (2.4) p(x, z y) = = p(y)p(x y)p(z y) 3 5 = p(x y)p(z y), p(y) p(y) so we have, as illustrated in Figure 2.4. FIGURE 2.4 Again, we have the following observations: (a) This is the only conditional independency implied by this graph. (b) It suggests that graph separation can be related to the conditional probabilities. (c) An interesting interpretation of this conditional independence is like this: Obviously, the shoe size (represented by ) and the amount of gray hair (represented by ) of a person is highly dependent, because a boy, with no gray hair, wears small shoes, while an old man, with many gray hairs, wears large

6 4 4 6 HAIPENG HENG, JIEQI U 2 shoes. However, when the age 6 (represented by ) is given, the 1 correlation between shoe size and amount of gray hair suddenly disappears, since age provides all 1 the information that shoe size can provide to infer amount of gray hair. The same is true for amount of gray hair. Thus, given age, shoe size and amount of gray hair are independent. This GM provides 3 us with the 5 intuition for hidden cause. 3 5 (3) Inverse Tree FIGURE 2.5 : lost my watch : aliens abduction FIGURE 2.6 For the graph shown in Figure 2.5, we observe: (a) is the only implied independency. (b) Graph separation is opposed in this case. (c) It is not necessarily true that, as illustrated in Figure 2.6. This looks a little less intuitive than the previous cases, yet we have a wonderful example: let s define : late for lunch

7 COS513 LECTURE 2 CONDITIONAL INDEPENDENCE AND FACTORIATION 7 Here, given, and explain away each other. They decreased each other s possibility if is given, hence they are not conditionally independent. To put it more specifically, say, if we know that David is late for lunch, then the two seemingly independent events, late for lunch and lost the watch suddenly explains each other away. If David comes to lunch and tells us that he lost his watch, then the probability of aliens abduction is very slim; but if we cannot find David and his watch, then the probability of aliens abduction increases Bayes Ball Algorithm. Description: Bayes Ball Algorithm is a notion of separability that let us determine the validity of an independency statement in a GM. Bayes Ball Algorithm is not for actual implementation, but we use it quite often in our minds. Is there any way to get all the conditional independencies? The only way is to try all configurations on the GM, then run Bayes Ball Algorithm to verify the independencies FIGURE 2.7 The basic idea of Bayes Ball Algorithm is like this, say, we want to verify the independency of 1 6 2, 3, as illustrated in Figure 2.7. Then we start a ball at 1 and try to bounce it to 6. According to the three rules as shown in Figure 2.8, 2.9 and If the ball can bounce from 1 to 6, then the conditional independency is not true. Otherwise, the conditional independency is verified.

8 8 HAIPENG HENG, JIEQI U FIGURE 2.8. Rule 1 for Bayes Ball Algorithm. A ball can bounce between and if is not given. However, when is given, it blocks the way between and and the ball can no longer bounce in between. FIGURE 2.9. Rule 2 for Bayes Ball Algorithm. A ball can bounce between and if is not given. However, when is given, it blocks the way between and and the ball can no longer bounce in between. FIGURE Rule 3 for Bayes Ball Algorithm. A ball cannot bounce between and if is not given. However, when is given, it connects and and the ball can bounce in between. 3. CONCLUSION The punch line of this lecture (H-C Theorem): Consider two families: (1) Family of joint distributions found by ranging over all conditional probability tables associated with G (via factorization);

9 COS513 LECTURE 2 CONDITIONAL INDEPENDENCE AND FACTORIATION 9 (2) All joint distributions that respect all conditional independence statements implied by G and d-separation (via Bayes Ball Algorithm). Theorem. (H-C Theorem) Family 1 and 2 are the same. We will try to prove this theorem in the next lecture. APPENDI A. PROOF IN THIS LECTURE A.1. Proof of 1.8. We only need to show that p(x 6 x 1:5 ) =. p(x 6 x 1:5 ) = p(x 1:6) p(x 1:5 ) p(x 1 )p(x 2 x 1 )p(x 3 x 1 )p(x 4 x 2 )p(x 5 x 3 ) = x 6 p(x 1 )p(x 2 x 1 )p(x 3 x 1 )p(x 4 x 2 )p(x 5 x 3 ) p(x 1 )p(x 2 x 1 )p(x 3 x 1 )p(x 4 x 2 )p(x 5 x 3 ) = p(x 1 )p(x 2 x 1 )p(x 3 x 1 )p(x 4 x 2 )p(x 5 x 3 ) x 6 = x 6 =. A.2. Proof of Here is the derivation: p(x 4 x 1, x 2, x 3 ) = p(x 4, x 1, x 2, x 3 ) p(x 1, x 2, x 3 ) x = 5 x 6 p(x 1:6 ) x 4 x 5 x 6 p(x 1:6 ) x = 5 x 6 p(x 1 )p(x 2 x 1 )p(x 3 x 1 )p(x 4 x 2 )p(x 5 x 3 ) x 4 x 5 x 6 p(x 1 )p(x 2 x 1 )p(x 3 x 1 )p(x 4 x 2 )p(x 5 x 3 ) p(x 4 x 2 ) x = 5 p(x 5 x 3 ) x 6 x 4 p(x 4 x 2 ) x 5 p(x 5 x 3 ) x 6 p(x 4 x 2 ) = x 4 p(x 4 x 2 ) = p(x 4 x 2 ).

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### 13.3 Inference Using Full Joint Distribution

191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

### Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

### Solutions to Exercises 8

Discrete Mathematics Lent 2009 MA210 Solutions to Exercises 8 (1) Suppose that G is a graph in which every vertex has degree at least k, where k 1, and in which every cycle contains at least 4 vertices.

### 88 CHAPTER 2. VECTOR FUNCTIONS. . First, we need to compute T (s). a By definition, r (s) T (s) = 1 a sin s a. sin s a, cos s a

88 CHAPTER. VECTOR FUNCTIONS.4 Curvature.4.1 Definitions and Examples The notion of curvature measures how sharply a curve bends. We would expect the curvature to be 0 for a straight line, to be very small

### CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

### Examples and Proofs of Inference in Junction Trees

Examples and Proofs of Inference in Junction Trees Peter Lucas LIAC, Leiden University February 4, 2016 1 Representation and notation Let P(V) be a joint probability distribution, where V stands for a

### Lecture 16 : Relations and Functions DRAFT

CS/Math 240: Introduction to Discrete Mathematics 3/29/2011 Lecture 16 : Relations and Functions Instructor: Dieter van Melkebeek Scribe: Dalibor Zelený DRAFT In Lecture 3, we described a correspondence

### 5 Directed acyclic graphs

5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate

### Tagging with Hidden Markov Models

Tagging with Hidden Markov Models Michael Collins 1 Tagging Problems In many NLP problems, we would like to model pairs of sequences. Part-of-speech (POS) tagging is perhaps the earliest, and most famous,

### Model-based Synthesis. Tony O Hagan

Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

### CHAPTER 2. Inequalities

CHAPTER 2 Inequalities In this section we add the axioms describe the behavior of inequalities (the order axioms) to the list of axioms begun in Chapter 1. A thorough mastery of this section is essential

### Probability, Conditional Independence

Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms

### COT5405 Analysis of Algorithms Homework 3 Solutions

COT0 Analysis of Algorithms Homework 3 Solutions. Prove or give a counter example: (a) In the textbook, we have two routines for graph traversal - DFS(G) and BFS(G,s) - where G is a graph and s is any

### 1 Limiting distribution for a Markov chain

Copyright c 2009 by Karl Sigman Limiting distribution for a Markov chain In these Lecture Notes, we shall study the limiting behavior of Markov chains as time n In particular, under suitable easy-to-check

### ORDERS OF ELEMENTS IN A GROUP

ORDERS OF ELEMENTS IN A GROUP KEITH CONRAD 1. Introduction Let G be a group and g G. We say g has finite order if g n = e for some positive integer n. For example, 1 and i have finite order in C, since

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

### Large-Sample Learning of Bayesian Networks is NP-Hard

Journal of Machine Learning Research 5 (2004) 1287 1330 Submitted 3/04; Published 10/04 Large-Sample Learning of Bayesian Networks is NP-Hard David Maxwell Chickering David Heckerman Christopher Meek Microsoft

### On the Unique Games Conjecture

On the Unique Games Conjecture Antonios Angelakis National Technical University of Athens June 16, 2015 Antonios Angelakis (NTUA) Theory of Computation June 16, 2015 1 / 20 Overview 1 Introduction 2 Preliminary

### princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora Scribe: One of the running themes in this course is the notion of

### Introduction to Statistical Machine Learning

CHAPTER Introduction to Statistical Machine Learning We start with a gentle introduction to statistical machine learning. Readers familiar with machine learning may wish to skip directly to Section 2,

### Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

### Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

### Artificial Intelligence. Conditional probability. Inference by enumeration. Independence. Lesson 11 (From Russell & Norvig)

Artificial Intelligence Conditional probability Conditional or posterior probabilities e.g., cavity toothache) = 0.8 i.e., given that toothache is all I know tation for conditional distributions: Cavity

### COS513 LECTURE 5 JUNCTION TREE ALGORITHM

COS513 LECTURE 5 JUNCTION TREE ALGORITHM D. EIS AND V. KOSTINA The junction tree algorithm is the culmination of the way graph theory and probability combine to form graphical models. After we discuss

### CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992

CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992 Original Lecture #6: 28 January 1991 Topics: Triangulating Simple Polygons

### Logic, Probability and Learning

Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

### CMSC 451: Graph Properties, DFS, BFS, etc.

CMSC 451: Graph Properties, DFS, BFS, etc. Slides By: Carl Kingsford Department of Computer Science University of Maryland, College Park Based on Chapter 3 of Algorithm Design by Kleinberg & Tardos. Graphs

### 8.1 Min Degree Spanning Tree

CS880: Approximations Algorithms Scribe: Siddharth Barman Lecturer: Shuchi Chawla Topic: Min Degree Spanning Tree Date: 02/15/07 In this lecture we give a local search based algorithm for the Min Degree

### Lecture Notes on Spanning Trees

Lecture Notes on Spanning Trees 15-122: Principles of Imperative Computation Frank Pfenning Lecture 26 April 26, 2011 1 Introduction In this lecture we introduce graphs. Graphs provide a uniform model

### Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 6 Sequential Monte Carlo methods II February

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

### Solution Using the geometric series a/(1 r) = x=1. x=1. Problem For each of the following distributions, compute

Math 472 Homework Assignment 1 Problem 1.9.2. Let p(x) 1/2 x, x 1, 2, 3,..., zero elsewhere, be the pmf of the random variable X. Find the mgf, the mean, and the variance of X. Solution 1.9.2. Using the

### Fourier Series. Chapter Some Properties of Functions Goal Preliminary Remarks

Chapter 3 Fourier Series 3.1 Some Properties of Functions 3.1.1 Goal We review some results about functions which play an important role in the development of the theory of Fourier series. These results

### Definition The covariance of X and Y, denoted by cov(x, Y ) is defined by. cov(x, Y ) = E(X µ 1 )(Y µ 2 ).

Correlation Regression Bivariate Normal Suppose that X and Y are r.v. s with joint density f(x y) and suppose that the means of X and Y are respectively µ 1 µ 2 and the variances are 1 2. Definition The

### Probabilistic Graphical Models

Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 4, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 4, 2011 1 / 22 Bayesian Networks and independences

### Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

### Marginal Independence and Conditional Independence

Marginal Independence and Conditional Independence Computer Science cpsc322, Lecture 26 (Textbook Chpt 6.1-2) March, 19, 2010 Recap with Example Marginalization Conditional Probability Chain Rule Lecture

### Lecture 11: Graphical Models for Inference

Lecture 11: Graphical Models for Inference So far we have seen two graphical models that are used for inference - the Bayesian network and the Join tree. These two both represent the same joint probability

### CSE 20: Discrete Mathematics for Computer Science. Prof. Miles Jones. Today s Topics: Graphs. The Internet graph

Today s Topics: CSE 0: Discrete Mathematics for Computer Science Prof. Miles Jones. Graphs. Some theorems on graphs. Eulerian graphs Graphs! Model relations between pairs of objects The Internet graph!

### Preference Relations and Choice Rules

Preference Relations and Choice Rules Econ 2100, Fall 2015 Lecture 1, 31 August Outline 1 Logistics 2 Binary Relations 1 Definition 2 Properties 3 Upper and Lower Contour Sets 3 Preferences 4 Choice Correspondences

### Data Structures and Algorithms Written Examination

Data Structures and Algorithms Written Examination 22 February 2013 FIRST NAME STUDENT NUMBER LAST NAME SIGNATURE Instructions for students: Write First Name, Last Name, Student Number and Signature where

### WEEK #5: Trig Functions, Optimization

WEEK #5: Trig Functions, Optimization Goals: Trigonometric functions and their derivatives Optimization Textbook reading for Week #5: Read Sections 1.8, 2.10, 3.3 Trigonometric Functions From Section 1.8

### Section 4.3 Image and Inverse Image of a Set

Section 43 Image and Inverse Image of a Set Purpose of Section: To introduce the concept of the image and inverse image of a set We show that unions of sets are preserved under a mapping whereas only one-to-one

### Lecture 13 - Basic Number Theory.

Lecture 13 - Basic Number Theory. Boaz Barak March 22, 2010 Divisibility and primes Unless mentioned otherwise throughout this lecture all numbers are non-negative integers. We say that A divides B, denoted

### Lecture 1 Newton s method.

Lecture 1 Newton s method. 1 Square roots 2 Newton s method. The guts of the method. A vector version. Implementation. The existence theorem. Basins of attraction. The Babylonian algorithm for finding

### NP-complete? NP-hard? Some Foundations of Complexity. Prof. Sven Hartmann Clausthal University of Technology Department of Informatics

NP-complete? NP-hard? Some Foundations of Complexity Prof. Sven Hartmann Clausthal University of Technology Department of Informatics Tractability of Problems Some problems are undecidable: no computer

### Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

### Intelligent Systems: Reasoning and Recognition. Uncertainty and Plausible Reasoning

Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2015/2016 Lesson 12 25 march 2015 Uncertainty and Plausible Reasoning MYCIN (continued)...2 Backward

### Evaluation of Machine Learning Techniques for Green Energy Prediction

arxiv:1406.3726v1 [cs.lg] 14 Jun 2014 Evaluation of Machine Learning Techniques for Green Energy Prediction 1 Objective Ankur Sahai University of Mainz, Germany We evaluate Machine Learning techniques

### Lecture 17 : Equivalence and Order Relations DRAFT

CS/Math 240: Introduction to Discrete Mathematics 3/31/2011 Lecture 17 : Equivalence and Order Relations Instructor: Dieter van Melkebeek Scribe: Dalibor Zelený DRAFT Last lecture we introduced the notion

### Notes on Matrix Multiplication and the Transitive Closure

ICS 6D Due: Wednesday, February 25, 2015 Instructor: Sandy Irani Notes on Matrix Multiplication and the Transitive Closure An n m matrix over a set S is an array of elements from S with n rows and m columns.

### Reading 13 : Finite State Automata and Regular Expressions

CS/Math 24: Introduction to Discrete Mathematics Fall 25 Reading 3 : Finite State Automata and Regular Expressions Instructors: Beck Hasti, Gautam Prakriya In this reading we study a mathematical model

### Monte Carlo-based statistical methods (MASM11/FMS091)

Monte Carlo-based statistical methods (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 6 Sequential Monte Carlo methods II February 7, 2014 M. Wiktorsson

### Lecture 5 : Continuous Functions Definition 1 We say the function f is continuous at a number a if

Lecture 5 : Continuous Functions Definition We say the function f is continuous at a number a if f(x) = f(a). (i.e. we can make the value of f(x) as close as we like to f(a) by taking x sufficiently close

### Prime Numbers. Chapter Primes and Composites

Chapter 2 Prime Numbers The term factoring or factorization refers to the process of expressing an integer as the product of two or more integers in a nontrivial way, e.g., 42 = 6 7. Prime numbers are

### 13 Infinite Sets. 13.1 Injections, Surjections, and Bijections. mcs-ftl 2010/9/8 0:40 page 379 #385

mcs-ftl 2010/9/8 0:40 page 379 #385 13 Infinite Sets So you might be wondering how much is there to say about an infinite set other than, well, it has an infinite number of elements. Of course, an infinite

### Gibbs Sampling and Online Learning Introduction

Statistical Techniques in Robotics (16-831, F14) Lecture#10(Tuesday, September 30) Gibbs Sampling and Online Learning Introduction Lecturer: Drew Bagnell Scribes: {Shichao Yang} 1 1 Sampling Samples are

### Markov random fields and Gibbs measures

Chapter Markov random fields and Gibbs measures 1. Conditional independence Suppose X i is a random element of (X i, B i ), for i = 1, 2, 3, with all X i defined on the same probability space (.F, P).

### 1 Basic Definitions and Concepts in Graph Theory

CME 305: Discrete Mathematics and Algorithms 1 Basic Definitions and Concepts in Graph Theory A graph G(V, E) is a set V of vertices and a set E of edges. In an undirected graph, an edge is an unordered

### Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

CSE341T 08/31/2015 Lecture 3 Cost Model: Work, Span and Parallelism In this lecture, we will look at how one analyze a parallel program written using Cilk Plus. When we analyze the cost of an algorithm

### Change of Continuous Random Variable

Change of Continuous Random Variable All you are responsible for from this lecture is how to implement the Engineer s Way (see page 4) to compute how the probability density function changes when we make

### Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan

Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan 3 Binary Operations We are used to addition and multiplication of real numbers. These operations combine two real numbers

### Introduction to Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem

### Conductance, the Normalized Laplacian, and Cheeger s Inequality

Spectral Graph Theory Lecture 6 Conductance, the Normalized Laplacian, and Cheeger s Inequality Daniel A. Spielman September 21, 2015 Disclaimer These notes are not necessarily an accurate representation

### STRAND B: Number Theory. UNIT B2 Number Classification and Bases: Text * * * * * Contents. Section. B2.1 Number Classification. B2.

STRAND B: Number Theory B2 Number Classification and Bases Text Contents * * * * * Section B2. Number Classification B2.2 Binary Numbers B2.3 Adding and Subtracting Binary Numbers B2.4 Multiplying Binary

### On the Hardness of Topology Inference

On the Hardness of Topology Inference H. B. Acharya 1 and M. G. Gouda 2 1 The University of Texas at Austin, USA acharya@cs.utexas.edu 2 The National Science Foundation, USA mgouda@nsf.gov Abstract. Many

### Inverse Trigonometric Functions - Trigonometric Equations

Inverse Trigonometric Functions - Trigonometric Equations Dr. Philippe B. Laval Kennesaw STate University April 0, 005 Abstract This handout defines the inverse of the sine, cosine and tangent functions.

### Lecture 2: Universality

CS 710: Complexity Theory 1/21/2010 Lecture 2: Universality Instructor: Dieter van Melkebeek Scribe: Tyson Williams In this lecture, we introduce the notion of a universal machine, develop efficient universal

### MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

### Applications of Trigonometry

chapter 6 Tides on a Florida beach follow a periodic pattern modeled by trigonometric functions. Applications of Trigonometry This chapter focuses on applications of the trigonometry that was introduced

### Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Models vs. Patterns Models A model is a high level, global description of a

### Lecture 10 - Cryptography and Information Theory

Lecture 10 - Cryptography and Information Theory Jan Bouda FI MU May 18, 2012 Jan Bouda (FI MU) Lecture 10 - Cryptography and Information Theory May 18, 2012 1 / 14 Part I Cryptosystem Jan Bouda (FI MU)

### Cooperative Games The Shapley value and Weighted Voting. Yair Zick

Cooperative Games The Shapley value and Weighted Voting Yair Zick The Shapley Value Given a player, and a set, the marginal contribution of to is How much does contribute by joining? Given a permutation

### Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

### 6.3 Conditional Probability and Independence

222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

### Fourier Series. 1. Full-range Fourier Series. ) + b n sin L. [ a n cos L )

Fourier Series These summary notes should be used in conjunction with, and should not be a replacement for, your lecture notes. You should be familiar with the following definitions. A function f is periodic

### Chapter 12: Cost Curves

Chapter 12: Cost Curves 12.1: Introduction In chapter 11 we found how to minimise the cost of producing any given level of output. This enables us to find the cheapest cost of producing any given level

### Announcements. CS243: Discrete Structures. More on Cryptography and Mathematical Induction. Agenda for Today. Cryptography

Announcements CS43: Discrete Structures More on Cryptography and Mathematical Induction Işıl Dillig Class canceled next Thursday I am out of town Homework 4 due Oct instead of next Thursday (Oct 18) Işıl

### Math 313 Lecture #10 2.2: The Inverse of a Matrix

Math 1 Lecture #10 2.2: The Inverse of a Matrix Matrix algebra provides tools for creating many useful formulas just like real number algebra does. For example, a real number a is invertible if there is

### Probabilistic Graphical Models Homework 1: Due January 29, 2014 at 4 pm

Probabilistic Graphical Models 10-708 Homework 1: Due January 29, 2014 at 4 pm Directions. This homework assignment covers the material presented in Lectures 1-3. You must complete all four problems to

### Introduction to Stationary Distributions

13 Introduction to Stationary Distributions We first briefly review the classification of states in a Markov chain with a quick example and then begin the discussion of the important notion of stationary

### Graph Theory. Directed and undirected graphs make useful mental models for many situations. These objects are loosely defined as follows:

Graph Theory Directed and undirected graphs make useful mental models for many situations. These objects are loosely defined as follows: Definition An undirected graph is a (finite) set of nodes, some

### 6.5. Log-linear Graphs. Introduction. Prerequisites. Learning Outcomes

Log-linear Graphs. Introduction In this Section we employ our knowledge of logarithms to simplify plotting the relation between one variable and another. In particular we consider those situations in which

### Lecture 11. Shuanglin Shao. October 2nd and 7th, 2013

Lecture 11 Shuanglin Shao October 2nd and 7th, 2013 Matrix determinants: addition. Determinants: multiplication. Adjoint of a matrix. Cramer s rule to solve a linear system. Recall that from the previous

### Basics of Statistical Machine Learning

CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

### Math 443/543 Graph Theory Notes 4: Connector Problems

Math 443/543 Graph Theory Notes 4: Connector Problems David Glickenstein September 19, 2012 1 Trees and the Minimal Connector Problem Here is the problem: Suppose we have a collection of cities which we

### 3 0 + 4 + 3 1 + 1 + 3 9 + 6 + 3 0 + 1 + 3 0 + 1 + 3 2 mod 10 = 4 + 3 + 1 + 27 + 6 + 1 + 1 + 6 mod 10 = 49 mod 10 = 9.

SOLUTIONS TO HOMEWORK 2 - MATH 170, SUMMER SESSION I (2012) (1) (Exercise 11, Page 107) Which of the following is the correct UPC for Progresso minestrone soup? Show why the other numbers are not valid

### ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

ECON 459 Game Theory Lecture Notes Auctions Luca Anderlini Spring 2015 These notes have been used before. If you can still spot any errors or have any suggestions for improvement, please let me know. 1

### Load Balancing and Switch Scheduling

EE384Y Project Final Report Load Balancing and Switch Scheduling Xiangheng Liu Department of Electrical Engineering Stanford University, Stanford CA 94305 Email: liuxh@systems.stanford.edu Abstract Load

### LAMC Beginners Circle: Parity of a Permutation Problems from Handout by Oleg Gleizer Solutions by James Newton

LAMC Beginners Circle: Parity of a Permutation Problems from Handout by Oleg Gleizer Solutions by James Newton 1. Take a two-digit number and write it down three times to form a six-digit number. For example,

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2 Proofs Intuitively, the concept of proof should already be familiar We all like to assert things, and few of us

### Discrete Mathematics for CS Fall 2006 Papadimitriou & Vazirani Lecture 22

CS 70 Discrete Mathematics for CS Fall 2006 Papadimitriou & Vazirani Lecture 22 Introduction to Discrete Probability Probability theory has its origins in gambling analyzing card games, dice, roulette

### Review for Calculus Rational Functions, Logarithms & Exponentials

Definition and Domain of Rational Functions A rational function is defined as the quotient of two polynomial functions. F(x) = P(x) / Q(x) The domain of F is the set of all real numbers except those for

### CS5314 Randomized Algorithms. Lecture 16: Balls, Bins, Random Graphs (Random Graphs, Hamiltonian Cycles)

CS5314 Randomized Algorithms Lecture 16: Balls, Bins, Random Graphs (Random Graphs, Hamiltonian Cycles) 1 Objectives Introduce Random Graph Model used to define a probability space for all graphs with

### Homework 5 Solutions

Homework 5 Solutions 4.2: 2: a. 321 = 256 + 64 + 1 = (01000001) 2 b. 1023 = 512 + 256 + 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1 = (1111111111) 2. Note that this is 1 less than the next power of 2, 1024, which

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 5

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 5 Modular Arithmetic One way to think of modular arithmetic is that it limits numbers to a predefined range {0,1,...,N

### B such that AB = I and BA = I. (We say B is an inverse of A.) Definition A square matrix A is invertible (or nonsingular) if matrix

Matrix inverses Recall... Definition A square matrix A is invertible (or nonsingular) if matrix B such that AB = and BA =. (We say B is an inverse of A.) Remark Not all square matrices are invertible.