# 1. First, write our query in general form: 2. Next, loop iteratively to calculate main result:

Save this PDF as:

Size: px
Start display at page:

Download "1. First, write our query in general form: 2. Next, loop iteratively to calculate main result:"

## Transcription

1 Variable Elimination Class #17: Inference in Bayes Nets, II Artificial Intelligence (CS 452/552): M. Allen, 16 Oct. 15! Want to know the probability of some value of X, given known evidence e, and other background variables Y (together, X, e, & Y are the entire set of variables in BN) 1. First, write our query in general form: P (x e) = P (x, e) = P (x, e, Y )= y 1 y Next, loop iteratively to calculate main result:! Move all irrelevant terms outside of inner sum! Add up terms for inner sum, getting new term! Put new term back into overall product 3. Normalize when you are done to get α = 1 / P(e) y k v BN P (v P arents(v)) 2" Factors in Variable Elimination! Terms we multiply and add are called factors! Initially, just basic probabilities that are taken directly from the BN: P(v Parents(v))! As we sum variables out to eliminate them, we replace these with new probabilities (marginal or conditional)! For convenience, we can write these new probabilities in factor notation:! The factor is indexed (subscript) with eliminated variable! It has arguments (in brackets) of things not yet eliminated A More Complex Example! Lung diagnostics: Tuberculosis Visit to Victim Abnormality in Chest Lung Cancer Smoking Bronchitis! For instance, we could write f V (U,Z) for the sum of all values of V over some probabilities involving U and Z X-Ray Dyspnea 3" 4" 1

2 ! Say we wish to know P(d)! Must eliminate: V,S,X,T,L,A,B! Our initial factors are probabilities: P (V ) P (S) P (T V ) P (L S) P (B S) P (A T,L) P (X A) P (d A, B)! We eliminate V by summing: f V (T )= v P (v) P (T v)! Thus, new factor list is: f V (T ) P (S) P (L S) P (B S) P (A T,L) P (X A) P (d A, B)! Note that f V (T) here is just the same thing as P ( T ), but this is not always the case (may be a complex combination) T X V A L D S B! Say we wish to know P(d)! Must eliminate: S,X,T,L,A,B T L A B! Our initial factors are: X D P (V ) P (S) P (T V ) P (L S) P (B S) P (A T,L) P (X A) P (d A, B) f V (T ) P (S) P (L S) P (B S) P (A T,L) P (X A) P (d A, B)! We eliminate S by summing:! Thus, new factor list is: f S (B,L) = P (s) P (B s) P (L s) s f V (T ) f S (B,L) P (A T,L) P (X A) P (d A, B)! New factor, f S (B,L), combines multiple variables V S 5" 6"! Say we wish to know P(d)! Now eliminate: X,T,L,A,B! Our initial factors are: P (V ) P (S) P (T V ) P (L S) P (B S) P (A T,L) P (X A) P (d A, B) f V (T ) P (S) P (L S) P (B S) P (A T,L) P (X A) P (d A, B) f V (T ) f S (B,L) P (A T,L) P (X A) P (d A, B) f V (T ) f S (B,L) f X (A)P (A T,L) P (d A, B) f S (B,L) f X (A)f T (A, L) P (d A, B) f L (A, B) f X (A) P (d A, B) f A (B,d) f B (d)! Where we get our final answer by eliminating B: f B (d) = X b f A (b, d) T X V A L D S B Summary of Variable Elimination! Essentially, VE algorithm is simply doing a lot of rewriting, rearranging, and calculation! The variable elimination steps, where we do the summing up, is where all the work is done! The actual amount of work depends upon the order in which we choose to eliminate! While we do O(n) work for a factor (where n is the number of probability values the factor contains), there can be exponential blow-up in the size of the factors! Intelligent selection policies help to reduce this, but essential NP-hardness of exact inference means it is sometimes unavoidable 7" 8" 2

3 How Do We Avoid the Problems?! Many kinds of approximation techniques in CS! Most all work on the basis of a trade-off: 1. Answer returned is often not 100% correct 2. We get a faster, more efficient algorithm! What is the fastest, easiest approximation method you can think of for Bayes Nets?! If asked, for BN with evidence E, what P(X E) is, what is a quick way to return a possible approximate answer? Better Approximations! Of course, we can always just guess!! Very quick, but often very poor! This produces an unbounded approximation: there is no precise limit on how wrong we can be! Really want bounded approximation methods! TANSTAAFL: the quality of response is usually a function of the time it took! Note: some problems simply cannot be approximated in a bounded fashion 9" 10" Stochastic Simulation Basic Direct Sampling! (Pseudo-)Randomly simulate probabilistic events! Use observed occurrences to estimate actual likelihood! Over time, converge to actual probability! Example techniques:! Direct Sampling: basic samples without evidence! Rejection Sampling: samples that disagree with evidence are thrown out before calculating probabilities! Likelihood Weighting: use evidence to weight samples! Markov Chain Monte Carlo (MCMC): sample from stochastic process with stationary distribution that is the true posterior (i.e., the value we are looking for)! Take empty network (no evidence)! Sample network in topological order (that is, top-down, from parents to children)! At each step, we base the probability distribution used to sample a node on values already drawn for its parents! To find a probability, count the number of samples that include the event you are looking for, and divide by the total number of samples taken. 11" 12" 3

4 Direct Sampling Algorithm! Make sure children come after all their parents in the ordering of variables 1 n! Repeat process to generate a database of samples! Each of these is some combination of values (event)! Estimate P(x 1,,x n ), the prior probability of a particular event by dividing into the total observed Want to find out how likely different combinations of variables are Start by sampling top node, based on its prior probability That is, we choose C = True or C = False with 0.5 probability 13" 14" Suppose it comes up True Now base samples of the children on this fact Since Cloudy is True, we use the appropriate conditional probabilities for Sprinkler and Rainy 15" 16" 4

5 In this sample, Sprinkler = False (the most likely value, given probability 0.9) Further, Rain = True (again most likely) 17" 18" Now, since Sprinkler = False and Rain = True, we use that combination to set probability of WetGrass = True to 0.9 and sample that node in turn Now, WetGrass = True as well Thus, we have a sample of one event: (C, S, R, W ) We could then count this occurrence (with just this sample, we would say that the probability of this event is 1.0, but it will change over time with more samples) 19" 20" 5

6 Problems Sample Convergence Rates P(B) =.001! Burglary Earthquake P(E) =.002! Alarm P(A B,E) =.95! P(A B, E) =.94! P(A B,E) =.29! P(A B, E) =.001! P(J A) =.90! P(J A) =.05! JohnCalls MaryCalls P(M A) =.70! P(M A) =.01! B, E, A, J, M B, E, A, J, M! 10 samples of this network, and almost all (9) are the same! Rare events can be sampled so infrequently that they seem never to happen! Can take many samples to get a good approximation! In general, we may not know how many we need ahead of time 21" 22" Other Sampling Methods Markov Chain Simulation! While direct sampling can be affected by the order in which we choose to sample things, other stochastic techniques avoid this limitation! Markov Chain sampling: performs a random walk around the network, sampling combinations of variable values! Monte Carlo Markov Chain (MCMC) algorithm converges to the same probabilities over time, no matter what order is chosen at random! In all cases, convergence to a good, relatively precise answer can be very slow, especially in the case of extremal probabilities (close to 1 or 0)! Another basic property of BN s: node X is conditionally independent of all other nodes in the network, if we are given values of: 1. All parent nodes 2. All child nodes 3. All the other parents of its children! Known as the Markov Blanket, and is equivalent to d-separation 23" 24" 6

7 Markov Chain Monte Carlo (MCMC)! Take a random walk around the Bayes Network: 1. Set BN to one particular event-state at random. 2. Pick node X at random. 3. Randomly sample X, conditioned on settings of its Markov Blanket nodes, tweaking its values. 4. Move to next state based on state of X.! Wanders around state-space randomly changing one variable value at a time and counting states we see! Over time, reaches a stationary distribution: the fraction of time spent in each state is proportional to its actual posterior probability MCMC with Evidence! If we have some evidence already, we do basically the same thing, except that we never change those values! E.g., if we have Sprinkler and Wet-Grass both true: 25" 26" Upcoming Events! Homework 02: due Monday, 19 Oct., before class! Midterm: Friday, 23 October! Practice Exam handed out over the weekend! Covers everything through end of this week! Office Hours: Wing 210! Tuesday/Thursday, 11:30 AM 1:00 PM! Tuesday/Thursday, 5:00 PM 6:00 PM! Friday, 12:30 PM 2:00 PM 27" 7

### Variable Elimination 1

Variable Elimination 1 Inference Exact inference Enumeration Variable elimination Junction trees and belief propagation Approximate inference Loopy belief propagation Sampling methods: likelihood weighting,

### Logic, Probability and Learning

Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

### Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

### Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

### CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

### Probability, Conditional Independence

Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms

### Artificial Intelligence

Artificial Intelligence ICS461 Fall 2010 Nancy E. Reed nreed@hawaii.edu 1 Lecture #13 Uncertainty Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule Uncertainty

### 13.3 Inference Using Full Joint Distribution

191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

### Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

### Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

### / Informatica. Intelligent Systems. Agents dealing with uncertainty. Example: driving to the airport. Why logic fails

Intelligent Systems Prof. dr. Paul De Bra Technische Universiteit Eindhoven debra@win.tue.nl Agents dealing with uncertainty Uncertainty Probability Syntax and Semantics Inference Independence and Bayes'

### Artificial Intelligence. Conditional probability. Inference by enumeration. Independence. Lesson 11 (From Russell & Norvig)

Artificial Intelligence Conditional probability Conditional or posterior probabilities e.g., cavity toothache) = 0.8 i.e., given that toothache is all I know tation for conditional distributions: Cavity

### Lecture 6: The Bayesian Approach

Lecture 6: The Bayesian Approach What Did We Do Up to Now? We are given a model Log-linear model, Markov network, Bayesian network, etc. This model induces a distribution P(X) Learning: estimate a set

### Outline. Uncertainty. Uncertainty. Making decisions under uncertainty. Probability. Methods for handling uncertainty

Outline Uncertainty Chapter 13 Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me

### CS 331: Artificial Intelligence Fundamentals of Probability II. Joint Probability Distribution. Marginalization. Marginalization

Full Joint Probability Distributions CS 331: Artificial Intelligence Fundamentals of Probability II Toothache Cavity Catch Toothache, Cavity, Catch false false false 0.576 false false true 0.144 false

### Introduction to Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem

### STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

### Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

### Markov Chain Monte Carlo Simulation Made Simple

Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical

PROBLEM SET 1 For the first three answer true or false and explain your answer. A picture is often helpful. 1. Suppose the significance level of a hypothesis test is α=0.05. If the p-value of the test

### Model-based Synthesis. Tony O Hagan

Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

### Life of A Knowledge Base (KB)

Life of A Knowledge Base (KB) A knowledge base system is a special kind of database management system to for knowledge base management. KB extraction: knowledge extraction using statistical models in NLP/ML

### Probabilistic Graphical Models Homework 1: Due January 29, 2014 at 4 pm

Probabilistic Graphical Models 10-708 Homework 1: Due January 29, 2014 at 4 pm Directions. This homework assignment covers the material presented in Lectures 1-3. You must complete all four problems to

### November 28 th, Carlos Guestrin. Lower dimensional projections

PCA Machine Learning 070/578 Carlos Guestrin Carnegie Mellon University November 28 th, 2007 Lower dimensional projections Rather than picking a subset of the features, we can new features that are combinations

### Bayesian Tutorial (Sheet Updated 20 March)

Bayesian Tutorial (Sheet Updated 20 March) Practice Questions (for discussing in Class) Week starting 21 March 2016 1. What is the probability that the total of two dice will be greater than 8, given that

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### Probability OPRE 6301

Probability OPRE 6301 Random Experiment... Recall that our eventual goal in this course is to go from the random sample to the population. The theory that allows for this transition is the theory of probability.

### Relational Dynamic Bayesian Networks: a report. Cristina Manfredotti

Relational Dynamic Bayesian Networks: a report Cristina Manfredotti Dipartimento di Informatica, Sistemistica e Comunicazione (D.I.S.Co.) Università degli Studi Milano-Bicocca manfredotti@disco.unimib.it

### Homework #3 is due Friday by 5pm. Homework #4 will be posted to the class website later this week. It will be due Friday, March 7 th, at 5pm.

Homework #3 is due Friday by 5pm. Homework #4 will be posted to the class website later this week. It will be due Friday, March 7 th, at 5pm. Political Science 15 Lecture 12: Hypothesis Testing Sampling

### Bayesian Statistics in One Hour. Patrick Lam

Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical

### Basics of Probability

Basics of Probability 1 Sample spaces, events and probabilities Begin with a set Ω the sample space e.g., 6 possible rolls of a die. ω Ω is a sample point/possible world/atomic event A probability space

### Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases. Andreas Züfle

Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases Andreas Züfle Geo Spatial Data Huge flood of geo spatial data Modern technology New user mentality Great research potential

### Joint Distribution and Correlation

Joint Distribution and Correlation Michael Ash Lecture 3 Reminder: Start working on the Problem Set Mean and Variance of Linear Functions of an R.V. Linear Function of an R.V. Y = a + bx What are the properties

### EC 6310: Advanced Econometric Theory

EC 6310: Advanced Econometric Theory July 2008 Slides for Lecture on Bayesian Computation in the Nonlinear Regression Model Gary Koop, University of Strathclyde 1 Summary Readings: Chapter 5 of textbook.

### The rule for computing conditional property can be interpreted different. In Question 2, P B

Question 4: What is the product rule for probability? The rule for computing conditional property can be interpreted different. In Question 2, P A and B we defined the conditional probability PA B. If

### Counting the number of Sudoku s by importance sampling simulation

Counting the number of Sudoku s by importance sampling simulation Ad Ridder Vrije University, Amsterdam, Netherlands May 31, 2013 Abstract Stochastic simulation can be applied to estimate the number of

### Lecture 4 : Bayesian inference

Lecture 4 : Bayesian inference The Lecture dark 4 energy : Bayesian puzzle inference What is the Bayesian approach to statistics? How does it differ from the frequentist approach? Conditional probabilities,

### Bayesian Methods. 1 The Joint Posterior Distribution

Bayesian Methods Every variable in a linear model is a random variable derived from a distribution function. A fixed factor becomes a random variable with possibly a uniform distribution going from a lower

### Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

1 Learning Goals Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1. Be able to apply Bayes theorem to compute probabilities. 2. Be able to identify

### Dirichlet Processes A gentle tutorial

Dirichlet Processes A gentle tutorial SELECT Lab Meeting October 14, 2008 Khalid El-Arini Motivation We are given a data set, and are told that it was generated from a mixture of Gaussian distributions.

### Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

### CS 771 Artificial Intelligence. Reasoning under uncertainty

CS 771 Artificial Intelligence Reasoning under uncertainty Today Representing uncertainty is useful in knowledge bases Probability provides a coherent framework for uncertainty Review basic concepts in

### Uncertainty. Chapter 13. Chapter 13 1

Uncertainty Chapter 13 Chapter 13 1 Attribution Modified from Stuart Russell s slides (Berkeley) Chapter 13 2 Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes Rule

### Finish Set Theory Nested Quantifiers

Finish Set Theory Nested Quantifiers Margaret M. Fleck 18 February 2009 This lecture does a final couple examples of set theory proofs. It then fills in material on quantifiers, especially nested ones,

### Sample Size Designs to Assess Controls

Sample Size Designs to Assess Controls B. Ricky Rambharat, PhD, PStat Lead Statistician Office of the Comptroller of the Currency U.S. Department of the Treasury Washington, DC FCSM Research Conference

### Towards running complex models on big data

Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation

### Big Data, Machine Learning, Causal Models

Big Data, Machine Learning, Causal Models Sargur N. Srihari University at Buffalo, The State University of New York USA Int. Conf. on Signal and Image Processing, Bangalore January 2014 1 Plan of Discussion

### Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data

Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data (Oxford) in collaboration with: Minjie Xu, Jun Zhu, Bo Zhang (Tsinghua) Balaji Lakshminarayanan (Gatsby) Bayesian

### Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1

Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2011 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

### Inference on Phase-type Models via MCMC

Inference on Phase-type Models via MCMC with application to networks of repairable redundant systems Louis JM Aslett and Simon P Wilson Trinity College Dublin 28 th June 202 Toy Example : Redundant Repairable

### Metropolis Light Transport. Samuel Donow, Mike Flynn, David Yan CS371 Fall 2014, Morgan McGuire

Metropolis Light Transport Samuel Donow, Mike Flynn, David Yan CS371 Fall 2014, Morgan McGuire Overview of Presentation 1. Description of necessary tools (Path Space, Monte Carlo Integration, Rendering

### Introduction to Stationary Distributions

13 Introduction to Stationary Distributions We first briefly review the classification of states in a Markov chain with a quick example and then begin the discussion of the important notion of stationary

### Theorem (informal statement): There are no extendible methods in David Chalmers s sense unless P = NP.

Theorem (informal statement): There are no extendible methods in David Chalmers s sense unless P = NP. Explication: In his paper, The Singularity: A philosophical analysis, David Chalmers defines an extendible

### Q1. [19 pts] Cheating at cs188-blackjack

CS 188 Spring 2012 Introduction to Artificial Intelligence Practice Midterm II Solutions Q1. [19 pts] Cheating at cs188-blackjack Cheating dealers have become a serious problem at the cs188-blackjack tables.

### STAB47S:2003 Midterm Name: Student Number: Tutorial Time: Tutor:

STAB47S:200 Midterm Name: Student Number: Tutorial Time: Tutor: Time: 2hours Aids: The exam is open book Students may use any notes, books and calculators in writing this exam Instructions: Show your reasoning

### Chapter 7 Probability. Example of a random circumstance. Random Circumstance. What does probability mean?? Goals in this chapter

Homework (due Wed, Oct 27) Chapter 7: #17, 27, 28 Announcements: Midterm exams keys on web. (For a few hours the answer to MC#1 was incorrect on Version A.) No grade disputes now. Will have a chance to

### CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

### CMPSCI 683 Artificial Intelligence Questions & Answers

CMPSCI 683 Artificial Intelligence Questions & s 1. General Learning Consider the following modification to the restaurant example described in class, which includes missing and partially specified attributes:

### 15-381: Artificial Intelligence. Probabilistic Reasoning and Inference

5-38: Artificial Intelligence robabilistic Reasoning and Inference Advantages of probabilistic reasoning Appropriate for complex, uncertain, environments - Will it rain tomorrow? Applies naturally to many

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

### Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 6 Sequential Monte Carlo methods II February

### Gaussian Processes to Speed up Hamiltonian Monte Carlo

Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo

### Gibbs Sampling and Online Learning Introduction

Statistical Techniques in Robotics (16-831, F14) Lecture#10(Tuesday, September 30) Gibbs Sampling and Online Learning Introduction Lecturer: Drew Bagnell Scribes: {Shichao Yang} 1 1 Sampling Samples are

### Chapter 3: The basic concepts of probability

Chapter 3: The basic concepts of probability Experiment: a measurement process that produces quantifiable results (e.g. throwing two dice, dealing cards, at poker, measuring heights of people, recording

### Parameter estimation for nonlinear models: Numerical approaches to solving the inverse problem. Lecture 12 04/08/2008. Sven Zenker

Parameter estimation for nonlinear models: Numerical approaches to solving the inverse problem Lecture 12 04/08/2008 Sven Zenker Assignment no. 8 Correct setup of likelihood function One fixed set of observation

### The intuitions in examples from last time give us D-separation and inference in belief networks

CSC384: Lecture 11 Variable Elimination Last time The intuitions in examples from last time give us D-separation and inference in belief networks a simple inference algorithm for networks without Today

### Tutorial on Markov Chain Monte Carlo

Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,

### 15 Markov Chains: Limiting Probabilities

MARKOV CHAINS: LIMITING PROBABILITIES 67 Markov Chains: Limiting Probabilities Example Assume that the transition matrix is given by 7 2 P = 6 Recall that the n-step transition probabilities are given

### Monte Carlo Simulation technique. S. B. Santra Department of Physics Indian Institute of Technology Guwahati

Monte Carlo Simulation technique S. B. Santra Department of Physics Indian Institute of Technology Guwahati What is Monte Carlo (MC)? Introduction Monte Carlo method is a common name for a wide variety

### 22c:145 Artificial Intelligence

22c:145 Artificial Intelligence Fall 2005 Uncertainty Cesare Tinelli The University of Iowa Copyright 2001-05 Cesare Tinelli and Hantao Zhang. a a These notes are copyrighted material and may not be used

### Normal distribution. ) 2 /2σ. 2π σ

Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

### COS 511: Foundations of Machine Learning. Rob Schapire Lecture #14 Scribe: Qian Xi March 30, 2006

COS 511: Foundations of Machine Learning Rob Schapire Lecture #14 Scribe: Qian Xi March 30, 006 In the previous lecture, we introduced a new learning model, the Online Learning Model. After seeing a concrete

### Chapter 4 - Practice Problems 1

Chapter 4 - Practice Problems SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Provide an appropriate response. ) Compare the relative frequency formula

### m (t) = e nt m Y ( t) = e nt (pe t + q) n = (pe t e t + qe t ) n = (qe t + p) n

1. For a discrete random variable Y, prove that E[aY + b] = ae[y] + b and V(aY + b) = a 2 V(Y). Solution: E[aY + b] = E[aY] + E[b] = ae[y] + b where each step follows from a theorem on expected value from

### Bayesian Phylogeny and Measures of Branch Support

Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The

### Lecture 7: Approximation via Randomized Rounding

Lecture 7: Approximation via Randomized Rounding Often LPs return a fractional solution where the solution x, which is supposed to be in {0, } n, is in [0, ] n instead. There is a generic way of obtaining

### Some Examples of (Markov Chain) Monte Carlo Methods

Some Examples of (Markov Chain) Monte Carlo Methods Ryan R. Rosario What is a Monte Carlo method? Monte Carlo methods rely on repeated sampling to get some computational result. Monte Carlo methods originated

### EE 126 Fall 2007 Midterm #1 Thursday October 4, :30 5pm DO NOT TURN THIS PAGE OVER UNTIL YOU ARE TOLD TO DO SO

EE 126 Fall 2007 Midterm #1 Thursday October 4, :30 5pm DO NOT TURN THIS PAGE OVER UNTIL YOU ARE TOLD TO DO SO You have 90 minutes to complete the quiz. Write your solutions in the exam booklet. We will

### INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS Questions 290 Field Statistics Target Audience Science Students Outline Target Level First or Second-year Undergraduate Topics Introduction to Statistics Descriptive Statistics

### Compression algorithm for Bayesian network modeling of binary systems

Compression algorithm for Bayesian network modeling of binary systems I. Tien & A. Der Kiureghian University of California, Berkeley ABSTRACT: A Bayesian network (BN) is a useful tool for analyzing the

### Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

### An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics

Slide 1 An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics Dr. Christian Asseburg Centre for Health Economics Part 1 Slide 2 Talk overview Foundations of Bayesian statistics

### ABC methods for model choice in Gibbs random fields

ABC methods for model choice in Gibbs random fields Jean-Michel Marin Institut de Mathématiques et Modélisation Université Montpellier 2 Joint work with Aude Grelaud, Christian Robert, François Rodolphe,

### **Chance behavior is in the short run but has a regular and predictable pattern in the long run. This is the basis for the idea of probability.

AP Statistics Chapter 5 Notes 5.1 Randomness, Probability,and Simulation In tennis, a coin toss is used to decide which player will serve first. Many other sports use this method because it seems like

### Lab 8: Introduction to WinBUGS

40.656 Lab 8 008 Lab 8: Introduction to WinBUGS Goals:. Introduce the concepts of Bayesian data analysis.. Learn the basic syntax of WinBUGS. 3. Learn the basics of using WinBUGS in a simple example. Next

### Program description for the Master s Degree Program in Mathematics and Finance

Program description for the Master s Degree Program in Mathematics and Finance : English: Master s Degree in Mathematics and Finance Norwegian, bokmål: Master i matematikk og finans Norwegian, nynorsk:

### Local Search. 22c:145 Artificial Intelligence. Local Search: General Principle. Hill-climbing search. Local Search. Intro example: N-queens

So far: methods that systematically explore the search space, possibly using principled pruning (e.g., A*) c:145 Artificial Intelligence Local Search Readings Textbook: Chapter 4:1 and 6:4 Current best

### Randomized algorithms

Randomized algorithms March 10, 2005 1 What are randomized algorithms? Algorithms which use random numbers to make decisions during the executions of the algorithm. Why would we want to do this?? Deterministic

### CS5314 Randomized Algorithms. Lecture 16: Balls, Bins, Random Graphs (Random Graphs, Hamiltonian Cycles)

CS5314 Randomized Algorithms Lecture 16: Balls, Bins, Random Graphs (Random Graphs, Hamiltonian Cycles) 1 Objectives Introduce Random Graph Model used to define a probability space for all graphs with

### Outline. Probability. Random Variables. Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule.

Probability 1 Probability Random Variables Outline Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule Inference Independence 2 Inference in Ghostbusters A ghost

### 7.1 Sample space, events, probability

7.1 Sample space, events, probability In this chapter, we will study the topic of probability which is used in many different areas including insurance, science, marketing, government and many other areas.

### Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

### Master s Theory Exam Spring 2006

Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

### IEOR 6711: Stochastic Models, I Fall 2012, Professor Whitt, Final Exam SOLUTIONS

IEOR 6711: Stochastic Models, I Fall 2012, Professor Whitt, Final Exam SOLUTIONS There are four questions, each with several parts. 1. Customers Coming to an Automatic Teller Machine (ATM) (30 points)

### Numerical methods for finding the roots of a function

Numerical methods for finding the roots of a function The roots of a function f (x) are defined as the values for which the value of the function becomes equal to zero. So, finding the roots of f (x) means

### MAT 1000. Mathematics in Today's World

MAT 1000 Mathematics in Today's World We talked about Cryptography Last Time We will talk about probability. Today There are four rules that govern probabilities. One good way to analyze simple probabilities

### 4.3. Addition and Multiplication Laws of Probability. Introduction. Prerequisites. Learning Outcomes. Learning Style

Addition and Multiplication Laws of Probability 4.3 Introduction When we require the probability of two events occurring simultaneously or the probability of one or the other or both of two events occurring

### Chapter 8: Introduction to Hypothesis Testing

Chapter 8: Introduction to Hypothesis Testing We re now at the point where we can discuss the logic of hypothesis testing. This procedure will underlie the statistical analyses that we ll use for the remainder