# Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Save this PDF as:

Size: px
Start display at page:

Download "Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) ="

## Transcription

1 Artificial Intelligence Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information Bayesian Inference with Boolean variables posterior P (D T ) = likelihood prior P (T D)P (D) P (T D)P (D) + P (T D)P ( D) normalizing constant Inferences combines sources of knowledge P (D T ) = = Inference is sequential P (D T 1, T 2 ) = P (T 2 D)P (T 1 D)P (D) P (T 2 )P (T 1 ) 2

2 Bayesian inference with continuous variables (recap) likelihood (Binomial) p(y!=0.05, n=5) p(θ y, n) = posterior p(θ y, n) likelihood ( ) n θ y (1 θ) n y y p(y!=0.2, n=5) prior p(y θ, n)p(θ n) p(y n) = normalizing constant p(y!=0.35, n=5) p(y θ, n)p(θ n)dθ p(y!=0.5, n=5) y y y y p(! y=1, n=5) posterior (beta) p(! y=0, n=0) prior (uniform) ! ! Today: Inference with more complex dependencies How do we represent (model) more complex probabilistic relationships? How do we use these models to draw inferences? 4

3 Probabilistic reasoning Suppose we go to my house and see that the door is open. - What s the cause? Is it a? Should we go in? Call the police? - Then again, it could just be my. Maybe she came home early. How should we represent these relationships? 5 Belief networks In Belief networks, causal relationships are represented in directed acyclic graphs. Arrows indicate causal relationships between the nodes. 6

4 Types of probabilistic relationships How do we represent these relationships? Direct cause Indirect cause Common cause Common effect A A A A B B B C B C C P(B A) P(B A) P(C B) P(B A) P(C A) P(C A,B) C is independent of A given B Are B and C independent? Are A and B independent? 7 Belief networks In Belief networks, causal relationships are represented in directed acyclic graphs. Arrows indicate causal relationships between the nodes. How can we determine what is happening before we go in? We need more information. What else can we observe? 8

5 Explaining away Suppose we notice that the car is in the garage. Now we infer that it s probably my, and not a. This fact explains away the hypothesis of a. Note that there is no direct causal link between and. Yet, seeing the car changes our beliefs about the. 9 Explaining away Suppose we notice that the car is in the garage. Now we infer that it s probably my, and not a. This fact explains away the hypothesis of a. We could also notice the door was damaged, in which case we reach the opposite conclusion. How do we make this inference process more precise? Let s start by writing down the conditional probabilities. 10

6 Defining the belief network Each link in the graph represents a conditional relationship between nodes. To compute the inference, we must specify the conditional probabilities. 11 Defining the belief network Each link in the graph represents a conditional relationship between nodes. To compute the inference, we must specify the conditional probabilities. Let s start with the. What do we specify? What else do we need to specify? The priors probabilities. W B P(O W,B) F F 0.01 F T 0.25 T F 0.05 T T 0.75 Check: Does this column have to sum to one? No! Only the full joint distribution does. This is a conditional distribution. But note that: P( O W,B) = 1- P(O W,B) 12

7 Defining the belief network Each link in the graph represents a conditional relationship between nodes. To compute the inference, we must specify the conditional probabilities. Let s start with the. What do we specify? What else do we need to specify? The priors probabilities. P(W) 0.05 W B P(O W,B) F F 0.01 F T 0.25 T F 0.05 T T Defining the belief network Each link in the graph represents a conditional relationship between nodes. To compute the inference, we must specify the conditional probabilities. Let s start with the. What do we specify? What else do we need to specify? W B P(O W,B) F F 0.01 F T 0.25 The priors probabilities. P(W) 0.05 T F 0.05 T T 0.75 P(B)

8 Defining the belief network Each link in the graph represents a conditional relationship between nodes. To compute the inference, we must specify the conditional probabilities. Let s start with the. What do we specify? W B P(O W,B) F F 0.01 F T 0.25 Finally, we specify the remaining conditionals P(W) 0.05 T F 0.05 T T 0.75 P(B) W P(C W) F 0.01 T Defining the belief network Each link in the graph represents a conditional relationship between nodes. To compute the inference, we must specify the conditional probabilities. Let s start with the. What do we specify? W B P(O W,B) F F 0.01 F T 0.25 Finally, we specify the remaining conditionals P(W) 0.05 T F 0.05 T T 0.75 P(B) W P(C W) F 0.01 T 0.95 B P(D B) F T Now what?

9 Calculating probabilities using the joint distribution What the probability that the door is open, it is my and not a, we see the car in the garage, and the door is not damaged? Mathematically, we want to compute the expression: P(o,w, b,c, d) =? We can just repeatedly apply the rule relating joint and conditional probabilities. - P(x,y) = P(x y) P(y) 17 Calculating probabilities using the joint distribution The probability that the door is open, it is my and not a, we see the car in the garage, and the door is not damaged. P(o,w, b,c, d) = P(o w, b,c, d)p(w, b,c, d) = P(o w, b)p(w, b,c, d) = P(o w, b)p(c w, b, d)p(w, b, d) = P(o w, b)p(c w)p(w, b, d) = P(o w, b)p(c w)p( d w, b)p(w, b) = P(o w, b)p(c w)p( d b)p(w, b) = P(o w, b)p(c w)p( d b)p(w)p( b) 18

10 Calculating probabilities using the joint distribution P(o,w, b,c, d) = P(o w, b)p(c w)p( d b)p(w)p( b) = 0.05! 0.95! 0.999! 0.05! = This is essentially the probability that my is home and leaves the door open. P(W) 0.05 W B P(O W,B) F F 0.01 F T 0.25 T F 0.05 T T 0.75 P(B) W P(C W) F 0.01 T 0.95 B P(D B) F T Calculating probabilities in a general Bayesian belief network A C E Note that by specifying all the conditional probabilities, we have also specified the joint probability. For the directed graph above: P(A,B,C,D,E) = P(A) P(B C) P(C A) P(D C,E) P(E A,C) The general expression is: B D P (x 1,..., x n ) P (X 1 = x 1... X n = x n ) n = P (x i parents(x i )) i=1 " With this we can calculate (in principle) the probability of any joint probability. " This implies that we can also calculate any conditional probability. 20

11 Calculating conditional probabilities Using the joint we can compute any conditional probability too The conditional probability of any one subset of variables given another disjoint subset is P (S 1 S 2 ) = P (S 1 S 2 ) P (S 2 ) = p S1 S 2 p S2 where p S is shorthand for all the entries of the joint matching subset S. How many terms are in this sum? 2 N The number of terms in the sums is exponential in the number of variables. In fact, general querying Bayes nets is NP complete. 21 So what do we do? There are also many approximations: - stochastic (MCMC) approximations - approximations The are special cases of Bayes nets for which there are fast, exact algorithms: - variable elimination - belief propagation 22

12 Belief networks with multiple causes In the models above, we specified the joint conditional density by hand. This specified the probability of a variable given each possible value of the causal nodes. Can this be specified in a more generic way? Can we avoid having to specify every entry in the joint conditional pdf? For this we need to specify: P(X parents(x)) One classic example of this function is the Noisy-OR model. W B P(O W,B) F F 0.01 F T 0.25 T F 0.05 T T Defining causal relationships using Noisy-OR We assume each cause C j can produce effect Ei with probability fij. The noisy-or model assumes the parent causes of effect E i contribute independently. The probability that none of them caused effect E i is simply the product of the probabilities that each one did not cause Ei. The probability that any of them caused Ei is just one minus the above, i.e. P (E i par(e i )) = P (E i C 1,..., C n ) = 1 i (1 P (E i C j )) touch contaminated object (O) hit by viral droplet (D) f CD catch cold (C) eat contaminated food (F) f CO f CF = 1 i (1 f ij ) P (C D, O, F ) = 1 (1 f CD )(1 f CO )(1 f CF ) 24

13 A general one-layer causal network Could either model causes and effects Or equivalently stochastic binary features. Each input xi encodes the probability that the ith binary input feature is present. The set of features represented by #j is defined by weights fij which encode the probability that feature i is an instance of #j. The data: a set of stochastic binary patterns Each column is a distinct eight-dimensional binary feature. There are five underlying causal feature patterns. What are they?

14 The data: a set of stochastic binary patterns a Each column is a distinct eight-dimensional binary feature. b c true hidden causes of the data This is a learning problem, which we ll cover in later lecture. Hierarchical Statistical Models A Bayesian belief network: The joint probability of binary states is P (S W) = i P (S i pa(s i ), W) The probability S i depends only on its parents: pa(s i ) S i D P (S i pa(s i ), W) = { h( j S jw ji ) if S i = 1 1 h( j S jw ji ) if S i = 0 The function h specifies how causes are combined, h(u) = 1 exp( u), u > 0. Main points: hierarchical structure allows model to form high order representations upper states are priors for lower states weights encode higher order features 28

15 Next time Inference methods in Bayes Nets Example with continuous variables More general types of probabilistic networks 29

### Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

### Logic, Probability and Learning

Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

### CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

### Machine Learning

Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 23, 2015 Today: Graphical models Bayes Nets: Representing distributions Conditional independencies

### Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

### Bayesian Networks Tutorial I Basics of Probability Theory and Bayesian Networks

Bayesian Networks 2015 2016 Tutorial I Basics of Probability Theory and Bayesian Networks Peter Lucas LIACS, Leiden University Elementary probability theory We adopt the following notations with respect

### 13.3 Inference Using Full Joint Distribution

191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

### COS513 LECTURE 5 JUNCTION TREE ALGORITHM

COS513 LECTURE 5 JUNCTION TREE ALGORITHM D. EIS AND V. KOSTINA The junction tree algorithm is the culmination of the way graph theory and probability combine to form graphical models. After we discuss

### Lecture 10: Probability Propagation in Join Trees

Lecture 10: Probability Propagation in Join Trees In the last lecture we looked at how the variables of a Bayesian network could be grouped together in cliques of dependency, so that probability propagation

### Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

### The intuitions in examples from last time give us D-separation and inference in belief networks

CSC384: Lecture 11 Variable Elimination Last time The intuitions in examples from last time give us D-separation and inference in belief networks a simple inference algorithm for networks without Today

### Bayesian networks. Data Mining. Abraham Otero. Data Mining. Agenda. Bayes' theorem Naive Bayes Bayesian networks Bayesian networks (example)

Bayesian networks 1/40 Agenda Introduction Bayes' theorem Bayesian networks Quick reference 2/40 1 Introduction Probabilistic methods: They are backed by a strong mathematical formalism. They can capture

### Bayesian network inference

Bayesian network inference Given: Query variables: X Evidence observed variables: E = e Unobserved variables: Y Goal: calculate some useful information about the query variables osterior Xe MA estimate

### Variable Elimination 1

Variable Elimination 1 Inference Exact inference Enumeration Variable elimination Junction trees and belief propagation Approximate inference Loopy belief propagation Sampling methods: likelihood weighting,

### Machine Learning

Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 8, 2011 Today: Graphical models Bayes Nets: Representing distributions Conditional independencies

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### Probabilistic Graphical Models Final Exam

Introduction to Probabilistic Graphical Models Final Exam, March 2007 1 Probabilistic Graphical Models Final Exam Please fill your name and I.D.: Name:... I.D.:... Duration: 3 hours. Guidelines: 1. The

### Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1

Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2011 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

### Intelligent Systems: Reasoning and Recognition. Uncertainty and Plausible Reasoning

Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2015/2016 Lesson 12 25 march 2015 Uncertainty and Plausible Reasoning MYCIN (continued)...2 Backward

### Probability, Conditional Independence

Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms

### Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

### Lecture 2: The Complexity of Some Problems

IAS/PCMI Summer Session 2000 Clay Mathematics Undergraduate Program Basic Course on Computational Complexity Lecture 2: The Complexity of Some Problems David Mix Barrington and Alexis Maciel July 18, 2000

### Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

### Introduction to Discrete Probability Theory and Bayesian Networks

Introduction to Discrete Probability Theory and Bayesian Networks Dr Michael Ashcroft September 15, 2011 This document remains the property of Inatas. Reproduction in whole or in part without the written

### Graphical Models. CS 343: Artificial Intelligence Bayesian Networks. Raymond J. Mooney. Conditional Probability Tables.

Graphical Models CS 343: Artificial Intelligence Bayesian Networks Raymond J. Mooney University of Texas at Austin 1 If no assumption of independence is made, then an exponential number of parameters must

### Summary of Probability

Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible

### Compression algorithm for Bayesian network modeling of binary systems

Compression algorithm for Bayesian network modeling of binary systems I. Tien & A. Der Kiureghian University of California, Berkeley ABSTRACT: A Bayesian network (BN) is a useful tool for analyzing the

### Artificial Intelligence

Artificial Intelligence ICS461 Fall 2010 Nancy E. Reed nreed@hawaii.edu 1 Lecture #13 Uncertainty Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule Uncertainty

### Introduction to Machine Learning

Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 5: Decision Theory & ROC Curves Gaussian ML Estimation Many figures courtesy Kevin Murphy s textbook,

### 3. The Junction Tree Algorithms

A Short Course on Graphical Models 3. The Junction Tree Algorithms Mark Paskin mark@paskin.org 1 Review: conditional independence Two random variables X and Y are independent (written X Y ) iff p X ( )

### MS&E 226: Small Data

MS&E 226: Small Data Lecture 16: Bayesian inference (v3) Ramesh Johari ramesh.johari@stanford.edu 1 / 35 Priors 2 / 35 Frequentist vs. Bayesian inference Frequentists treat the parameters as fixed (deterministic).

### DAGs, I-Maps, Factorization, d-separation, Minimal I-Maps, Bayesian Networks. Slides by Nir Friedman

DAGs, I-Maps, Factorization, d-separation, Minimal I-Maps, Bayesian Networks Slides by Nir Friedman. Probability Distributions Let X 1,,X n be random variables Let P be a joint distribution over X 1,,X

10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

### 1 Maximum likelihood estimation

COS 424: Interacting with Data Lecturer: David Blei Lecture #4 Scribes: Wei Ho, Michael Ye February 14, 2008 1 Maximum likelihood estimation 1.1 MLE of a Bernoulli random variable (coin flips) Given N

### Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

Neural Networks for Machine Learning Lecture 13a The ups and downs of backpropagation Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed A brief history of backpropagation

### Lecture 9: Bayesian hypothesis testing

Lecture 9: Bayesian hypothesis testing 5 November 27 In this lecture we ll learn about Bayesian hypothesis testing. 1 Introduction to Bayesian hypothesis testing Before we go into the details of Bayesian

### Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data

Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data (Oxford) in collaboration with: Minjie Xu, Jun Zhu, Bo Zhang (Tsinghua) Balaji Lakshminarayanan (Gatsby) Bayesian

### CS 331: Artificial Intelligence Fundamentals of Probability II. Joint Probability Distribution. Marginalization. Marginalization

Full Joint Probability Distributions CS 331: Artificial Intelligence Fundamentals of Probability II Toothache Cavity Catch Toothache, Cavity, Catch false false false 0.576 false false true 0.144 false

### BIN504 - Lecture XI. Bayesian Inference & Markov Chains

BIN504 - Lecture XI Bayesian Inference & References: Lee, Chp. 11 & Sec. 10.11 Ewens-Grant, Chps. 4, 7, 11 c 2012 Aybar C. Acar Including slides by Tolga Can Rev. 1.2 (Build 20130528191100) BIN 504 - Bayes

### Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation

Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2015 CS 551, Fall 2015

### bayesian inference: principles and practice

bayesian inference: and http://jakehofman.com july 9, 2009 bayesian inference: and motivation background bayes theorem bayesian probability bayesian inference would like models that: provide predictive

### Relational Dynamic Bayesian Networks: a report. Cristina Manfredotti

Relational Dynamic Bayesian Networks: a report Cristina Manfredotti Dipartimento di Informatica, Sistemistica e Comunicazione (D.I.S.Co.) Università degli Studi Milano-Bicocca manfredotti@disco.unimib.it

### Probabilistic Graphical Models

Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 4, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 4, 2011 1 / 22 Bayesian Networks and independences

### Chapter Prove or disprove: A (B C) = (A B) (A C). Ans: True, since

Chapter 2 1. Prove or disprove: A (B C) = (A B) (A C)., since A ( B C) = A B C = A ( B C) = ( A B) ( A C) = ( A B) ( A C). 2. Prove that A B= A B by giving a containment proof (that is, prove that the

### SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 17, 2015 Outline

### Outline. Uncertainty. Uncertainty. Making decisions under uncertainty. Probability. Methods for handling uncertainty

Outline Uncertainty Chapter 13 Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me

### Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization Wolfram Burgard, Maren Bennewitz, Diego Tipaldi, Luciano Spinello 1 Motivation Recall: Discrete filter Discretize

### Querying Joint Probability Distributions

Querying Joint Probability Distributions Sargur Srihari srihari@cedar.buffalo.edu 1 Queries of Interest Probabilistic Graphical Models (BNs and MNs) represent joint probability distributions over multiple

### Bayesian Learning. CSL465/603 - Fall 2016 Narayanan C Krishnan

Bayesian Learning CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Bayes Theorem MAP Learners Bayes optimal classifier Naïve Bayes classifier Example text classification Bayesian networks

### GeNIeRate: An Interactive Generator of Diagnostic Bayesian Network Models

GeNIeRate: An Interactive Generator of Diagnostic Bayesian Network Models Pieter C. Kraaijeveld Man Machine Interaction Group Delft University of Technology Mekelweg 4, 2628 CD Delft, the Netherlands p.c.kraaijeveld@ewi.tudelft.nl

### Lecture 11: Graphical Models for Inference

Lecture 11: Graphical Models for Inference So far we have seen two graphical models that are used for inference - the Bayesian network and the Join tree. These two both represent the same joint probability

### Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

### Chapter 14 Managing Operational Risks with Bayesian Networks

Chapter 14 Managing Operational Risks with Bayesian Networks Carol Alexander This chapter introduces Bayesian belief and decision networks as quantitative management tools for operational risks. Bayesian

### Lecture 6: The Bayesian Approach

Lecture 6: The Bayesian Approach What Did We Do Up to Now? We are given a model Log-linear model, Markov network, Bayesian network, etc. This model induces a distribution P(X) Learning: estimate a set

### Basics of Probability

Basics of Probability 1 Sample spaces, events and probabilities Begin with a set Ω the sample space e.g., 6 possible rolls of a die. ω Ω is a sample point/possible world/atomic event A probability space

### Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

### Model-based Synthesis. Tony O Hagan

Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

### CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

### Lecture 3: Summations and Analyzing Programs with Loops

high school algebra If c is a constant (does not depend on the summation index i) then ca i = c a i and (a i + b i )= a i + b i There are some particularly important summations, which you should probably

### Bayesian Tutorial (Sheet Updated 20 March)

Bayesian Tutorial (Sheet Updated 20 March) Practice Questions (for discussing in Class) Week starting 21 March 2016 1. What is the probability that the total of two dice will be greater than 8, given that

### Bayesian Analysis for the Social Sciences

Bayesian Analysis for the Social Sciences Simon Jackman Stanford University http://jackman.stanford.edu/bass November 9, 2012 Simon Jackman (Stanford) Bayesian Analysis for the Social Sciences November

### Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

1 Learning Goals Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1. Be able to apply Bayes theorem to compute probabilities. 2. Be able to identify

### A Bayesian Approach for on-line max auditing of Dynamic Statistical Databases

A Bayesian Approach for on-line max auditing of Dynamic Statistical Databases Gerardo Canfora Bice Cavallo University of Sannio, Benevento, Italy, {gerardo.canfora,bice.cavallo}@unisannio.it ABSTRACT In

### CMPSCI 683 Artificial Intelligence Questions & Answers

CMPSCI 683 Artificial Intelligence Questions & s 1. General Learning Consider the following modification to the restaurant example described in class, which includes missing and partially specified attributes:

### Tracking Groups of Pedestrians in Video Sequences

Tracking Groups of Pedestrians in Video Sequences Jorge S. Marques Pedro M. Jorge Arnaldo J. Abrantes J. M. Lemos IST / ISR ISEL / IST ISEL INESC-ID / IST Lisbon, Portugal Lisbon, Portugal Lisbon, Portugal

### Lecture 7: NP-Complete Problems

IAS/PCMI Summer Session 2000 Clay Mathematics Undergraduate Program Basic Course on Computational Complexity Lecture 7: NP-Complete Problems David Mix Barrington and Alexis Maciel July 25, 2000 1. Circuit

### Tutorial on variational approximation methods. Tommi S. Jaakkola MIT AI Lab

Tutorial on variational approximation methods Tommi S. Jaakkola MIT AI Lab tommi@ai.mit.edu Tutorial topics A bit of history Examples of variational methods A brief intro to graphical models Variational

### Artificial Intelligence. Conditional probability. Inference by enumeration. Independence. Lesson 11 (From Russell & Norvig)

Artificial Intelligence Conditional probability Conditional or posterior probabilities e.g., cavity toothache) = 0.8 i.e., given that toothache is all I know tation for conditional distributions: Cavity

### Neural Networks. CAP5610 Machine Learning Instructor: Guo-Jun Qi

Neural Networks CAP5610 Machine Learning Instructor: Guo-Jun Qi Recap: linear classifier Logistic regression Maximizing the posterior distribution of class Y conditional on the input vector X Support vector

### The Certainty-Factor Model

The Certainty-Factor Model David Heckerman Departments of Computer Science and Pathology University of Southern California HMR 204, 2025 Zonal Ave Los Angeles, CA 90033 dheck@sumex-aim.stanford.edu 1 Introduction

### Decision Trees and Networks

Lecture 21: Uncertainty 6 Today s Lecture Victor R. Lesser CMPSCI 683 Fall 2010 Decision Trees and Networks Decision Trees A decision tree is an explicit representation of all the possible scenarios from

### Finite Markov Chains and Algorithmic Applications. Matematisk statistik, Chalmers tekniska högskola och Göteborgs universitet

Finite Markov Chains and Algorithmic Applications Olle Häggström Matematisk statistik, Chalmers tekniska högskola och Göteborgs universitet PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGE

### Lecture Notes on Spanning Trees

Lecture Notes on Spanning Trees 15-122: Principles of Imperative Computation Frank Pfenning Lecture 26 April 26, 2011 1 Introduction In this lecture we introduce graphs. Graphs provide a uniform model

### PS 271B: Quantitative Methods II. Lecture Notes

PS 271B: Quantitative Methods II Lecture Notes Langche Zeng zeng@ucsd.edu The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.

### 1 A simple example. A short introduction to Bayesian statistics, part I Math 218, Mathematical Statistics D Joyce, Spring 2016

and P (B X). In order to do find those conditional probabilities, we ll use Bayes formula. We can easily compute the reverse probabilities A short introduction to Bayesian statistics, part I Math 18, Mathematical

### Artificial Intelligence II

Artificial Intelligence II Some supplementary notes on probability Sean B Holden, 2010-14 1 Introduction These notes provide a reminder of some simple manipulations that turn up a great deal when dealing

### The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters

DEFINING PROILISTI MODELS The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters 2 n (larger than 10 25 for only 100 variables). x y z p(x, y, z) 0

### Incorporating Evidence in Bayesian networks with the Select Operator

Incorporating Evidence in Bayesian networks with the Select Operator C.J. Butz and F. Fang Department of Computer Science, University of Regina Regina, Saskatchewan, Canada SAS 0A2 {butz, fang11fa}@cs.uregina.ca

### Introduction to Bayesian tracking

Introduction to Bayesian tracking Doz. G. Bleser Prof. Stricker Computer Vision: Object and People Tracking Reminder: introductory eample Goal: Estimate posterior belief or state distribution based on

### Sufficient Statistics and Exponential Family. 1 Statistics and Sufficient Statistics. Math 541: Statistical Theory II. Lecturer: Songfeng Zheng

Math 541: Statistical Theory II Lecturer: Songfeng Zheng Sufficient Statistics and Exponential Family 1 Statistics and Sufficient Statistics Suppose we have a random sample X 1,, X n taken from a distribution

### Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

### Bayesian models of cognition

Bayesian models of cognition Thomas L. Griffiths Department of Psychology University of California, Berkeley Charles Kemp and Joshua B. Tenenbaum Department of Brain and Cognitive Sciences Massachusetts

### Shannon and Huffman-Type Coders

Shannon and Huffman-Type Coders A useful class of coders that satisfy the Kraft's inequality in an efficient manner are called Huffman-type coders. To understand the philosophy of obtaining these codes,

### CSPs: Arc Consistency

CSPs: Arc Consistency CPSC 322 CSPs 3 Textbook 4.5 CSPs: Arc Consistency CPSC 322 CSPs 3, Slide 1 Lecture Overview 1 Recap 2 Consistency 3 Arc Consistency CSPs: Arc Consistency CPSC 322 CSPs 3, Slide 2

### Learning from Data: Naive Bayes

Semester 1 http://www.anc.ed.ac.uk/ amos/lfd/ Naive Bayes Typical example: Bayesian Spam Filter. Naive means naive. Bayesian methods can be much more sophisticated. Basic assumption: conditional independence.

### Probability and statistics; Rehearsal for pattern recognition

Probability and statistics; Rehearsal for pattern recognition Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering, Department of Cybernetics Center for Machine Perception

### Big Data, Machine Learning, Causal Models

Big Data, Machine Learning, Causal Models Sargur N. Srihari University at Buffalo, The State University of New York USA Int. Conf. on Signal and Image Processing, Bangalore January 2014 1 Plan of Discussion

### NOTES ON ELEMENTARY PROBABILITY

NOTES ON ELEMENTARY PROBABILITY KARL PETERSEN 1. Probability spaces Probability theory is an attempt to work mathematically with the relative uncertainties of random events. In order to get started, we

### Life of A Knowledge Base (KB)

Life of A Knowledge Base (KB) A knowledge base system is a special kind of database management system to for knowledge base management. KB extraction: knowledge extraction using statistical models in NLP/ML

### Lecture 9: Bayesian Learning

Lecture 9: Bayesian Learning Cognitive Systems II - Machine Learning SS 2005 Part II: Special Aspects of Concept Learning Bayes Theorem, MAL / ML hypotheses, Brute-force MAP LEARNING, MDL principle, Bayes

### CS 771 Artificial Intelligence. Reasoning under uncertainty

CS 771 Artificial Intelligence Reasoning under uncertainty Today Representing uncertainty is useful in knowledge bases Probability provides a coherent framework for uncertainty Review basic concepts in

### Probability Theory. Elementary rules of probability Sum rule. Product rule. p. 23

Probability Theory Uncertainty is key concept in machine learning. Probability provides consistent framework for the quantification and manipulation of uncertainty. Probability of an event is the fraction

### Part III: Machine Learning. CS 188: Artificial Intelligence. Machine Learning This Set of Slides. Parameter Estimation. Estimation: Smoothing

CS 188: Artificial Intelligence Lecture 20: Dynamic Bayes Nets, Naïve Bayes Pieter Abbeel UC Berkeley Slides adapted from Dan Klein. Part III: Machine Learning Up until now: how to reason in a model and

### Experimental Uncertainty and Probability

02/04/07 PHY310: Statistical Data Analysis 1 PHY310: Lecture 03 Experimental Uncertainty and Probability Road Map The meaning of experimental uncertainty The fundamental concepts of probability 02/04/07

### Lecture Note 1 Set and Probability Theory. MIT 14.30 Spring 2006 Herman Bennett

Lecture Note 1 Set and Probability Theory MIT 14.30 Spring 2006 Herman Bennett 1 Set Theory 1.1 Definitions and Theorems 1. Experiment: any action or process whose outcome is subject to uncertainty. 2.