# Lecture 2: Introduction to belief (Bayesian) networks

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, COMP-526 Lecture 2 Recall from last time: Conditional probabilities Our probabilistic models will compute and manipulate conditional probabilities. Given two random variables X, Y, we denote by p(x = x Y = y) the probability of X taking value x given that we know that Y is certain to have value y. This fits the situation when we observe something and want to make an inference about something related but unobserved: p(cancer recurs tumor measurements) p(gene expressed > 1.3 transcription factor concentrations) p(collision to obstacle sensor readings) p(word uttered sound wave) January 7, COMP-526 Lecture 2

2 Recall from last time: Bayes rule Bayes rule is very simple but very important for relating conditional probabilities: p(x y) = p(y x)p(x) p(y) Bayes rule is a useful tool for inferring the posterior probability of a hypothesis based on evidence and a prior belief in the probability of different hypotheses. January 7, COMP-526 Lecture 2 Using Bayes rule for inference Often we want to form a hypothesis about the world based on observable variables. Bayes rule is fundamental when viewed in terms of stating the belief given to a hypothesis H given evidence e: p(h e) = p(e H)p(H) p(e) p(h e) is sometimes called posterior probability p(h) is called prior probability p(e H) is called likelihood of the evidence (data) p(e) is just a normalizing constant, that can be computed from the requirement that P h p(h = h e) = 1: p(e) = X h p(e h)p(h) Sometimes we write p(h e) p(e H)p(H) January 7, COMP-526 Lecture 2

3 Example: Medical Diagnosis A doctor knows that pneumonia causes a fever 95% of the time. She knows that if a person is selected randomly from the population, there is a 10 7 chance of the person having pneumonia. 1 in 100 people suffer from fever. You go to the doctor complaining about the symptom of having a fever (evidence). What is the probability that pneumonia is the cause of this symptom (hypothesis)? p(pneumonia fever) = p(fever pneumonia)p(pneumonia) p(fever) = January 7, COMP-526 Lecture 2 Computing conditional probabilities Typically, we are interested in the posterior joint distribution of some query variables Y given specific values e for some evidence variables E Let the hidden variables be Z = X Y E If we have a joint probability distribution, we can compute the answer by using the definition of conditional probabilities and marginalizing the hidden variables: p(y e) = p(y, e) p(e) p(y, e) = X z p(y, e, z) This yields the same big problem as before: the joint distribution is too big to handle January 7, COMP-526 Lecture 2

4 Independence of random variables revisited We said that two r.v. s X and Y are independent, denoted X Y, if p(x, y) = p(x)p(y). But we also know that p(x, y) = p(x y)p(y). Hence, two r.v. s are independent if and only if: p(x y) = p(x) (and vice versa), x Ω X, y Ω Y This means that knowledge about Y does not change the uncertainty about X and vice versa. Is there a similar requirement, but less stringent? January 7, COMP-526 Lecture 2 Conditional independence Two random variables X and Y are conditionally independent given Z if: p(x y, z) = p(x z), x, y, z This means that knowing the value of Y does not change the prediction about X if the value of Z is known. We denote this by X Y Z. January 7, COMP-526 Lecture 2

5 Example Consider the medical diagnosis problem with three random variables: P (patient has pneumonia), F (patient has a fever), C (patient has a cough) The full joint distribution has = 7 independent entries If someone has pneumonia, we can assume that the probability of a cough does not depend on whether they have a fever: p(c = 1 P = 1, F ) = p(c = 1 P = 1) (1) Same equality holds if the patient does not have pneumonia: p(c = 1 P = 0, F ) = p(c = 1 P = 0) (2) Hence, C and F are conditionally independent given P. January 7, COMP-526 Lecture 2 Example (continued) The joint distribution can now be written as: p(c, P, F ) = p(c P, F )p(f P )p(p ) = p(c P )p(f P )p(p ) Hence, the joint can be described using = 5 numbers instead of 7 Much more important savings happen with more variables January 7, COMP-526 Lecture 2

6 Naive Bayesian model A common assumption in early diagnosis is that the symptoms are independent of each other given the disease Let s 1,... s n be the symptoms exhibited by a patient (e.g. fever, headache etc) Let D be the patient s disease Then by using the naive Bayes assumption, we get: p(d, s 1,... s n ) = p(d)p(s 1 D) p(s n D) The conditional probability of the disease given the symptoms: p(d s 1,... s n ) = p(d, s 1,... s n ) p(s 1,... s n ) p(d)p(s 1 D) p(s n D) because the denominator is just a normalization constant. January 7, COMP-526 Lecture 2 Recursive Bayesian updating The naive Bayes assumption allows also for a very nice, incremental updating of beliefs as more evidence is gathered Suppose that after knowing symptoms s 1,... s n the probability of D is: ny p(d, s 1... s n ) = p(d) p(s i D) i=1 What happens if a new symptom s n+1 appears? n+1 Y p(, s 1... s n, s n+1 ) = p(d) p(s i D) = p(d, s 1... s n )p(s n+1 D) i=1 An even nicer formula can be obtained by taking logs: log p(d, s 1... s n, s n+1 ) = log p(d, s 1... s n ) + log p(s n+1 D) January 7, COMP-526 Lecture 2

7 Recursive Bayesian updating The naive Bayes assumption allows also for a very nice, incremental updating of beliefs as more evidence is gathered Suppose that after knowing symptoms s 1,... s n the probability of D is: ny p(d, s 1... s n ) = p(d) p(s i D) i=1 What happens if a new symptom s n+1 appears? n+1 Y p(d, s 1... s n, s n+1 ) = p(d) p(s i D) = p(d, s 1... s n )p(s n+1 D) i=1 An even nicer formula can be obtained by taking logs: log p(d, s 1... s n, s n+1 ) = log p(d, s 1... s n ) + log p(s n+1 D) January 7, COMP-526 Lecture 2 A graphical representation of the naive Bayesian model Diagnosis Sympt1 Sympt2... Sympt n The nodes represent random variables The arcs represent influences The lack of arcs represents conditional independence relationships January 7, COMP-526 Lecture 2

8 More generally: Bayesian networks p(e) E=1 E= E B p(b) B=1 B= E=0 E=1 p(r E) R= R= A=0 A=1 R C=1 p(c A) C= A C B=0,E=0 B=0,E=1 B=1,E=0 B=1,E=1 p(a B,E) A=1 A= Bayesian networks are a graphical representation of conditional independence relations, using graphs. January 7, COMP-526 Lecture 2 A graphical representation for probabilistic models Suppose the world is described by a set of r.v. s X 1,... X n Let us define a directed acyclic graph such that each node i corresponds to an r.v. X i Since this is a one-to-one mapping, we will use X i to denote both the node in the graph and the corresponding r.v. Let X πi be the set of parents for node X i in the graph We associate with each node the conditional probability distribution of the r.v. X i given its parents: p(x i X πi ). January 7, COMP-526 Lecture 2

9 Example: A Bayesian (belief) network p(e) E=1 E= E B p(b) B=1 B= E=0 E=1 p(r E) R= R= A=0 A=1 R C=1 p(c A) C= The nodes represent random variables The arcs represent influences A C B=0,E=0 B=0,E=1 B=1,E=0 B=1,E=1 p(a B,E) A=1 A= At each node, we have a conditional probability distribution (CPD) for the corresponding variable given its parents January 7, COMP-526 Lecture 2 Factorization Let G be a DAG over variables X 1,..., X n. We say that a joint probability distribution p factorizes according to G if p can be expressed as a product: ny p(x 1,..., x n ) = p(x i x πi ) i=1 The individual factors p(x i x πi ) are called local probabilistic models or conditional probability distributions (CPD). January 7, COMP-526 Lecture 2

10 Bayesian network definition A Bayesian network is a DAG G over variables X 1,..., X n, together with a distribution p that factorizes over G. p is specified as the set of conditional probability distributions associated with G s nodes. p(e) E=1 E= E B p(b) B=1 B= E=0 E=1 p(r E) R= R= A=0 A=1 R p(c A) C=1 C= A C p(a B,E) A=1 A=0 B=0,E= B=0,E= B=1,E= B=1,E= January 7, COMP-526 Lecture 2 Using a Bayes net for reasoning (1) Computing any entry in the joint probability table is easy because of the factorization property: p(b = 1, E = 0, A = 1, C = 1, R = 0) = p(b = 1)p(E = 0)p(A = 1 B = 1, E = 0)p(C = 1 A = 1)p(R = 0 E = 0) = Computing marginal probabilities is also easy. E.g. What is the probability that a neighbor calls? p(c = 1) = X p(c = 1, e, b, r, a) =... e,b,r,a January 7, COMP-526 Lecture 2

11 Using a Bayes net for reasoning (2) One might want to compute the conditional probability of a variable given evidence that is upstream from it in the graph E.g. What is the probability of a call in case of a burglary? P p(c = 1, B = 1) e,r,a p(c = 1, B = 1, e, r, a) p(c = 1 B = 1) = = p(b = 1) p(c, B = 1, e, r, a) P c,e,r,a This is called causal reasoning or prediction January 7, COMP-526 Lecture 2 Using a Bayes net for reasoning (3) We might have some evidence and need an explanation for it. In this case, we compute a conditional probability based on evidence that is downstream in the graph E.g. Suppose we got a call. What is the probability of a burglary? What is the probability of an earthquake? p(b = 1 C = 1) = p(e = 1 C = 1) = p(c = 1 B = 1)p(B = 1) p(c = 1) p(c = 1 E = 1)p(E = 1) p(c = 1) =... =... This is evidential reasoning or explanation. January 7, COMP-526 Lecture 2

12 Using a Bayes net for reasoning (4) Suppose that you now gather more evidence, e.g. the radio announces an earthquake. What happens to the probabilities? p(e = 1 C = 1, R = 1) p(e = 1 C = 1) and p(b = 1 C = 1, R = 1) p(b = 1 C = 1) This is called explaining away January 7, COMP-526 Lecture 2 I-Maps A directed acyclic graph (DAG) G whose nodes represent random variables X 1,..., X n is an I-map (independence map) of a distribution p if p satisfies the independence assumptions: X i Nondescendents(X i ) X πi, i = 1,... n where X πi are the parents of X i January 7, COMP-526 Lecture 2

13 Example Consider all possible DAG structures over 2 variables. Which graph is an I-map for the following distribution? x y p(x, y) What about the following distribution? x y p(x, y) January 7, COMP-526 Lecture 2 Example (continued) In the first example, X and Y are not independent, so the only I-maps are the graphs X Y and Y X, which assume no independence In the second example, we have p(x = 0) = 0.2, p(y = 0) = 0.4, and and for all entries p(x, y) = p(x)p(y) Hence, X Y, and there are three I-maps for the distribution: the graph in which X and Y are not connected, and both graphs above. Note that independence maps may have extra arcs! January 7, COMP-526 Lecture 2

### 13.3 Inference Using Full Joint Distribution

191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

### Querying Joint Probability Distributions

Querying Joint Probability Distributions Sargur Srihari srihari@cedar.buffalo.edu 1 Queries of Interest Probabilistic Graphical Models (BNs and MNs) represent joint probability distributions over multiple

### Logic, Probability and Learning

Logic, Probability and Learning Luc De Raedt luc.deraedt@cs.kuleuven.be Overview Logic Learning Probabilistic Learning Probabilistic Logic Learning Closely following : Russell and Norvig, AI: a modern

### Probability, Conditional Independence

Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms

### Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

### Outline. Probability. Random Variables. Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule.

Probability 1 Probability Random Variables Outline Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule Inference Independence 2 Inference in Ghostbusters A ghost

### Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1

Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2011 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

### Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

### Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

1 Learning Goals Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1. Be able to apply Bayes theorem to compute probabilities. 2. Be able to identify

### Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

### Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

### CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

### Life of A Knowledge Base (KB)

Life of A Knowledge Base (KB) A knowledge base system is a special kind of database management system to for knowledge base management. KB extraction: knowledge extraction using statistical models in NLP/ML

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

### Intelligent Systems: Reasoning and Recognition. Uncertainty and Plausible Reasoning

Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2015/2016 Lesson 12 25 march 2015 Uncertainty and Plausible Reasoning MYCIN (continued)...2 Backward

### Probability Theory. Elementary rules of probability Sum rule. Product rule. p. 23

Probability Theory Uncertainty is key concept in machine learning. Probability provides consistent framework for the quantification and manipulation of uncertainty. Probability of an event is the fraction

### People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

PROBABILITY AND LIKELIHOOD, A BRIEF INTRODUCTION IN SUPPORT OF A COURSE ON MOLECULAR EVOLUTION (BIOL 3046) Probability The subject of PROBABILITY is a branch of mathematics dedicated to building models

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### Outline. Uncertainty. Uncertainty. Making decisions under uncertainty. Probability. Methods for handling uncertainty

Outline Uncertainty Chapter 13 Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me

### Artificial Intelligence

Artificial Intelligence ICS461 Fall 2010 Nancy E. Reed nreed@hawaii.edu 1 Lecture #13 Uncertainty Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule Uncertainty

### / Informatica. Intelligent Systems. Agents dealing with uncertainty. Example: driving to the airport. Why logic fails

Intelligent Systems Prof. dr. Paul De Bra Technische Universiteit Eindhoven debra@win.tue.nl Agents dealing with uncertainty Uncertainty Probability Syntax and Semantics Inference Independence and Bayes'

### Part III: Machine Learning. CS 188: Artificial Intelligence. Machine Learning This Set of Slides. Parameter Estimation. Estimation: Smoothing

CS 188: Artificial Intelligence Lecture 20: Dynamic Bayes Nets, Naïve Bayes Pieter Abbeel UC Berkeley Slides adapted from Dan Klein. Part III: Machine Learning Up until now: how to reason in a model and

### CMPSCI 240: Reasoning about Uncertainty

CMPSCI 240: Reasoning about Uncertainty Lecture 18: Spam Filtering and Naive Bayes Classification Andrew McGregor University of Massachusetts Last Compiled: April 9, 2015 Review Total Probability If A

### Probabilistic Graphical Models

Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 4, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 4, 2011 1 / 22 Bayesian Networks and independences

### OPTIMIZATION AND OPERATIONS RESEARCH Vol. IV - Decision Trees and Influence Diagrams - Prakash P. Shenoy

DECISION TREES AND INFLUENCE DIAGRAMS Prakash P. Shenoy University of Kansas, Lawrence, USA Keywords: Decision trees, influence diagrams, decision making under uncertainty Contents 1. Introduction 2. A

### Model-based Synthesis. Tony O Hagan

Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

### Probability OPRE 6301

Probability OPRE 6301 Random Experiment... Recall that our eventual goal in this course is to go from the random sample to the population. The theory that allows for this transition is the theory of probability.

### 22c:145 Artificial Intelligence

22c:145 Artificial Intelligence Fall 2005 Uncertainty Cesare Tinelli The University of Iowa Copyright 2001-05 Cesare Tinelli and Hantao Zhang. a a These notes are copyrighted material and may not be used

### The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters

DEFINING PROILISTI MODELS The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters 2 n (larger than 10 25 for only 100 variables). x y z p(x, y, z) 0

### Probabilistic Graphical Models Homework 1: Due January 29, 2014 at 4 pm

Probabilistic Graphical Models 10-708 Homework 1: Due January 29, 2014 at 4 pm Directions. This homework assignment covers the material presented in Lectures 1-3. You must complete all four problems to

### CS 771 Artificial Intelligence. Reasoning under uncertainty

CS 771 Artificial Intelligence Reasoning under uncertainty Today Representing uncertainty is useful in knowledge bases Probability provides a coherent framework for uncertainty Review basic concepts in

### Variable Elimination 1

Variable Elimination 1 Inference Exact inference Enumeration Variable elimination Junction trees and belief propagation Approximate inference Loopy belief propagation Sampling methods: likelihood weighting,

### Artificial Intelligence. Conditional probability. Inference by enumeration. Independence. Lesson 11 (From Russell & Norvig)

Artificial Intelligence Conditional probability Conditional or posterior probabilities e.g., cavity toothache) = 0.8 i.e., given that toothache is all I know tation for conditional distributions: Cavity

### The Certainty-Factor Model

The Certainty-Factor Model David Heckerman Departments of Computer Science and Pathology University of Southern California HMR 204, 2025 Zonal Ave Los Angeles, CA 90033 dheck@sumex-aim.stanford.edu 1 Introduction

### Ching-Han Hsu, BMES, National Tsing Hua University c 2014 by Ching-Han Hsu, Ph.D., BMIR Lab

Lecture 2 Probability BMIR Lecture Series in Probability and Statistics Ching-Han Hsu, BMES, National Tsing Hua University c 2014 by Ching-Han Hsu, Ph.D., BMIR Lab 2.1 1 Sample Spaces and Events Random

### Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

### The intuitions in examples from last time give us D-separation and inference in belief networks

CSC384: Lecture 11 Variable Elimination Last time The intuitions in examples from last time give us D-separation and inference in belief networks a simple inference algorithm for networks without Today

### CS 331: Artificial Intelligence Fundamentals of Probability II. Joint Probability Distribution. Marginalization. Marginalization

Full Joint Probability Distributions CS 331: Artificial Intelligence Fundamentals of Probability II Toothache Cavity Catch Toothache, Cavity, Catch false false false 0.576 false false true 0.144 false

### Lecture 2: Statistical Estimation and Testing

Bioinformatics: In-depth PROBABILITY AND STATISTICS Spring Semester 2012 Lecture 2: Statistical Estimation and Testing Stefanie Muff (stefanie.muff@ifspm.uzh.ch) 1 Problems in statistics 2 The three main

### Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization Wolfram Burgard, Maren Bennewitz, Diego Tipaldi, Luciano Spinello 1 Motivation Recall: Discrete filter Discretize

### Business Statistics 41000: Probability 1

Business Statistics 41000: Probability 1 Drew D. Creal University of Chicago, Booth School of Business Week 3: January 24 and 25, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office:

### Bayesian Tutorial (Sheet Updated 20 March)

Bayesian Tutorial (Sheet Updated 20 March) Practice Questions (for discussing in Class) Week starting 21 March 2016 1. What is the probability that the total of two dice will be greater than 8, given that

### Discrete Structures for Computer Science

Discrete Structures for Computer Science Adam J. Lee adamlee@cs.pitt.edu 6111 Sennott Square Lecture #20: Bayes Theorem November 5, 2013 How can we incorporate prior knowledge? Sometimes we want to know

### Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

Neural Networks for Machine Learning Lecture 13a The ups and downs of backpropagation Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed A brief history of backpropagation

### Basic Probability Theory (I)

Basic Probability Theory (I) Intro to Bayesian Data Analysis & Cognitive Modeling Adrian Brasoveanu [partly based on slides by Sharon Goldwater & Frank Keller and John K. Kruschke] Fall 2012 UCSC Linguistics

### Machine Learning. CS 188: Artificial Intelligence Naïve Bayes. Example: Digit Recognition. Other Classification Tasks

CS 188: Artificial Intelligence Naïve Bayes Machine Learning Up until now: how use a model to make optimal decisions Machine learning: how to acquire a model from data / experience Learning parameters

### Decision Trees and Networks

Lecture 21: Uncertainty 6 Today s Lecture Victor R. Lesser CMPSCI 683 Fall 2010 Decision Trees and Networks Decision Trees A decision tree is an explicit representation of all the possible scenarios from

### Study Manual. Probabilistic Reasoning. Silja Renooij. August 2015

Study Manual Probabilistic Reasoning 2015 2016 Silja Renooij August 2015 General information This study manual was designed to help guide your self studies. As such, it does not include material that is

### Q1. [19 pts] Cheating at cs188-blackjack

CS 188 Spring 2012 Introduction to Artificial Intelligence Practice Midterm II Solutions Q1. [19 pts] Cheating at cs188-blackjack Cheating dealers have become a serious problem at the cs188-blackjack tables.

### Bayesian probability theory

Bayesian probability theory Bruno A. Olshausen arch 1, 2004 Abstract Bayesian probability theory provides a mathematical framework for peforming inference, or reasoning, using probability. The foundations

### Math 141. Lecture 2: More Probability! Albyn Jones 1. jones@reed.edu www.people.reed.edu/ jones/courses/141. 1 Library 304. Albyn Jones Math 141

Math 141 Lecture 2: More Probability! Albyn Jones 1 1 Library 304 jones@reed.edu www.people.reed.edu/ jones/courses/141 Outline Law of total probability Bayes Theorem the Multiplication Rule, again Recall

### Bayesian Network Development

Bayesian Network Development Kenneth BACLAWSKI College of Computer Science, Northeastern University Boston, Massachusetts 02115 USA Ken@Baclawski.com Abstract Bayesian networks are a popular mechanism

### Knowledge-Based Probabilistic Reasoning from Expert Systems to Graphical Models

Knowledge-Based Probabilistic Reasoning from Expert Systems to Graphical Models By George F. Luger & Chayan Chakrabarti {luger cc} @cs.unm.edu Department of Computer Science University of New Mexico Albuquerque

### Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

### MONITORING AND DIAGNOSIS OF A MULTI-STAGE MANUFACTURING PROCESS USING BAYESIAN NETWORKS

MONITORING AND DIAGNOSIS OF A MULTI-STAGE MANUFACTURING PROCESS USING BAYESIAN NETWORKS Eric Wolbrecht Bruce D Ambrosio Bob Paasch Oregon State University, Corvallis, OR Doug Kirby Hewlett Packard, Corvallis,

### Learning from Data: Naive Bayes

Semester 1 http://www.anc.ed.ac.uk/ amos/lfd/ Naive Bayes Typical example: Bayesian Spam Filter. Naive means naive. Bayesian methods can be much more sophisticated. Basic assumption: conditional independence.

### Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections - we are still here Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.5 The t Test 9.6 Comparing the

### 7: The CRR Market Model

Ben Goldys and Marek Rutkowski School of Mathematics and Statistics University of Sydney MATH3075/3975 Financial Mathematics Semester 2, 2015 Outline We will examine the following issues: 1 The Cox-Ross-Rubinstein

### Bayesian models of cognition

Bayesian models of cognition Thomas L. Griffiths Department of Psychology University of California, Berkeley Charles Kemp and Joshua B. Tenenbaum Department of Brain and Cognitive Sciences Massachusetts

### Marginal Independence and Conditional Independence

Marginal Independence and Conditional Independence Computer Science cpsc322, Lecture 26 (Textbook Chpt 6.1-2) March, 19, 2010 Recap with Example Marginalization Conditional Probability Chain Rule Lecture

### Informatics 2D Reasoning and Agents Semester 2, 2015-16

Informatics 2D Reasoning and Agents Semester 2, 2015-16 Alex Lascarides alex@inf.ed.ac.uk Lecture 29 Decision Making Under Uncertainty 24th March 2016 Informatics UoE Informatics 2D 1 Where are we? Last

### Worked examples Basic Concepts of Probability Theory

Worked examples Basic Concepts of Probability Theory Example 1 A regular tetrahedron is a body that has four faces and, if is tossed, the probability that it lands on any face is 1/4. Suppose that one

### Uncertainty. Chapter 13. Chapter 13 1

Uncertainty Chapter 13 Chapter 13 1 Attribution Modified from Stuart Russell s slides (Berkeley) Chapter 13 2 Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes Rule

### Master s thesis tutorial: part III

for the Autonomous Compliant Research group Tinne De Laet, Wilm Decré, Diederik Verscheure Katholieke Universiteit Leuven, Department of Mechanical Engineering, PMA Division 30 oktober 2006 Outline General

### Homework 8 Solutions

CSE 21 - Winter 2014 Homework Homework 8 Solutions 1 Of 330 male and 270 female employees at the Flagstaff Mall, 210 of the men and 180 of the women are on flex-time (flexible working hours). Given that

### Conditional Probability, Hypothesis Testing, and the Monty Hall Problem

Conditional Probability, Hypothesis Testing, and the Monty Hall Problem Ernie Croot September 17, 2008 On more than one occasion I have heard the comment Probability does not exist in the real world, and

### 6.4 Logarithmic Equations and Inequalities

6.4 Logarithmic Equations and Inequalities 459 6.4 Logarithmic Equations and Inequalities In Section 6.3 we solved equations and inequalities involving exponential functions using one of two basic strategies.

### Basics of Statistical Machine Learning

CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

### Solutions: Problems for Chapter 3. Solutions: Problems for Chapter 3

Problem A: You are dealt five cards from a standard deck. Are you more likely to be dealt two pairs or three of a kind? experiment: choose 5 cards at random from a standard deck Ω = {5-combinations of

### STA 371G: Statistics and Modeling

STA 371G: Statistics and Modeling Decision Making Under Uncertainty: Probability, Betting Odds and Bayes Theorem Mingyuan Zhou McCombs School of Business The University of Texas at Austin http://mingyuanzhou.github.io/sta371g

### A Web-based Bayesian Intelligent Tutoring System for Computer Programming

A Web-based Bayesian Intelligent Tutoring System for Computer Programming C.J. Butz, S. Hua, R.B. Maguire Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 Email:

### Chapter 14 Managing Operational Risks with Bayesian Networks

Chapter 14 Managing Operational Risks with Bayesian Networks Carol Alexander This chapter introduces Bayesian belief and decision networks as quantitative management tools for operational risks. Bayesian

### Prediction of Heart Disease Using Naïve Bayes Algorithm

Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,

### Running Head: STRUCTURE INDUCTION IN DIAGNOSTIC REASONING 1. Structure Induction in Diagnostic Causal Reasoning. Development, Berlin, Germany

Running Head: STRUCTURE INDUCTION IN DIAGNOSTIC REASONING 1 Structure Induction in Diagnostic Causal Reasoning Björn Meder 1, Ralf Mayrhofer 2, & Michael R. Waldmann 2 1 Center for Adaptive Behavior and

### A survey on click modeling in web search

A survey on click modeling in web search Lianghao Li Hong Kong University of Science and Technology Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models

### DETERMINING THE CONDITIONAL PROBABILITIES IN BAYESIAN NETWORKS

Hacettepe Journal of Mathematics and Statistics Volume 33 (2004), 69 76 DETERMINING THE CONDITIONAL PROBABILITIES IN BAYESIAN NETWORKS Hülya Olmuş and S. Oral Erbaş Received 22 : 07 : 2003 : Accepted 04

### 15.0 More Hypothesis Testing

15.0 More Hypothesis Testing 1 Answer Questions Type I and Type II Error Power Calculation Bayesian Hypothesis Testing 15.1 Type I and Type II Error In the philosophy of hypothesis testing, the null hypothesis

### Bayesian Analysis for the Social Sciences

Bayesian Analysis for the Social Sciences Simon Jackman Stanford University http://jackman.stanford.edu/bass November 9, 2012 Simon Jackman (Stanford) Bayesian Analysis for the Social Sciences November

### Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

### Introduction to Game Theory Lecture 7: Bayesian Games

Introduction to Game Theory Lecture 7: Bayesian Games Haifeng Huang University of California, Merced Shanghai, Summer 2011 Games of Incomplete Information: Bayesian Games In the games we have studies so

### Quiz 1 for Name: Good luck! 20% 20% 20% 20% Quiz page 1 of 16

Quiz 1 for 6.034 Name: 20% 20% 20% 20% Good luck! 6.034 Quiz page 1 of 16 Question #1 30 points 1. Figure 1 illustrates decision boundaries for two nearest-neighbour classifiers. Determine which one of

### 5 Directed acyclic graphs

5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate

### Examples and Proofs of Inference in Junction Trees

Examples and Proofs of Inference in Junction Trees Peter Lucas LIAC, Leiden University February 4, 2016 1 Representation and notation Let P(V) be a joint probability distribution, where V stands for a

### Linear Threshold Units

Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

### SEQUENTIAL VALUATION NETWORKS: A NEW GRAPHICAL TECHNIQUE FOR ASYMMETRIC DECISION PROBLEMS

SEQUENTIAL VALUATION NETWORKS: A NEW GRAPHICAL TECHNIQUE FOR ASYMMETRIC DECISION PROBLEMS Riza Demirer and Prakash P. Shenoy University of Kansas, School of Business 1300 Sunnyside Ave, Summerfield Hall

### Lecture 9: Bayesian hypothesis testing

Lecture 9: Bayesian hypothesis testing 5 November 27 In this lecture we ll learn about Bayesian hypothesis testing. 1 Introduction to Bayesian hypothesis testing Before we go into the details of Bayesian

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 1

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 1 Course Outline CS70 is a course on Discrete Mathematics and Probability for EECS Students. The purpose of the course

### Lab 6: Sampling Distributions and the CLT

Lab 6: Sampling Distributions and the CLT Objective: The objective of this lab is to give you a hands- on discussion and understanding of sampling distributions and the Central Limit Theorem (CLT), a theorem

### Lecture Note 1 Set and Probability Theory. MIT 14.30 Spring 2006 Herman Bennett

Lecture Note 1 Set and Probability Theory MIT 14.30 Spring 2006 Herman Bennett 1 Set Theory 1.1 Definitions and Theorems 1. Experiment: any action or process whose outcome is subject to uncertainty. 2.

### Probabilities. Probability of a event. From Random Variables to Events. From Random Variables to Events. Probability Theory I

Victor Adamchi Danny Sleator Great Theoretical Ideas In Computer Science Probability Theory I CS 5-25 Spring 200 Lecture Feb. 6, 200 Carnegie Mellon University We will consider chance experiments with

### COS513 LECTURE 5 JUNCTION TREE ALGORITHM

COS513 LECTURE 5 JUNCTION TREE ALGORITHM D. EIS AND V. KOSTINA The junction tree algorithm is the culmination of the way graph theory and probability combine to form graphical models. After we discuss

### Statistical Inference

Statistical Inference Idea: Estimate parameters of the population distribution using data. How: Use the sampling distribution of sample statistics and methods based on what would happen if we used this

### Shannon and Huffman-Type Coders

Shannon and Huffman-Type Coders A useful class of coders that satisfy the Kraft's inequality in an efficient manner are called Huffman-type coders. To understand the philosophy of obtaining these codes,

### How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

### ST 371 (IV): Discrete Random Variables

ST 371 (IV): Discrete Random Variables 1 Random Variables A random variable (rv) is a function that is defined on the sample space of the experiment and that assigns a numerical variable to each possible

### Introduction to Machine Learning

Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 5: Decision Theory & ROC Curves Gaussian ML Estimation Many figures courtesy Kevin Murphy s textbook,

### Sampling and Hypothesis Testing

Population and sample Sampling and Hypothesis Testing Allin Cottrell Population : an entire set of objects or units of observation of one sort or another. Sample : subset of a population. Parameter versus

### Relational Dynamic Bayesian Networks: a report. Cristina Manfredotti

Relational Dynamic Bayesian Networks: a report Cristina Manfredotti Dipartimento di Informatica, Sistemistica e Comunicazione (D.I.S.Co.) Università degli Studi Milano-Bicocca manfredotti@disco.unimib.it