Querying Joint Probability Distributions



Similar documents
Bayesian Networks. Read R&N Ch Next lecture: Read R&N

Big Data, Machine Learning, Causal Models

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Life of A Knowledge Base (KB)

CHAPTER 2 Estimating Probabilities

3. The Junction Tree Algorithms

Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian probability theory

The Basics of Graphical Models

What Is Probability?

Course: Model, Learning, and Inference: Lecture 5

Statistical Machine Learning from Data

CONTINGENCY (CROSS- TABULATION) TABLES

Relational Dynamic Bayesian Networks: a report. Cristina Manfredotti

An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data

Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Machine Learning. CS 188: Artificial Intelligence Naïve Bayes. Example: Digit Recognition. Other Classification Tasks

Introduction to Markov Chain Monte Carlo

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Informatics 2D Reasoning and Agents Semester 2,

Introduction to Detection Theory

Basics of Statistical Machine Learning

Lecture 4: BK inequality 27th August and 6th September, 2007

Message-passing sequential detection of multiple change points in networks

Naive-Bayes Classification Algorithm

Statistics in Geophysics: Introduction and Probability Theory


Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Part III: Machine Learning. CS 188: Artificial Intelligence. Machine Learning This Set of Slides. Parameter Estimation. Estimation: Smoothing

Structured Learning and Prediction in Computer Vision. Contents

Bayesian Network Development

Learning Instance-Specific Predictive Models

), 35% use extra unleaded gas ( A


STA 256: Statistics and Probability I

Chapter 28. Bayesian Networks

Lecture 1 Introduction Properties of Probability Methods of Enumeration Asrat Temesgen Stockholm University

Simple Decision Making

Role of Automation in the Forensic Examination of Handwritten Items

A Bayesian Network Model for Diagnosis of Liver Disorders Agnieszka Onisko, M.S., 1,2 Marek J. Druzdzel, Ph.D., 1 and Hanna Wasyluk, M.D.,Ph.D.

Model-based Synthesis. Tony O Hagan

Approximating the Partition Function by Deleting and then Correcting for Model Edges

Towards running complex models on big data

Bayesian Tutorial (Sheet Updated 20 March)

Conditional Random Fields: An Introduction

Bounded Treewidth in Knowledge Representation and Reasoning 1

What is the purpose of this document? What is in the document? How do I send Feedback?

Probability: The Study of Randomness Randomness and Probability Models. IPS Chapters 4 Sections

Compression algorithm for Bayesian network modeling of binary systems

An Empirical Evaluation of Bayesian Networks Derived from Fault Trees

Knowledge-Based Probabilistic Reasoning from Expert Systems to Graphical Models

Reject Inference in Credit Scoring. Jie-Men Mok

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Graphs. Exploratory data analysis. Graphs. Standard forms. A graph is a suitable way of representing data if:

Homework 8 Solutions

Elements of probability theory

Study Manual. Probabilistic Reasoning. Silja Renooij. August 2015

CONSUMER PREFERENCES THE THEORY OF THE CONSUMER

A Few Basics of Probability

L3: Statistical Modeling with Hadoop

A Comparison of Novel and State-of-the-Art Polynomial Bayesian Network Learning Algorithms

Running Head: STRUCTURE INDUCTION IN DIAGNOSTIC REASONING 1. Structure Induction in Diagnostic Causal Reasoning. Development, Berlin, Germany

Bayes Theorem. Bayes Theorem- Example. Evaluation of Medical Screening Procedure. Evaluation of Medical Screening Procedure

Probability for Estimation (review)

Latent variable and deep modeling with Gaussian processes; application to system identification. Andreas Damianou

Part III. Lecture 3: Probability and Stochastic Processes. Stephen Kinsella (UL) EC4024 February 8, / 149

Finding the M Most Probable Configurations Using Loopy Belief Propagation

Linear Threshold Units

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages , Warsaw, Poland, December 2-4, 1999

Unit 19: Probability Models

Bayesian networks - Time-series models - Apache Spark & Scala

Statistical Models in Data Mining

Probability & Probability Distributions

Statistical Machine Translation: IBM Models 1 and 2

Distributed Structured Prediction for Big Data

STA 371G: Statistics and Modeling

Basic Probability. Probability: The part of Mathematics devoted to quantify uncertainty

CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION

Topic Dempster-Shafer Theory

ECE302 Spring 2006 HW3 Solutions February 2,

Graphical Models, Exponential Families, and Variational Inference

Transcription:

Querying Joint Probability Distributions Sargur Srihari srihari@cedar.buffalo.edu 1

Queries of Interest Probabilistic Graphical Models (BNs and MNs) represent joint probability distributions over multiple variables Their use is to answer queries of interest This is called inference What types of queries are there? Illustrate it with the Student Example 2

Student Bayesian Network Represents joint probability distribution over multiple variables BNs represent them in terms of graphs and conditional probability distributions(cpds) Resulting in great savings in no of parameters needed 3

Joint distribution from Student BN CPDs: P(X i pa(x i )) Joint Distribution: P(X) = P(X 1,..X n ) N P(X) = P(X i pa(x i )) i=1 P(D, I,G,S, L) = P(D)P(I)P(G D, I)P(S I)P(L G) 4

Query Types 1. Probability Queries Given L and S give distribution of I 2. MAP Queries Maximum a posteriori probability Also called MPE (Most Probable Explanation) What is the most likely setting of D,I, G, S, L Marginal MAP Queries When some variables are known 5

Probability Queries Most common type of query is a probability query Query has two parts Evidence: a subset E of variables and their instantiation e Query Variables: a subset Y of random variables in network Inference Task: P(Y E=e) Posterior probability distribution over values y of Y Conditioned on the fact E=e Can be viewed as Marginal over Y in distribution we obtain by conditioning on e Marginal Probability Estimation P(Y = y i E = e) = P(Y = y i, E = e) P(E = e) 6

Example of Probability Query P(Y = y i E = e) = P(Y = y i, E = e) P(E = e) Posterior Marginal Probability of Evidence Posterior Marginal Estimation: P(I=i 1 L=l 0,S=s 1 )=? Probability of Evidence: P(L=l 0,s=s 1 )=? Here we are asking for a specific probability rather than a full distribution 7

MAP Queries (Most Probable Explanation) Finding a high probability assignment to some subset of variables Most likely assignment to all non-evidence variables W=χ Y MAP(W e) = arg max w P(w,e) Value of w for which P(w,e) is maximum Difference from probability query Instead of a probability we get the most likely value for all remaining variables 8

Example of MAP Queries P(Diseases) a 0 a 1 0,4 0.6 A Disease Medical Diagnosis Problem Diseases (A) cause Symptoms (B) Two possible diseases Mono and Flu Two possible symptoms Headache, Fever B Symptom P(Symptom Disease) P(B A) b 0 b 1 a 0 0.1 0.9 a 1 0.5 0.5 Q1: Most likely disease P(A)? A1: Flu (Trivially determined for root node) Q2: Most likely disease and symptom P(A,B)? A2: Q3: Most likely symptom P(B)? A3: 9

Example of MAP Query Disease Symptom a 0 a 1 0,4 0.6 A B MAP(A) = arg max a MAP(A, B) = arg max a, b A = a 1 P(A, B)=arg max a, b A1: Flu =arg max a, b {0.04, 0.36, 0.3, 0.3}=a0, b 1 P(A)P(B A) A2: Mono and Fever Note that individually most likely value a 1 is not in the most likely joint assignment P(B A) b 0 b 1 a 0 0.1 0.9 a 1 0.5 0.5 10

Marginal MAP Query Diseases a 0 a 1 0,4 0.6 A We looked for highest joint probability assignment of disease and symptom Can look for most likely assignment of disease variable only Query is not all remaining variables but a subset of them Y is query, evidence is E=e Task is to find most likely assignment to Y: MAP(Y e)=arg max P(Y e) B Symptoms If Z=X-Y-E MAP(Y e) = arg max Y Z P(Y,Z e) A b 0 b 1 a 0 0.1 0.9 a 1 0.5 0.5 11

Example of Marginal MAP Query Disease a 0 a 1 0,4 0.6 A MAP(A, B) = arg max a, b P(A, B)=arg max a, b =arg max a, b {0.04, 0.36, 0.3, 0.3}=a0, b 1 P(A)P(B A) A2: Mono and Fever Symptom B P(B A) b 0 b 1 a 0 0.1 0.9 a 1 0.5 0.5 MAP(B) = arg max b =arg max b P(B)=arg max b {0.34, 0.66}=b 1 A P(A, B) A3: Fever P(A,B) b 0 b 1 a 0 0.04 0.36 a 1 0.3 0.3 12

Marginal MAP Assignments They are not monotonic Most likely assignment MAP(Y 1 e) might be completely different from assignment to Y 1 in MAP({Y 1,Y 2 } e) Q1: Most likely disease P(A)? A1: Flu Q2: Most likely disease and symptom P(A,B)? A2: Mono and Fever Thus we cannot use a MAP query to give a correct answer to a marginal map query 13

Marginal MAP more Complex than MAP Contains both summations (like in probability queries) and maximizations (like in MAP queries) MAP(B) = arg max b =arg max b P(B)=arg max b {0.34, 0.66}=b 1 A P(A, B) 14

Computing the Probability of Evidence Probability Distribution of Evidence P(L,S) = D,I,G P(D, I,G, L,S) = P(D) P(I)P(G D, I)P(L G)P(S I) D,I,G Probability of Evidence P(L = l 0,s = s 1 ) = P(D)P(I)P(G D, I)P(L = l 0 G)P(S = s 1 I) More Generally D,I,G n P(E = e) = P(X i pa(x i )) E =e X \ E i=1 An intractable problem #P complete Sum Rule of Probability Tractable when tree-width is less than 25 Most real-world applications have higher tree-width Approximations are usually sufficient (hence sampling) When P(Y=y E=e)=0.29292, approximation yields 0.3 From the Graphical Model 15