Outline. Probability. Random Variables. Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule.

Similar documents
15-381: Artificial Intelligence. Probabilistic Reasoning and Inference

Math/Stats 425 Introduction to Probability. 1. Uncertainty and the axioms of probability

Discrete Structures for Computer Science

6.3 Conditional Probability and Independence

The Basics of Graphical Models

1 Review of Least Squares Solutions to Overdetermined Systems

Managerial Economics Prof. Trupti Mishra S.J.M. School of Management Indian Institute of Technology, Bombay. Lecture - 13 Consumer Behaviour (Contd )

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Knowledge-based systems and the need for learning

Sudoku puzzles and how to solve them

Decision Trees and Networks

Bayesian probability theory

Solving Equations. How do you know that x = 3 in the equation, 2x - 1 = 5?

Decision Making under Uncertainty

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Basics of Statistical Machine Learning

Homogeneous systems of algebraic equations. A homogeneous (ho-mo-geen -ius) system of linear algebraic equations is one in which

CHAPTER 2 Estimating Probabilities

Attributes Acceptance Sampling Understanding How it Works

Introduction to Game Theory IIIii. Payoffs: Probability and Expected Utility

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Lecture 1 Introduction Properties of Probability Methods of Enumeration Asrat Temesgen Stockholm University

The fundamental question in economics is 2. Consumer Preferences

Knowledge-Based Probabilistic Reasoning from Expert Systems to Graphical Models

Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Stat 20: Intro to Probability and Statistics

Part III: Machine Learning. CS 188: Artificial Intelligence. Machine Learning This Set of Slides. Parameter Estimation. Estimation: Smoothing

Descriptive Statistics and Measurement Scales

Activities/ Resources for Unit V: Proportions, Ratios, Probability, Mean and Median

Bayesian Networks. Read R&N Ch Next lecture: Read R&N

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

6th Grade Lesson Plan: Probably Probability

Statistics and Probability

Machine Learning.

Math 3C Homework 3 Solutions

Section 6-5 Sample Spaces and Probability

Bayesian Tutorial (Sheet Updated 20 March)

Representation of functions as power series

Gibbs Sampling and Online Learning Introduction

Problems often have a certain amount of uncertainty, possibly due to: Incompleteness of information about the environment,

CURVE FITTING LEAST SQUARES APPROXIMATION

DATA LAYOUT AND LEVEL-OF-DETAIL CONTROL FOR FLOOD DATA VISUALIZATION

EXAM. Exam #3. Math 1430, Spring April 21, 2001 ANSWERS

Math Journal HMH Mega Math. itools Number

Sequential Data Structures

2. Information Economics

Probabilistic Strategies: Solutions

Informatics 2D Reasoning and Agents Semester 2,

Predict the Popularity of YouTube Videos Using Early View Data

5 Aleatory Variability and Epistemic Uncertainty

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Self-Improving Supply Chains

Experiment 4 ~ Resistors in Series & Parallel

Chapter 4. Probability and Probability Distributions

PS 271B: Quantitative Methods II. Lecture Notes

Elliott-Wave Fibonacci Spread Trading

Solutions to Math 51 First Exam January 29, 2015

Experiment 6: Magnetic Force on a Current Carrying Wire

NPV Versus IRR. W.L. Silber We know that if the cost of capital is 18 percent we reject the project because the NPV

Nodal and Loop Analysis

Probability definitions

The Theory and Practice of Using a Sine Bar, version 2

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Cosmological Arguments for the Existence of God S. Clarke

Section 4.1 Rules of Exponents

psychology and economics

MATH 140 Lab 4: Probability and the Standard Normal Distribution

Mechanics 1: Conservation of Energy and Momentum

STT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012)

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2

CSE 473: Artificial Intelligence Autumn 2010

OA3-10 Patterns in Addition Tables

Circuit Analysis using the Node and Mesh Methods

Simple Regression Theory II 2010 Samuel L. Baker

Business Statistics 41000: Probability 1

It is remarkable that a science, which began with the consideration of games of chance, should be elevated to the rank of the most important

5 Homogeneous systems

Prime Time: Homework Examples from ACE

Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases. Andreas Züfle

The Measurement of Situation Awareness for Automobile Technologies of the Future

Model-based Synthesis. Tony O Hagan

GAP CLOSING. 2D Measurement. Intermediate / Senior Student Book

2C: One in a Million. Part 1: Making solutions. Name: Section: Date: Materials

Robert Collins CSE598G. More on Mean-shift. R.Collins, CSE, PSU CSE598G Spring 2006

Leak Detection Theory: Optimizing Performance with MLOG

( ) = 1 x. ! 2x = 2. The region where that joint density is positive is indicated with dotted lines in the graph below. y = x

Real Time Traffic Monitoring With Bayesian Belief Networks

Chapter 5 A Survey of Probability Concepts

3-D Workshop AT A GLANCE:

Constrained Bayes and Empirical Bayes Estimator Applications in Insurance Pricing

5.5. Solving linear systems by the elimination method

Odds ratio, Odds ratio test for independence, chi-squared statistic.

OA4-13 Rounding on a Number Line Pages 80 81

Differentiated Instruction Strategies

MONITORING AND DIAGNOSIS OF A MULTI-STAGE MANUFACTURING PROCESS USING BAYESIAN NETWORKS

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Least-Squares Intersection of Lines

Kenken For Teachers. Tom Davis June 27, Abstract

Transcription:

Probability 1

Probability Random Variables Outline Joint and Marginal Distributions Conditional Distribution Product Rule, Chain Rule, Bayes Rule Inference Independence 2

Inference in Ghostbusters A ghost is in the grid somewhere Sensor readings tell how close a square is to the ghost On the ghost: red 1 or 2 away: orange 3 or 4 away: yellow 5+ away: green Sensors are noisy, but we know P(Color Distance)

Uncertainty General situation: there are Observed variables (evidence): Agent knows certain things about the state of the world (e.g., sensor readings or symptoms) Unobserved variables: Agent needs to reason about other aspects (e.g. where an object is or what disease is present) Model: Agent knows something about how the known variables relate to the unknown variables Probabilistic reasoning gives us a framework for managing our beliefs and knowledge 4

Random Variables A random variable is some aspect of the world about which we (may) have uncertainty R = Is it raining? D = How long will it take to drive to work? L = Where am I? We denote random variables with capital letters, the values for random variables with lower case letters Like variables in a CSP, random variables have domains R in {true, false} (sometimes written as {+r, r}) D in [0, ) L in possible locations, maybe {(0,0), (0,1), } 5

Probability Distributions Unobserved random variables have distributions A distribution is a TABLE of probabilities of values A probability is a single number Must have: 6

Joint Distributions A joint distribution over a set of random variables: X 1, X 2,, X n specifies a real number for each assignment (or outcome): Size of distribution if n variables with domain sizes d? Must obey: d n For all but the smallest distributions, impractical to write out 7

Probabilistic Models A probabilistic model is a joint distribution over a set of random variables Probabilistic models: (Random) variables with domains Assignments are called outcomes Joint distributions: say whether assignments (outcomes) are likely Normalized: sum to 1.0 Ideally: only certain variables directly interact Recall: Constraint satisfaction problems: Variables with domains Constraints: state whether assignments are possible Ideally: only certain variables directly interact 8

Events An event is a set E of outcomes From a joint distribution, we can calculate the probability of any event Probability that it s hot AND sunny? Probability that it s hot? Probability that it s hot OR sunny? 0.4 0.4 + 0.1 = 0.5 0.4 + 0.1 + 0.2 = 0.7 Typically, the events we care about are partial assignments, like P(T=hot) 9

Marginal Distributions Marginal distributions are sub-tables which eliminate variables Marginalization (summing out): Combine collapsed rows by adding 10

Conditional Probabilities A simple relation between joint and conditional probabilities In fact, this is taken as the definition of a conditional probability =.3/.5 =.6 11

Conditional Distribution Conditional distributions are probability distributions over some variables given fixed values of others 12

Normalization Trick A trick to get a whole conditional distribution at once: Select the joint probabilities matching the evidence Normalize the selection (make it sum to one) Why does this work? Sum of selection is P(evidence)! (P(r), here) 13

Probabilistic Inference Probabilistic inference: compute a desired probability from other known probabilities (e.g. conditional from joint) We generally want conditional probabilities P(on time no reported accidents) = 0.90 These represent the agent s beliefs given the evidence Probabilities change with new evidence: P(on time no accidents, 5 a.m.) = 0.95 P(on time no accidents, 5 a.m., raining) = 0.80 Observing new evidence causes beliefs to be updated 14

Inference by Enumeration: Example Given any joint probability table, we can compute any probability needed. P(sun)?.30+.10+.10+.15 =.65 P(sun winter)?.1+.15 /.1+.05+.15+.20 =.5 P(sun winter, hot)?.10 /.10+.05 =.67 15

General case: Inference by Enumeration Evidence variables: E 1 E k = e 1 e k Query variable(s): Q Hidden variables: H 1 H r We want probability distribution of Query variable : P(Q e 1 e k ) First, select the entries consistent with the evidence Second, sum out H to get joint of Query and evidence: Finally, normalize the remaining entries to conditionalize Obvious problems: Worst-case time complexity O(d n ) Space complexity O(d n ) to store the joint distribution Hidden variables are bad news. How to go over hidden variables efficiently

The Product Rule Sometimes have conditional distributions but want the joint Example: 17

The Chain Rule More generally, can always write any joint distribution as an incremental product of conditional distributions Why is this always true? The chain rule is true for any ordering of variables. 18

Baye s Rule Two ways to factor a joint distribution over two variables: Dividing, we get: Why is this at all helpful? Lets us build one conditional from its reverse Often one conditional is tricky but the other one is simple Foundation of many systems (e.g. ASR, MT) This is in the running for most important AI equation! 19

Inference with Baye s Rule Example: Diagnostic probability from causal probability Example: m is meningitis, s is stiff neck Note: posterior probability of meningitis still very small Note: you should still get stiff necks checked out! Why? 20

Ghostbusters, Revisited We have two distributions: Prior distribution over ghost location: P(G) Let s say this is uniform Sensor reading model: P(R G) Given: we know what our sensors do R = reading color measured at (1,1) E.g. P(R = yellow G=(1,1)) = 0.1 We can calculate the posterior distribution P(G r) over ghost locations given a reading using Bayes rule: 21

Independence Two variables are independent in a joint distribution if: Says the joint distribution factors into a product of two simple ones Usually variables aren t independent! Can use independence as a modeling assumption Independence can be a simplifying assumption What could we assume for {Weather, Traffic, Cavity}? Empirical joint distributions: at best close to independent Independence is like something from CSPs: what? 22

Example: Independence 23

Example: Independence N fair, independent coin flips: 24