Discrete Mathematics & Mathematical Reasoning Chapter 7 (section 7.3): Conditional Probability & Bayes Theorem

Similar documents
Discrete Structures for Computer Science

Definition and Calculus of Probability

Bayesian Analysis for the Social Sciences

Lecture Note 1 Set and Probability Theory. MIT Spring 2006 Herman Bennett

1 Introductory Comments. 2 Bayesian Probability

STA 371G: Statistics and Modeling

Math/Stats 425 Introduction to Probability. 1. Uncertainty and the axioms of probability

Math 141. Lecture 2: More Probability! Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141

Week 2: Conditional Probability and Bayes formula

3.2 Conditional Probability and Independent Events

PROBABILITY. The theory of probabilities is simply the Science of logic quantitatively treated. C.S. PEIRCE

Homework 8 Solutions

1. Probabilities lie between 0 and 1 inclusive. 2. Certainty implies unity probability. 3. Probabilities of exclusive events add. 4. Bayes s rule.

Lecture 1 Introduction Properties of Probability Methods of Enumeration Asrat Temesgen Stockholm University

Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Homework 3 Solution, due July 16

Learning from Data: Naive Bayes

P (B) In statistics, the Bayes theorem is often used in the following way: P (Data Unknown)P (Unknown) P (Data)

The Calculus of Probability

Math 55: Discrete Mathematics

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Reliability Applications (Independence and Bayes Rule)

Lesson 1. Basics of Probability. Principles of Mathematics 12: Explained! 314

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs

Statistics in Geophysics: Introduction and Probability Theory

Question 2 Naïve Bayes (16 points)

Conditional Probability, Hypothesis Testing, and the Monty Hall Problem

Probability & Probability Distributions

Chapter What is the probability that a card chosen from an ordinary deck of 52 cards is an ace? Ans: 4/52.

Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

When 95% Accurate Isn t Written by: Todd CadwalladerOlsker California State University, Fullerton tcadwall@fullerton.edu

Integer roots of quadratic and cubic polynomials with integer coefficients

Basic Probability Concepts

Notes on Probability. Peter J. Cameron

Machine Learning.

Exam 3 Review/WIR 9 These problems will be started in class on April 7 and continued on April 8 at the WIR.

A Few Basics of Probability

ECE302 Spring 2006 HW1 Solutions January 16,

E3: PROBABILITY AND STATISTICS lecture notes

If n is odd, then 3n + 7 is even.

Elements of probability theory

Probability. Sample space: all the possible outcomes of a probability experiment, i.e., the population of outcomes

WHERE DOES THE 10% CONDITION COME FROM?

Probability: The Study of Randomness Randomness and Probability Models. IPS Chapters 4 Sections

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

2Probability CHAPTER OUTLINE LEARNING OBJECTIVES

Reading 13 : Finite State Automata and Regular Expressions

MAS131: Introduction to Probability and Statistics Semester 1: Introduction to Probability Lecturer: Dr D J Wilkinson

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

Full and Complete Binary Trees

Bayes and Naïve Bayes. cs534-machine Learning

Math 210 Lecture Notes: Ten Probability Review Problems

Bayes Theorem. Bayes Theorem- Example. Evaluation of Medical Screening Procedure. Evaluation of Medical Screening Procedure

A Three-Way Decision Approach to Spam Filtering

Solutions to Homework 6 Mathematics 503 Foundations of Mathematics Spring 2014

Probabilities. Probability of a event. From Random Variables to Events. From Random Variables to Events. Probability Theory I

A natural introduction to probability theory. Ronald Meester

z-scores AND THE NORMAL CURVE MODEL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

MATHEMATICS 154, SPRING 2010 PROBABILITY THEORY Outline #3 (Combinatorics, bridge, poker)

People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS

Welcome to the HVAC Sales Academy Learning Management System (LMS) Walkthrough:

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

1. Prove that the empty set is a subset of every set.

The "Dutch Book" argument, tracing back to independent work by. F.Ramsey (1926) and B.deFinetti (1937), offers prudential grounds for

LECTURE 16. Readings: Section 5.1. Lecture outline. Random processes Definition of the Bernoulli process Basic properties of the Bernoulli process

MAS108 Probability I

Multi-variable Calculus and Optimization

Spam Detection A Machine Learning Approach

Chapter 13 & 14 - Probability PART

Lecture 17 : Equivalence and Order Relations DRAFT

Chapter 4. Probability and Probability Distributions

Bayesian Networks. Read R&N Ch Next lecture: Read R&N

Chapter 4 - Practice Problems 1

Statistics and Data Analysis B

Section 6-5 Sample Spaces and Probability

BAYES' THEOREM IN DECISION MAKING Reasoning from Effect to Cause

Click on the links below to jump directly to the relevant section

Solutions to Self-Help Exercises 1.3. b. E F = /0. d. G c = {(3,4),(4,3),(4,4)}

Note on growth and growth accounting

INTRODUCTORY SET THEORY

The Fundamental Theorem of Arithmetic

5.1 Midsegment Theorem and Coordinate Proof

Bayesian Tutorial (Sheet Updated 20 March)

Introductory Probability. MATH 107: Finite Mathematics University of Louisville. March 5, 2014

THE BANACH CONTRACTION PRINCIPLE. Contents

HealthyCT Online Bill Pay

Extension of measure

A Procedure for Classifying New Respondents into Existing Segments Using Maximum Difference Scaling

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation

Combinatorial Proofs

Transcription:

Discrete Mathematics & Mathematical Reasoning Chapter 7 (section 7.3): Conditional Probability & Bayes Theorem Kousha Etessami U. of Edinburgh, UK Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 1 / 11

Reverend Thomas Bayes (1701-1761), studied logic and theology as an undergraduate student at the University of Edinburgh from 1719-1722. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 2 / 11

Bayes Theorem Bayes Theorem Let A and B be two events from a (countable) sample space Ω, and P : Ω [0, 1] a probability distribution on Ω, such that 0 < P(A) < 1, and P(B) > 0. Then P(A B) = P(B A)P(A) P(B A)P(A) + P(B A)P(A) This may at first look like an obscure equation, but as we shall see, it is useful... Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 3 / 11

Proof of Bayes Theorem: Let A and B be events such that 0 < P(A) < 1 and P(B) > 0. By definition, P(A B) = P(A B). So: P(A B) = P(A B)P(B). P(B) Likewise, P(B A) = P(B A)P(A). Likewise, P(B A) = P(B A)P(A). (Note that P(A) > 0.) Note that P(A B)P(B) = P(A B) = P(B A)P(A). So, P(B A)P(A) P(A B) = P(B) Furthermore, P(B) = P((B A) (B A)) = P(B A) + P(B A) So: P(A B) = = P(B A)P(A) + P(B A)P(A) P(B A)P(A) P(B A)P(A) + P(B A)P(A). Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 4 / 11

Using Bayes Theorem Problem: There are two boxes, Box B 1 and Box B 2. Box B 1 contains 2 red balls and 8 blue balls. Box B 2 contains 7 red balls and 3 blue balls. Suppose Jane first randomly chooses one of two boxes B 1 and B 2, with equal probability, 1/2, of choosing each. Suppose Jane then randomly picks one ball out of the box she has chosen (without telling you which box she had chosen), and shows you the ball she picked. Suppose you only see that the ball Jane picked is red. Question: Given this information, what is the probability that Jane chose box B 1? Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 5 / 11

Using Bayes Theorem, continued Answer: The underlying sample space, Ω, is: Ω = {(a, b) a {1, 2}, b {red, blue}} Let F = {(a, b) Ω a = 1} be the event that box B 1 was chosen. Thus, F = Ω F is the event that box B 2 was chosen. Let E = {(a, b) Ω b = red} be the event that a red ball was picked. Thus, E is the event that a blue ball was picked. We are interested in computing the probability P(F E). We know that P(E F) = 2 7 and P(E F) =. 10 10 We also know that: P(F) = 1/2 and P(F) = 1/2. Can we compute P(F E) based on this? Yes, using Bayes. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 6 / 11

Using Bayes Theorem, continued Note that, 0 < P(F) < 1, and P(E) > 0. By Bayes Theorem: P(F E) = = = P(E F)P(F) P(E F)P(F) + P(E F)P(F) (2/10) (1/2) (2/10) (1/2) + (7/10) (1/2) 2/20 2/20 + 7/20 = 2 9. Note that, without the information that a red ball was picked, the probability that Jane chose Box B 1 is P(F) = 1/2. But given the information, E, that a red ball was picked, the probability becomes much less, changing to P(F E) = 2/9. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 7 / 11

More on using Bayes Theorem: Baysian Spam Filters Problem: Suppose it has been observed empirically that the word Congratulations occurs in 1 out of 10 spam emails, but that Congratulations only occurs in 1 out of 1000 non-spam emails. Suppose it has also been observed empirically that about 4 out of 10 emails are spam. In Bayesian Spam Fitering, these empirical probabilities are interpreted as genuine probabilities in order to help estimate the probability that a incoming email is spam. Suppose we get a new email that contains Congratulations. Let C be the event that a new email contains Congratulations. Let S be the event that a new email is spam. We have observed C. We want to know P(S C). Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 8 / 11

Bayesian spam filtering example, continued Bayesian solution: By Bayes Theorem: P(C S)P(S) P(S C) = P(C S)P(S) + P(C S)P(S) From the empirical probabilities, we get the estimates: P(C S) 1/10; P(C S) 1/1000; P(S) 4/10; P(S) 6/10. So, we estimate that: P(S C) (1/10)(4/10) (1/10)(4/10) + (1/1000) (6/10).04.0406 0.985 So, with high probability, such an email is spam. (However, much caution is needed when interpreting such probabilities.) Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 9 / 11

Generalized Bayes Theorem Suppose that E, F 1,..., F n are events from sample space Ω, and that P : Ω [0, 1] is a probability distribution on Ω. Suppose that n i=1 F j = Ω, and that F i F j = for all i j. Suppose P(E) > 0, and P(F j ) > 0 for all j. Then for all j: P(F j E) = P(E F j )P(F j ) n i=1 P(E F i)p(f i ) Suppose Jane first randomly chooses a box from among n different boxes, B 1,..., B n, and then randomly picks a coloured ball out of the box she chose. (Each Box may have different numbers of balls of each colour.) We can use the Generalized Bayes Theorem to calculate the probability that Jane chose box B j (event F j ), given that the colour of the ball that Jane picked is red (event E). Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 10 / 11

Proof of Generalized Bayes Theorem: Very similar to the proof of Bayes Theorem. Observe that: P(F j E) = P(F j E) P(E) = P(E F j)p(f j ) P(E) So, we only need to show that P(E) = n i=1 P(E F i)p(f i ). But since i F i = Ω, and since F i F j = for all i j: P(E) = P( i (E F i )) = = n P(E F i ) i=1 n P(E F i )P(F i ). i=1 (because F i s are disjoint) Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 11 / 11