3.2 Conditional Probability and Independent Events



Similar documents
Statistics 100A Homework 2 Solutions

Homework 3 Solution, due July 16

EXAM. Exam #3. Math 1430, Spring April 21, 2001 ANSWERS

Lecture 1 Introduction Properties of Probability Methods of Enumeration Asrat Temesgen Stockholm University

Math 370, Actuarial Problemsolving Spring 2008 A.J. Hildebrand. Problem Set 1 (with solutions)

P (B) In statistics, the Bayes theorem is often used in the following way: P (Data Unknown)P (Unknown) P (Data)

Math/Stats 425 Introduction to Probability. 1. Uncertainty and the axioms of probability

Can receive blood from: * I A I A and I A i o Type A Yes No A or AB A or O I B I B and I B i o Type B No Yes B or AB B or O

AP Stats - Probability Review

E3: PROBABILITY AND STATISTICS lecture notes

Question of the Day. Key Concepts. Vocabulary. Mathematical Ideas. QuestionofDay

People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

Lesson 17: Margin of Error When Estimating a Population Proportion

Math 141. Lecture 2: More Probability! Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141

Chapter 5 A Survey of Probability Concepts

Bayesian Tutorial (Sheet Updated 20 March)

Set operations and Venn Diagrams. COPYRIGHT 2006 by LAVON B. PAGE

Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Basic Probability Concepts

Lecture Note 1 Set and Probability Theory. MIT Spring 2006 Herman Bennett

5) The table below describes the smoking habits of a group of asthma sufferers. two way table ( ( cell cell ) (cell cell) (cell cell) )

Definition and Calculus of Probability

Changing the way smoking is measured among Australian adults: A preliminary investigation of Victorian data

Summary Measures (Ratio, Proportion, Rate) Marie Diener-West, PhD Johns Hopkins University

WHERE DOES THE 10% CONDITION COME FROM?

Activities/ Resources for Unit V: Proportions, Ratios, Probability, Mean and Median

Midterm Exam #1 - Answers

Lesson 1. Basics of Probability. Principles of Mathematics 12: Explained! 314

Math Circle Beginners Group October 18, 2015

Chapter 4 Lecture Notes

MATHEMATICS: PAPER I. 5. You may use an approved non-programmable and non-graphical calculator, unless otherwise stated.

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

Exam 3 Review/WIR 9 These problems will be started in class on April 7 and continued on April 8 at the WIR.

ABO-Rh Blood Typing Using Neo/BLOOD

Chapter 13 & 14 - Probability PART

ABO/Rh Blood-Typing Model:

Chapter 18. Blood Types

Topic 8. Chi Square Tests

Bayes Theorem & Diagnostic Tests Screening Tests

Normal and Binomial. Distributions

Elements of probability theory

2. How many ways can the letters in PHOENIX be rearranged? 7! = 5,040 ways.

Chapter 4: Probability and Counting Rules

First-year Statistics for Psychology Students Through Worked Examples. 2. Probability and Bayes Theorem

Math 210 Lecture Notes: Ten Probability Review Problems

8 Divisibility and prime numbers

calculating probabilities

PPS UNDERWRITING GUIDE FOR APPLICANTS

PERMUTATIONS AND COMBINATIONS

STAT 35A HW2 Solutions

n/a MYSTERY OF THE BLOOD STAIN (RE)

4.5 Linear Dependence and Linear Independence

6.3 Conditional Probability and Independence

Section 1.3 P 1 = 1 2. = P n = 1 P 3 = Continuing in this fashion, it should seem reasonable that, for any n = 1, 2, 3,..., =

Probability. Sample space: all the possible outcomes of a probability experiment, i.e., the population of outcomes

Probabilistic Strategies: Solutions

6 3 The Standard Normal Distribution

Chapter 4 & 5 practice set. The actual exam is not multiple choice nor does it contain like questions.

Introduction to Probability

Section 6-5 Sample Spaces and Probability

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide

Case-control studies. Alfredo Morabia

ECE302 Spring 2006 HW1 Solutions January 16,

STA 371G: Statistics and Modeling

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 1

CALCULATIONS & STATISTICS

Evaluation of Diagnostic and Screening Tests: Validity and Reliability. Sukon Kanchanaraksa, PhD Johns Hopkins University

Chapter 4. Probability and Probability Distributions

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION

HUMAN BLOOD TYPE: TESTING FOR ABO AND Rh FACTORS STANDARDS B, C B, C

mod 10 = mod 10 = 49 mod 10 = 9.

2Probability CHAPTER OUTLINE LEARNING OBJECTIVES

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Probability. a number between 0 and 1 that indicates how likely it is that a specific event or set of events will occur.

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University

Blood Stains at the Crime Scene Forensic Investigation

Mathematics Content: Pie Charts; Area as Probability; Probabilities as Percents, Decimals & Fractions

Sudoku puzzles and how to solve them

Homework 3 (due Tuesday, October 13)

Mathematics (Project Maths Phase 3)

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Unit 19: Probability Models

Parallel and Perpendicular. We show a small box in one of the angles to show that the lines are perpendicular.

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

CHAPTER 10 BLOOD GROUPS: ABO AND Rh

AP STATISTICS TEST #2 - REVIEW - Ch. 14 &15 Period:

Circuits 1 M H Miller

Mathematics (Project Maths Phase 1)

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

136 CHAPTER 4. INDUCTION, GRAPHS AND TREES

Case-Control Studies. Sukon Kanchanaraksa, PhD Johns Hopkins University

Rhesus Negative 10:Rhesus Negative July 06. rhesus negative. what it means

Practice Questions 1: Evolution

MAS113 Introduction to Probability and Statistics

Chapter 7: Effect Modification

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Regular smoker

Blood Typing Laboratory Exercise 40

Activity 1: Using base ten blocks to model operations on decimals

Transcription:

Ismor Fischer, 5/29/2012 3.2-1 3.2 Conditional Probability and Independent Events Using population-based health studies to estimate probabilities relating potential risk factors to a particular disease, evaluate efficacy of medical diagnostic and screening tests, etc. Example: Events: = lung cancer B = smoker S B Disease Status Lung cancer () No lung cancer ( c ) 0.03 0.12 0.04 0.81 Smoker Yes (B) No (B c ) 0.12 0.04 0.16 0.03 0.81 0.84 0.15 0.85 1.00 Probabilities: P() = 0.15 P(B) = 0.16 P( B) = 0.12 Definition: Conditional Probability of Event, given Event B (where P(B) 0) P( B) = P ( B) PB ( ) Comments: P(B ) = PB ( ) P ( ) = 0.12 0.15 = 0.12 0.16 = 0.75 >> 0.15 = P(). = 0.80, so P( B) P(B ) in general. General formula can be rewritten: P( B) = P( B) P(B) IMPORTNT Example: P(ngel barks) = 0.1 P(Brutus barks) = 0.2 P(ngel barks Brutus barks) = 0.3 Therefore P(ngel and Brutus bark) = 0.06

Ismor Fischer, 5/29/2012 3.2-2 Example: Suppose that two balls are to be randomly drawn, one after another, from a container holding four red balls and two green balls. Under the scenario of sampling without replacement, calculate the probabilities of the events = First ball is red, B = Second ball is red, and B = First ball is red ND second ball is red. (s an exercise, list the 6 5 = 30 outcomes in the sample space of this experiment, and use brute force to solve this problem.) R 1 G 1 R 2 R 3 R 4 G 2 This type of problem known as an urn model can be solved with the use of a tree diagram, where each branch of the tree represents a specific event, conditioned on a preceding event. The product of the probabilities of all such events along a particular sequence of branches is equal to the corresponding intersection probability, via the previous formula. In this example, we obtain the following values: 1 st draw 2 nd draw P() = 4/6 P(B ) = 3/5 P(B c ) = 2/5 P( B) = 12/30 P( B c ) = 8/30 B B c c B P( c ) = 2/6 P(B c ) = 4/5 P( c B) = 8/30 P(B c c ) = 1/5 P( c B c ) = 2/30 We can calculate the probability P(B) by adding the two boxed values above, i.e., P(B) = P( B) + P( c B) = 12/30 + 8/30 = 20/30, or P(B) = 2/3. This last formula which can be written as P(B) = P(B ) P() + P(B c ) P( c ) can be extended to more general situations, where it is known as the Law of Total Probability, and is a useful tool in Bayes Theorem (next section).

Ismor Fischer, 5/29/2012 3.2-3 Suppose event C = coffee drinker. S C Disease Status Lung cancer () No lung cancer ( c ) 0.09 0.06 0.34 0.51 Coffee Drinker Yes (C) No (C c ) 0.06 0.34 0.40 0.09 0.51 0.60 0.15 0.85 1.00 Probabilities: P() = 0.15 P(C) = 0.40 P( C) = 0.06 Therefore, P( C) = P( C) P(C) = 0.06 0.40 = 0.15 = P() i.e., the occurrence of event C gives no information about the probability of event. Definition: Two events and B are said to be statistically independent if either: (1) P( B) = P(), i.e., P(B ) = P(B), or equivalently, (2) P( B) = P() P(B). Exercise: Prove that if events B and C are statistically independent, then so are each of the following: B and Not C Not B and C Not B and Not C Hint: Let P(B) = b, P(C) = c, and construct a 2 2 probability table. Summary, B disjoint If either event occurs, then the other cannot occur: P( B) = 0., B independent If either event occurs, this gives no information about the other: P( B) = P( ) P( B). Example: = Select a 2 and B = Select a are not disjoint events, because B = {2 }. However, P( B) = 1/52 = 1/13 1/4 = P() P(B); hence they are independent events. Can two disjoint events ever be independent? Why?

Ismor Fischer, 5/29/2012 3.2-4 VERY IMPORTNT ND USEFUL FCT: It can be shown that for any event, all of the elementary properties of probability P() covered in the notes, extend to conditional probability PB ( ), for any other event B. For example, since we know that P ( 1 2) = P ( 1) + P ( 2) P ( 1 2) for any two events 1 and 2, it is also true that P ( B) = P ( B) + P ( B) P ( B) for any other event B. 1 2 1 2 1 2 s another example, since we know that P( ) = 1 P ( ), it therefore also c. follows that P( B) = 1 PB ( ) Exercise: Prove these two statements. (Hint: Sketch a Venn diagram.) HOWEVER, there is one important exception! We know that if and B are two independent events, then P ( B) = PPB ( ) ( ). But this does not extend to conditional probabilities! In particular, if C is any other event, then P ( BC ) PCPBC ( ) ( ) in general. The following example illustrates this, for three events, B, and C: c B.20.20.20.05.05.05.10 C.15 Exercise: Confirm that P ( B) = PPB ( ) ( ), but P ( BC ) PCPBC ( ) ( ). In other words, two events that may be independent in a general population, may not necessarily be independent in a particular subgroup of that population.

Ismor Fischer, 5/29/2012 3.2-5 More on Conditional Probability and Independent Events nother example from epidemiology S = POPULTION = lung cancer S = POPULTION = lung cancer B C B = obese C = smoker Suppose that, in a certain study population, we wish to investigate the prevalence of lung cancer (), and its associations with obesity (B) and cigarette smoking (C), respectively. From the first of the two stylized Venn diagrams above, by comparing the scales drawn, observe that the proportion of the size of the intersection B (green) relative to event B (blue + green), is about equal to the proportion of the size of event (yellow + green) relative to the entire population S. That is, P ( B) P ( ) = PB ( ) P( S ). (s an exercise, verify this equality for the following probabilities: yellow =.09, green =.07, blue =.37, white =.47, to two decimals, before reading on.) In other words, the probability that a randomly chosen person from the obese subpopulation has lung cancer, is equal to the probability that a randomly chosen person from the general population has lung cancer (.16). This equation can be equivalently expressed as P( B) = P(), since the left side is conditional probability by definition, and P(S) = 1 in the denominator of the right side. In this form, the equation clearly conveys the interpretation that knowledge of event B (obesity) yields no information about event (lung cancer). In this example, lung cancer is equally probable (.16) among the obese as it is among the general population, so knowing that a person is obese is completely unrevealing with respect to having lung cancer. Events and B that are related in this way are said to be independent. Note that they are not disjoint! In the second diagram however, the relative size of C (orange) to C (red + orange), is larger than the relative size of (yellow + orange) to the whole population S, so P( C) P(), i.e., events and C are dependent. Here, as is true in general, the probability of lung cancer is indeed influenced by whether a person is randomly selected from among the general population or the smoking subset, where it is much higher. Statistically, lung cancer would be a rare disease in the U.S., if not for cigarettes (although it is on the rise among nonsmokers).

Ismor Fischer, 5/29/2012 3.2-6 pplication: re Blood ntibodies Independent? n example of conditional probability in human genetics (dapted from Rick Chappell, Ph.D., UW Dept. of Biostatistics & Medical Informatics) Background: The surfaces of human red blood cells ( erythrocytes ) are coated with antigens that are classified into four disjoint blood types: O,, B, and B. Each type is associated with blood serum antibodies for the other types, that is, Type O blood contains both and B antibodies. (This makes Type O the universal donor, but capable of receiving only Type O.) Type blood contains only B antibodies. Type B blood contains only antibodies. Type B blood contains neither nor B antibodies. (This makes Type B the universal recipient, but capable of donating only to Type B.) In addition, blood is also classified according to the presence (+) or absence ( ) of Rh factor (found predominantly in rhesus monkeys, and to varying degree in human populations; they are important in obstetrics). Hence there are eight distinct blood groups corresponding to this joint classification system: O +, O, +,, B +, B, B +, B. ccording to the merican Red Cross, the U.S. population has the following blood group relative frequencies: Blood Types Rh factor + Totals O.384.077.461.323.065.388 B.094.017.111 B.032.007.039 Totals.833.166.999 From these values (and from the background information above), we can calculate the following probabilities: P ( antibodies) = P (Type O or B) P (B antibodies) = P (Type O or ) = P (O) + P (B) = P (O) + P () =.461 +.111 =.461 +.388 =.572 =.849 P (B antibodies and Rh + ) = P (Type O + or + ) = P (O + ) + P ( + ) =.384 +.323 =.707

Ismor Fischer, 5/29/2012 3.2-7 Using these calculations, we can answer the following. Question: Is having antibodies independent of having B antibodies? Solution: We must check whether or not i.e., or P( and B antibodies) = P( antibodies) P(B antibodies), P(Type O).572.849.461.486 This indicates near independence of the two events; there does exist a slight dependence. The dependence would be much stronger if merica were composed of two disjoint (i.e., non-interbreeding) groups: Type (with B antibodies only) and Type B (with antibodies only), and no Type O (with both and B antibodies). Since this is evidently not the case, the implication is that either these traits evolved before humans spread out geographically, or they evolved later but the populations became mixed in merica. Question: Is having B antibodies independent of Rh +? Solution: We must check whether or not that is, P (B antibodies and Rh + ) = P (B antibodies) P (Rh + ),.707 =.849.833, which is true, so we have exact independence of these events. These traits probably predate diversification in humans (and were not differentially selected for since). Exercises: Is having antibodies independent of Rh +? Find P ( antibodies B antibodies) and P (B antibodies antibodies). Conclusions? Is Blood Type independent of Rh factor? (Do a separate calculation for each blood type: O,, B, B, and each Rh factor: +,.)