Bayes Theorem P(C A) P(A) P(A C) = P(C A) P(A) + P(C B) P(B) P(E B) P(B) P(B E) = P(E B) P(B) + P(E A) P(A) P(D A) P(A) P(A D) = P(D A) P(A) + P(D B) P(B) Cost of procedure is $1,000,000 Data regarding accuracy of the procedure is: Prob (+ test result patient has diabetes) =.90 Prob (+ test result patient has leukemia) =.95 Prob (+ test result patient has neither) =.07 We also know that in the general population: Prob (Patient has diabetes) =.02 Prob (Patient has leukemia) =.01 Prob (Patient has neither) =.97 Should we purchase the procedure? 1
What probabilities do we really need to know? Prob (patient has diabetes + test result) =??? Prob (patient has leukemia + test result) =??? Prob (patient has neither + test result) =??? How can we find these probabilities? Prob (patient has diabetes + test result) = P (+ test result diabetes) P(diabetes) P (+ test result diabetes) P(diabetes) + P (+ test result leukemia) P(leukemia) + P (+ test result neither) P(neither) Prob (D + ) = P (+ D) P(D) P (+ D) P(D) + P (+ L) P(L) + P (+ N) P(N) (.90) (.02) =.19 (.90) (.02) + (.95) (.01) + (.07) (.97) 2
Prob (L + ) = P (+ L) P(L) P (+ L) P(L) + P (+ D) P(D) +P (+ N) P(N) (.95) (.01) =.10 (.95) (.01) + (.90) (.02) + (.07) (.97) Prob (N + ) = P (+ N) P(N) P (+ N) P(N) + P (+ L) P(L) + P (+ D) P(D) (.07) (.97) =.71 (.07) (.97) + (.95) (.01) + (.90) (.02) To demonstrate that these probs. are correct, look at a random sample of 100,000 people D 2000 90% 1800 200 L 1000 95% 950 50 N 97,000 7% 6790 90,210 100,000 9,540 90,460 Total popul. + reading - reading 3
Remember that: Prob (+ test result patient has diabetes) =.90 Prob (+ test result patient has leukemia) =.95 Prob (+ test result patient has neither) =.07 and Prob (Patient has diabetes) =.02 Prob (Patient has leukemia) =.01 Prob (Patient has neither) =.97 May also want to be concerned about false negatives: Prob (Patient has diabetes - ) = 200/90460 Prob (Patient has leukemia - ) = 50/90460 Cost of procedure is $1,000,000 Data regarding accuracy of the procedure is: Prob (+ test result patient has diabetes) =.90 Prob (+ test result patient has leukemia) =.95 Prob (+ test result patient has neither) =.07 4
Bayes Theorem- Practice problem Product Manager Problem There are two types of probabilities: Actual: P(superior product) Survey results: P(Survey says superior product) Bayes Theorem- Practice problem Product manager Problem - p. 21 Part a) First survey is negative Find P(superior product survey says not superior) Part b) Second survey is negative Find P(superior product survey says not superior) Using results from (a) as starting point in (b) use condit. prob. as new marginals Quick Review of Discrete and Continuous Distributions Random Variable Discrete Continuous Discrete Probability Distribution Continuous Probability Distribution Normal (and Normal Approximation to ) 5
Random Variable Variable that takes on a set of unique numerical values which represent every possible outcome of a random process Discrete -- Values are usually integers (finite number of values) Continuous -- Values are real #s (infinite number of values within interval) Probability Distribution Discrete Collection of probabilities that describe frequency with which random variable takes on each of its possible values Called probability mass function 1. 0 <= P(x) <= 1 2. Σ P(x i ) = 1 i Cumulative distribution function Probability that x takes on values <= X F(x) = P( x <= X) = Σ P(x i ) x i <=X 1. 0 <= F(x) <= 1 2. F(a) <= F(b) if a < b 6
Discrete Prob. Distribution example There are 20 students in a class, where the ages are distributed as follows: Age Frequency 18 5 19 6 20 4 21 4 24 1 Discrete Prob. Distribution example There are 20 students in a class, where the ages are distributed as follows: Age 18 19 20 21 24 Frequency 5 6 4 4 1 P(x).25.30.20.20.05 20 1.0 F(x).25.55.75.95 1.00 Discrete Prob. Distribution example Find 1. P(x = 19) =.30 2. P(x <= 21) =.95 3. P(19 < x <= 22) =.40 4. The mean age of students in the class µ = Σ x i P(x i ) = 19.6 i 7
Discrete Prob. Distribution example To find variance: σ 2 = Σ (x i - µ ) 2 P(x i ) i = E (x i 2 ) - µ 2 Where E (x i 2 ) = Σ x i2 P(x i ) i (Easier to compute) Discrete Prob. Distribution example 1. σ 2 = Σ (x i - µ ) 2 P(x i ) σ 2 =.25(18-19.6 ) 2 +.3(19-19.6) 2 +.2(20-19.6) 2 2. σ 2 +.2(21-19.6) 2 +.05(24-19.6) 2 = 2.14 = E (x i 2 ) - µ 2 = 386.3-384.16 = 2.14 E (x i 2 ) = Σ x i2 P(x i ) = 18 2 (.25) + 19 2 (.3) + 20 2 (.2) + 21 2 (.2) + 24 2 (.05) Discrete Prob. Distribution example To find standard deviation: σ = σ 2 = 2.14 = 1.46 µ = Σ x i P(x i ) = 19.6 = E (x 2 i ) - µ 2 = 2.14 σ 2 σ = σ 2 = 1.46 8
distribution 1) Only 2 possible outcomes, success or failure 2) There are n independent trials 3) Prob. of S or F remains constant 4) Random variable represents # of successes in n trials distribution r = # of successes n = # of trials p = probability of success P(x=r) = Cr n (p) r (1-p) n-r Where C n n! r = r!(n-r)! µ = np σ 2 = np(1-p) distribution Quiz Problem example: r = # of successes n = # of trials = 9 p = probability of success = 0.4 If want to find prob. of getting 3 correct P(x=3) = C 9 (0.4) 3 (1-0.4) 9-3 3 9
distribution Quiz Problem example: Use Table C from textbook - look under r p=0.40 1.9899 2.9295 3.7682 4.5174 5.2666 6.0994 7.0250 8.0003 n=9 and prob. = 0.4 prob. of x=> 1 prob. of x=> 2 prob. of x=> 3 prob. of x=> 4 prob. of x=> 5 prob. of x=> 6 prob. of x=> 7 prob. of x=> 8 distribution Quiz Problem example: r p=0.40 1.9899 2.9295 3.7682 4.5174 5.2666 6.0994 7.0250 8.0003 1. P(x=3) = P(x=>3) - P(x=>4) =.7682 -.5174 =.2508 2. P(x<5) = 1 - P(x=>5) = 1 -.2666 =.7334 distribution Quiz Problem example: r p=0.40 1.9899 2.9295 3.7682 4.5174 5.2666 6.0994 7.0250 8.0003 3. P(3<x<8) = P(x=>4)-P(x=>8) =.5174 -.0003 =.5171 4. µ = np = 9(.4) = 3.6 5. Most likely = value with highest probability P(x=3) = 2508; P(x=4) =.2508 10
Normal Distribution (Continuous) Infinite number of values for x within an interval; values within a range P(a <= x <= b) = F(b) - F(a) Normal Distribution: N (µ, σ) x - µ Z = σ -4-3-2-1 01234 for b>a Normal Distribution (Continuous) Production process fills 46-oz. fruit juice cans Juice filling cans is N(46.5 oz., 0.2 oz.) a) State law requires that no more than 1 in 20 cans have less than 46 oz. Does this process meet the state law? P(x<46) <= 1/20 =.05???? Normal Distribution (Continuous) Normal Distribution: x - µ Z = σ = Look up -2.5 in Normal Table Z = 1 -.9938 =.0062 <.05 Yes, process meets law 46-46.5.2 = - 2.5-4-3-2-1 01234 46.5 11
Normal Distribution (Continuous) Production process fills 46-oz. fruit juice cans Juice filling cans is N(46.5 oz., 0.2 oz.) b) P(46.3<=x<=46.6) =??? c) P(x=>46.8) =??? Normal Distribution (Continuous) Production process fills 46-oz. fruit juice cans Juice filling cans is N(46.5 oz., 0.2 oz.) b) P(46.3<=x<=46.6) =??? Z = x - µ 46.3-46.5 = σ = -1.0.2 Z = x - µ 46.6-46.5 = σ =.5.2 Normal Distribution (Continuous) Production process fills 46-oz. fruit juice cans Juice filling cans is N(46.5 oz., 0.2 oz.) b) P(46.3<=x<=46.6) = P(-1.0<=x<=.5) =.6915 - (1-.8413) =.5328-4-3-2-1 01234 12
Normal Distribution (Continuous) Production process fills 46-oz. fruit juice cans Juice filling cans is N(46.5 oz., 0.2 oz.) c) P(x=>46.8) =??? Z = x - µ σ 46.8-46.5 = = 1.5.2 P(x=>1.5) = 1 -.9332 =.0668 Normal Approximation to the This applies when you have a setting but for very large n values. Normal Distribution: Distribution: N (µ, σ) B (n, p ) Normal Approximation: µ = np σ 2 = np(1-p) σ = np(1-p) -4-3-2-1 01234 Normal Approximation to the Out of graduating Seniors, 80% have jobs by graduation. Take a sample of 200. a) What is the likelihood that at least 125 have jobs? b) That between 170 and 185 have jobs? setting (S vs. F), but very large n Use Normal Approx. to 13
Normal Approximation to the n = 200 p =.80 µ = np = 200 (.8) = 160 σ 2 = np(1-p) = 200 (.8) (.2)= 32 σ = np(1-p) = 32 = 5.66 a) P(x=>125) = P(Z=> -6.27) = 1.0 Z = x - µ σ 124.5-160 = = - 6.27 5.66 Normal Approximation to the This applies when you have a setting but for very large n values. Normal Distribution: Distribution: N (µ, σ) B (n, p ) Normal Approximation: µ = np σ 2 = np(1-p) σ = np(1-p) -4-3-2-1 01234 Normal Approximation to the n = 200 p =.80 µ = np = 200 (.8) = 160 σ 2 = np(1-p) = 200 (.8) (.2)= 32 σ = np(1-p) = 32 = 5.66 a) P(x=>125) = P(Z=> -6.27) = 1.0 Z = x - µ σ 124.5-160 = = - 6.27 5.66 14
Normal Approximation to the n = 200 p =.80 µ = np = 200 (.8) = 160 σ 2 = np(1-p) = 200 (.8) (.2) =32 σ = np(1-p) = 32 = 5.66 b) P(170<=x<=185) = P(1.68<=Z<=4.5) 169.5-160 185.5-160 Z = 5.66 = 1.68 Z = = 4.5 5.66 Normal Approximation to the b) P(170<=x<=185) = P(1.68<=Z<=4.5) = 1.0 -.95352 =.04648-4-3-2-1 01234 15