Chapter 5. Discrete Probability Distributions Chapter Problem: Did Mendel s result from plant hybridization experiments contradicts his theory? 1. Mendel s theory says that when there are two inheritable traits, one of them will be dominant and the other will be recessive. 2. Experiment using pea plants. Green dominant, yellow recessive. 3. P(green pod) = ¾ = 0.75. 4. 580 offspring, 428 with green pods, 428/580 = 0.738. Gene from Gene from Offspring Color of Parent 1 Parent 2 Genes Offspring Pod green + green green/green green green + yellow green/yellow green yellow + green yellow/green green yellow + yellow yellow/yellow yellow 1
5.1 Review and Preview Combine the descriptive statistics presented in chapter 2 and 3 and those of probability present in chapter 4. x f Chapters 2 and 3 Chapters 4 Collect sample 1 8 data, then 2 10 get statistics 3 11 x = 3.6 and graphs 4 12 s = 1.7 5 13 6 14 P(1) = 1/6 Find the P(2) = 1/6 probability for P(3) = 1/6 Each outcome P(4) = 1/6 and graphs P(5) = 1/6 P(6) = 1/6 chapter 5 Create a theoretical model describing how the experiment is expected to behave, then get its parameter. x P(x) 1 1/6 2 1/6 3 1/6 = 3.5 4 1/6 = 1.7 5 1/6 6 1/6 Figure 5 1 Combining Descriptive Methods and Probability to Form a Theorem model of Behavior 2
5.2 Random Variables 1. Related Concepts 2. Graphs 3. Mean, Variance, and Standard Deviation 4. Rationale for Formula 5-1 through 5-5 5. Identifying Unusual Results with the Range Rule of Thumb 6. Identifying Unusual Results with Probabilities 7. Expected Value 3
1. Related Concepts 5.2 Random Variables Definition A random variable is a variable (typically represented by x) that has a single numerical value, determined by chance, for each outcome of a procedure A probability distribution is a description that gives the probability for each value of the random variable. It is often expressed in the format of a graph, table, or formula 4
1. Related Concepts 5.2 Random Variables e.g.1 Genetics Consider the offspring of peas from parents both having the green/yellow combination of pod genes. Under these combinations, the probability that the offspring has a green pod is ¾ or 0.75. If 5 such offspring is obtained, and let x = number of peas with green pods among 5 offspring peas then x is a random variable. Table 5-1 is a probability distribution. (5-3 will tell how the values of P(x) is obtained) x ( Number of Peas with Green Pods) 0 1 2 3 4 5 P(x) 0.001 0.015 0.088 0.264 0.396 0.237 Table 5-1. Probability Distribution: Probabilities of Numbers of Peas with Green Pods Among 5 Offspring Peas 5
1. Related Concepts 5.2 Random Variables Definition. A discrete random variable has either a finite number of values or a countable number of values A continuous random variable has infinitely many values, those values can be associated with measurements on a continuous scale without gaps or interruptions (voltmeter). 6
1. Related Concepts 5.2 Random Variables e.g.2 Determine each variable as discrete or continuous. 1). x = the # of eggs that a hen lays in a day Discrete 2). x = the # of stat students present in class on a given day Discrete 3). x = the amount of milk a cow produces Continuous 4). x = the measure of voltage for a particular smoke detector Continuous 7
Probability 2. Graphs Probability Histogram. 5.2 Random Variables 0.4 0.3 Figure 5-3 Probabilities Histogram 0.2 0.1 0 1 2 3 4 5 Number of Peas with Green Pods Among 5 8
5.2 Random Variables 2. Graphs Requirement for a Probability Distribution 1. P(x) = 1, where x assumes all possible values (the sum of all probabilities must be 1) 2. 0 P(x) 1, for every individual value of x, (that is, each probability value must be between 0 and 1 inclusive.) Example 3. Is Table 5-2 a probability distribution? x 0 1 2 3 No P(x) 0.19 0.26 0.33 0.13 Table 5-2 Cell Phones per Household 9
2. Graphs 5.2 Random Variables Example 4. P(x) = x/10, where x = 0, 1, 2, 3, 4. Is P a probability distribution? Yes 3. Mean, Variance, and Standard Deviation Formula 5-1 Formula 5-2 x P(x) 2 2 ( x ) P( x) Mean for a probability distribution Variance for a probability distribution Formula 5-3 2 2 2 x P( x) Variance for a probability distribution Formula 5-4 2 2 x P( ) x Standard Deviation for a probability distribution 10
5.2 Random Variables 4. Rationale for Formula 5-1 through 5-4 Consider mean for example. From Formula 3-2, section 3-2, we have Similarly, one can derive 5-2 (using formula on page 103 for Standard Deviation). Formula 5-3 is derived from Formula 5-2 5-4 from 5-3 directly. f N x f x N x f N x P(x) 11
5.2 Random Variables 4. Rationale for Formula 5-1 through 5-4 Round-off Rule for 2,, and Round results by carrying one more decimal place than the number of decimal places used for the random variable x. If the value of x are integers, round decimal place 2,, and to one 12
5.2 Random Variables 5. Identifying Unusual Results with the Range Rule of Thumb e.g.5 Continue use the Chapter example. x 0 1 2 3 4 5 P(x) 0.001 0.015 0.088 0.264 0.396 0.237 x P(x) 0.000 0.015 0.176 0.792 1.584 1.185 sum = 3.752 x 2 0 1 4 9 16 25 x 2 P(x) 0.000 0.015 0.352 2.376 6.336 5.925 sum = 15.004 2 Table 5-3 Calculating,, and for a Probability Distribution = 3.752 3.8 2 = 15.004 3.752 2 = 0.926496 0.9 0.926496 0.9625 1.0 13
5.2 Random Variables 5. Identifying Unusual Results with the Range Rule of Thumb e.g.6 Identifying unusual results with the range rule of thumb. By Range Rule of Thumb Maximum usual value = + 2 = 3.8 + 2(1.0) = 5.8 Minimum usual value = 2 = 3.8 2(1.0) = 1.8 Interpretation. Based this result, we conclude that for groups of 5 offspring peas, the number of offspring peas with green pods should usually fall between 1.8 and 5.8. If 5 offspring peas are generated as described, it would be unusual to get only 1 with a green pod. 14
5.2 Random Variables 6. Identifying Unusual Results with Probabilities Rare Event Rule If, under a given assumption (such as the assumption that a 2 coin is fair), the probability of a particular observed event (such as 992 heads in 1000 tosses of a coin) is extremely small, we conclude that the assumption is probably not correct. 1 3 15
5.2 Random Variables 6. Identifying Unusual Results with Probabilities Probabilities can be used to apply the rare event rule as follows: Using Probabilities to Determine When Results Are Unusual Unusually high number of success: x successes among n trials is an unusually high number of successes if P(x or more) 0.05.* e.g. 9 success in 10 trials is high if P(9 or more successes) 0.05 Unusually low number of success: x successes among n trials is an unusually low number of successes if P(x or fewer) 0.05.* e.g. 2 success in 10 trials is low if P(2 or fewer successes) 0.05 * The value of 0.05 is commonly used, but is not absolutely rigid. 16
5.2 Random Variables 6. Identifying Unusual Results with Probabilities Let s flip a coin to see it favors heads, 1000 tosses resulted in 501 heads. This is not an evidence that the coin favors heads P(501 heads in 1000 flips) = 0.0252, a very low probability. However, P(at least 501 heads in 1000 flips) = 0.487, a high probability 17
5.2 Random Variables 6. Identifying Unusual Results with Probabilities e.g.7. Identifying Unusual Results with Probabilities Use probability to determine whether 1 is an unusually low number of peas with green pods when 5 offspring are generated from parents both having the green/yellow pair of genes. Solution. P( 1 or fewer ) = P(0 or 1) = P(0) + P(1) = 0.001 + 0.015 + = 0.016 < 0.05 Interpretation: the result of 1 pea with a green pod is unusually low. There is a very small likelihood (0.016) of generating 1 or fewer peas with green pods. 18
7. Expected Value 5.2 Random Variables Definition the expected value of a discrete random variable is denoted by E, and it represents the average value of a the outcomes. It is obtained by finding the value of x P(x), E = x P(x) From formula 5-1, we see that E =. Thus the expected value of the number of peas with green pods is also 3.8. 19
7. Expected Value 5.2 Random Variables e.g.8 How to be a Better Bettor. You are considering placing a bet either on the number 7 in a roulette or on the pass line in the dice game of craps at the Venetian casino in Las Vegas. a. If you bet $5 on the number 7 in roulette, the probability of losing $5 is 37/38 and the probability of making a net gain of $175 is 1/38. (the prize is $180, including your $5 bet, so the net gain is $175.) Find your expected value if you bet $5 on the number 7 in roulette. Table 5-4 Roulette Event x P(x) xp(x) Lose Gain (net) Total $5 37/38 $4.87 $175 1/38 $4.61 E = $0.26 (or 26 ) 20
7. Expected Value 5.2 Random Variables e.g.8 How to be a Better Bettor. Continue b. If you bet $5 on the pass line in the dice game of craps, the probability of losing $5 is 251/495 and the probability of making a net gain of $5 is 244/495. (If you bet $5 on the pass line and win, you are given $10 that includes your bet, so the net gain is $5). Find your expected value if you bet on the pass line. which one is better: a $5 on the number 7 in roulette or a $5 on the pass line in the dice game? Why? Table 5-5 Dice Event x P(x) xp(x) Lose Gain (net) Total $5 251/495 $2.54 $5 244/495 $2.46 E = $0.08 (or 8 ) 21
7. Expected Value 5.2 Random Variables Interpretation. The $5 bet in the roulette results in an expected value of 26 and the $5 bet in the dice game results in an expected value of 8. The bet in the dice game is better because it has large expected value. That is, you are better off losing 8 instead of losing 26. Even though the roulette game provides an opportunity for a larger payoff, the craps game is better in the long run. So far, we introduced the discrete probability distribution Next, the binomial probability distribution, a special case of the discrete distribution Other case, Poisson, geometric, hyper-geometric 22
5.3 Binomial Probability Distribution 1. Related Concepts 2. Rationale for Binomial Probability Formula 23
5.3 Binomial Probability Distribution 1. Related Concepts Definition A Binomial probability Distribution results from a procedure that meets all of the following requirements: 1) Fixed number of trials 2) The trials must be independent (the outcome of any individual trial doesn t affect the probabilities in the other trials) 3) Each trial must have all outcomes classified into two categories (commonly referred as success and failure) 4) The probability of a success remains the same in all trials. 24
5.3 Binomial Probability Distribution 1. Related Concepts Notation for Binomial Probability Distributions S and F (success and failure) denote the two possible categories of all outcomes; P(S) = p ( p = probability of a success) P(F) = q ( q = probability of a failure, q = 1 p ) n denote the fixed number of trial x specific number of success in n trials, 0 x n p denote the probability of success in one of n trials q denote the probability of failure in one of n trials P(x) the probability of getting exactly x success in n trials 25
5.3 Binomial Probability Distribution 1. Related Concepts e.g.1. Genetics. Consider an experiment in which 5 offspring peas are generated from 2 parents each having the green/yellow combination of genes for pod color. P(green pod) = 0.75. We want to find the probability that exact 3 of the 5 offspring peas have a green pod. a. Does this procedure result in a binomial distribution? i. The number of trails is fixed (5) ii. iii. iv. The 5 trails are independent, because the probability of any spring pea having a green pod is not affected by the outcome of any other offspring pea. Each of the 5 trails has two categories of outcomes: the pea has a green pod or it does not For each offspring pea, the probability that it has a green pod is ¾ and the probability remains the same for each of the 5 peas 26
5.3 Binomial Probability Distribution 1. Related Concepts b. If yes, identify n, x, p and q. a) n = 5 b) x = 3 c) p = 0.75 d) q = 0.25 Method 1: Using the Binomial Probability Formula. Formula 5-5 P(x) = n! x!( n x)! p x q nx for x = 0, 1,, n 27
5.3 Binomial Probability Distribution 1. Related Concepts e.g.2. Continued. Method 1: Using the Formula 5-5 P(3) = 5! 3 53 (0.75) (0.25) 3!(5 3)! 5! 3 2 (0.75) (0.25) 3!2! = (10)(0.421875)(0.0625) = 0.263671875 The probability of getting exactly 3 peas with green pods among among 5 offspring peas is 0.264 (round to 3 significant digits) 28
5.3 Binomial Probability Distribution 1. Related Concepts Method 2: Using Technology Use Excel (function name: binom(x, n, p)) Use TI-84 (function name: binompdf(n, p, x)) 29
5.3 Binomial Probability Distribution 1. Related Concepts Method 3: Using Table A-1 in Appendix A Table A-1 cannot be used for e.g.2 because p = 0.75 is not included. e.g.3 McDonald s Brand Recognition. The fast food chain has a brand name recognition rate of 95% around the world. Randomly select 5 people, find a. The probability that exactly 3 of the 5 recognize McDonald s. b. The probability that the number of people who recognize McDonald s is 3 or fewer. Solution. a. P(3) = 0.021 b. P(3 or fewer) = P(0) + P(1) + P(2) + P(3) = 0.021 + 0.001 + 0 + 0 = 0.022 30
5.3 Binomial Probability Distribution 2. Rationale for the Binomial Probability Formula The number of outcomes with exactly x successes among n trials The probability of x successes among n trials for any one particular order P( x) n! x!( n x)! p x q nx 31
5.4 Mean, Variance, and SD for Binomial Distribution For Any Discrete Prob. Distribution For Binomial distribution Formula 5-1 = [x P(x)] Formula 5-6 = np Formula 5-3 2 = [x 2 P(x)] 2 Formula 5-7 2 = npq 2 2 Formula 5-4 [ x P( x)] Formula 5-8 npq Using the range rule of thumb, we can consider values to be unusual if they fall outside of the limits obtained from the following: maximum usual value: + 2 minimum usual value: 2 32
5.4 Mean, Variance, and SD for Binomial Distribution Example 1. Genetics. Use the formula 5-6 and 5-8 to find the mean and standard deviation for the number of peas with green pods when groups of 5 offspring peas are generated. Assume that there is a 0.75 probability that an offspring pea has a green pod (as described in the chapter problem).. Solution. Using n = 5, p = 0.75, and q = 0.25. Now = np = 5(0.75) = 3.75 npq ( 5)(0.75)(0.25) 0.968246 1.0 (rounded) This results are consistent with the results on page example 5, section 5-2. But computation is much easy. 33
5.4 Mean, Variance, and SD for Binomial Distribution e.g.2. Genetics. In an actual experiment, Mendel generated 580 offspring peas. He claimed that 75%, or 435, of them would have green pods. The actual experiment resulted in 428 peas with green pods. a. Assuming that groups of 580 offspring peas are generated, find the mean and standard deviation for the numbers of peas with green pods. b. Use the range rule of thumb to find the minimum usual number and the maximum usual number of peas with green pods. Based on those numbers, can we conclude that Mendel s actual result of 428 peas with green pods is unusal? Does this suggest that Mendel s value of 75% is wrong? 34
5.4 Mean, Variance, and SD for Binomial Distribution Solution. a. Using n = 580, p = 0.75, and q = 0.25. Now = np = (580)(0.75) = 435.0 npq ( 580)(0.75)(0.25) 10.4 b. Maximum usual value: + 2 = 435.0 + 2(10.4) = 455.8 Minimum usual value: 2 = 435.0 2(10.4) = 414.2 Interpretation... If Mendel generated 580 offspring peas and if his 75% rate is correct, the number of peas with green pods should usually fall between 414.2 and 455.8. Mendel actually got 428 peas with green pods, and that value does fall within the range of usual values. So the experimental result is consistent with the 75% rate. 35