Chapter 13. Binomial Distributions

Chapter 13. Binomial Distributions Topics Covered in this chapter: Binomial Probabilities Binomial Probabilities Example (Apply Your Knowledge 13.5): Proofreading The Problem: Typing errors in a text are either nonword errors (as when the is typed as teh ), or word errors that result in a real but incorrect word. Spellchecking software will catch nonword errors but not word errors. Human proofreaders catch 70% of word errors. You ask a fellow student to proofread an essay in which you have deliberately made 10 word errors. (a) If the student matches the usual 70% rate, what is the distribution of the number of errors caught? Let X represent the number of errors caught out of 10 word errors. It is easy to see that X has a binomial distribution with n = 10 and p = 0.7. (b) Missing 3 or more out of 10 errors seems a poor performance. What is the probability that a proofreader who catches 70% of word errors misses exactly 3 out of 10? Also, find the probability that a proofreader who catches 70% of word errors misses 3 or more out of 10? We are looking for the probability that the proofreader catches exactly 7 out of 10 (i.e., we are looking for P(X = 7), or the binomial probability function of X evaluated at 7, from a binomial with n = 10 and p = 0.7), and the probability that the proofreader catches 7 or less out of 10 (i.e., we are looking for P( X 7 ) or the binomial cumulative distribution function evaluated at 7). 1. Find the probability that the proofreader catches exactly 7 out of 10 missed words. a. Open SPSS, go to Variable View, and type the following in the name column: quant, n and p. b. Select the Data View tab. c. Under quant, type 7 for the number of errors caught. Under n, type 10 for the sample of 10 word errors. Under p, type 0.7 for the probability of catching a word error. 99

Binomial Distributions 100 d. Go to the Transform menu and scroll to the Compute Variable option. e. The following window should open: f. Under Target Variable type answer (this is the final answer for the question). g. Under Function group, scroll down to the PDF & Noncentral PDF option. h. Under Functions and Special Variables, scroll down to the Pdf.Binom option and double-click. i. Replace the question marks under Numeric Expression with the variables quant, n, and p respectively.

101 Chapter 13 j. Click OK. Now the answer should be adjacent to the three variables in your SPSS Data Editor in a column entitled answer. 2. Find the probability that the proofreader catches 7 or fewer errors out of 10. To find P(X 7) using SPSS, follow these steps: a. Open SPSS, go to Variable View, and type the following in the name column: quant, n and p. b. Select the Data View tab. c. Under quant, type 7 for the number of errors caught. Under n, type 10 for the sample of 10 word errors. Under p, type 0.7 for the probability of catching a word error.

Binomial Distributions 102 d. Go to the Transform menu and scroll to the Compute Variable option. e. Under Function Group, scroll down to the CDF & Noncentral CDF option. f. Under Functions and Special Variables, scroll down to the Cdf.Binom option and double-click. g. Replace the question marks under Numeric Expression with the variables quant, n, and p respectively. h. Under Target Variable type answer (this is the final answer for the question). i. Click OK. Now the answer should be adjacent to the three variables in your SPSS Data Editor in a column entitled answer. Chapter 13 Exercises 13.5 Proofreading. 13.11 College admissions. 13.25 Random stock prices. 13.27 The pill. 13.29 The pill, continued. 13.31 Genetics. 13.33 High school equivalency. 13.35 Multiple-choice tests. 13.39 A whooping cough outbreak. 13.41 A mixed group: probabilities.

379 Chapter 13 SPSS Solutions 13.5 If the student catches 70% of errors, 30% will be missed, so if X = number of errors missed, X is Binomial (assuming independence), n = 10, p = 0.30. To find the probability of missing exactly 3 of the 10 errors, use Transform, Compute Variable and locate PDF.Binom in the PDF and Noncentral PDF function group. The parameters for the command are x (the number we re interested in 3, here), n (the number of trials, 10), and (the probability of a success, which is 0.3). The probability is shown to be 0.2668 (about 26.7%). If you don t see enough decimals in your answer, click the Variable View tab in the worksheet and increase them using the arrows at the right side of the box (or type in the number of places your want). To find the probability of 3 or more, we subtract the probability of 2 or less from 1. The probability of 2 or less is found using CDF.Binom from the CDF and Noncentral CDF function group. The chance of missing at least 3 errors is 0.6172 (61.7%). 13.11 The mean is μ = np = 1535*.27 = 414.45. The standard deviation is σ = np(1 p) = 1535*.27*.73 = 17.394. We re interested in the chance that more

380 than 415 students enroll; this is the complement of less than 415. Use Transform, Compute Variable and CDF.Normal to find this probability. Since we want the probability of more than 415, using the Normal function, we ll subtract the probability of less than or equal to 415 from 1. We find that the Normal approximation for this probability is 1 0.5126 = 0.4874. To find the exact probability, use CDF.Binom and subtract the probability of 415 or fewer from 1. We have 0.4742. The two calculations differ by about 1.3%. 13.25 Since we are interested in five years, n = 5, and p = 0.65. X can take values 0, 1, 2, 3, 4, and 5. We can find the probability of each value by entering 0 through 5 in a worksheet column (we called it YearsUp), then use Transform, Compute Variable and PDF.Binom to find the probabilities of each possibility. Note that variable YearsUp has been used instead of specifying a particular number. To graph the probability histogram, we ll make a bar chart of the data and then use the Chart Editor to connect the bars. Click Graphs, Legacy Dialogs, Bar. Select that Data in Chart are Values of individual cases, and proceed to Define the plot.

381 Click to enter that the bars represent the value of Prob (the probability and that the Category Labels are YearsUp. Give your graph an appropriate Titles, and OK. Now, we ll connect the bars. Double-click in the graph to bring up the Chart Editor, then double-click in any bar. Click the Bar Options tab, and move the slider to change Bar Width to 100%. Apply the change, and Close the Properties Box and the Chart Editor. Our finished graph is below. The mean is μ = np = 5*.65 = 3.25. The standard deviation is σ = np(1 p) = 5*.65*.35 = 1.067. 13.27 We can assume the women are independent of each other, there is an (assumed) constant probability of becoming pregnant, the women either become pregnant or not, and we have a fixed number (20) of women of interest, so this setting fulfills all the requirements for a binomial. We find the probability of at least 1 (X 1) as the complement of none.

382 Under ideal conditions, there is an 18.2% chance of at least one pregnancy. To find the probability under typical use, repeat the above calculation, but change the success probability to 0.05. Under typical use, there is a 1 0.358 = 64.2% chance of at least one pregnancy in 20 women. 13.29 The Normal approximation can be used; we have μ = np = 500*.05 = 25, and n(1 p) = 475. The standard deviation is σ = np(1 p) = 500*.05*.95 = 4.873. The Normal approximation yields a probability of 0.50 that at least 25 will become pregnant (remember, 25 is the mean for this Normal distribution). The binomial probability is (remember, 25 or more is the complement of 24 or less) 1 0.4714 = 0.5286. These differ by 2.86%. We can t use the Normal approximation for the ideal case because np = 5. 13.31 We use PDF.Binom to find the probability that 6 of the 8 have red blossoms. The probability is 31.15%. The mean for 80 plants is μ = np = 80*.75 = 60, and the standard deviation is σ = 80*.75*.25 = 3.873. Using the Normal approximation, the probability of at least 60 red blossomed plants is 0.5 (60 is the mean of this distribution). To find the exact probability, we use the fact that at least 60 is the complement of 59 or less. The exact probability is 1 0.4403 = 55.97%; the two differ by almost 6%.

383 13.33 The mean is μ = np = 25000*.21 = 5250, and the standard deviation is σ = 25000*.21*.79 = 64.401. Use CDF.Normal to find the approximate probability that at least 5000 dropouts receive the flyer is 99.99%. 13.35 Jodi s mean is μ = np = 100*.75 = 75, and her standard deviation is σ = 100*.75*.25 = 4.330. We use CDF.Normal to find the probability of at most 80 correct, and subtract the probability of at most 70 correct. The result is 0.7518, she has about a 75% chance of scoring between 70 and 80. For the 250 question test, the mean is 250*.75 = 187.5, and the standard deviation is 6.847. Her chance of scoring between 70% (175 correct) and 80% (200 correct) on that test is 0.966 0.034 = 93.2%. 13.39 We were given the information that 80% of unvaccinated people exposed to the virus will develop the infection. We want the probability that at least 75% of their 1400 students (that s at least 1050 students) would develop the infection if not treated. The mean is μ = np = 1400*.80 = 1120, and the standard deviation is σ = 1400*.80*.20 = 14.967. The Normal approximation gives us a probability of (to four decimal places) 1.0000; it is virtually guaranteed that at least 75% would get sick if not treated.

384 13.41 The probability of exactly one infection in the three unvaccinated children is 9.6%. The probability of exactly one infection in the 17 vaccinated children is 37.4%. Since these are independent, we can multiply these together to find the probability of exactly one in each group, 0.096*0.374 = 0.0359 (3.6%). We could also have two infections in the unvaccinated group or two in the vaccinated group. These probabilities are 0.384 for the unvaccinated group and 0.1575 for the vaccinated group. Adding all these results together, there is a 0.0359 + 0.384 + 0.1575 = 0.5774 (about 57.7%) chance of two whooping cough infections in this group of 20 children.