Chapter 8 Section 1. Homework A

Chapter 8 Section 1 Homework A 8.7 Can we use the large-sample confidence interval? In each of the following circumstances state whether you would use the large-sample confidence interval. The variable X denotes the number of observed successes out of n attempts. (a) n = 50, X = 30 (b) n = 90, X = 15. (c) n=10, X = (d) n = 60; X = 50. (e) n = 5, X = 15. 8.9 What's wrong? Explain what is wrong with each of the following: (a) An approximate 99% confidence interval for an unknown proportion p is ˆp plus or minus its standard error. (c) A significance test is used to evaluate H 0 : ˆp = 0. versus the two-sided alternative.

8.11 Gambling and college athletics. Gambling is an issue of great concern to those involved in intercollegiate athletics. Because of this, the National Collegiate Athletic Association (NCAA) surveyed student-athletes concerning their gambling-related behaviors. 11 There were 5594 Division I male athletes in the survey. Of these, 3547 reported some participation in some gambling behavior. This included playing cards, betting on games of skill, buying lottery tickets, and betting on sports. (a) Find the sample proportion and the large-sample margin of error for 95% confidence. Explain in simple terms the meaning of the 95%. Make sure you confirm that you can use the formula that you have used. (b) Because of the way that the study was designed to protect the anonymity of the student-athletes who responded, it was not possible to calculate the number of students who were asked to respond but did not. Does this fact affect the way you interpret the results? Write a short paragraph explaining your answer. (c) In order to use the formula that you used in (a) what did you have to verify is true about the distribution of ˆp? 8.1 Gambling and female athletes. In the study described in the previous exercise, 1447 a total of 3469 female student-athletes re participation in some gambling activity. (a) Use the large-sample methods to find an estimate of the true proportion with a 95% confidence interval. ˆp = 1447 3469 0.4171(1 0.4171) = 0.4171 0.4171 ± 1.96 3469 (0.4007, 0.4335)

(b) The margin of error for this sample is not same as the margin of error calculated for the previous exercise. Explain why. The margin of error is determined by the value of ˆp and the sample size. In the previous exercise these values were different, thus the reason for a different sample size. 1. Do you enjoy driving your car? In 1991, a Gallup Poll for U.S. population reported this percent to be 79%. (a) The Pew Research Center recently (008) conducted the same poll in the U.S, with n = 50, and 36 reported that they enjoy driving. Does this sample provide evidence that the percent of drivers who enjoy driving their cars has declined since 1991? Make sure you state the null and alternative hypothesis. Report the large-sample z statistic and its P-value. Verify that you can use the method you used to calculate the p-value. While the 79% is really a statistic, let us assume for the moment that it is a parameter. 50(0.79) = 39.5 > 10 50(1 0.79) = 10.5 > 10 we can use a normal approximation H 0 : p = 0.79 H a : p < 0.79 ˆp = 36 50 0.7 Test Statistic: Z = 0.7 0.79 0.79(1 0.79) 50 = - 1.15 P( ˆp < 0.7) = P(Z < -1.15) = 0.111 (b) Draw a sketch of a standard Normal curve mark the location of your z statistic. Shade the appropriate area that corresponds to the P-value.

(c) The researchers will reject the null hypothesis if ˆp < 0.66. What is the value 0.66 called? The critical value. (d) What is the probability of a Type I error? P( ˆp < 0.66) = P Z < 0.66 0.79 0.79(1 0.79) 50 = P(Z < -.569) = 0.01 (e) The Gallup conducts the same survey (008) in the U.S, with n = 500, and 360 reported that they enjoy driving. Does this sample provide evidence that the percent of drivers who enjoy driving their cars has declined since 1991? Make sure you state the null and alternative hypothesis. Report the largesample z statistic and its P-value. Verify that you can use the method you used to calculate the p-value. 500(0.79) = 395 > 10 500(1 0.79) = 105 > 10 we can use a normal approximation H 0 : p = 0.79 H a : p < 0.79 ˆp = 360 500 0.7 Test Statistic: Z = 0.7 0.79 0.79(1 0.79) 500 = - 3.843 P( ˆp < 0.7) = P(Z < -3.843) < 0.0001 (f) Draw a sketch of a standard Normal curve mark the location of your z statistic. Shade the appropriate area that corresponds to the P-value. (g) Notice that problem (e) and problem (a) both produced the same ˆp value. What caused the difference in p-values? While the sample proportion for both problems is the same the sample size is completely different. The increase in sample size makes S.E. p-hat smaller. Thus, if p is really not 0.79, then whatever the real value of p is, p-hat will be closer to it than to 0.79. This is why p-hat will be further away from 0.79.

(h) Create a 95% confidence interval with the data from (e). 0.7 ±1.96 (0.6806, 0.7594) 0.7(1 0.7) 500 (i) We want to estimate p, to ± 0.01. How large of a sample is needed to construct a 95% confidence interval for the proportion of U.S. drivers who enjoy driving their automobiles? Use the estimate found in problem (e) as the value for p *. (0.7)(1-0.7) = 7745. 0.01 8.15 Getting angry at other drivers. Refer to Exercise 8.14. The same Pew Poll found that 38% of the respondents "shouted, cursed or made gestures to other drivers" in the last year. (b) Does the fact that the respondent is self-reporting these actions affect the way that you interpret the results? Write a short paragraph -explaining your answer.. Long sermons. The National Congregations Study collected data in a one-hour interview with a key informant that is, a minister, priest, rabbi, or other staff person or leader. ls One question asked concerned the length of the typical sermon. For this question 9 out of 119 congregations reported that the typical sermon lasted more than 30 minutes. (a) Estimate the true proportion for this question with a 95% confidence interval. Notice that I do not have enough successes (need at least 15 successes and failures) to use the large sample confidence interval formula. But I have a sample size large enough, (need at least a sample size of 5), to use the four-plus formula. p = 9 119 ++ 4 0.0894 0.0894 ± 1.96 0.0894(1 0.0894) 119 + 4 (0.0390, 0.1398)

(b) The respondents to this question were not asked to use a stopwatch to record the lengths of a random sample of sermons at their congregations. They responded based on their impressions of the sermons. Do you think that ministers, priests, rabbis, or other staff persons or leaders might perceive sermon lengths differently from the people listening to the sermons? Discuss how your ideas would influence your interpretation of the results of this study. 3. Confidence level and interval width. Refer to Exercise 1(h). Would a 90% confidence interval be wider or narrower than the one that you found in that exercise? Verify your results by computing the interval. The interval should be narrower, since I am not as confident. 0.7 ±1.645 (0.6869, 0.7530) 0.7(1 0.7) 500 8.1 Can we use the z test? In each of the following cases state whether or not the Normal approximation to the binomial should be used for a significance test on the population proportion p. (a) n = 30 and Ho: p = 0.. (b) n = 30and Ho: p = 0.6. (c) n = 100 and Ho: p = 0.5. (d) n = 00 and Ho: p = 0.01.

8. Instant versus fresh-brewed coffee. A matched pairs experiment compares the taste of instant versus fresh-brewed coffee. Each subject tastes two unmarked cups of coffee, one of each type, in random order and states which he or she prefers. Of the 40 subjects who participate in the study, 1 prefer the instant coffee. Let p be the probability that a randomly chosen subject prefers fresh-brewed coffee to instant coffee. (In practical terms, p is the proportion of the population who prefer fresh-brewed coffee.) (a) Test the claim that a majority of people prefer the taste of fresh-brewed coffee. What is the null hypothesis, and alternative hypothesis. Report the large-sample z statistic and its P-value. Verify that your method used to calculate P-value is valid (why do we need to do this?). Let X count the number of people out of 40 that select the fresh-brewed coffee. Now I want to show that most people (over half) prefer fresh-brewed coffee. Thus, we can not assume that most people prefer fresh-brewed coffee. We need a neutral assumption. A neutral assumption is that the preference is the same for instant versus fresh-brewed. Let us assume that p = 0.5, that is half of the population prefers fresh-brewed coffee over instant, evenly split. The measurement will be those people who said that they preferred fresh-brewed. We will try and see if there is any evidence that this proportion is more than 0.5, which would indicate that most people prefer fresh brewed coffee. H 0 : p = 0.5, H a : p > 0.5. ˆp = 8 40 = 0.7 Now we have 8 successes and we have 1 failures, both greater than 10 so we can use a normal approximation. 0.7 0.5 Test Statistic: Z = 0.5(0.5) 40 P( ˆp > 0.7) = P(Z >.53) = 0.0057 =.53 The p-value is 0.0057, thus the result is statistically significant, meaning we would see a proportion as high as 0.7 or higher (further away from 0.5) when we assume 0.5 is the correct proportion around 57 times out of 10,000 attempts. So our result is rare if 0.5 is true, thus, we wonder why we would see such a value? Of course what we will be more likely to believe is that p is not 0.5 but some other value that is actually larger than 0.5 and that is why we saw a sample proportion of 0.7. We could say that our result is statistically significant at 1%.

(b) Draw a sketch of a standard Normal curve and mark the location of your z statistic. Shade the appropriate area that corresponds to the P-value. I put the actual binomial curve of the null situation so you can compare to our approximation procedure. (c) Is your result significant at the 5% level? What is your practical conclusion? What is the critical value? If your result is significant at the 5% level, calculate a 95% confidence interval to estimate the value of the parameter p using the same data set. Yes, the result we found is statistically significant at 5%, since 0.0057 < 0.05. The critical value, is tied to the 5%. The z-score associated with the 5% is 1.645. 1.645 0.5(0.5) 40 + 0.5 = 0.6300. The critical value is 0.6300. 8.3 Sample size needed for an evaluation. You are planning an evaluation of a semester-long alcohol awareness campaign at your college. Previous evaluations indicate that about 5% of the students surveyed will respond "Yes" to the question "Did the campaign alter your behavior toward alcohol consumption?" How large a sample of students should you take if you want the margin of error for 95% confidence to be about 0.1? * Z The formula is p * (1 p * ) but the issue is what to make p* equal to? Since a previous study m was conducted and the p-hat value was 5% we could use that as the value of 0.5. (0.5)(0.75) = 73. Another option is to go to the other extreme which would be using 0.5 as 0.1 p-star: (0.5)(0.5) = 97. A last option is to choose a p-star value between 0.5 and 0.5 that we 0.1 would guess is closer to the actual population proportion (parameter) p.

8.33 Sample size needed for an evaluation, continued. The evaluation in the previous exercise will also have questions that have not been asked before, so you do not have previous information about the possible value of p. Repeat the calculation above for the following values of p*: 0.1, 0.3, 0.5, 0.7, and 0.9. Summarize the results in a table and graphically. What sample size will you use? 8.34 Are the customers dissatisfied? An automobile manufacturer would like to know what proportion of its customers are dissatisfied with the service received from their local dealer. The customer relations department will survey a random sample of customers and compute a 95% confidence interval for the proportion that are dissatisfied. From past studies, they believe that this proportion will be about 0.15. Find the sample size needed if the margin of error of the confidence interval is to be no more than 0.0. Using the guessed value: (0.15)(0.85) = 15 0.0 Using the conservative (extreme) value: (0.5)(0.5) = 401 0.0