CONTENTS. Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter 8...

Size: px
Start display at page:

Download "CONTENTS. Chapter 1...1. Chapter 2...9. Chapter 3... 29. Chapter 4... 45. Chapter 5... 59. Chapter 6... 73. Chapter 7... 101. Chapter 8..."

Transcription

1 CONTENTS Chapter Chapter...9 Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter Chapter

2

3 Chapter 1: Introduction to Statistics Section 1- Chapter 1: Introduction to Statistics 1 1. Statistical significance is indicated when methods of statistics are used to reach a conclusion that some treatment or finding is effective, but common sense might suggest that the treatment or finding does not make enough of a difference to justify its use or to be practical. Yes, it is possible for a study to have statistical significance but not a practical significance.. If the source of the data can benefit from the results of the study, it is possible that an element of bias is introduced so that the results are favorable to the source. 3. A voluntary response sample is a sample in which the subjects themselves decide whether to be included in the study. A voluntary response sample is generally not suitable for a statistical study because the sample may have a bias resulting from participation by those with a special interest in the topic being studied. 4. Even if we conduct a study and find that there is a correlation, or association, between two variables, we cannot conclude that one of the variables is the cause of the other. 5. There does appear to be a potential to create a bias. 6. There does not appear to be a potential to create a bias. 7. There does not appear to be a potential to create a bias. 8. There does appear a potential to create a bias. 9. The sample is a voluntary response sample and is therefore flawed. 10. The sample is a voluntary response sample and is therefore flawed. 11. The sampling method appears to be sound. 1. The sampling method appears to be sound. 13. Because there is a 30% chance of getting such results with a diet that has no effect, it does not appear to have statistical significance, but the average loss of 45 pounds does appear to have practical significance. 14. Because there is only a 1% chance of getting the results by chance, the method appears to have a statistical significance. The result of 540 boys in 1000 births is above the approximately 50% rate expected by chance, but it does not appear to be high enough to have practical significance. Not many couples would bother with a procedure that raises the likelihood of a boy from 50% to 54%. 15. Because there is a 3% chance of getting such results with a program that has no effect, the program does not appear to have statistical significance. Because the success rate of 3% is not much better than the 0% rate that is typically expected with random guessing, the program does not appear to have practical significance. 16. Because there is a 5% chance of getting such results with a program that has no effect, the program does not appear to have statistical significance. Because the average increase is only 3 IQ point, the program does not appear to have practical significance. 17. The male and female pulse rates in the same column are not matched in any meaningful way. It does not make sense to use the difference between any of the pulse rates that are in the same column. 18. Yes, the source of the data is likely to be unbiased. 19. The data can be used to address the issue of whether males and females have pulse rates with the same average (mean) value. 0. The results do not prove that the populations of males and females have the same average (mean) pulse rate. The results are based on a particular sample of five males and five females, and analyzing other samples might lead to a different conclusion. Better results would be obtained with larger samples. Copyright 014 Pearson Education, Inc.

4 Chapter 1: Introduction to Statistics 1. Yes, each IQ score is matched with the brain volume in the same column, because they are measurements obtained from the same person. It does not make sense to use the difference between each IQ score and the brain volume in the same column, because IQ scores and brain volumes use different units of measurement. For example, it would make no sense to find the difference between an IQ score of 87 and a brain volume of 1035 cm 3.. The issue that can be addressed is whether there is a correlation, or association, between IQ score and brain volume. 3. Given that the researchers do not appear to benefit from the results, they are professionals at prestigious institutions, and funding is from a U.S. government agency, the source of the data appears to be unbiased. 4. No. Correlation does not imply causation, so a statistical correlation between IQ score and brain volume should not be used to conclude that larger brain volumes cause higher IQ scores. 5. It is questionable that the sponsor is the Idaho Potato Commission and the favorite vegetable is potatoes. 6. The sample is a voluntary response sample, so there is a good chance that the results are not valid. 7. The correlation, or association, between two variables does not mean that one of the variables is the cause of the other. Correlation does not imply causation. 8. The correlation, or association, between two variables does not mean that one of the variables is the cause of the other. Correlation does not imply causation. 9. a. The number of people is (0.39)(1018) = b. No. Because the result is a count of people among 1018 who were surveyed, the result must be a whole number. c. The actual number is 397 people d. The percentage is % 1018 = = 30. a. The number of women is (0.38)(47) = 16.6 b. No. Because the result is a count of women among 47 who were surveyed, the result must be a whole number. b. The actual number is 16 women. d. 30 The percentage is % 47 = = 31. a. The number of adults is (0.14)(30) = 3.8 b. No. Because the result is a count of adults among 30 who were surveyed, the result must be a whole number. c. The actual number is 3 adults. d. 46 The percentage is % 30 = = 3. a. The number of adults is (0.76)(513) = b. No. Because the result is a count of adults among 513 who were surveyed, the result must be a whole number. b. The actual number is 1910 adults. d. The percentage is % 513 = = 33. Because a reduction of 100% would eliminate all of the size, it is not possible to reduce the size by 100% or more. Copyright 014 Pearson Education, Inc.

5 Chapter 1: Introduction to Statistics If the Club eliminated all car thefts, it would reduce the odds of car theft by 100%, so the 400% figure is impossible. 35. If foreign investment fell by 100% it would be totally eliminated, so it is not possible for it to fall by more than 100%. 36. Because a reduction of 100% would eliminate all plague, it is not possible to reduce it by more than 100%. 37. Without our knowing anything about the number of ATVs in use, or the number of ATV drivers, or the amount of ATV usage, the number of 740 fatal accidents has no context. Some information should be given so that the reader can understand the rate of ATV fatalities. 38. All percentages of success should be multiples of 5. The given percentage cannot be correct. 39. The wording of the question is biased and tends to encourage negative response. The sample size of 0 is too small. Survey respondents are self-selected instead of being selected by the newspaper. If 0 readers respond, the percentages should be multiples of 5, so 87% and 13% are not possible results. Section A parameter is a numerical measurement describing some characteristic of a population, whereas a statistic is a numerical measurement describing some characteristic of a sample.. Quantitative data consist of numbers representing counts or measurements, whereas categorical data can be separated into different categories that are distinguished by some characteristic that is not numerical. 3. Parts (a) and (c) describe discrete data. 4. The values of 1010 and 55% are both statistics because they are based on the sample. The population consists of all adults in the United States. 5. Statistic 17. Discrete 6. Parameter 18. Discrete 7. Parameter 19. Continuous 8. Statistic 0. Continuous 9. Parameter 1. Nominal 10. Parameter 11. Statistic 1. Statistic 13. Continuous 14. Discrete 15. Discrete 16. Continuous. Ratio 3. Interval 4. Ordinal 5. Ratio 6. Nominal 7. Ordinal 8. Interval 9. The numbers are not counts or measures of anything, so they are at the nominal level of measurement, and it makes no sense to compute the average (mean) of them. 30. The flight numbers do not count or measure anything. They are at the nominal level of measurement, and it does not make sense to compute the average (mean) of them. 31. The numbers are used as substitutes for the categories of low, medium, and high, so the numbers are at the ordinal level of measurement. It does not make sense to compute the average (mean) of such numbers. 3. The numbers are substitutes for names and are not counts or measures of anything. They are at the nominal level of measurement, and it makes no sense to compute the average (mean) of them. Copyright 014 Pearson Education, Inc.

6 4 Chapter 1: Introduction to Statistics 33. a. Continuous, because the number of possible values is infinite and not countable. b. Discrete, because the number of possible values is finite. c. Discrete, because the number of possible values is finite. d. Discrete, because the number of possible values is infinite and countable. 34. Either ordinal or interval is a reasonable answer, but ordinal makes more sense because differences between values are not likely to be meaningful. For example, the difference between a food rated 1 and a food rated is not necessarily the same as a difference between a food rated 9 and a food rated With no natural starting point, temperatures are at the interval level of measurement, so ratios such as twice are meaningless. Section No. Not every sample of the same size has the same chance of being selected. For example, the sample with the first two names has no chance of being selected. A simple random sample of (n) items is selected in such a way that every sample of same size has the same chance of being selected.. In an observational study, you would examine subjects who consume fruit and those who do not. In the observational study, you run a greater risk of having a lurking variable that affects weight. For example, people who consume more fruit might be more likely to maintain generally better eating habits, and they might be more likely to exercise, so their lower weights might be due to these better eating and exercise habits, and perhaps fruit consumption does not explain lower weights. An experiment would be better, because you can randomly assign subjects to the fruit treatment group and the group that does not get the fruit treatment, so lurking variables are less likely to affect the results. 3. The population consists of the adult friends on the list. The simple random sample is selected from the population of adult friends on the list, so the results are not likely to be representative of the much larger general population of adults in the United States. 4. Because there is nothing about left-handedness or right-handedness that would affect being in the author s classes, the results are likely to be typical of the population. The results are likely to be good, but convenience samples in general are not likely to be so good. 5. Because the subjects are subjected to anger and confrontation, they are given a form or treatment, so this is an experiment, not an observational study. 6. Because the subjects were given a treatment consisting of Lipitor, this is an experiment. 7. This is an observational study because the therapists were not given any treatment. Their responses were observed. 8. This is an observational study because the survey subjects were not given any treatment. Their responses were observed. 9. Cluster 10. Convenience 11. Random 1. Systematic 13. Convenience 15. Systematic 16. Cluster 17. Random 18. Cluster 19. Convenience 14. Random 0. Systematic 1. The sample is not a simple random sample. Because every 1000 th pill is selected, some samples have no chance of being selected. For example, a sample consisting of two consecutive pills has no chance of being selected, and this violates the requirement of a simple random sample.. The sample is not a simple random sample. Not every sample of 1500 adults has the same chance of being selected. For example, a sample of 1500 women has no chance of being selected. 3. The sample is a simple random sample. Every sample of size 500 has the same chance of being selected. Copyright 014 Pearson Education, Inc.

7 Chapter 1: Introduction to Statistics 5 4. The sample is a simple random sample. Every sample of the same size has the same chance of being selected. 5. The sample is not a simple random sample. Not every sample has the same chance of being selected. For example, a sample that includes people who do not appear to be approachable has no chance of being selected. 6. The sample is not a simple random sample. Not all samples of the same size have the same chance of being selected. For example, a sample would not be selected which included people who do not appear to be approachable. 7. Prospective study 8. Retrospective study 9. Cross-sectional study 31. Matched pairs design 3. Randomized block design 33. Completely randomized design 30. Prospective study 34. Matched pairs design 35. Blinding is a method whereby a subject (or a person who evaluates results) in an experiment does not know whether the subject is treated with the DNA vaccine or the adenoviral vector vaccine. It is important to use blinding so that results are not somehow distorted by knowledge of the particular treatment used. 36. Prospective: The experiment was begun and results were followed forward in time. Randomized: Subjects were assigned to the different groups through the process of random selection, and whereby they had the same chance of belonging to each group. Double-blind: The subjects did not know which of the three groups they were in, and the people who evaluated results did not know either. Placebo-controlled: There was a group of subjects who were given a placebo, by comparing the placebo group to the two treatment groups, the effect of the treatments might be better understood. Chapter Quick Quiz 1. No. The numbers do not measure or count anything.. Nominal 3. Continuous 4. Quantitative data 5. Ratio 7. No 8. Statistic 9. Observational study 10 False 6. False Review Exercises 1. a. Discrete b. Ratio c. Stratified d. Cluster e. The mailed responses would be a voluntary response sample, so those with strong opinions are more likely to respond. It is very possible that the results do not reflect the true opinions of the population of all costumers.. The survey was sponsored by the American Laser Centers, and 4% said that the favorite body part is the face, which happens to be a body part often chosen for some type of laser treatment. The source is therefore questionable. 3. The sample is a voluntary response sample, so the results are questionable. Copyright 014 Pearson Education, Inc.

8 6 Chapter 1: Introduction to Statistics 4. a. It uses a voluntary response sample, and those with special interests are more likely to respond, so it is very possible that the sample is not representative of the population. b. Because the statement refers to 7% of all Americans, it is a parameter (but it is probably based on a 7% rate from the sample, and the sample percentage is a statistic). c. Observational study. 5. a. If they have no fat at all, they have 100% less than any other amount with fat, so the 15% figure cannot be correct. b. The exact number is (0.58)(118) = The actual number is 686. c % 118 = = 6. The Gallop poll used randomly selected respondents, but the AOL poll used a voluntary response sample. Respondents in the AOL poll are more likely to participate if they have strong feelings about the candidates, and this group is not necessarily representative of the population. The results from the Gallop poll were more likely to reflect the true opinions of American voters. 7. Because there is only a 4% chance of getting the results by chance, the method appears to have statistical significance. The results of 11 girls in 00 births is above the approximately 50% rate expected by chance, but it does not appear to be high enough to have practical significance. Not many couples would bother with a procedure that raises the likelihood of a girl from 50% to 56%. 8. a. Random b. Stratified c. Nominal d. Statistic, because it is based on a sample. e. The mailed responses would be a voluntary response sample. Those with strong opinions about the topic would be more likely to respond, so it is very possible that the results would not reflect the true opinions of the population of all adults. 9. a. Systematic 10. a. 0.5( 1500) = 780 adults b. Random 345 c. Cluster b % 1500 = = d. Stratified e. Convenience c. Men: % 1500 = = ; f. No, although this is a subjective judgment. Cumulative Review Exercises Women: % 1500 = = 1. The mean is 11. Because the flight numbers are not measures or counts of anything, the result does not have meaning.. The mean is 101, and it is reasonably close to the population mean of ( ) = is an unusually high value (175 17) = ( ) 0.03 = ( ) = Copyright 014 Pearson Education, Inc.

9 Chapter 1: Introduction to Statistics (( ) + ( ) + ( ) ) = 8.0 ( 3 1) (( ) + ( ) + ( ) ) = 8 = 5.3 ( 3 1) = = = = Copyright 014 Pearson Education, Inc.

10

11 Chapter : Summarizing and Graphing Data 9 Chapter : Summarizing and Graphing Data Section - 1. No. For each class, the frequency tells us how many values fall within the given range of values, but there is no way to determine the exact IQ scores represented in the class.. If percentages are used, the sum should be 100%. If proportions are used, the sum should be No. The sum of the percentages is 199% not 100%, so each respondent could answer yes to more than one category. The table does not show the distribution of a data set among all of several different categories. Instead, it shows responses to five separate questions. 4. The gap in the frequencies suggests that the table includes heights of two different populations: students and faculty/staff. 5. Class width: 10. Class midpoints: 4.5, 34.5, 44.5, 54.5, 64.5, 74.5, Class boundaries: 19.5, 9.5, 39.5, 49.5, 59.5, 69.5, 79.5, Class width: 10. Class midpoints: 4.5, 34.5, 44.5, 54.5, 64.5, Class boundaries: 19.5, 9.5, 39.5, 49.5, 59.5, 69.5, Class width: 10. Class midpoints: 54.5, 64.5, 74.5, 84.5, 94.5, 104.5, 114.5, Class boundaries: 49.5, 59.5, 69.5, 79.5, 89.5, 99.5, 109.5, 119.5, Class width: 5. Class midpoints:, 7, 1, 17,, 7, 3, 37. Class boundaries: 0.5, 4.5, 9.5, 14.5, 19.5, 4.5, 9.5, 34.5, Class width:. Class midpoints: 3.95, 5.95, 7.95, 9.95, Class boundaries:.95, 4.95, 6.95, 8.95, 10.95, Class width:. Class midpoints: 3.95, 5.95, 7.95, 9.95, Class boundaries:.95, 4.95, 6.95, 8.95, 10.95, 1.95, No. The frequencies do not satisfy the requirement of being roughly symmetric about the maximum frequency of Yes. The frequencies start low, increase to the maximum frequency of 43, and then decrease. Also, the frequencies are approximately symmetric about the maximum frequency of , 7, , 1, 6, Copyright 014 Pearson Education, Inc.

12 10 Chapter : Summarizing and Graphing Data 15. On average, the actresses appear to be younger than the actors. Age When Oscar Was Won Relative Frequency (Actresses) Relative Frequency (Actors) % 1.% % 31.7% % 4.7% % 15.9% % 7.3% % 1.% % 0.0% 16. The differences are not substantial. Based on the given data, males and females appear to have about the same distribution of white blood cell counts. White Blood Cell Counts Relative Frequency (Males) Relative Frequency (Females) % 15.0% % 40.0% %.5% % 17.5% % 0.0% % 5.0% 17. The cumulative frequency table is Age (years) of Best Actress When Oscar Was Won 18. The cumulative frequency table is Cumulative Frequency Less than 30 7 Less than Less than Less than Less than Less than Less than 90 8 Age (years) of Best Actor When Oscar Was Won Cumulative Frequency Less than 30 1 Less than 40 7 Less than 50 6 Less than Less than Less than 80 8 Copyright 014 Pearson Education, Inc.

13 Chapter : Summarizing and Graphing Data Because there are disproportionately more 0s and 5s, it appears that the heights were reported instead of measured. Consequently, it is likely that the results are not very accurate. x Frequency Because there are disproportionately more 0s and 5s, it appears that the heights were reported instead of measured. Consequently, it is likely that the results are not very accurate. x Frequency Yes, the distribution appears to be a normal distribution. Pulse Rate (Male) Frequency Copyright 014 Pearson Education, Inc.

14 1 Chapter : Summarizing and Graphing Data. Yes. The pulse rates of males appear to be generally lower than the pulse rates of females. Pulse Rate (Females) Frequency No, the distribution does not appear to be a normal distribution. Magnitude Frequency No, the distribution does not appear to be a normal distribution. Depth (km) Frequency Yes, the distribution appears to be roughly a normal distribution. Red Blood Cell Count Frequency Yes, the distribution appears to be roughly a normal distribution. Red Blood Cell Count Frequency Copyright 014 Pearson Education, Inc.

15 Chapter : Summarizing and Graphing Data Yes. Among the 48 flights, 36 arrived on time or early, and 45 of the flights arrived no more than 30 minutes late. Arrival Delay (min) Frequency ( 60) ( 31) 11 ( 30) ( 1) No. The times vary from a low of 1 minutes to a high of 49 minutes. It appears that many flights taxi out quickly, but many other flights require much longer times, so it would be difficult to predict the taxi-out time with reasonable accuracy Taxi-Out Time (min) Frequency Category Relative Frequency Male Survivors 16.% Males Who Died 6.8% Female Survivors 15.5% Females Who Died 5.5% Cause Relative Frequency Bad Track 46% Faulty Equipment 18% Human Error 4% Other 1% 31. Pilot error is the most serious threat to aviation safety. Better training and stricter pilot requirements can improve aviation safety. Cause Relative Frequency Pilot Error 50.5% Other Human Error 6.1% Weather 1.1% Mechanical.% Sabotage 9.1% Copyright 014 Pearson Education, Inc.

16 14 Chapter : Summarizing and Graphing Data 3. The digit 0 appears to have occurred with a higher frequency than expected, but in general the differences are not very substantial, so the selection process appears to be functioning correctly. The digits are qualitative data because they do not represent measures or counts of anything. The digits could be replaced by the first 10 letters of the alphabet, and the lottery would be essentially the same. Digit Relative Frequency % 1 8.3% 10.0% % 4 6.7% 5 9.% 6 7.5% 7 8.3% 8 7.5% % 33. An outlier can dramatically affect the frequency table. Weight (lb) With Outlier Without Outlier Number of Data Values Ideal Number of Classes Copyright 014 Pearson Education, Inc.

17 Section -3 Chapter : Summarizing and Graphing Data It is easier to see the distribution of the data by examining the graph of the histogram than by the numbers in the frequency distribution.. Not necessarily. Because those with special interests are more likely to respond, and the voluntary response sample is likely to consist of a group having characteristics that are fundamentally different than those of the population. 3. With a data set that is so small, the true nature of the distribution cannot be seen with a histogram. The data set has an outlier of 1 minute. That duration time corresponds to the last flight, which ended in an explosion that killed seven crew members. 4. When referring to a normal distribution, the term normal has a meaning that is different from its meaning in ordinary language. A normal distribution is characterized by a histogram that is approximately bell-shaped. Determination of whether a histogram is approximately bell-shaped does require subjective judgment. 5. Identifying the exact value is not easy, but answers not too far from 00 are good answers. 6. Class width of inches. Approximate lower limit of first class of 43 inches. Approximate upper limit of first class of 45 inches. 7. The tallest person is about 108 inches, or about 9 feet tall. That tallest height is depicted in the bar that is farthest to the right in the histogram. That height is an outlier because it is very far from all of the other heights. The height of 9 feet must be an error, because the height of the tallest human ever recorded was 8 feet 11 inches. 8. The first group appears to be adults. Knowing that the people entered a museum on a Friday morning, we can reasonably assume that there were many school children on a field trip and that they were accompanied by a smaller group of teachers and adult chaperones and other adults visiting the museum by themselves. 9. The digits 0 and 5 seem to occur much more than the other digits, so it appears that the heights were reported and not actually measured. This suggests that the results might not be very accurate. 10. The digits 0 and 5 seem to occur much more often than the other digits, so it appears that the heights were reported and not measured. This suggests that the results might not be very accurate. 11. The histogram does appear to depict a normal distribution. The frequencies increase to a maximum and then tend to decrease, and the histogram is symmetric with the left half being roughly a mirror image of the right half. Copyright 014 Pearson Education, Inc.

18 16 Chapter : Summarizing and Graphing Data 11. (continued) 1. The histogram appears to roughly approximate a normal distribution. The frequencies generally increase to a maximum and then tend to decrease, and the histogram is symmetric with the left half being roughly a mirror image of the right half. 13. The histogram appears to roughly approximate a normal distribution. The frequencies increase to a maximum and then tend to decrease, and the histogram is symmetric with the left half being roughly a mirror image of the right half. 14. No, the histogram does not appear to approximate a normal distribution. The frequencies do not increase to a maximum and then decrease, and the histogram is not symmetric with the left half being a mirror image of the right half. Copyright 014 Pearson Education, Inc.

19 Chapter : Summarizing and Graphing Data (continued) 15. The histogram appears to roughly approximate a normal distribution. The frequencies increase to a maximum and then tend to decrease, and the histogram is symmetric with the left half being roughly a mirror image of the right half. 16. The histogram appears to roughly approximate a normal distribution. The frequencies increase to a maximum and then tend to decrease, and the histogram is symmetric with the left half being roughly a mirror image of the right half. Copyright 014 Pearson Education, Inc.

20 18 Chapter : Summarizing and Graphing Data 17. The two leftmost bars depict flights that arrived early, and the other bars to the right depict flights that arrived late. 18. Yes, the entire distribution would be more concentrated with less spread. 19. The ages of actresses are lower than those of actors. 0. a. 107 inches to 109 inches; 8 feet 11 inches to 9 feet 1 inch. b. The heights of the bars represent numbers of people, not heights. Because there are many more people between 43 inches tall and 55 inches tall, they have the tallest bars in the histogram, but they have the lowest actual heights. They have the tallest bars because there are more of them. Section In a Pareto chart, the bars are arranged in descending order according to frequencies. The Pareto chart helps us understand data by drawing attention to the more important categories, which have the highest frequencies. Copyright 014 Pearson Education, Inc.

21 Chapter : Summarizing and Graphing Data 19. A scatter plot is a plot of paired quantitative data, and each pair of data is plotted as a single point. The scatterplot requires paired quantitative data. The configuration of the plotted points can help us determine whether there is some relationship between two variables. 3. The data set is too small for a graph to reveal important characteristics of the data. With such a small data set, it would be better to simply list the data or place them in a table. 4. The sample is a voluntary response sample since the students report their scores to the website. Because the sample is a voluntary response sample, it is very possible that it is not representative of the population, even if the sample is very large. Any graph based on the voluntary response sample would have a high chance of showing characteristics that are not actual characteristics of the population. 5. Because the points are scattered throughout with no obvious pattern, there does not appear to be a correlation. 6. The configuration of the points does not support the hypothesis that people with larger brains have larger IQ scores. Copyright 014 Pearson Education, Inc.

22 0 Chapter : Summarizing and Graphing Data 7. Yes. There is a very distinct pattern showing that bears with larger chest sizes tend to weigh more. 8. Yes. There is a very distinct pattern showing that cans of Coke with larger volumes tend to weigh more. Another notable feature of the scatterplot is that there are five groups of points that are stacked above each other. This is due to the fact that the measured volumes were rounded to one decimal place, so the different volume amounts are often duplicated, with the result that points are stacked vertically. 9. The first amount is highest for the opening day, when many Harry Potter fans are most eager to see the movie; the third and fourth values are from the first Friday and the first Saturday, which are the popular weekend days when movie attendance tends to spike. Copyright 014 Pearson Education, Inc.

23 Chapter : Summarizing and Graphing Data The numbers of home runs rose from 1990 to 000, but after 000 there was a very gradual decline. 11. Yes, because the configuration of the points is roughly a bell shape, the volumes appear to be from a normally distributed population. The volume of 11.8 oz. appears to be an outlier. 1. No, because the configuration of points is not at all a bell shape, the amounts do not appear to be from a normally distributed population. 13. No. The distribution is not dramatically far from being a normal distribution with a bell shape, so there is not strong evidence against a normal distribution There are no outliers. The distribution is not dramatically far from being a normally distribution with a bell shape, so there is not strong evidence against a normal distribution Copyright 014 Pearson Education, Inc.

24 Chapter : Summarizing and Graphing Data To remain competitive in the world, the United States should require more weekly instruction time Because there is not a single total number of hours of instruction time that is partitioned among the five countries, it does not make sense to use a pie chart for the given data. Copyright 014 Pearson Education, Inc.

25 Chapter : Summarizing and Graphing Data The frequency polygon appears to roughly approximate a normal distribution. The frequencies increase to a maximum and then tend to decease, and the graph is symmetric with the left half being roughly a mirror image of the right half. 0. No, the frequency polygon does not appear to approximate a normal distribution. The frequencies do not increase to a maximum and then decrease, and the graph is not symmetric with the left half being a mirror image of the right half. 1. The vertical scale does not start at 0, so the difference is exaggerated. The graphs make it appear that Obama got about twice as many votes as McCain, but Obama actually got about 69 million votes compared to 60 million to McCain.. The fare doubled from $1 to $, but when the $ bill is shown with twice the width and twice the height of the $1 bill, the $ bill has an area that is four times that of the $1 bill, so the illustration greatly exaggerates the increase in fare. 3. China s oil consumption is.7 times (or roughly 3 times) that of the United States, but by using a larger barrel that is three times as wide and three times as tall (and also three times as deep) as the smaller barrel, the illustration has made it appear that the larger barrel has a volume that is 7 times that of the smaller barrel. The actual ratio of US consumption to China s consumption is roughly 3 to 1, but the illustration makes it appear to be 7 to The actual braking distances are 133 ft., 136 ft., and 143 ft., so the differences are relatively small, but the illustration has a scale that begins at 130 ft., so the differences are grossly exaggerated. Copyright 014 Pearson Education, Inc.

26 4 Chapter : Summarizing and Graphing Data 5. The ages of actresses are lower than those of actors. 6. a b. The condensed stemplot reduces the number of rows so that the stemplot is not too large to be understandable * * * * * * * * 3 Chapter Quick Quiz 1. The class width is The class boundaries are and No min., 6 min., 6 min., 6 min., 6 min., 67 min., and 69 min. 6. Bar graph 7. Scatterplot 8. Pareto Chart 9. The distribution of the data set 5. No 10. The bars of the histogram start relatively low, increase to a maximum value and then decrease. Also, the histogram is symmetric with the left half being roughly a mirror image of the right half. Review Exercises 1. Volume (cm 3 ) Frequency Copyright 014 Pearson Education, Inc.

27 Chapter : Summarizing and Graphing Data 5. No, the distribution does not appear to be normal because the graph is not symmetric. 3. Although there are differences among the frequencies of the digits, the differences are not too extreme given the relatively small sample size, so the lottery appears to be fair. 4. The sample size is not large enough to reveal the true nature of the distribution of IQ scores for the population from which the sample is obtained A time-series graph is best. It suggests that the amounts of carbon monoxide emissions in the United States are increasing. Copyright 014 Pearson Education, Inc.

28 6 Chapter : Summarizing and Graphing Data 6. A scatterplot is best. The scatterplot does not suggest that there is a relationship. 7. A Pareto chart is best. Cumulative Review Exercises 1. Pareto chart.. Nominal, because the responses consist of names only. The responses do not measure or count anything, and they cannot be arranged in order according to some quantitative scale. 3. Voluntary response sample. The voluntary response sample is not likely to be representative of the population, because those with special interests or strong feelings about the topic are more likely than others to respond and their views might be very different from those of the general population. 4. By using a vertical scale that does not begin at 0, the graph exaggerates the differences in the numbers of responses. The graph could be modified by starting the vertical scale at 0 instead of The percentage is 41 = = 37.6%. Because the percentage is based on a sample and not a population 641 that percentage is a statistic. 6. Grooming Time (min.) Frequency Copyright 014 Pearson Education, Inc.

29 Chapter : Summarizing and Graphing Data 7 7. Because the frequencies increase to a maximum and then decrease and the left half of the histogram is roughly a mirror image of the right half, the data appear to be from a population with a normal distribution. 8. Stemplot Copyright 014 Pearson Education, Inc.

30

31 Chapter 3: Statistics for Describing, Exploring, and Comparing Data 9 Chapter 3: Statistics for Describing, Exploring, and Comparing Data Section 3-1. No. The numbers do not measure or count anything, so the mean would be a meaningless statistic.. The term average is not used in statistic. The term mean should be used for the value obtained when data values are added, then the sum is divided by the number of data values. 3. No. The price exactly in between the highest and lowest is the midrange, not the median. 4. They use different approaches for providing a value (or values) of the center or middle of a set of data values. 5. The mean is = million. 10 The median is = $95 million. There is no mode. The midrange is = $199.5 million. Apart from the obvious and trivial fact that the mean annual earnings of all celebrities is less than $33 million, nothing meaningful can be known about the mean of the population The mean is 10 = $51, The median is = $51,193. There is no mode. The midrange is = $5,64.5. Apart from the obvious and trivial fact that all other colleges have tuition amounts less than those listed, nothing meaningful can be known about the mean of the population. 7. The mean is = hic. 7 The median is 393 hic. There is no mode. The midrange is = 435 hic. The safest of these cars appears to be the Hyundai Elantra. Because the measurements appear to vary substantially from a low of 36 hic to a high of 544 hic, it appears that some small cars are considerably safer than others. 8. The mean is = hic. 6 The median is = hic. There is no mode. The midrange is = 80.5 hic. Copyright 014 Pearson Education, Inc.

32 30 Chapter 3: Statistics for Describing, Exploring, and Comparing Data 8. (continued) All of the measures of center are less than 1000 hic, but that does not indicate that all of the individual booster seats satisfy the requirement. One of the booster seats has a measurement of 110 hic, which does not satisfy the specified requirement of being less than 1000 hic. 9. The mean is = $16.4 million. 14 The median is = 10 million. The modes are $4 million, $9 million, and $10 million. The midrange is = $31 million. The measures of center do not reveal anything about the pattern of the data over time, and that pattern is a key component of a movie s success. The first amount is highest for the opening day when many Harry Potter fans are most eager to see the movie, the third and fourth values are from the first Friday and the first Saturday, which are the popular weekend days when movie attendance tends to spike. 10. The mean is = 8.7 manatees. 10 The median is = 80 manatees. The mode is 73 manatees. The midrange is = 83 manatees. The measures of center do not reveal anything about the pattern of the data over time, and it is important to monitor the number of manatee deaths caused by collisions with watercraft, so that corrective action might be taken. 11. The mean is = $ The median is = $ There is no mode. The midrange is = $ None of the measures of center are most important here. The most relevant statistic in this case is the minimum value of $48.9, because that is the lowest price for the software. Here, we generally care about the lowest price not the mean price or median price. 1. The mean is 17,688, ,68, , 407, ,765, 410 = $1,898, The median is $14,765,410. There is no mode. The midrange is = $9,814, 93. The compensation amount of $1 for Jobs is an outlier because it is very far from all the other values. 13. The mean is = 11.05μg/g. 10 The median is = 9.5 μg/g. The mode is 0.5 μ g/g. Copyright 014 Pearson Education, Inc.

33 Chapter 3: Statistics for Describing, Exploring, and Comparing Data (continued) The midrange is = μg/g. There is not enough information given here to assess the true danger of these drugs, but ingestion of any lead is generally detrimental to good health. All of the decimal values are either 0 or 5, so it appears that the lead concentrations were rounded to the nearest one-half unit of measurement. 14. The mean is = ppm. 7 The median is 0.75 ppm. There is no mode. The midrange is = ppm. Fairway has the tuna with the lowest level of mercury, so it has the healthiest tuna. Because of the large range of values, it does not appear that the different stores are getting their tuna from the same supplier The mean is = years. The median is = 4.5 years. The modes are 4 years and 4.5 years. The midrange is = 9.5 years. It is common to earn a bachelor s degree in four years, but the typical college student requires more than four years. 16. The mean is = W/kg. 11 The median is 0.9 W/kg. There is no mode. The midrange is = W/kg. If purchasing a cell phone with concern about radiation emissions, you might be more interested in the fact that the maximum emission is 1.55 W/kg, which is less than the FCC standard of 1.6 W/kg. You might also be interested in the radiation emission for the particular cell phone you are considering. 17. The mean is ( 15) + ( 18) + ( 3) + ( 1) + ( 9) + ( 3) = 14.3 min. 8 The median is ( 15) + ( 18) = The mode is 3 min. The midrange is ( 3) + 11 = Because the measures of center are all negative values, it appears that the flights tend to arrive early before the scheduled arrival times, so the on-time performance appears to be very good. Copyright 014 Pearson Education, Inc.

34 3 Chapter 3: Statistics for Describing, Exploring, and Comparing Data ( ) + 3+ ( ) + ( ) + 5+ ( ) ( 5) The mean is. 18 = 1.9 kg. The median is 1 + = 1.5 kg. The mode is kg. The midrange is ( 5) + 11 = 3 kg. No, because the mean weight gain is only 1.9 kg, which is below the 6.8 kg weight gain given in the legend. 19. The mean is = The median is 73. There is no mode. The midrange is = The numbers do not measure or count anything; they are simply replacements for names. The data are at the nominal level of measurement, and it makes no sense to compute the measures of center for these data. 0. The mean is = 1.9. The median is. The mode is 1. The midrange is =.5. The mode of 1 correctly indicates that the smooth-yellow peas occur more than any other phenotype, but the other measures of center do not make sense with these data at the nominal level of measurement. 1. White drivers mean is 73 mi/h. White drivers median is 73 mi/h. African American drivers mean is 74 mi/h. African American drivers median is 74 mi/h. Although the African American drivers have a mean speed greater than the white drivers, the difference is very small, so it appears that drivers of both races appear to speed about the same amount.. Collection contractor was Brinks had a mean of $1.55 million, and a median of $1.55 million. Collection contractor was not Brinks had a mean of $1.73 million and a median of $1.65 million. The data do suggest that collections were considerably lower when Brinks was the collection contractor. 3. Obama had a mean of $653.9 and a median of $45. McCain had a mean of $458.5 and a median of $350. The contributions appear to favor Obama because his mean and median are substantially higher. With 66 contributions to Obama and 0 to McCain, Obama collected substantially more in total contributions. 4. Jefferson Valley had a mean of 7.15 min. and a median of 7. min. Providence had the same results as Jefferson Valley. Although the measures of center are the same, the Providence times are much more varied than the Jefferson Valley times. 5. The mean is the median is Yes, it is an outlier because it is a value that is very far away from all the other sample values. 6. The mean is 1 min. and the median is 18.5 min. The mean taxi-out time is important for calculating and scheduling the arrival times. Copyright 014 Pearson Education, Inc.

35 Chapter 3: Statistics for Describing, Exploring, and Comparing Data The mean is 15 years and the median is 16 years. Presidents receive Secret Service protection after they leave office, so the mean is helpful in planning for the cost and resources used for that protection. 8. The mean is 101 and the median is The mean of 101 does not differ from the population mean of 100 by an amount that is substantial, so it appears that the sample is consistent with the population. 7( 4.5) + 34( 34.5) + 13( 44.5) + ( 54.5) + 4( 64.5) + 1( 74.5) + 1( 84.5) 9. = This result is quite close to the mean of 35.9 years found by using the original list of data values. 4.5() ( 6) ( 35) ( 13) ( 6) () = 44.5 years. This result is not substantially different from the mean of 44.1 found by using the original list of data values. 4( 54.5) + 10( 64.5) + 5( 74.5) + 43( 84.5) + 6( 94.5) + 8( 104.5) + 3( 114.5) + ( 14.5) 31. = This result is close to the mean of 84.4 found using the original list of data values. ( 8) + 7( ) + 1( 5) + 17( 7) + ( 4) + 7( 6) + 3( 0) + 37( 1) 3. = 15 years. When rounded, this result is the same mean of 15 years found using the original list of data values. 33. a. x = 5( 0.6) = 0.6 parts per million b. n The mean ignoring the presidents who are still alive is 15 years. The mean including the presidents who are still alive is at least 15. years. The results do not differ by much. 35. The mean is 39.07, the 10% trimmed mean is 7.677, and the 0% trimmed mean is By deleting the outlier of 47., the trimmed means are substantially different from the untrimmed mean. 36. The mean of 47 mi/h is not the actual average speed, because more time was spent at the lower speed. The harmonic mean is 45.3 mi/h, and it does represent the true average value. 37. The geometric mean is = , or when rounded. Single percentage growth rate is 3.67%. The result is not exactly the same as the mean which is 3.68%. 38. The root mean square (RMS) is volts, which is very different from the mean of 0 volts ( 7 + 1) 39. The median is 30 ( 10) + = years, which is rounded to 34 years. The value of 33 years is better because it is based on the original data and does not involve interpolation. Section The IQ scores of a class of statistics students should have less variation, because those students are a much more homogeneous group with IQ scores that are likely to be closer together.. Parts (a), (b), and (d) are true. 3. Variation is a general descriptive term that refers to the amount of dispersion or spread among the data values, but the variance refers specifically to the square of the standard deviation. 4. s, σ, s, σ Copyright 014 Pearson Education, Inc.

36 34 Chapter 3: Statistics for Describing, Exploring, and Comparing Data 5. The range is $33 $67 = $65 million. 10( 350, 9) (,553,604) The variance is s = = square of million dollars. 10( 9) The standard deviation is s = 10,548 = $ million. Because the data values are 10 highest from the population, nothing meaningful can be known about the standard deviation of the population. 6. The range is $54,410 $50,875 = $ ( 6,631,884,700) ( 515,966) The variance is s = = 1,088,153.8 square dollars. 10( 9) The standard deviation is s = 1,088,153.8 = $ Because the data values are the 10 highest from the population, nothing meaningful can be known about the standard deviation of the population. 7. The range is = 18 hic. 7( 1,34, 439) ( 9,066,11) The variance is = hic squared. 76 ( ) The standard deviation is = 88.8 hic. Although all of the cars are small, the range from 36 hic to 544hic appears to be relatively large, so the head injury measurements are not about the same. 8. The range is = 779 hic. 6( 3,34,798) ( 4) The variance is s = = 74,383.5 hic squared. 65 () The standard deviation is s = 74,383.5 = 7.7 hic. Because the data values are the 10 highest from the population, nothing meaningful can be known about the standard deviation of the population. 9. The range is 58 4 = $54 million. 14( 6487) 5441 The variance is = 10.9 square of million dollars. 14( 13) The standard deviation is 10.9 = $14.5. An investor would care about the gross from opening day and the rate of decline after that, but the measures of center and variation are less important. 10. The range is = 8 manatees. 10( 69,303) ( 87) The variance is s = = manatees squared. 10( 9) The standard deviation is s = = 10.1 manatees. The measures of variation reveal nothing about the pattern over time. 11. The range is $71.77 $48.9 = $.85. 6( 1, ) 16,38 The variance is = dollars squared. 65 () The standard deviation is $9.957 =. The measures of variation are not very helpful in trying to find the best deal. Copyright 014 Pearson Education, Inc.

37 Chapter 3: Statistics for Describing, Exploring, and Comparing Data The range is $19,68,584 $1 = $19,68, ( 1,070,16,05,084,410) ( 64, 490,037) The variance is s = = 59,583,69,405,35.10 dollars 10( 9) squared. The standard deviation is 59,583,69,405,35.1 = $7,719,00. The amount of $1 for Jobs is an outlier, and it has a great effect on the measures of deviation. 13. The range is = 17.5μg/g. 10( ) 1,10.5 The variance is = 41.75( μg/g). 10( 9) The standard deviation is = 6.46μg/g. If the medicines contained no lead, all of the measures would be 0 μg/ g, and the measures of variation would all be 0 as well. 14. The range is = 1.15 ppm. 7( 4.4) ( 5.03) The variance is s = = ppm squared. 76 ( ) The standard deviation is = ppm. If the tuna sushi contained no mercury, all of the measures would be 0 ppm, and the measures of variation would all be 0 as well. 15. The range is 15 4 = 11 years. 0( ) 16,900 The variance is = 1.3 years. 0( 19) The standard deviation is 1.3 = 3.5 years. No, because 1 years is within standard deviations of the mean. 16. The range is = 1.17 W/kg. 11( 11.41) ( 10.3) The variance is s = = (W/kg). 11( 10) The standard deviation is = 0.43 W/kg. No. Same models of cell phones are sold much more than others, so the measures from the different models should be weighted according to their size in the population. 17. The range is 11 ( 3) = 43min ( ) 1,996 The variance is = 31.4 min. squared. 87 ( ) The standard deviation is 31.4 = 15. min. The standard deviation can never be negative. 18. The range is 11 ( 5) = 16 kg. 18( 340) ( 3) The variance is s = = 16.5 kg. 18( 17) The standard deviation is = kg. The weight gain of 6.8 kg is not unusual because it is within standard deviations of the mean. Although a gain of 6.8 kg is not unusual, the mean weight gain of 1.9 kg is not close to the legendary 6.8 kg, so an individual weight gain of 6.8 kg does not support the legend. Copyright 014 Pearson Education, Inc.

38 36 Chapter 3: Statistics for Describing, Exploring, and Comparing Data 19. The range is 88 9 = ( 38,078) 306,916 The variance is = ( 10) The standard deviation is = The data are at the nominal level of measurement and it makes no sense to compute the measures of variation for these data. 0. The range is 4 1= 3. 5( 7) ( 47) The variance is s = = ( 4) The standard deviation is 0.9 = Because the data are at the nominal level of measurement, these results make no sense. 1. The mean of the White drivers is 73 and the standard deviation is.906 the coefficient of variation for the White drivers is % = 4%. The mean for the African American 74 and the standard deviation is the coefficient of variation for the African American drivers is % = 3.7%. The variation is 74 about the same.. The mean of the collection contractor was Brinks is 1.55 and the standard deviation is the coefficient of variation is % = 11.5%. The mean of the collection contractor was not Brinks is 1.73 and the 1.55 standard deviation is 0.14 the coefficient of variation is % = 1.8%. The variation is about 1.73 the same. 3. The mean of Obama contributors is $654 and the standard deviation is $53 the coefficient of variation is $53 100% = 80%. The mean of McCain contributors is $459 and the standard deviation is $418 the $654 coefficient of variation is $ % = 90%. The variation among Obama contributors is a little less than $459 the variation among the McCain contributors. 4. The mean of Jefferson Valley is 7.15 and the standard deviation is the coefficient of variation is % = 6.7%. The mean of Providence is 7.15 and the standard deviation is 1.8 the coefficient 7.16 of variation is % = 5.5%. The variation among Jefferson Valley waiting times is much less 7.15 than among the Providence waiting times. 5. The range is.95, the variance is 0.345, and the standard deviation is The range is 37 min., the variance is 85.5 min. squared, and the standard deviation is 9. min. 7. The range is 36 years, the variance is 94.5 years squared, and the standard deviation is 9.7 years. 8. The range is 4, the variance is 174.5, and the standard deviation is The standard deviation.95 = 0.738, which is not substantially different from The standard deviation 37 = 9.3 min., which is very close to 9. min. 4 Copyright 014 Pearson Education, Inc.

39 Chapter 3: Statistics for Describing, Exploring, and Comparing Data The standard deviation 36 = 9 years, this is reasonably close to 9.7 years The standard deviation 4 = 10.5, which is not substantially different from No. The pulse rate of 99 beats per minute is between the minimum usual value of 54.3 beats per minute and the maximum usual value of beats per minute. 34. Yes. The pulse rate of 45 beats per minute is not between the minimum usual value of 46.7 beats per minutes and the maximum usual value of 87.9 beats per minute. 35. Yes. The volume of 11.9 oz. is not between the minimum usual value of oz. and the maximum usual value of 1.41 oz. 36. No. The weight of lb. is between the minimum usual value of and the maximum usual value of lb ( 84, 408.5) 8,637,71 s = = 1.3 years. This result is not substantially different from the standard 8( 81) deviation of 11.1 years found from the original list of data values. 8( 169,980.5) 13,315,01 s = = 9.7 years. The result is not substantially different from the standard 8( 81) deviation of 9 years found from the original list of sample values. 11( 889,106.69) 104,941, s = = The result is very close to the standard deviation of ( 10) found from the original list of sample values. 33( 10,55) 46, s = = 9.8 years. The result is very close to the standard deviation of 9.7 years 33( 3) found from the original list of sample values. 41. a. 95% 4. a. 68% b. 68% b. 99.7% 43. At least 75% of women have platelet counts within standard deviations of the mean. The minimum is 150 and the maximum is At least 89% of healthy adults have body temperatures within 3 standard deviations of the mean. The minimum is F and the maximum is F. ( ) ( ) ( ) a. σ = = 6.9 min 3 b. The nine possible samples of two values are the following: [( min, min), ( min, 3 min), ( min, 8 min), (3 min, min), (3 min, 3 min), (3 min, 8 min), (8 min, min), (8 min, 3 min), (8 min, 8 min)] and they have the following corresponding variances: [0, 0.707, 18, 0.707, 0, 1.5, 18, 1.5, 0] which have the mean of c. The population variances of the nine samples above are [0, , 9, , 0, 6.5, 9, 6.5, 0] d. Part (b), because repeated samples result in variances that target the same value (6.9 min. ) as the population variance. Use division by n 1. e. No. The mean of the sample variances (6.9 min. ) equals the population variance, but the mean of the sample standard deviations (1.9 min.) does not equal the mean of the population standard deviation (.6 min.) Copyright 014 Pearson Education, Inc.

40 38 Chapter 3: Statistics for Describing, Exploring, and Comparing Data 46. The mean absolute deviation of the population is.4 minutes. With repeated samplings of size, the nine different possible samples have mean absolute deviations of 0, 0, 0, 0.5, 0.5,.5,.5, 3, and 3. With many such samples, the mean of those nine results is 1.3 minutes, showing that the sample mean absolute deviations tend to center about the value of 1.3 minutes instead if the mean absolute deviation of the population, which is.4 minutes. The sample mean deviations do not target the mean deviation of the population. This is not good. This indicates that a sample mean absolute deviation is not a good estimator of the mean absolute deviation of a population. Section Madison s height is below the mean. It is.8 standard deviations below the mean...00 should be preferred, because it is.00 standard deviations above the mean and would correspond to the highest of the five different possible scores. 3. The lowest amount is $5 million, the first quartile Q 1 is $47 million, the second quartile Q (or median) is $104 million, the third quartile Q 3 is $11 million, and the highest gross amount is $380 million. 4. All three values are the same. 5. a. The difference is $3,670,505 $4,939, 455 = $1, 68,950 b. $1, 68,950 = 0.16 standard deviations $7,775,948 c. z = 0.16 d. Usual 6. a. The difference is b = 3.01standard deviations c. z = 3.01 d. Unusual 7. a. The difference is $1 $1, 449,779 = $1, 449,778 b. $1,449,778 =.75 standard deviation $57,651 c. z =.75 d. Unusual 8. a. The difference is 15.3 beats per minute b = 1.49 standard deviations 10.3 c. z = 1.49 d. Usual 9. Z scores of and. A z score of means a score of x = = 70. A z score of means a score of x = = Z scores of and. A z score of means a hip breadth of x = = 31.6 cm. A z score of means a hip breadth of x = = 41.6 cm 11. Two standard deviations from the mean: = and = Two standard deviations from the mean: = 1613 words and = words Copyright 014 Pearson Education, Inc.

41 Chapter 3: Statistics for Describing, Exploring, and Comparing Data The tallest man z score is z = = 10.9 the tallest women z score is z = = De- 7 6 Fen Yao is relatively taller, because her z score of 1.33, which is greater than the z score of 10.9 for Sultan Kosen. De-Fen Yao is more standard deviations above the mean than Sultan Kosen With a z score of z = = 0.8, Sandra Bullock was relatively younger than Jeff Bridges, who has a z score of z = = The SAT score of 1490 has a z score of z = = 0.09, and the ACT score of 17 has a z score of z = = The z score of 0.09 is a larger number than the z score of 0.85, so the SAT 4.8 score of 1490 is relatively better The male has a higher count because his z score is z = = 0.41, which is a higher number than the z score of z = = 0.67 for the female The percentile for 13 sec. is =, so the 13th percentile 18. The percentile for 40 sec. is =, so the 33rd percentile 19. The percentile for 50 sec. is =, so the 50th percentile 0. The percentile for 60 sec. is =, so the 83rd percentile P60 = = 14.4, pick 15 th entry which is 51 sec Q1 = = 34.5 sec Q3 = = 55 sec P40 = = 9.6, pick 10 th entry which is 43 sec P50 = = 47.5 sec P75 = = 18, which is entry 55 sec P5 = Q1 = 34.5 sec P85 = = 0.4, pick the 1 st entry which is 60 sec. 100 Copyright 014 Pearson Education, Inc.

42 40 Chapter 3: Statistics for Describing, Exploring, and Comparing Data 9. The five number summary: 1 sec, 8709 sec, 10,074.5 sec, 11,445 sec, 11,844 sec 30. The five number summary: 81 min, 88 min, 94.5 min, 98 min, 106 min 31. The five number summary : 4 min, 14 min, 18 min, 3 min, 63 min 3. The five number summary: 70 mi/h, 7 mi/h, 74 mi/h, 78 mi/h, 79 mi/h 33. It appears that males have lower pulse rates than females Male Pulse Female Pulse 34. Although actresses include the oldest age of 80 years, the boxplot for actresses shows that they have ages that are generally lower than those of actors. Actresses Actors 35. The weights of regular Coke appear to be generally greater than those of diet Coke, probably due to the sugar in cans of regular Coke. CKREGWT CKDIETWT Copyright 014 Pearson Education, Inc.

43 Chapter 3: Statistics for Describing, Exploring, and Comparing Data The low lead level group has much more variation and the IQ scores tend to be higher than the IQ scores from the high lead level group. Low Lead High Lead 37. Outliers for actresses 60 years, 61 years, 63 years, 70 years, and 80 years. Outliers for actors: 76 years. The modified boxplots show that only one actress has an age that is greater than any actor. 38. Using interpolation, P 17 = 1.6. Using figure 3-5, P 17 =. In this case, the results are close, but in some other cases the results might be quite different. Chapter Quick Quiz 1. The mean is 14 minutes. The median is 1 minutes 3. The mode is 1 minutes 4. The variance is (5 min) = 5 min z = = Standard deviation, variance, range, mean absolute deviation 7. Sample mean x, population mean μ 8. s, σ, s, σ 9. 75% 10. Minimum, first quartile Q 1, second quartile Q (or median), third quartile Q 3, maximum Review Exercises 1. a x = = mm 5 b. The median is 1550 mm c. There is no mode Copyright 014 Pearson Education, Inc.

44 4 Chapter 3: Statistics for Describing, Exploring, and Comparing Data 1. (continued) d. The midrange is = mm e. The range is = 145 mm f. s = ( ) + ( ) + ( ) + ( ) + ( ) 5 1 = mm g. s = = mm h. 5 5 Q1 = = 1.5, pick second entry (in ordered list) which is 1538 mm i. Q3 = = 3.75, pick the fourth entry (in the ordered list) which is 1571 mm z = = The eye height is not unusual because its z score is between and, so it is within two standard deviations of the mean. 3. The five number summary: 1497, 1538, 1550, 1571, 164 Because the boxplot shows a distribution of data that is roughly symmetric, the data could be from a population with a normal distribution, but the data are not necessarily from a population with a normal distribution, because there is no way to determine whether a histogram is roughly a bell shape. 4. The mean is The ZIP codes do not measure or count anything. They are at the nominal level of measurement, so the mean is a meaningless statistic The male z score is z = = 0.6. The female z score is z = = The male has a larger relative BMI because the male has the larger z score. 6. a. The answers may vary but a mean around $8 or $9 is reasonable. b. A reasonable standard deviation would be around $1 or $. 7. Based on a minimum age of 3 years and a maximum age of 70 years an estimate of the age standard deviation would be 70 3 = years A minimum usual sitting height of = 84 mm and a maximum sitting height of = 986 mm. The maximum usual height of 986 mm is more relevant for designing overhead bin storage. 9. The minimum value is 963 cm 3, the first quartile is cm 3, the second quartile (or median) is 1079 cm 3, the third quartile is cm 3, and the maximum value is 1439 cm The median would be better because it is not affected much by the one very large income. Cumulative Review Exercises 1. a. Continuous b. Ratio Copyright 014 Pearson Education, Inc.

45 Chapter 3: Statistics for Describing, Exploring, and Comparing Data 43. Hand Length (mm) Frequency Hand length histogram a x = = 190.1mm 8 b. The median is mm c. ( ) + ( ) + ( ) + ( ) + ( ) + ( ) + ( ) + ( ) = s = = 18.7 mm mm 7 d. s = 18.7 = mm e. The range is = 56 mm 6. Yes. The frequencies increase to a maximum, and then they decrease. Also, the frequencies preceding the maximum are roughly a mirror image of those that follow the maximum. 7. No. Even though the sample is large, it is a voluntary response sample, so the responses cannot be considered to be representative of the population of the United States. 8. The vertical scale does not begin at 0, so the differences among different outcomes are exaggerated. Copyright 014 Pearson Education, Inc.

46

47 Chapter 4: Probability 45 Chapter 4: Probability Section PA= ( ) = , 10, PA= ( ) 1 = = , , 000. The probability of a baby being born a boy is 1 or Part (c). 4. The answers vary, but an answer in the neighborhood of 0.99 is reasonable. 5. 5:, 7 456, 0.9, or or Unlikely, neither unusually low nor unusually high 10. Unlikely, unusually high 11. Unlikely, unusually low 1. Unlikely, neither unusually low nor unusually high or or or 0.5 or 0. or or The employer would suffer because it would be at a risk by hiring someone who uses 1000 drugs. 90. or The person tested would suffer because he or she would be suspected of using drugs when 1000 in reality he or she does not use drugs or This result is not close to the probability of for a positive test result or This result is not very close to the probability of for a negative test result. or Yes, the technique appears to be effective. 39 or Yes, the technique appears to be effective or No, the probability of being struck is much greater on an open golf course 300,000,000 during a thunder storm. The golfer should seek shelter. Copyright 014 Pearson Education, Inc.

48 46 Chapter 4: Probability = 0.738; yes 1 9. a. 365 b. Yes c. He already knew d = No, it is not unlikely, because the responses are from a voluntary response survey; the results are not likely to be very good. 10,47, or No, a crash is not unlikely. Given that car crashes are so common, we should take 135,933,000 precautions such as not driving after drinking and not using a cell phone or texting or Yes, it is unlikely. The air travel fatality rate is much higher than that of 1,000,000,000 cars. The comparison isn t fair because car trips involve much shorter distances than trips by air = It is unlikely = It is unlikely = Yes, it is unlikely. The middle seat lacks an outside view, easy access to the aisle, and a passenger in the middle seat has passengers on both sides instead of on one side only = Yes, it is unlikely or or {bb, bg, gb, gg}; 1 or {bbbb, bbbg, bbgb, bbgg, bgbb, bgbg, bggb, bggg, gbbg, gbbb, gbgb, gbgg, ggbb, ggbg, gggb, gggg}; 4 16 or a. brown /brown, brown/blue, blue/brown, blue/blue 43. a. b. 999 : : 1 b. 1 4 c a. 0 b. 0 c. 0.5 d. 0 c. The description is not accurate. The odds against winning are 999:1 and the odds in favor are 1:999, not 1: a. 18 or b. 10 : 9 c. $18 d. $ 0 Copyright 014 Pearson Education, Inc.

49 45. a. $16 b. 8 : 1 c. About 9.75 : 1, which becomes 39 : 4 d. $ Relative risk: 103 = Odds ratio: or = Chapter 4: Probability The probability of a headache with Nasonex (0.014) is slightly less than the probability of a headache with the placebo (0.013), so Nasonex does not appear to pose a risk of headache. Section Based on the rule of the complements, the sum of P(A) and its complement must always be 1, so the sum cannot be A is the event of betting on the pass line and not winning (or losing). PA= ( ) = Because it is possible to select someone who is male and a Republican, events M and R are not disjoint. Both events can occur at the same time when someone is randomly selected. 4. It is certain that an event occurs or does not occur. 5. Disjoint 10. Disjoint 6. Not disjoint 11. Not disjoint 7. Not disjoint 1. Disjoint 8. Not disjoint = Disjoint = PD ( ) = 0.45, where PDis ( ) the probability of randomly selecting someone who does not choose a direct in-person encounter as the most fun way to flirt. 16. PI ( ) denotes the probability of screening a driver and finding that he or she is not intoxicated, and PI ( ) = = = = or That probability is not as high as it should be. Copyright 014 Pearson Education, Inc.

50 48 Chapter 4: Probability or That probability is not as low as it should be. or or a. 11 = or 78.6% 14 b. = or 14.3% 14 c. The physicians given the labels with concentrations appear to have done much better. The results suggest that labels described as concentrations are much better than labels described as ratios. 6. a. 3 = 0.14 or 1.4% 14 b. 1 = or 85.7% 14 c. The physicians given the labels with ratios appear to have done much worse. The results suggest that label described as ratios are much worse than labels described as concentrations Use the following table for Exercises 7 3 Age and Total over Responded Refused Total = Yes. A high refusal rate results in a sample that is not necessarily representative of the 105 population, because those who refuse may well constitute a particular group with opinions different from others = = = = = = = = = Subject Used Marijuana Subject Did not Use Marijuana Positive Test Result Negative Test Result Total Total Copyright 014 Pearson Education, Inc.

51 = = Chapter 4: Probability 49 = No, the general population probably has a marijuana usage rate less than 0.407, or 40.7% = With an error rate of 0.09 or 9%, the test does not appear to be highly accurate = Exercise 37 results in the probability of a wrong result and this exercise result in the 300 probability of a correct result, so these exercises deal with events that are complements or No. Here is one example: A = event of selecting a male under 30 years of age, B = selecting a female, C = selecting a male over 18 years of age. 41. PA ( or B) = PA ( ) + PB ( ) PA ( and B) 4. PA ( or Bor C) = PA ( ) + PB ( ) + PC ( ) PA ( and B) PA ( and C) PB ( and C) + PA ( and Band C) 43. a. 1 PA ( ) PB ( ) + PA ( and B) b. 1 PA ( and B) c. No Section The probability that the second selected senator is a Democrat given that the first selected senator was a Republican.. R and D are dependent events, because the probability of a Democrat on the second selection is affected by the outcome of the first selection. Because it was stipulated that the second selection must be a different senator, the sampling is done without replacement, so only 99 senators are available for the second selection. 3. False. The events are dependent because the radio and air conditioner are both powered by the same electrical system. If you find that your car s radio does not work, there is a greater probability that the air conditioner will also not work. 4. Because the selections are based on different numbers, the sampling is done without replacement and the events are dependent. Because the sample size of 1068 is less than 5% of the population size of 8,741,346, the events can be treated as being independent (based on the 5% guideline for cumbersome calculations). 5. a. The events are dependent 7. a. Independent b. 1 or a. Independent b. 1 or b. 1 or a. Dependent b. 1 or Copyright 014 Pearson Education, Inc.

52 50 Chapter 4: Probability 9. a. Independent b. 5 5 = a. Independent b. 1 or a = Yes, it is unlikely b = Yes, it is unlikely a. Dependent 58 1 b. = a. Dependent 8 7 b. = a. b = Yes, it is unlikely = Yes, it is unlikely a = No, it is not unlikely b = No, it is not unlikely a = No, it is not unlikely b = No, it is not unlikely = No, the entire batch consists of malfunctioning pacemakers = The scheme is not likely to detect the large number if defects. With a probability of 0.583, it is more likely that the entire batch will be accepted. 19. a = b. = c. = d. By using one backup drive, the probability of failure is 0.0, and with three independent disk drives, the probability drops to By changing from one drive to three, the likelihood of failure drops from 1 chance in 50 to only 1 chance in 15,000, and that is a very substantial improvement in reliability. BACK UP YOUR DATA =. With one radio there is a probability of a serious problem, but with two independent radios, the probability of a serious problem drops to , which is substantially lower. The flight becomes much safer with two independent radios. Copyright 014 Pearson Education, Inc.

53 1. a. b or = a = b c. c or or 0. Chapter 4: Probability = Yes, it is unlikely, but perhaps there was a strong need to staff the department so that the hirings were more likely to occur on the same day. Subject used marijuana Subject did not use marijuana Positive Test Result Negative Test Result Total True Positive False Negative False Positive 4 True Negative 154 Total = No, it is not unlikely = Yes, it is unlikely = Yes, it is unlikely = No it is not unlikely a = b = Using the 5% guideline for cumbersome calculations 8. a = b = Using the 5% guideline for cumbersome calculations 9. a = b = Using the 5% guideline for cumbersome calculations Copyright 014 Pearson Education, Inc.

54 5 Chapter 4: Probability 30. a. b. c = = = a = b = c. The series arrangement provides better protection = Section a. Answers vary, but 0.98 is a reasonable estimate. b. Answers vary, but is a reasonable estimate.. A conditional probability is a probability of an event calculated with the knowledge that some other event has occurred. 3. The probability that the polygraph indicates lying given that the subject is actually telling the truth. 4. Confusion of the inverse is to think that the following two probabilities are the same: (1) the probability of a polygraph indication of lying when the subject is telling the truth; () the probability of a subject telling the truth when the polygraph indicates lying. Confusion of the inverse is to think that PA ( B) = PB ( A) or to use one of those probabilities in place of the other. 5. At least one of the five children is a boy or At least one of the five children is a girl None of the digits is 0. = At least one of the digits is 7. 1 = or = The chance of passing is reasonably good = The probability of having to complete the exam without a working calculator drops from 0.08 to (or 64 chances in 10000), so she does gain a substantial increase in reliability or 50% 1 1. or ( 0.51) 5 = ( 0.545) 5 = The system cannot continue indefinitely because eventually there would be no women to give birth. Copyright 014 Pearson Education, Inc.

55 Chapter 4: Probability ( ) 3 = 0.1. Given that the three cars are in the same family, they are not randomly selected and there is a good chance that the family members have similar driving habits, so the probability might not be accurate. 16. a. 1 ( ) 5 = b. ( 0.14) 5 = c. The detective is much better than average, or the detective was given five easy cases ( ) 4 = It is very possible that the result is not valid because it is based on data from a voluntary response survey ( ) 1 = It is very possible that the result is not valid because it is based on data from a voluntary response survey or This is the probability of the test making it appear that the subject uses drugs when the 950 subject is not a drug user or 0.1. The employer would suffer by hiring a job applicant who appears to not use drugs, but the 50 applicant actually does use drugs or This result is substantially different from the result found in Exercise 0. The 866 probabilities P(subject uses drugs negative test result) and P(negative test result subject uses drugs) are not equal.. a. 860 or a. or b. 860 or c. The results are different 44 or a. 1 3 or or b or 1 or 0.5 or a. 1 ( 0.0) = b. 1 ( 0.0) 3 = b or ( ) = Rounding the result to three significant digits would yield a probability of 1.00, but that would be misleading because it would suggest that it is certain that both radios will work. Yes, the probability is high enough to ensure flight safety ( ) 8 = The probability is not low, so further testing of the individual samples will be necessary for about 68% of the combined samples. Copyright 014 Pearson Education, Inc.

56 54 Chapter 4: Probability 3. 1 ( ) 5 = The probability is quite low, indicating that further testing of the individual samples will be necessary for about % of the combined samples. 33. a = b = or a. = b. 0.8 c. The estimate of 75% is dramatically greater than the actual rate of 7.48%. They exhibited confusion of the inverse. A consequence is that they would unnecessarily alarm patients who are benign, and they might start treatments that are not necessary. Section The symbol! is the factorial symbol that represents the product of decreasing whole numbers, as in 4! = 431 = 4. Four people can stand in line 4 different ways.. Combinations, because order does not count and five numbers are selected (from 1 to 39) without replacement. 3. Because repetition is allowed, numbers are selected with replacement, so neither of the two permutation rules applies. The fundamental counting rule can be used to show that the number of possible outcomes is = 10,000, so the probability of winning is 4. Only the fundamental counting rule applies = , , = ,000,000, = = ! 36, = 7! ! 9. The number of combinations is (7 1)!1! = 17,383,860. Because that number is so large, it is not practical to make a different CD for each possible combination = =. No 5,57,00 is too many possibilities to list ,57, ! (8 3)! = ! 50,400 3!3!! = ! 34, 650 4!4!! = Copyright 014 Pearson Education, Inc.

57 Chapter 4: Probability ! (44 6)!6! = 7,059,05. The probability is 53! (53 6)!6! =,957,480. The probability is 1 7,059,05 1,957, = 18. 4! = 7! a. 41! (41 5)!5! = 749,398. The probability is b = 10,000 c. $10,000 38! 0. a. ( 39 5 )!5! = 575,757. The probability is 1 749, ,757 b = 1000 c. $1000 1! 1. a. ( 1 4 )! = 11,880 16!. a. ( )! =10,461,394,944,000 b. c. 1! ( 1 4 )!4! = b. c. 16! ( )!14! = The number of possible combinations is = 15,000. The fundamental counting rule can be used. The different possible codes are ordered sequences of numbers, not combinations, so the name of combination lock is not appropriate. Given that fundamental counting rule lock is a bit awkward, a better name would be something like number lock =. No, there are too many different possibilities ,000, ! = 10 ; AMITY; ! 5! 5! 5! = 6 5!0! 4!1! 3!!!3! 6. 6! 360 = ; HARROW; a = 10,000,000,000,000,000 b = 1,000,000,000, c. =. The number of possibilities (100,000,000) is still quite large, so there is no ,000, reason to worry = 100,000,000 Copyright 014 Pearson Education, Inc.

58 56 Chapter 4: Probability 5! 9. ( 5 )!! = a. 1 4 or 0.5 b. 3 or c. Trick question. There is no finite number of attempts, because you could continue to get the wrong position every time = a b. = = ,147,483, C C = 195, 49, C C = 175,711, C =. Yes, if everyone treated is of one gender while everyone in the placebo group is of the 5 5 opposite gender, you would not know if different reactions are due to the treatment or gender = =,095,681,645, a. 5 C = 10 nn ( 1) b. nc = c. 4! = 4 d. ( n 1)! ways: {5p, 1n 0p, n 15p, 3n 10p, 4n 5p, 5n, 1d 15p, 1d 1n 10p, 1d n 5p, 1d 3n, d 5p,d 1n} (Note: 5p represents 5 pennies, etc.) 40. The probability is 0. If 9 of the letters are in the correct envelopes, the 10 th letter must also be in the correct envelope, so it is impossible for the 10 th letter to go into the wrong envelope. Chapter Quick Quiz 1. 0 (not an option) = (all days contain the letter y) = Answers vary, but an answer such as 0.01 or lower is reasonable = = Review Exercises = = = = = = = = = 0.41 Copyright 014 Pearson Education, Inc.

59 Chapter 4: Probability It appears that you have a substantially better chance of avoiding prison if you enter a guilty plea = = = = = = Answers vary, but DuPont data show that about 8% of cars are red, so any estimate between 0.01 and 0. would be reasonable. 1. a =.65 b. ( 0.35) 4 = c. Yes, because the probability is so small 13. a b = , 000. No C = 5, 45, C = 575, Cumulative Review Exercises 17. c. Answers vary, but it is probably small, such as 0.0 d. Yes = P 3 = The probability is a. The mean of 8.9 years is not close to the value of 0 years that would be expected with no gender discrepancy. b. The median of 13.5 years is not close to the value of 0 years that would be expected with no gender discrepancy. c. s = ( 0 ( 8.9) ) + ( 15 ( 8.9) ) + + ( 15 ( 8.9) ) = 10.6 years. 11 d. s = ( 10.6) = 113. years e. Q 1 = 15 years f. Q 3 = 5 years g. The boxplot suggests that the data have a distribution that is skewed. Copyright 014 Pearson Education, Inc.

60 58 Chapter 4: Probability. a z = = No, the pulse rate of 100 beats per minute is within standard deviations away 11.6 from the mean, so it is not unusual. b z = =.37. Yes, the pulse rate of 50 beats per minutes is more than standard deviations 11.6 away from the mean so it is unusual. c. 1 Yes, because the probability of (or ) is so small. 56 d. No, because the probability of 1 8 (or 0.15) is not very small. 3. a % 5100 = = b = 46% c. Stratified sample 4. The graph is misleading because the vertical scale does not start at 0. The vertical scale starts at the frequency of 500 instead of 0, so the difference between the two response rates is exaggerated. The graph incorrectly makes it appear that no responses occurred 60 times more often than the number of yes responses, but comparisons of the actual frequencies shows that the no responses occurred about four times more often than the number of yes responses. 5. a. A convenience sample b. If the students at the college are mostly from a surrounding region that includes a large proportion of one ethnic group, the results will not reflect the general population of the United States. c = 0.75 d. 1 ( 0.6) = The straight-line pattern of the points suggests that there is a correlation between chest size and weight. 7. a. b. c. 1 1 C = 575, C C = 10,939, Copyright 014 Pearson Education, Inc.

61 Chapter 5: Discrete Probability Distributions Section 5- Chapter 5: Discrete Probability Distributions The random variable is x, which is the number of girls in three births. The possible values of x are 0, 1,, and 3. The values of the random variable x are numerical.. The random variable is discrete because the number of possible values is 4, and 4 is a finite number. The random variable is discrete if it has a finite number of values or a countable number of values. 3. Table 5-7 does describe a probability distribution because the three requirements are satisfied. First, the variable x is a numerical random variable and its values are associated with probabilities. Second, Σ Px= ( ) = 1 as required. Third, each of the probabilities is between 0 and 1 inclusive, as required. 4. a. Yes, because b. No, because > a. Continuous random variable 6. a. Not a random variable b. Discrete random variable b. Continuous random variable c. Not a random variable c. Discrete random variable d. Discrete random variable d. Discrete random variable e. Continuous random variable e. Not a random variable f. Discrete random variable f. Discrete random variable 7. Probability distribution with μ = ( ) + (1 0.5) + ( 0.375) + (3 0.5) + ( ) = σ = (0 ) (1 ) ( ) (3 ) (4 ) = 1 8. Probability distribution with μ = ( ) + (1 0.87) + ( 0.05) + ( ) + ( ) + (5 0) = 0.4 σ = = 0.6 (0 0.4) (1 0.4) 0.87 ( 0.4) (4 0.4) (5 0.4) 0 9. Not a probability distribution because the sum of the probabilities is 0.601, which is not 1 as required. Also, Ted clearly needs a new approach. 10. Not a probability distribution because the responses are not values of a numerical random variable. 11. Probability distribution with μ = ( ) + (1 0.) + ( 0.367) + (3 0.99) + (4 0.09) =. σ = (0.) (1.) 0. + (.) (3.) (4.) 0.09 = 1 1. Probability distribution with μ = (0 0.0) + ( ) + ( 0.05) + ( ) + (4 0.79) + ( ) + (6 0.08) = 4.6 σ = = 1 (0 4.6) 0.0 (1 4.6) ( 4.6) 0.05 (3 4.6) (6 4.6) Not a probability distribution because the responses are not values of a numerical random variable. Also, sum of the probabilities is 1.18 instead of 1 as required. 14. Not a probability distribution because the sum of the probabilities is instead of 1 as required. The discrepancy between and 1 is too large to attribute to rounding errors. 15. μ = ( ) + (1 0.01) + ( 0.044) + + (9 0.01) + ( ) = 5 σ = (0 5) (1 5) ( 5) (9 5) (10 5) = 1.6 Copyright 014 Pearson Education, Inc.

62 60 Chapter 5: Discrete Probability Distributions 16. Lower limit: μ σ = 5 (1.6) = 1.8 girls; Upper limit: μ+ σ = 5 + (1.6) = 8. girls Yes, 1 girl is an unusually low number of girls, because 1 girl is outside the range of usual values. 17. a. PX= ( 8) = b. PX ( 8) = = c. The result from part (b) d. No, because the probability of 8 or more girls is 0.055, which is not very low (less than or equal to 0.05) 18. a. PX= ( 1) = 0.01 b. PX ( 1) = = c. The result from part (b) d. Yes, because the probability of is very low (less than or equal to 0.05) 19. μ = ( ) + ( ) + ( 0.176) + ( ) + ( ) + (5 0) + (6 0) = 0.9 σ = (0 0.9) (1 0.9) (4 0.9) (5 0.9) 0 + (6 0.9) 0 = Lower limit: μ σ = 0.9 (0.9) = 0.9 ;Upper limit: μ+ σ = (0.9) =.7 Yes; 3 is above the range of usual values, so 3 is an unusually high number of failures among 6 cars tested. 1. a. PX= ( 3) = b. PX ( 3) = = c. The probability from part (b) d. Yes, because the probability of three or more failures is which is very low (less than or equal to 0.05). a. PX= ( 1) = b. PX ( 1) = = c. The result from part (b) d. No, because the probability of is not very low (less than or equal to 0.05) 3. a = 1000 b c. $500 $1 = $499 d. 1 $1 1 + $500 = $0.50 = 50 cents 1000 e. The $1 bet on the pass line in craps is better because its expected value of 1.4 cents is much greater than the expected value of 50 cents for the Texas Pick 3 lottery. 4. a = 10,000 b. 1 10, 000 c. $5000 $1 = $4999 d. 1 $1 1 + $5000 = $0.50 = 50 cents 10,000 e. Because both bets have the same expected value of 50 cents, neither bet is better than the other. Copyright 014 Pearson Education, Inc.

63 Chapter 5: Discrete Probability Distributions a $0.6 + $30 $5 = b. The bet on the number 7 is better because its expected value of 6 cents is greater than the expected value of 39 cents for the other bet. 6. a , ,000,000 = $131, b ( , ) + (1 131, ) + + (1,000, , ) = $53, c. Lower limit: $131, $53, = $375, Upper limit: $131, $53, = $638, d. Yes, because the values are above the range of usual values given in part (c) Section The given calculation assumes that the first two adults include Wal-Mart and the last three adults do not include Wal-Mart, but there are other arrangements consisting of two adults who include Wal-Mart and three who do not. The probabilities corresponding to those other arrangements should also be included in the result.. The format of Formula 5-5 requires that the probability p and the variable x refer to the same outcome. If p is the probability of an adult including Wal-Mart, then x should count the number of people who include Wal-Mart. 3. Because the 30 selections are made without replacement, they are dependent, not independent. Based on the 5% guideline for cumbersome calculations, the 30 selections can be treated as being independent. (The 30 selections constitute 3% of the population of 1000 responses, and 3% is not more than 5% of the population.) The probability can be found by using the binomial probability formula. 4. The 0+ indicates that the probability is a very small positive value. (The actual value is ) The notation of 0+ does not indicate that the event is impossible; it indicates that the event is possible, but very unlikely. 5. Not binomial. Each of the weights has more than two possible outcomes. 6. Binomial 7. Binomial 8. Not binomial. Each of the responses has more than two possible outcomes. 9. Not binomial. Because the senators are selected without replacement, the selections are not independent. (The 5% guideline for cumbersome calculations cannot be applied because the 40 selected senators constitute 40% of the population of 100 senators, and that exceeds 5%.) 10. Not binomial. Because the senators are selected without replacement, they are not independent.. (The 5% guideline for cumbersome calculations cannot be applied because the 10 selected senators constitute 10% of the population of 100 senators, and that exceeds 5%.). Also, the numbers of terms have more than two possible outcomes. 11. Binomial. Although the events are not independent, they can be treated as being independent by applying the 5% guideline. The sample size of 380 is no more than 5% of the population of all smartphone users. 1. Binomial. Although the events are not independent, they can be treated as being independent by applying the 5% guideline. The sample size of 47 is not more than 5% of the population of all women. Copyright 014 Pearson Education, Inc.

64 6 Chapter 5: Discrete Probability Distributions 13. a = b. {WWC, WCW, CWW}; 0.18 for each c = a = b. {MMXX, MXMX, MXXM, XXMM, XMXM, XMMX}; each has a probability of c = C = C C C = C C C = C C C = C = C = C = = C = C = PX ( ) = = ; yes 6. PX ( 5) = = C = PX ( ) = = ; yes, because the probability of or fewer peas with green pods is small (less than or equal to 0.05). 8. PX ( 5) = = ; no, because the probability of is not small (less than or equal to 0.05) 9. a C = 0.00 (Tech: ) 6 0 b. 6C = (Tech: ) c = 0.00 (Tech: ) d. Yes, the small probability from part (c) suggests that 5 is an unusually high number. 30. a. b. 5 7C = (Tech: ) 1 6 7C = (Tech: ) c = (Tech: ) d. Yes, the small probability from part (c) suggests that is unusually low Copyright 014 Pearson Education, Inc.

65 Chapter 5: Discrete Probability Distributions a. b. C = 0.38 C = c = (Tech: 0.737) d. No, the probability from part (c) is not small, so 1 is not an unusually low number 3. a. C = b. 8C = c = d. No, the probability from part (c) is not small, so 7 is not unusually high C =. No, because the probability of exactly 1 is 0.101, the probability of 1 or more is greater than 0.101, so the probability of getting 1 or more is not very small, so 1 us not unusually high C = The probability is not very small, so it is not unlikely 10 1C = No, because the flights all originate from New York, they are not randomly selected flights, so the 80.5% on-time rate might not apply 36. C =. No, because the probability of exactly 0 is 0.15, the probability of 0 or fewer having those concerns is greater than 0.15, so the probability of getting 0 or fewer is not very small, so 0 is not unusually low a. C = b = c. 1C C = d. Yes, the very low probability of would suggest that the 45 share value is wrong a. C ( ) C ( ) = With 1 booked passengers, there is a probability of that more than 19 passengers will show up, and that the flight will be overbooked. It does not seem wise to schedule in such a way that the flights will be overbooked about 37% of the time. b. c. C = C = C14 14C = d. Yes. The probability of getting 13 girls or a result of 14 girls is , so chance does not appear to be a reasonable explanation for the 13 girls. Because 13 is an unusually high number of girls, it appears that the probability of a girl is higher with the XSORT method, and it appears that the XSORT method is effective a. C = 0.1 b. No. If the success rate is equal to 50%, it is likely (with probability 0.5) that we get 8 successes or a result that is more extreme(fewer than 8 successes). This indicates that with a 50% success rate, the occurrence of 8 successes in 0 challenges could be reasonably explained by chance C =. It is not unlikely for such a combined sample to test positive C =. It is unlikely for such a combined sample to test positive. Copyright 014 Pearson Education, Inc.

66 64 Chapter 5: Discrete Probability Distributions C C = The probability shows that about /3 of all shipments will be accepted. With about 1/3 of the shipments rejected, the supplier would be wise to improve quality. C C C = About 98% of all shipments will be accepted. Almost all shipments will be accepted, and only % of the shipments will be rejected. 45. PX= ( 5) = 0.06( ) 4 = ! = !4!3! a. b. c. 6! 43! (6+ 43)! P(4) = = (6 4)!4! ( )!(6 4)! ( )!6! 6! 43! (6+ 43)! P(6) = = (6 6)!6! ( )!(6 6)! ( )!6! 6! 43! (6+ 43)! P(0) = = (6 0)!0! ( )!(6 0)! ( )!6! Section n = 70, p = 0.07, q = = 4. people. Yes, both expressions will yield the same result because they are equivalent. They are equivalent because q= 1 p 3. Variance is = 9.4 executives 4. The mean of executives is expressed with μ. The mean is calculated for the population of all groups of 150 executives, not just one sample group. Because the mean is calculated for a population, it is a parameter. 5. μ = np = = 1 correct guesses and σ = np(1 p) = = 3.1 correct guesses. Minimum: 1 (3.1) = 5.8 correct guesses, maximum: 1+ (3.1) = 18. correct guesses. 6. μ = np = = 7 girls and σ = np(1 p) = = 1.9 girls. Minimum: 7 (1.9) = 3., maximum 7 + (1.9) = 10.8 girls 7. μ = np = = worriers and σ = np(1 p) = = 15.1 worriers. Minimum: (15.1) = worriers, maximum: (15.1) = worriers 8. μ = np = = 6 subjects with headaches and σ = np(1 p) = =.4 subjects with headaches. Minimum: 6 (.4) = 1. subjects with headaches, maximum: 6 + (.4) = 10.8 subjects with headaches. 9. a. μ = np = = boys and σ = np(1 p) = = 8.5 boys b. Yes. Using the range rule of thumb, the minimum value is (8.5) = 18.5 boys and the maximum value is (8.5) = 16.5 boys. Because 39 boys is above the range of usual values, it is unusually high. Because 39 boys is unusually high, it does appear that the YSORT method of gender selection is effective. Copyright 014 Pearson Education, Inc.

67 Chapter 5: Discrete Probability Distributions a. μ = np = = 145 and σ = np(1 p) = = 10.4 b. No, it is within the range of usual values of 145 (10.4) = 14. and145+ (10.4) = It does not provide strong evidence against Mendel s theory. 11. a. μ = np = = 0 and σ = np(1 p) = = 4 b. No, because 5 orange M&Ms is within the range of usual values of 0 (4) = 1 and 0 + (4) = 8. The claimed rate of 0% does not necessarily appear to be wrong, because that rate will usually result in 1 to 8 orange M&Ms (among 100), and the observed number of orange M&Ms is within that range. 1. a. μ = np = = 14 and σ = np(1 p) = = 3.5 b. No, because 8 yellow M&Ms is within the range of usual values of 14 (3.5) = 7 and 14 + (3.5) = 1. The claimed rate of 14% does not necessarily appear to be wrong, because that rate will usually result in 7 to 1 yellow M&Ms (among 100), and the observed number of yellow M&Ms is within that range. 13. a. μ = np = 40, = 14.8 and σ = np(1 p) = 40, = 11.9 b. No, 135 is not unusually low or high because it is within the range of usual values 14.8 (11.9) = 119 and (11.9) = c. Based on the given results, cell phones do not pose a health hazard that increases the likelihood of cancer of the brain or nervous system. 14. a. μ = np = = 140 and σ = np(1 p) = = 8.4 b. The result of 13 correct identifications is just outside the range of usual values of 140 (8.4) = 13. and140+ (8.4) = 156.8, but this indicates that 13 is unusually low. If the touch therapists really had an ability to select the correct hand, they would have made more than correct identifications. Therefore, they do not appear to have that ability. 15. a. μ = np = = 156 and σ = np(1 p) = = 1.1 b. The minimum usual frequency is 156 (1.1) = and the maximum is 156+ (1.1) = The occurrence of r 178 times is not unusually low or high because it is within the range of usual values. 16. a. μ = np = = 330. and σ = np(1 p) = = 17 b. The minimum usual frequency is 330. (17) = 96. and the maximum is (17) = The occurrence of e 90 times is unusually because it is below the range of usual values. 17. a. μ = np = = 74 and σ = np(1 p) = = 7.7 b. The minimum usual number is 74 (7.7) = 58.6 and the maximum is 74+ (7.7) = The value of 90 is unusually high because it is above the range of usual values. 18. a μ = np = 50 = 1.3and σ = np(1 p) = 50 = b. The minimum usual value is 1.3 (1.1) = 0.9 and the maximum is 1.3+ (1.1) = 3.5 The result of 0 wins is not unusually low because 0 wins is within the range of usual values. Copyright 014 Pearson Education, Inc.

68 66 Chapter 5: Discrete Probability Distributions a. μ = np = 30 = and σ = np(1 p) = 30 = b. The minimum usual value is ( ) = and the maximum is ( ) = The result of students born on the 4 th of July would be unusually high, because is above the range of usual values a. μ = np = 600 = and 195, 49, , 49, 053 σ = np(1 p) = 600 = ,49, ,49,054 b. The minimum usual values is ( ) = and the maximum is ( ) = It is unusual to buy a ticket each week for 50 years and win at least once, because 1 win (or more) is outside the range of usual values. 1. From the range of usual values we get μ = 60 and σ = 6. Using the formulas for the mean and the σ μ standard deviation we get p = 1 = 0.4which leads to n = = 150 and q = 0.6 μ p Cn 30C1 n 3. The probability of selecting a girl out of 40 is given by PX ( = n) = the following table list 40C1 the probabilities of selecting the number of girls from 0 to 10 Number of girls (X = n) Probability P(X = n) The mean is μ = [ x P( x)] μ = = 3 The standard deviation is σ = [ x P( x)] μ σ = = 1.3 Copyright 014 Pearson Education, Inc.

69 Chapter 5: Discrete Probability Distributions 67 Section μ = = 0.99, which is the mean number of hits per region. x =, because we want the probability 576 that a randomly selected region had exactly hits, and e =.7188 which is a constant used in all applications of Formula The mean is μ = = 4., the standard deviation is σ = 4. =.1 and the variance is σ = With n = 50, the first requirement of n 100 is not satisfied. With n = 50 and p = the second requirement of np 10 is satisfied. Because both requirements are not satisfied, we should not use the Poisson distribution as an approximation to the binomial. 4. With n = 100 and p = the two requirements are satisfied. For 101 wins, the Poisson approximation gives a small positive probability, but the actual probability of 101 wins is 0 since it is impossible to get 101 wins in 100 tries e P(0) = = ; Yes it is unlikely. 0! e P(6) = = ; No it is not unlikely 6! e P(10) = = 0.11 ; No it is not unlikely 10! e P(1) = = ; No it is not unlikely 1! 9. a. 68 μ = = e b. 1 = ! c. Yes. Based on the result in part (b), we are quite sure (with probability 0.998) that there is at least one earthquake measuring 6.0 or higher on the Richter scale, so there is a very low probability (0.00) that there will be no such earthquake in a year a. μ = = e b. P(133) = = ! c. No. Although the probability of exactly 133 earthquakes measured at 6.0 or higher on the Richter scale is quite small (0.0346), the number 133 is so close to the mean of that this year would be quite ordinary, and it would not be unusual a. μ = = b e P(50) = = ! Copyright 014 Pearson Education, Inc.

70 68 Chapter 5: Discrete Probability Distributions μ = = e 9.8 e a. P(0) = = c. P() = = 0.1 0!! e 9.8 e b. P(1) = = d. P(3) = = ! 3! e e. P(4) = = The expected frequencies of 139, 97, 34, 8, and 1.4 compare 4! reasonably well to the actual frequencies, so the Poisson distribution does provide good results. 13. a e P() = = 0.17! b. The expected number of regions with exactly hits is 98. c. The expected number of regions with hits is close to 93, which is the actual number of regions with hits. 14. a. μ = = b e e P(0) = = 0.87 and P(1) = = So the probability of 0 or 1 is 0! 1! = c = d. No, the probability of more than one case is extremely small, so the probability of getting as many as four cases is even smaller. 15. a. b. 16. a. b e P(6) = = The expected value is = 1.9 cookies. The expected 6! number of cookies is very close to the actual number of cookies with 6 chocolate chips which is e P(30) = = The expected value is =.5 cookies. The expected 30! number of cookies is very different from the actual number of cookies with 6 chocolate chips which is e P(18) = = The expected value is = 3.5 cookies. The expected 18! number of cookies is not very close to the actual number of cookies with 18 chocolate chips which is e P(1) = = The expected value is = 3.3 cookies. The expected 1! number of cookies is very close to the actual number of cookies with 1 chocolate chips which is a. No. With n = 1 and p = 1 the requirement of n 100 is not satisfied, so the Poisson distribution 6 is not a good approximation to the binomial distribution. b. No. The Poisson distribution approximation to the binomial distribution yields e 1 5 P(3) = = 0.18 and the binomial distribution yields P(3) = 1C3 = ! 6 6. The Poisson approximation of 0.18 is too far from the correct result of Copyright 014 Pearson Education, Inc.

71 Chapter Quick Quiz 1. Yes = 0 5 Chapter 5: Discrete Probability Distributions σ = = 4 4. The range of usual values has a minimum value of = 180 and a maximum value of = 0. Therefore, 3 girls in 400 is an unusually high number of girls since it is outside the range of usual values. 5. The range of usual values has a minimum value of = 180 and a maximum value of = 0. Therefore, 185 girls in 400 is not an unusually high number of girls since it is inside the range of usual values. 6. Yes. The sum of the probabilities is and it can be considered to be indicates that the probability is a very small positive number. It does not indicate that it is impossible for none of the five flights to arrive on time. 8. Px ( 3) = = μ = = 4.0 and σ = = The range of usual values is from.36 to Since zero is outside the range of usual values it is an unusually low number. 10. μ = = 4.0 and σ = = The range of usual values is from.36 to Since 5 is inside the range of usual values it is not an unusually high number. Review Exercise 1. PX= = =. 0 6 ( 0) 6C ( = 4) = 6C = μ = = 40 and σ = = 1. The range of usual values has a minimum of 40 1 = 16 and a maximum value of = 64. The result of 00 with brown eyes is unusually low. 4. The probability of 39 or fewer ( 0.484) is relevant for determining whether 39 is an unusually low number. Because that probability is not very small, it appears that 39 is not an unusually low number of people with brown eyes. 5. Yes. The three requirements are satisfied. There is a numerical random variable x and its values are associated with corresponding probabilities. The sum of the probabilities is 1.001, so the sum is 1 when we allow for a small discrepancy due to rounding. Also, each of the probability values is between 0 and 1 inclusive. 6. μ = = 0.4 and σ = = 0.6 The range of usual values has a minimum of = 0.8 and a maximum value of = 1.6. Yes, 3 is an unusually high number of males with tinnitus among four randomly selected males. 7. The sum of the probabilities is 0.90 which is not 1as required. Because the three requirements are not satisfied, the given information does not describe a probability distribution. PX Copyright 014 Pearson Education, Inc.

72 70 Chapter 5: Discrete Probability Distributions $75 + $300 + $75, $500, $1, 000, 000 = $315, 075. Because the offer is well below her expected value, she should continue the game (although the guaranteed prize of $ had considerable appeal). 9. a $1,000,000 + $100,000 + $5, , 000, , 000, , 000, $ $500 = $ ,667,000 7,500,000 b. $0.01 minus the cost of the postage stamp. Since the expected value of winning is much smaller than the cost of a postage stamp, it is not worth entering the contest a. μ = = b e P(0) = = ! c = 16.5 days d. The expected number of days is 16.5, and that is reasonably close to the actual number of days which is 18. Cumulative Review Exercises 1. a The mean is x = = 4.4 hours 5 b. The median is 4. hours c. The range is 6.9. = 4.7 hours d. The standard deviation is s = (. 4.4) + ( ) + (4. 4.4) + ( ) + ( ) = e. The variance is.9 hours f. The minimum is = 1 hours and the maximum is = 7.8 hours. g. No, because none of the times are outside the range of the usual values h. Ratio i. Continuous j. The given times come from countries with very different population sizes, so it does not make sense to treat the given times equally. Calculations of statistics should take the different population sizes into account. Also, the sample is very small, and there is no indication that the sample is random a. = = c = , e b. d. P(1) = = ! x P(x) $ $ e. $ $ = $0.50 or 50 cents. Copyright 014 Pearson Education, Inc.

73 3. a. b. c. d = = = = 0.97 Chapter 5: Discrete Probability Distributions 71 e. f. g = = = Because the vertical scale begins at 60 instead of 0, the difference between the two amounts is exaggerated. The graph makes it appear that men s earnings are roughly twice those of women, but men earn roughly 1. times the earnings of women. 5. a. Frequency distribution or frequency table b. Probability distribution c x = = This value is a statistic d. μ = = 4.5. This value is a parameter e. The random generation of 1000 digits should have a mean close to 4.5 from part (d). The mean of 4.5 is the mean for the population of all random digits; so samples will have means that tend to center about a. b C = C = c. This is a voluntary response sample. This suggests that the results might not be valid, because those with a strong interest in the topic are more likely to respond. Copyright 014 Pearson Education, Inc.

74

75 Chapter 6: Normal Probability Distributions Section 6- Chapter 6: Normal Probability Distributions The word normal has a special meaning in statistics. It refers to a specific bell-shaped distribution that can be described by Formula The mean and standard deviation have values of μ = 0 and σ = 1 4. The notation zα represents the z score that has an area of α to its right ( 5 1.5) = ( ) = ( ) = P( z< 0.44) = ( 3 1) = P( z ) P( z ) P( z ) 10. P( z> 1.04) = < < 1.8 = < 1.8 < 0.84 = = (Tech: ) 1. P( 1.07 < z< 0.67) = P( z< 0.67) P( z< 1.07) = = z = z = z = z = P( z<.04) = P( z< 0.19) = P( z< 1.96) = P( z> 0.8) = = P( z> 1.8) = = P( z> 1.50) = = P( z> 0.84) = = P( z<.33) = P( 0.5 < z< 1.5) = P( z< 1.5) P( z< 0.5) = = (Tech: 0.956) 6. P( 1.3 < z<.37) = P( z<.37) P( z< 1.3) = = (Tech: ) 7. P(.75 < z<.00) = P( z<.00) P( z<.75) = = P( 1.93 < z< 0.45) = P( z< 0.45) P( z< 1.93) = = P(.0 < z<.50) = P( z<.50) P( z<.0) = = P( 0.6 < z< 1.78) = P( z< 1.78) P( z< 0.6) = = (Tech: ) Copyright 014 Pearson Education, Inc.

76 74 Chapter 6: Normal Probability Distributions 31. P(.11< z< 4.00) = P( z< 4.00) P( z<.11) = = (Tech: 0.987) 3. P( 3.90 < z<.00) = P( z<.00) P( z< 3.90) = = (Tech: 0.077) 33. P( z< 3.65) = P( z> 3.80) = P( z< 0) = P( z> 0) = P 90 = P 5 = P.5 = 1.96 and P 97.5 = P 0.5 =.575 and P 99.5 = z 0.05 = z 0.01 = z 0.05 = z 0.03 = P( 1< z< 1) = P( z< 1) P( z< 1) = = = 68.6% (Tech: 68.7%) 46. P( < z< ) = P( z< ) P( z< ) = = = 95.44% (Tech: 95.45%) 47. P( 3 < z< 3) = P( z< 3) P( z< 3) = = = 99.74% (Tech: 99.73%) 48. P( 3.5 < z< 3.5) = P( z< 3.5) P( z< 3.5) = = = 99.98% (Tech: 99.95%) 49. a. P( 1< z< 1) = P( z< 1) P( z< 1) = = = 68.6% (Tech: 68.7%) b. P( z< or z> ) = P( z< ) + P( z> ) = = = 4.56% c. P( 1.96 < z< 1.96) = P( z< 1.96) P( z< 1.96) = = = 95% d. P( < z< ) = P( z< ) P( z< ) = = = 95.44% (Tech: 95.45%) e. P( z> 3) = 1 P( z< 3) = = = 0.13% a. μ =.5 min. and σ = = 1.4 min. 1 b. The probability is 1 or , and it is very different from the probability of that would be 3 obtained by incorrectly using the standard normal distribution. The distribution does affect the results very much. Section a. μ = 0 and σ = 1 b. The z scores are numbers without units of measurements. a. The area equals the maximum probability value of 1. b. The median is the middle value and for normally distributed scores that is also the mean, which is 100. c. The mode is also 100. d. The variance is the square of the standard deviation which is The standard normal distribution has a mean of 0 and a standard deviation of 1, but a nonstandard normal distribution has a different value for one or both of those parameters. 4. No. Randomly generated digits have a uniform distribution, but not a normal distribution. The probability of a digit less than 3 is = Copyright 014 Pearson Education, Inc.

77 Chapter 6: Normal Probability Distributions z x = z x = z x = = = 1. which has an area of to the left of it = = 0.6 which has an area of to the right of it = =. which has an area of to the left of it. z x = 79 = = 1.4 which has an area of to the left of it. The area between the two scores is = z x = = = 1.6 which has an area of to the left of it. z x = 11 = = 0.8 which has an area of to the left of it. The area between the two scores is = z =.44, which means x = = z = 1, which mean x = = z =.07, which means x = = z = 1.33, which means x = = z x = 85 = = 1, which has an area of to the left of it z x = 70 = =, which has an area of to the right of it z x = 90 = = 0.67 which has an area of to the left of it. z x = 110 = = 0.67 which has an area of to the left of it. The area between the two scores is = (Tech: ) 16. z x = 10 = 1.33 which has an area of to the left of it z x = 110 = = 0.67 which has an area of to the left of it. The area between the two scores is = (Tech: ) 17. z = 1.7 which means the score is x = = z = 0.67 which means the score is x = = z = 0.67 which means the score is x = = z =.07 which means the minimum score is x = = a. z x = 78 = = 5.46 which has an area of to the left of it z x = 6 = = 0.69 which has an area of to the left of it. Therefore, the percentage of.6 qualified women is = or 75.48%. (Tech 95.56%.) Yes, about 5% of women are not qualified because of their heights b. z x = 78 = = 3.54 which has an area of to the left of it. z x = 6 = = which has an area of to the left of it. Therefore, the percentage of men is = or 99.90%. (Tech: 99.89%.) No, only about 0.1% of men are not qualified because of their heights. Copyright 014 Pearson Education, Inc.

78 76 Chapter 6: Normal Probability Distributions 1. (continued) c. The z score with % to the left of it is.04 which corresponds to a height of x = = 58.5 in. The z score with % to the right of it is.04 which corresponds to a height of x = = 69.1 in. d. The z score with 1% to the left of it is.33 which corresponds to a height of x = = 63.9 in. The z score with 1% to the right of it is.33 which corresponds to a height of x = = 75.1 in a. z x = 64 = = 0.08 and z x = 77 = = The area between the two z scores is = or 46.80%. (Tech: 46.93%.) b. z x = 64 = =.9 and z x = 77 = = The area between the two z scores is = or 98.81%. c. The z score with 3% to the left of it for women is 1.88 which corresponds to a height of = 58.9 in. The z score with 3% to the right of it for men is or 1.88 which corresponds to a height of = 74 in 3. a. The height minimum is = 56 in. and the height maximum is = 75 in. The z score for women for the minimum is = 3, and the z score for women for the maximum is = The area between the z scores is = or 99.86%.6 b. The z score for men for the minimum is = 5.63, and the z score for men for the maximum.4 is =.9. The area between the z scores is = or 98.98%. (Tech: %.) c. The z score with 5% for women to the left of it is 1.65 which corresponds to a height of = 59.5 in. the z score with 5% of men to the right of it is 1.65 which corresponds to a height of = 73.4 in. 4. a. The z score for the minimum height is = 7.46 which has an area of or 0.01% to.4 the left of it meaning that practically no man can fit without bending. (Tech: 0.00%.) b. The z score for the minimum height is = 4.69 which has an area of or 0.01% to.6 the left of it meaning that practically no women can fit without bending. (Tech: 0.00%.) c. The door design is very inadequate, but the jet is relatively small and seats only six people. A much higher door would require such major changes in the design and cost of the jet, that the greater height is not practical. d. The z score for 60% is 0.5 which corresponds to a height = 70.1 in for men. 5. a. The z score for 174 lb. is = 0. which has an area of to the left of it. (Tech: ) 3500 b = people c = 19.14, so 19 people 18.9 Copyright 014 Pearson Education, Inc.

79 Chapter 6: Normal Probability Distributions (continued) d. The mean weight is increasing over time, so safety limits must be periodically updated to avoid an unsafe condition 6. a. The z score that has 95% of the area to the left of it is 1.67 which corresponds to a height of = 3.4 in. If there is clearance for 95% of males, there will certainly be clearance for all women in the bottom 5% b. Men s z score is = 1.75 and that has an area of or 99.95%. Women s z score is = 3.55 and that has an area of or 99.99%. (Tech 99.98%.) The table will fit almost 1.1 everyone except about 4% of the men with the largest sitting knee heights 7. a. The z score for a 308 day pregnancy is =.67 which corresponds to a probability of or 0.38%. Either a very rare event occurred or the husband is not the father. b. The z score corresponding to 3% is 1.87 which corresponds to a pregnancy of = 40 days 8. a. The z score for a temperature of is = 3.87 which corresponds to a an area of = = 0.01% to the right of it.; yes b. The z score for a probability of 5% is 1.65 which corresponds to a temperature of = 99. degrees. 9. a. The z score for an earthquake of magnitude is = 1.39 which is or 91.77% of earthquakes. (Tech: 99.78%.) b. The z score is = 4.80 which is or 0.01% of earthquakes. (Tech: 0.00%.) c. The z score for 95% of earthquakes is or 1.65 which corresponds to an earthquake magnitude of =.15, so not all earthquakes about the 95th percentile will cause items to shake. 30. The z score for a probability of 99% is.33 which corresponds to a hip breadth of = in. 31. The z score for P 1 is.33 which corresponds to a count of = 17.9 chocolate chips. (Tech: 18 chocolate chips.) The z score for P 99 is.33 which corresponds to a count of = 30.1 chocolate chips. (Tech: 10.0 chocolate chips.) The values can be used to identify cookies with an unusually low number of chocolate chips or an unusually high number of chocolate chips, so those numbers can be used to monitor the production process to ensure that the numbers of chocolate chips stay within reasonable limits. 3. a. The minimum weight has a z score of = 0.5 which has a corresponding probability of and the maximum weight has a z score of = 0.5 which has a corresponding 0.06 probability of Therefore, the percentage of quarters rejected is 1 ( ) = (Tech: 61.71%.) That percentage is too high because most quarters will be rejected. b. The z score for a probability of the top.5% and the bottom.5% is and respectively. Therefore the weight minimum is = 5.5 g and the weight maximum is = 5.79 g Copyright 014 Pearson Education, Inc.

80 78 Chapter 6: Normal Probability Distributions 33. a. The mean is 67.5 beats per minute and the standard deviation is beats per minute. The histogram for the data confirms that the distribution is roughly normal Frequency PULSE b. The z score for the bottom.5% is 1.95 which corresponds to a pulse of = 47 beats per minute, and the z score for the top.5% is 1.95 which corresponds to a pulse of = 87.5 beats per minute. 34. a. The mean is lb. and the standard deviation is lb. The histogram confirms that the distribution of weights is roughly normal Frequency Diet Pepsi Weights b. The z score for the bottom 0.5% is.59 which has a corresponding weight of, = lb., the z score for the top 0.5% is.59 which has a corresponding weight of = lb. 35. a. The new mean is equal to the old one plus the new points which is 75. The standard deviation is unchanged at 10 (since we added the same amount to each student.) b. No, the conversion should also account for variation. c. The z score for the bottom 70% is 0.5 which has a corresponding score of = 45., and the z score for the top 10% is 1.8 which has a corresponding score of = 5.8 d. Using a scheme like the one in part (c), because variation is included in the curving process a. z x = 30 = =.31 which has a percentage of , z x = 0 = = 1.54 which has a.6.6 percentage of Therefore, the percentage between 0 and 30 chocolate chips is = or 9.78%. (Tech: 9.75%) b. z x = 30.5 = =.5 which has a percentage of and z x = 19.5 = = 1.73 which has.6.6 a percentage of Therefore, the percentage between 19.5 and 30.5 is = or 95.0%. c. The use of the continuity correction changes the result by a relatively small but not insignificant amount. Copyright 014 Pearson Education, Inc.

81 Chapter 6: Normal Probability Distributions The z score for Q 1 is 0.67, and the z score for Q 3 is The IQR is 0.67 ( 0.67) = IQR =.01, so Q1 1.5 IQR= =.68 and Q IQR= =.68 The percentage to the left of.68 is and the percentage to the right of.68 is Therefore, the percentage of an outlier is (Tech: ) 38. a. The z score for the 95 th percentile is This corresponds to an SAT score of = 04.4 or 04. The z score of corresponds to an ACT score of = or 9.5. b. The score of 100 corresponds to a z score of = 1.89 which corresponds to an ACT score 31 of = 30.73or Section a. The sample mean will tend to center about the population parameter of 5.67 g. b. The sample mean will tend to have a distribution that is approximately normal. c. The sample proportions will tend to have a distribution that is approximately normal.. a. Without replacement b. (1) When selecting a relatively small sample from a large population, it makes no significant difference whether we sample with replacement or without replacement. () Sampling with replacement results in independent events that are unaffected by previous outcomes, and independent events are easier to analyze and they result in simpler calculations and formulas. 3. Sample mean, sample variance, sample proportion 4. No. The data set is only one sample, but the sampling distribution of the mean is the distribution of the means from all samples, not the one sample mean obtained from this single sample. 5. No. The sample is not a simple random sample from the population of all college Statistics students. It is very possible that the students at Broward College do not accurately reflect the behavior of all college Statistics students. 6. a. Normal b = c. 5 = 0.5 which is the proportion of the five odd numbers {1, 3, 5, 7, 9} to the ten digits {0, 1,, 3, 4, 10 5, 6, 7, 8, 9} 7. a The mean of the population is μ = = 6, and the variance is 3 (4 6) + (5 6) + (9 6) σ = = b. The possible sample of size are {(4, 4), (4, 5), (4, 9), (5, 4), (5, 5), (5, 9), (9, 4), (9, 5), (9, 9)} which have the following variances {0, 0.5, 1.5, 0.5, 0, 8, 1.5, 8, 0} respectively. Sample Variance Probability 0 3/9 0.5 /9 8 /9 1.5 /9 Copyright 014 Pearson Education, Inc.

82 80 Chapter 6: Normal Probability Distributions 7. (continued) c. The sample variances mean is = d. Yes. The mean of the sampling distribution of the sample variances (4.7) is equal to the value of the population variance (4.7) so the sample variances target the value of the population variance. 8. a. The population standard deviation (using the result from the previous problem) is σ = 4.7 =.160 b. By taking the square root of the sample variances from the previous problem we get Sample Standard Deviation Probability / /9.88 / /9 c. The mean of the sample standard deviations is = d. No. The mean of the sampling distribution of the sample standard deviations is 1.571, and it is not equal to the value of the population standard deviation (.160), so the sample standard deviations do not target the value of the population standard deviation. 9. a. The population median is 5 b. The possible sample of size are {(4, 4), (4, 5), (4, 9), (5, 4), (5, 5), (5, 9), (9, 4), (9, 5), (9, 9)} which have the following medians {4, 4.5, 6.5, 4.5, 5, 7, 6.5, 7, 9} Sample Median Probability 4 1/9 4.5 /9 5 1/9 6.5 /9 7 /9 9 1/9 c. The mean of the sampling distribution of the sampling median is = 6 9 d. No. The mean of the sampling distribution of the sample medians is 6, and it is not equal to the value of the population median of 5, so the sample medians do not target the value of the population median. 10. a. The proportion of odd numbers is /3 (there are two odd numbers from the population of 4, 5, and 9) b. The possible sample of size are {(4, 4), (4, 5), (4, 9), (5, 4), (5, 5), (5, 9), (9, 4), (9, 5), (9, 9)} which have the following proportion of odd numbers {0, 0.5, 0.5, 0.5, 1, 1, 0.5, 1, 1} Sample Proportion Probability 0 1/ /9 1 4/9 c. The mean of the sampling distribution of sample proportions is = = Copyright 014 Pearson Education, Inc.

83 Chapter 6: Normal Probability Distributions (continued) d. Yes. The mean of the sampling distribution of the sample proportion of odd numbers is /3, and it is equal to the value of the population proportion of odd numbers of /3, so the sample proportions target the value of the population proportion 11. a. The possible samples of size are {(56, 56), (56, 49), (56, 58), (56, 46), (49, 56), (49, 49), (49, 58), (49, 46), (58, 56), (58, 49), (58, 58), (58, 46), (46, 56), (46, 49), (46, 58), (46, 46)} Sample Mean Age Probability 46 1/ / /16 51 /16 5 / / / /16 57 / /16 b. The mean of the population is = 5.5 and the mean of the sample means is = c. The sample means target the population mean. Sample means make good estimators of population means because they target the value of the population mean instead of systematically underestimating or overestimating it. 1. a. The possible samples of size are {(56, 56), (56, 49), (56, 58), (56, 46), (49, 56), (49, 49), (49, 58), (49, 46), (58, 56), (58, 49), (58, 58), (58, 46), (46, 56), (46, 49), (46, 58), (46, 46)} Sample Median Age Probability 46 1/ / /16 51 /16 5 / / / /16 57 / /16 b. The median of the population is = 5.5 and the median of the sample medians is = 5.5. The two values are not equal. c. The sample medians do not target the population median of 5.5, so the sample medians do not make good estimators of the population medians Copyright 014 Pearson Education, Inc.

84 8 Chapter 6: Normal Probability Distributions 13. a. The possible samples of size are {(56, 56), (56, 49), (56, 58), (56, 46), (49, 56), (49, 49), (49, 58), (49, 46), (58, 56), (58, 49), (58, 58), (58, 46), (46, 56), (46, 49), (46, 58), (46, 46)} which have the following ranges and associated probabilities Sample Range Probability 0 4/16 /16 3 /16 7 /16 9 /16 10 /16 1 /16 b. The range of the population is = 1, the mean of the sample ranges is = The values are not equal. 16 c. The sample ranges do not target the population range of 1, so sample ranges do not make good estimators of the population range. 14. a. The possible samples of size are {(56, 56), (56, 49), (56, 58), (56, 46), (49, 56), (49, 49), (49, 58), (49, 46), (58, 56), (58, 49), (58, 58), (58, 46), (46, 56), (46, 49), (46, 58), (46, 46)} which have the following variances and associated probabilities Sample Variance Probability 0 4/16 / / / /16 50 /16 7 /16 b. The variance of the population is (56 5.5) + (49 5.5) + (58 5.5) + (46 5.5) = The mean of the sample variances is = The two values are equal c. The sample variances do target the population variance, so sample variances do make good estimators of the population variance. 15. The possible birth samples are {(b, b), (b, g), (g, b), (g, g)} Proportion of Girls Probability / 0.5 / 0.5 Yes. The proportion of girls in births is 0.5, and the mean of the sample proportions is 0.5. The result suggests that a sample proportion is an unbiased estimator of the population proportion. Copyright 014 Pearson Education, Inc.

85 16. The possible birth samples are {bbb, bbg, bgb, gbb, ggg, ggb, gbg, bgg} Proportion of Girls Probability 0 1/3 1/3 3/8 /3 3/8 3/3 1/8 Chapter 6: Normal Probability Distributions 83 Yes. The proportion of girls in 3 births is 0.5 and the mean of the sample proportions is 0.5. The result suggests that a sample proportion is an unbiased estimator of the population proportion. 17. The possibilities are: both questions incorrect, one question correct (two choices), both questions correct. a. Proportion Correct 0 1 Probability = = = b. The mean is = 0. 5 c. Yes. The sampling distribution of the sample proportions has a mean of 0. and the population proportion is also 0. (because there is 1 correct answer among 5 choices.) Yes, the mean of the sampling distribution of the sample proportions is always equal to the population proportion. 18. a. The proportions of 0, 0.5, and 1 have the following probabilities Proportion of Defective Probability 0 9 / / / 5 b. The mean is = c. Yes. The population proportion is 0.4 ( out of 5) and the mean of the sampling proportions is also 0.4. Yes, the mean of the sampling distribution of proportions is always equal to the population proportion P (0) = 0.5 ( 0)!( 0)! = 4 = 1, P (0.5) = 0.5 ( 0.5)!( 0.5)! =, P 1 (1) = 0.5 ( 1)!( 1)! =. The formula yields values which describes the sampling distribution of the sample proportions. The formula is just a different way of presenting the same information in the table that describes the sampling distribution. Copyright 014 Pearson Education, Inc.

86 84 Chapter 6: Normal Probability Distributions 0. Sample values of the mean absolute deviation (MAD) do not usually target the value of the population MAD, so a MAD statistic is not good for estimating a population MAD. If the population of {4, 5, 9} from Example 5 is used, the sample MAD values of 0, 0.5,, and.5 have corresponding probabilities of 3/9, /9, /9, and /9. For these values, the population MAD is, but the sample MAD values have a mean of 1.1, so the mean of the sample MAD values is not equal to the population MAD. Section Because the sample size is greater than 30, the sampling distribution of the mean ages can be approximated σ by a normal distribution with mean μ and standard deviation. 40. No. Because the original population is normally distributed, the sample means will be normally distributed for any sample size, not just those greater than μ x = 60.5 cm and it represents the mean of the population consisting of all sample means. 6.6 σ x = = 1.1cm, and it represents the standard deviation of the population consisting of all sample 36 means. 4. Because the digits are equally likely to occur, they have a uniform distribution. Because the sample means are based on samples of size 3 drawn from a population that does not have a normal distribution, we should not treat the sample means as having a normal distribution. 5. a z x =.7 = =, which has a probability of b z x = 07 = = 1., which has a probability of (Tech: ) a z x = = = 1, which has a probability of b z x = 05 = = 0.35, which has a probability of (Tech: ) a z x = 18.4 = = 1.5, which has a probability of = to the right of it 8.6 b z x = 04 = = 0.5 which has a probability of = to the right of it. (Tech: ) c. Because the original population has a normal distribution, the distribution of sample means is normal for any sample size. 8. a z x = 195 = = 1. which has an area of = to the right of it. (Tech: ) b z = = 1.45 which has an area of = to the right of it. (Tech: ) c. Because the original population has a normal distribution, the distribution of sample means is normal for any sample size Copyright 014 Pearson Education, Inc.

87 Chapter 6: Normal Probability Distributions a z x = 31.5 = = 3.0 and z x = = = 3 which has a probability of = between them. (Tech: ) b z x = 06 = = 0.37 and z x = = = which has a probability of = between them. (Tech: ) 10. a z x = 00 = = 0.64 and z x = 180 = =.97 which have a probability of = between them. (Tech: ) b z x = 06 = = 0.41 and z x = = = which has a probability of = between them. (Tech: ) = = 1. which has a probability of = to the right of it. The elevator appears to be relatively safe because there is a very small chance that it will be overloaded with 16 male passengers. (Tech: ) 11. z x = = =.09 which has a probability of = to the right of it. The elevator appears to be relatively safe because there is a very small chance of overloading. Using the outdated mean that is too low has the effect of making the elevator appear to be much safer than it actually is. (Tech: ) 1. z x = a. z x = 5 = =.94 and z x = 1 = =.06, so = = 97.87% of women can fit into the hats. (Tech: ) b. The z scores for the smallest.5% and the largest.5% head circumferences are 1.96 and 1.96 respectively. This corresponds to head circumferences of 0.8 ( 1.96) +.65 = 1.08 and = c. z x = 3 = = 3.5 and z x = = = which have a probability of = = 99.98% between them. No, the hats must fit individual women, not the mean from 64 women. If all hats are made to fit head circumferences between in. and 3 in., the hats will not fit about half those women a. z x = = = 3.8 which has a probability of So the percentage is 99.99% b. z x = 18.5 = = 1.8 which has a probability of No, when considering the diameters of 1 36 manholes, we should use a design based on individual men, not samples of 36 men. Copyright 014 Pearson Education, Inc.

88 86 Chapter 6: Normal Probability Distributions 15. a. The mean weight of passengers is = lb b. z x = 140 = = 5.6 which has a probability of (or ) to the right of it. (Tech: ) c. z x = 175 = = 0.87 which has a probability of to the right of it. (Tech: ) d. Given that there is a probability of exceeding the 3500 lb. limit when the water taxi is loaded with 0 random men, the new capacity of 0 passengers does not appear to be safe enough because the probability of overloading is too high a. z x = = = 0.06 which has a probability of = to the right of it (Tech: ) b. z x = = = 1.5 which has a probability of = to the right of it (Tech: ) Instead of filling each bag with exactly 465 M&Ms, the company probably fills the bags so that the weight is as stated. In any event, the company appears to be doing a good job of filling the bags a. z x = 167 = = 0.39 which has a probability of = to the right of it. (Tech: ) b. z x = 167 = = 1.35 which has a probability of = to the right of it c. There is a high probability that the gondola will be overloaded if it is occupied by 1 more people, so it appears that the number of allowed passengers should be reduced. 18. a. The z score for 1% is.33 which corresponds to a pulse rate of = 50.5 beats per minute. The z score for 99% is.33 which corresponds to a pulse rate of = beats per minute b. z x = 85 = = 3.3 and z x = = = which have a probability of = between them c. Instead of the mean pulse rate from the patients in a day, the cutoff values should be based on individual patients, so it would be better to use the pulse rates of 50.5 beats per minute and beats per minute. Copyright 014 Pearson Education, Inc.

89 Chapter 6: Normal Probability Distributions a. z x = 11 = = 1.01 and z x = 140 = = 0.55 which have a probability of = between them. (Tech: ) b. z x = 11 = = 6.05 and z x = = = which have a probability of = between them. (Tech: ) c. Part (a) because the ejection seats will be occupied by individual women, not groups of women a. z x = 140 = = 7.44 which has a probability of = to the right of it. (Tech: when rounded to four decimal places.) b. z x = 140 = = 0.8 which has a probability of = to the right of it. (Tech: ) a. z x = 7 = = 1.04 which has a probability of (Tech: ) b. z x = 7 = = which has a probability of (Tech: when rounded to four decimal places.) c. The probability of Part (a) is more relevant because it shows that 85.08% of male passengers will not need to bend. The result from part (b) gives us useful information about the comfort and safety of individual male passengers. d. Because men are generally taller than women, a design that accommodates a suitable proportion of men will necessarily accommodate a greater proportion of women z x = = =.8 which has a probability of = to the right of it. There is a probability that the aircraft is overloaded. Because that probability is so high, the pilot should take action, such as removing excess fuel and/or requiring that some passengers disembark and take a later flight. 3. a. Yes. The sampling is without replacement and the sample size of 50 is greater than 5% of the finite population size of 75. σ x = = = = 4.63 and z x = 95.5 = = 0.4 which have a probability of = (Tech: ) b. z x = a. Yes. The sampling is without replacement (because each sample of 16 elevator passengers consists of 16 different people) and the sample size of 16 is greater than 5% of the finite population size of σ x = = b = lb. Copyright 014 Pearson Education, Inc.

90 88 Chapter 6: Normal Probability Distributions 4. (continued) c. z x = = = 1.08 which has a probability of = The probability is not as low as it should be, since it would be overloaded 14% of the time. (Tech: ) d. The z score for is 3.1 which means that we need to solve for n in the equation n 3.1= which has a solution of 14 n 16 1 n = passengers. 5. a (4 6) + (5 6) + (9 6) μ = = 6, σ = = b. The possible samples of size are:{ (4, 5), (4, 9), (5, 4), (5, 9), (9, 4), (9, 5)} which have the following means {4.5, 6.5, 4.5, 7, 6.5, 7} respectively. c μ x = = 6 and 6 σ x (4.5 6) + (6.5 6) + (4.5 6) + (7 6) + (6.5 6) + (7 6) = = d. It is clear that μ= μ x = 6. Section 6-6 σx = = = σ The histogram should be approximately bell-shaped, and the normal quantile plot should have points that approximate a straight line pattern.. Either the points are not reasonably close to a straight line pattern, or there is some systematic pattern that is not a straight line pattern. 3. We must verify that the sample is from a population having a normal distribution. We can check for normality using a histogram, identifying the number of outliers, and constructing a normal quantile plot. 4. Because the histogram is roughly bell-shaped, conclude that the data are from a population having a normal distribution. 5. Not normal. The points show a systematic pattern that is not a straight line pattern. 6. Normal. The points are reasonably close to a straight line pattern, and there is no other pattern that is not a straight line pattern. 7. Normal. The points are reasonably close to a straight line pattern, and there is no other pattern that is not a straight line pattern. 8. Not normal. The points are not reasonably close to a straight line pattern, and there appears to be a pattern that is not a straight line pattern. Copyright 014 Pearson Education, Inc.

91 Chapter 6: Normal Probability Distributions Not normal Frequency Arrival Delay Normal 1 10 Frequency Height Normal 7 6 Frequency Systolic Not normal Frequency No Exposure Copyright 014 Pearson Education, Inc.

92 90 Chapter 6: Normal Probability Distributions 13. Not normal 14. Normal 15. Normal 16. Not normal Copyright 014 Pearson Education, Inc.

93 Chapter 6: Normal Probability Distributions Normal. The points have coordinates ( 131, 1.8), (134, 0.5), (139, 0), (143, 0.5), (145, 1.8) 18. Not normal. The points have coordinates (13, 1.38), (14, 0.67), (15, 0.1), (15, 0.1), (31, 0.67), (37, 1.38) 19. Not normal. The points have coordinates (1034, 1.53), (1051, 0.89), (1067, 0.49), (1070, 0.16), (1079, 0.16), (1079, 0.49), (1173, 0.89), (17, 1.53) Copyright 014 Pearson Education, Inc.

94 9 Chapter 6: Normal Probability Distributions 0. Normal. The points have coordinates (0.85, 1.59), (0.85, 0.97), (0.855, 0.59), (0.864, 0.8), (0.869, 0), (0.886, 0.8), (0.887, 0.59), (0.91, 0.97), (0.94, 1.59) 1. a. Yes. b. Yes. c. No.. a. The magnitudes are from a normally distributed b. The original measurements have a lognormal distribution c. We can reverse the process of taking values be an exponent of 10. The normal quantile plot indicates that the original values are not from a population with a normal distribution. 3. The original values are not from a normally distributed population Probability Net Worth Copyright 014 Pearson Education, Inc.

95 Chapter 6: Normal Probability Distributions (continued) After taking the logarithm of each value, the values appear to be from a normally distributed population Probability Log(Net Worth) 4 5 The original values are from a population with a lognormal distribution. Section The Minitab display shows that the region representing 35 wins is a rectangle. The result of is an approximation, but the result of is better because it is based on an exact calculation. The approximation differs from the exact result by a very small amount.. The continuity correction is used to compensate for the fact that a continuous distribution (normal) is used to approximate a discrete distribution (binomial). The discrete number of 13 is represented by the interval from 1.5 to p = = 0., q = = 0.8, μ = 5 0. = 5, σ = =. The value of 5 for the mean shows 5 5 that for many people who make random guesses for the 5 questions, the mean number of correct answers is 5. For many people who make random guesses, the standard deviation of is a measure of how much the numbers of correct responses vary. 4. Yes. The circumstances correspond to 5 independent trials of a binomial experiment in which the probability of success is 0.. Also, with n = 5, p = 0., q = 0.8, the requirements of np 5 and nq 5 are both satisfied. 5. The requirements are satisfied with a mean of = 5. and the standard deviation of = Therefore, z x =.5 = = 1.53 which has a probability of (Tech: ) 6. The requirement of nq 5 is not satisfied. Normal approximation should not be used. 7. The requirement of nq 5 is not satisfied. Normal approximation should not be used. 8. The requirements are satisfied with a mean of 10 and a standard deviation of = Therefore, z x = 9.5 = = 0.0 which has a probability of = to the right of it (Tech: ) 9. μ= =, σ = = z x = = = 0.60 which has a probability of (Tech: ) Copyright 014 Pearson Education, Inc.

96 94 Chapter 6: Normal Probability Distributions 4.5 = = 0.60 which has a probability of = to the right of it. (Tech: ) z x = = = 0.1 and z x = 3.5 = = 0.36 which have a probability of = between them. (Tech: ) z x = = = 0.84 and z x = 19.5 = = 0.60 which have a probability of = (Tech: ) 10. z x = μ= = 183.3, σ = = a. z x = 17.5 = = 0.95 and z x = = = 1.04 which have a probability of = between them. (Tech using normal approximation: 0.014; Tech using binomial: 0.017) b. z x = 17.5 = = 0.95 which has a probability of The result of 17 overturned calls is not unusually low. (Tech using normal approximation: 0.170; Tech using binomial: ) c. The result from part (b) is useful. We want the probability of getting a result that is at least as extreme as the one obtained. d. If the 30% rate is correct, there is a good chance (17.11%) of getting 17 or fewer calls overturned, so there is not strong evidence against the 30% rate. 14. μ= = 01.63, σ = = a. z x = 17.5 = =.51and z x = = =.59 which have a probability of = between them. (Tech using normal approximation: ; Tech using binomial: ) b. z x = 17.5 = =.51which has a probability of The result of 17 overturned calls is unusually low. (Tech using normal approximation: ; Tech using binomial: ) c. The result from part (b) is useful. We want the probability of getting a result that is at least as extreme as the one obtained. d. If the 33% rate is correct, there is a very small chance (0.6%) of getting 17 or fewer calls overturned, so there is not strong evidence against the 33% rate. 15. μ= = 435, σ = = a. z x = 48.5 = = 0.6 and z x = 47.5 = = 0.7 which have a probability of = between them. (Tech using normal approximation: ; Tech using binomial: ) b. z x = 48.5 = = 0.6 which has a probability of The result of 48 peas with green pods is not unusually low. (Tech using normal approximation: 0.665; Tech using binomial: ) c. The result from part (b) is useful. We want the probability of getting a result that is at least as extreme as the one obtained. Copyright 014 Pearson Education, Inc.

97 Chapter 6: Normal Probability Distributions (continued) d. No. Assuming that Mendel s probability of 3/4 is correct, there is a good chance (6.76%) of getting the results that were obtained. The obtained results do not provide strong evidence against the claim that the probability of a pea having a green pod is 3/4 16. μ= = 51, σ = = a. z x = 90.5 = =.88 which has a probability of = to the right of it. (Tech using normal approximation: 0.000; Tech using binomial: ) b. Because the probability of getting 91 or more with the value of 5% is so small, the result of 91 is unusually high. c. The results do suggest that the rate is greater than 5%. 17. μ= = 47.5, σ = = a. z x = = = 6.48 and z x = = = 6.41 which have a probability of or 0+ (a very small positive probability that is extremely close to 0) between them b. z x = = = 6.41 which has a probability of (Tech: or 0+, which is a very small positive probability that is extremely close to 0). If boys and girls are equally likely, 879 girls in 945 births is unusually high. c. The result from part (b) is more relevant, because we want the probability of a result that is at least as extreme as the one obtained. d. Yes. It is very highly unlikely that we would get 879 girls in 945 births by chance. Given that the 945 couples were treated with the XSORT method, it appears that this method is effective in increasing the likelihood that a baby will be a girl. 18. μ= = , σ = = z x = = = 8.93 which has a probability of to the right of it. (Tech using normal approximation: or 0+; Tech using binomial: or 0+.) It appears that many adult males say that they wash their hands in a public restroom when they actually do not. 19. μ= = 611., σ = = z x = = = 5.78 which has a probability of to the right of it. (Tech ) The result suggests that the surveyed people did not respond accurately. 0. μ= 40, = 14.83, σ = 40, = z x = = = 0.61 which has a probability of (Tech using normal 40, approximation: 0.697; Tech using binomial: 0.76.) Media reports appear to be wrong. 1. The probability of six or fewer should be computed. μ= = 10, σ = = z x = 6.5 = = 1.4 which has a probability of (Tech using normal approximation: ; Tech using binomial: ) Because that probability is not very small, the evidence against the rate of 0% is not very strong. Copyright 014 Pearson Education, Inc.

98 96 Chapter 6: Normal Probability Distributions. The probability of three or fewer should be computed. μ= = 10, σ = = z x = 3.5 = =.30 which has a probability of (Tech using normal approximation: ;.884 Tech using binomial: ) Because that probability is very small, the evidence against the rate of 0% is very strong. It appears that the rate of smoking among statistics students is lower than the 0% rate for the general population. 3. The probability of 170 or fewer should be computed. μ= = 00, σ = = z x = = =.33 which has a probability of (Tech using normal approximation: ; Tech using binomial: ) Because the probability of 170 or fewer is so small with the assumed 0% rate, it appears that the rate is actually less than 0%. 4. The probability of 175 or more should be computed. μ= = 167.5, σ = = z x = = = 0.94 which has a probability of = to the right of it (Tech using normal approximation: 0.173; Tech using binomial: ) If the internet access rate is 67%, there is a relatively high probability of 17.3% of getting 175 or more households with internet access when 50 households are surveyed. It does not appear that the 67% rate is too low. 5. a. In order to make a profit Marc will need to win over $1000. With 35:1 odds a $5 bet wins $175. Therefore, Marc needs 6 winning bets in order to make a profit μ= 00 = 5.63, σ = 00 = = = 0.10 which has a probability of = to the right of it. z x = (Tech using normal approximation: ; tech using binomial: ) b. Since the odds of winning are 1:1 Marc would need 101 wins or more to make a profit. μ= 00 = , σ = 00 = z x = = = 0.7 which has a probability of = (Tech using normal approximation: ; tech using binomial: 0.393) c. The roulette game provides a better likelihood of making a profit. 6. The z score that corresponds to a 0.95 probability is This means that we have to solve the equation n n = 13 for n. This has a solution of 9 reservations. (Tech: 30.) Chapter Quick Quiz 1. μ = 0 and σ = 1 Copyright 014 Pearson Education, Inc.

99 Chapter 6: Normal Probability Distributions P 98 =.05 (Tech:.05375) 4. P( z> 1) = 1 P( z< 1) = = P( 1.37 < z<.4) = P( z<.4) P( z< 1.37) = = (Tech: ) 6. z x = z x = = = 0.99 which have a probability of (Tech: ) = =.15 which has a probability of = (Tech: ) The z score for P 80 = 0.84 which corresponds to a red blood count of = z = = 1.74 which has a probability of = = 0.99 and z x = 5.4 = =.15 which have a probability of = 0.831or 8.31%. (Tech: 8.6%) 10. z x = 4. Review Exercises 1. a. The probability to the left of a z score of.93 is b. The probability to the right of a z score of 1.53 is = c. The probability between z scores 1.07 and.07 is = d. The z score for P 30 = 0.5 e z x = 0.7 = = 1.08 which has a probability of = a z x = 1605 = = 1.41 which has a probability of = or 7.93%. (Tech: 7.89%.) 63 b. The z score for the lowest 1% is.33 which corresponds to a standing eye height of = mm. (Tech: ) 3. a z x = 1500 = =.03 which has a probability of = or 97.88% 66 b. The z score for the lowest 95% is which corresponds to a standing eye height of = mm Copyright 014 Pearson Education, Inc.

100 98 Chapter 6: Normal Probability Distributions 4. a. Normal distribution b. μ = 1.1 x c. 5.1 σ = = 0.57 x a. An unbiased estimator is a statistic that targets the value of the population parameter in the sense that the sampling distribution of the statistic has a mean that is equal to the mean of the corresponding parameter. b. Mean, variance and proportion c. True 6. a z x = 7 = = 1.04 which has a probability of or 85.08%. (Tech: 85.1.) With about.4 15% of all men needing to bend, the design does not appear to be adequate, but the Mark VI monorail appears to be working quite well in practice. b. The z score for 99% is.33 which corresponds to a doorway height of = a z x = 175 = = 0.19 which has a probability of = (Tech: ) 40.9 b z x = 175 = =.8 which has a probability of = Yes, if the plane is full of male passengers, it is highly likely that it is overweight. 8. a. No. A histogram is far from bell shaped Frequency Salary b. No. The sample has a size of 6 which does not satisfy the condition at least 30, and the values do not appear to be from a population having a normal distribution μ= 1064 = 798, σ = 1064 = z z = = = 0.74 which has a probability of (Tech using normal approximation: ; Tech using binomial: 0.78.) The occurrence of 787 offspring plants with long stem is not unusually low because its probability is not small. The results are consistent with Mendel s claimed proportion of 3/4 Copyright 014 Pearson Education, Inc.

101 Chapter 6: Normal Probability Distributions μ= = 51., σ = = a. z x = 49.5 = = 0.53 which has a probability of = to the right of it. (Tech 3. using normal approximation: 0.704; Tech using binomial: ) b. z x = 49.5 = 0.53 and z x = 50.5 = = 0. which have a probability of = between them. (Tech using normal approximation: ; Tech using binomial: ) Cumulative Review Exercises 1. a. 14,500, ,000, ,000, ,000,000+ 3,500,000 x = = $10,300,000 5 b. The median is $14,000,000 c. s = (14,500 10,300) + (14,500 10,300) ( ,300) + ( ,300) 5 1 = $ (in thousands of dollars) which is $5,55,07. d. s = e. z x = 14,500,000 30,85, 003,810, 000 square dollars 14,500,000 10,300,000 = = ,55,07 f. Ratio g. Discrete h. No, the starting players are likely to be the best players who receive the highest salaries.. a. A is the event of selecting someone who does not have the belief that college is not a good investment. NOTE: This is not the same as selecting someone who believes that college is a good investment. b. PA= ( ) 1 0.1= 0.9 c. P = = d. The sample is a voluntary response sample. This suggests that the 10% rate might not be very accurate, because people with strong feelings or interest about the topic are more likely to respond. 3. a z x = 500 = = 1.53 which has a probability of (Tech: 0.067) 567 b. The z score for the bottom 10% is 1.8, which correspond to the weight = 64.4 g. (Tech: 64 g.) c z x = 1500 = = 3.3 which has a probability of d z x = 3400 = = 0.7 which has a probability of = (Tech: 0.393) to the right of it. Copyright 014 Pearson Education, Inc.

102 100 Chapter 6: Normal Probability Distributions 4. a. The vertical scale does not start at 0, so differences are somewhat distorted. By using a scale ranging from 1 to 9 for frequencies that range to 14, the graph is flattened, so differences are not shown as they should be. b. The graph depicts a distribution that is not exactly normal, but it is approximately normal because it is roughly bell shaped. c. Minimum: 4 years; maximum: 70 years. Using the range rule of thumb, the standard deviation is estimated to be 70 4 = 7 years. The estimate of 7 years is very close to the actual standard 4 deviation of 6.6 years, so the range rule of thumb works quite well here. 5. a. PX= ( 3) = = b. PX ( 1) = 1 PX ( = 0) = 1 ( ) = 0.71 c. The requirement that np 5 is not satisfied, indicating that the normal approximation would result in errors that are too large. d. μ = = 5 e. σ = =.113 f. No, 8 is within two standard deviations of the mean and is within the range of values that could easily occur by chance. Copyright 014 Pearson Education, Inc.

103 Chapter 7: Estimates and Sample Sizes Section 7- Chapter 7: Estimates and Sample Sizes The confidence level (such as 95%) was not provided.. When using 6% to estimate the value of the population percentage, the maximum likely difference between 6% and the true population percentage is three percentage points, so the interval from 3% to 9% is likely to contain the true population percentage. 3. ˆp = 0.6 is the sample proportion; ˆq = 0.74 (found from evaluating 1 ˆp ); n = 1910 is the sample size; E = 0.03 is the margin of error; p is the population proportion, which is unknown. The value of α is The 95% confidence interval will be wider than the 80% confidence interval. A confidence interval must be wider in order for us to be more confident that it captures the true value of the population proportion. (Think of estimating the age of a classmate. You might be 90% confident that she is between 0 and 30, but you might be 99.9% confident that she is between 10 and 40.) (Tech:.576) E = = 0.061, so 0.15 ± E = = 0.085, so 0.50 ± < p < < p < a. p ˆ = = ( )( ) b. pq ˆˆ E = zα / = 1.96 n 100 = c. pˆ E< p< pˆ E < p< < p< d. We have 95% confidence that the interval from to actually does contain the true value of the population proportion. 14. a. 490 p ˆ = = b. c ( )( ) pq ˆˆ E = zα / = = n pˆ E< p< pˆ E < p < < p < d. We have 99% confidence that the interval from to actually does contain the true value of the population proportion. Copyright 014 Pearson Education, Inc.

104 10 Chapter 7: Estimates and Sample Sizes 15. a p ˆ = = ( )( ) b. pq ˆˆ E = zα / = 1.65 n 518 = pˆ E< p< pˆ E c < p < < p < d. We have 90% confidence that the interval from to actually does contain the true value of the population proportion. 16. a. 543 p ˆ = = ( )( ) b. pq ˆˆ E = zα / = 1.8 n 1005 = pˆ E< p< pˆ E c < p < < p < d. We have 80% confidence that the interval from 0.50 to actually does contain the true value of the population proportion. 17. a. 879 p ˆ = = ( 945)( 945) pq ˆˆ 879 pˆ ± z b. α / = ± 1.96 n < p < c. Yes. The true proportion of girls with the XSORT method is substantially greater than the proportion of (about) 0.5 that is expected when no method of gender selection is used a. p ˆ = = ( 91)( 91) pq ˆˆ 39 pˆ ± z b. α / = ±.56 n < p < c. Yes. The true proportion of boys with the YSORT method is substantially greater than the proportion of (about) 0.5 that is expected when no method of gender selection is used. 19. a b. p ˆ = = c ( 80)( 80) pq ˆˆ 13 pˆ ± zα / = ±.56 n < p< or 36.3% < p< 51.5% d. If the touch therapists really had an ability to select the correct hand by sensing an energy field, their success rate would be significantly greater than 0.5, but the sample success rate of and the confidence interval suggest that they do not have the ability to select the correct hand by sensing an energy field. Copyright 014 Pearson Education, Inc.

105 Chapter 7: Estimates and Sample Sizes a ( 580)( 580) pq ˆˆ 15 pˆ ± zα / = ± z0.05 n < p< 0.98 or.6% < p< 9.8% b. No, the confidence interval includes 0.5, so the true percentage could easily equal 5%. 1. a. 47( 0.9) = 14 b. pq ˆˆ ( 0.9)( 0.71) pˆ ± zα / = 0.9 ± z0.05 n < p< or 4.7% < p< 33.3% c. Yes. Because all values of the confidence interval are less than 0.5, the confidence interval shows that the percentage of women who purchase books online is very likely less than 50%. d. No. The confidence interval shows that it is possible that the percentage of women who purchase books online could be less than 5%. e. Nothing.. If the subjects chose to respond to the posted question, the sample is a voluntary response sample, so the confidence interval could be very misleading. pq ˆˆ ( 0.08)( 0.79) pˆ ± zα / = 0.08± z0.05 n < p< 0.74 or 14.% < p< 7.4% (using x = 30: 14.% < p < 7.5%). 3. a. 514( 0.459) = 36 b. c. pq ˆˆ ( 0.459)( 0.541) pˆ ± zα / = 0.459± z0.05 n 514 (using x = 36: < p < 0.516) < p < pq ˆˆ ( 0.459)( 0.541) pˆ ± zα / = 0.459± z0.10 n < p < d. The 95% confidence interval is wider than the 80% confidence interval. A confidence interval must be wider in order to be more confident that it captures the true value of the population proportion. (See Exercise 4.) 4. a. 514( 0.90) = 463 b. c. pq ˆˆ ( 0.90)( 0.10) pˆ ± zα / = 0.90 ± z0.005 n 514 (using x = 463: < p < 0.935) < p < pq ˆˆ ( 0.90)( 0.10) pˆ ± zα / = 0.90 ± z0.10 n 514 (using x = 463: < p < 0.918) < p < d. The 95% confidence interval is wider than the 80% confidence interval. A confidence interval must be wider in order to be more confident that it captures the true value of the population proportion. (See Exercise 4.) Copyright 014 Pearson Education, Inc.

106 104 Chapter 7: Estimates and Sample Sizes 5. No, the confidence interval limits contain the value of 0.13, so the claimed rate of 13% could be the true percentage for the population of brown M&Ms. 6. a. 7. a. pq ˆˆ ( 0.08)( 0.9) pˆ ± zα / = 0.08± z0.01 n < p < (Tech: < p < 0.143) pq ˆˆ ( 0.70)( 0.30) pˆ ± zα / = 0.70 ± z0.01 n 100 (Tech: < p < 0.733) < p < b. No. Because 0.61 is not included in the confidence interval, it does not appear that the responses are consistent with the actual voter turnout. pq ˆˆ ( )( ) pˆ ± zα / = ± z0.05 n 40,095 (using x = 135: 0.076% < p < %) % < p < % b. No, because % is included in the confidence interval. 8. a. 3005( 0.817) = pq ˆˆ ( 0.817)( 0.183) pˆ ± z b. α / = ± z0.005 n < p< 0.89 or 80.5% < p< 8.9% c. Nothing. [ z ] ˆˆ [ ] α / pq ( 0.5) n = = = 75 E 0.03 [ z ] ˆˆ [ ] α / pq 1.8 ( 0.5) n = = = 56 (Tech: 57) E 0.04 [ z ] ˆˆ [ ] α / pq.575 ( 0.15)( 0.85) n = = = 339 E 0.05 [ z ] ˆˆ [ ] α / pq.33 ( 0.15)( 0.85) n = = = 770 (Tech: 767) E a. b. [ z ] ˆˆ [ ] α / pq 1.96 ( 0.5) n = = = 1537 E 0.05 [ z ] ˆˆ [ ] α / pq 1.96 ( 0.38)( 0.6) n = = = 1449 E a. [ z ] ˆˆ [ ] α / pq.575 ( 0.5) n = = = 16,577 (Tech: 16,588) E 0.01 b. [ z ] ˆˆ [ ] α / pq.575 ( 0.90)( 0.10) n = = = 5968 (Tech: 597) E 0.01 c. Yes. Using the additional survey information from part (b) dramatically reduces the sample size. Copyright 014 Pearson Education, Inc.

107 Chapter 7: Estimates and Sample Sizes a. [ z ] ˆˆ [ ] α / pq ( 0.5) n = = = 71 E 0.05 b. [ z ] ˆˆ [ ] α / pq ( 0.85)( 0.15) n = = = 139 (Tech: 138) E 0.05 c. No. A sample of students at the nearest college is a convenience sample, not a simple random sample, so it is very possible that the results would not be representative of the population of adults. 36. a. [ z ] ˆˆ [ ] α / pq 1.8 ( 0.5) n = = = 456 (Tech: 457) E 0.03 [ z ] ˆˆ [ ] α / pq 1.8 ( 0.84)( 0.16) b. n = = = 45 (Tech: 46) E 0.03 c. No. Flights between New York and San Francisco might not be representative of the population of all Southwest flights. 37. Greater height does not appear to be an advantage for presidential candidates. If greater height is an advantage, then taller candidates should win substantially more than 50% of the elections, but the confidence interval shows that the percentage of elections won by taller candidates is likely to be anywhere between 36.% and 69.7% ( 34)( 34) pq ˆˆ p ˆ pˆ ± z = = α / = ± 1.96 n < p< or 36.% < p< 69.7%. 38. No, the confidence interval is based on sample data consisting of flights from New York (JFK) to Los Angeles, and arrival delays for that route might be very different from arrival delays for the population that includes all routes. 44 p ˆ = = ( 48)( 48) pq ˆˆ 44 pˆ ± zα / = ± n < p< or 85.1% < p< 98.0%. 39. a. b. Npq ˆˆ[ zα /] 00( 0.5)( 0.5)[ 1.96] n = = = 178 pq ˆˆ[ z ] + ( N 1) E ( 0.5)( 0.5)[ 1.96] + ( 00 1) 0.05 α / Npq ˆˆ[ zα /] 00( 0.38)( 0.6)[ 1.96] n = = = 176 pq ˆˆ[ z ] + ( N 1) E ( 0.38)( 0.6)[ 1.96] + ( 00 1) 0.05 α / ()() 8 8 pq ˆˆ 3 pˆ ± zα / = ± 1.96 n < p < 0.710; no 41. The upper confidence interval limit is greater than 100%. Given that the percentage cannot exceed 100%, change the upper limit to 100% ( 48)( 48) pq ˆˆ 44 pˆ ± zα / = ±.575 n < p< or 81.4% < p< 101.9%. Copyright 014 Pearson Education, Inc.

108 106 Chapter 7: Estimates and Sample Sizes 4. a. The requirement of at least 5 successes and at least 5 failures is not satisfied, so the normal distribution cannot be used. 3 b = 43. Because we have 95% confidence that p is greater than 0.831, we can safely conclude that more than 75% of adults know what Twitter is. pq ˆˆ 44 ( 0.85)( 0.15) pˆ + zα = n p > Section a. sec 33.4 sec < μ < sec (Tech: p > 0.83). b. The best point estimate of μ is x = = sec. The margin of error is E = = sec.. a. df = 39 b..03 c. In general, the number of degrees of freedom for a collection of sample data is the number of sample values that can vary after certain restrictions have been imposed on all data values. 3. We have 95% confidence that the limits of 33.4 sec and sec contain the true value of the mean of the population of all duration times. 4. When we say that the confidence interval methods of this section are robust against departures from normality, we mean that these methods work reasonably well with distributions that are not normal, provided that departures from normality are not too extreme. The given dotplot does appear to satisfy the loose normality requirement. Also, there are 40 dots, so the sample size of 40 satisfies the condition of n > Neither the normal nor the Student t distribution applies. 6. t α / = t α / = z α / =.575 (Tech:.576) 9. Because the sample size is greater than 30, the confidence interval yields a reasonable estimate of μ, even though the data appear to be from a population that is not normally distributed. s x± tα / = 9.808±.403 n km < μ < km (Tech: km < μ < 513 km) 10. s x± tα / = ± n 7 (If the original values are used, the upper limit is ppm.) ppm < μ < ppm Copyright 014 Pearson Education, Inc.

109 Chapter 7: Estimates and Sample Sizes The $1 salary of Jobs is an outlier that is very far away from the other values, and that outlier has a dramatic effect on the confidence interval. s x± tα / = 1898 ±.776 n thousand dollars < μ < thousand dollars (Tech: thousand dollars < μ < 6,48.5 thousand dollars) 1. The confidence interval is an estimate of the population mean and it does not apply to individual sample values. s.55 x± tα / = 3.95±.71 n chocolate chips < μ < 5.04 chocolate chips 13. Because the confidence interval does not contain 98.6 F, it appears that the mean body temperature is not 98.6 F, as is commonly believed. s 0.6 x± tα / = 98. ± 1.98 n F < μ < 98.3 F 14. Because the confidence interval does not include 0 or negative values, it does appear that the weight loss program is effective with a positive loss of weight. Because the amount of weight lost is relatively small, the weight loss program does not appear to be very practical. s x± tα / =.1± 1.68 n 0.8 lb < μ < 3.4 lb 15. Because the confidence interval includes the value of 0, it is very possible that the mean of the changes in LDL cholesterol is equal to 0, suggesting that the garlic treatment did not affect LDL cholesterol levels. It does not appear that garlic is effective in reducing LDL cholesterol s 1 x± tα / = 0.4±.4 n mg/dl < μ < 7.6 mg/dl 16. The confidence interval includes the mean of 10.8 min that was measured before the treatment, so the mean could be the same after the treatment. This result suggests that the zoplicone treatment has no effect. s 4.3 x± tα / = 98.9±.6 n min < μ < 16.4 min 17. The data appear to have a distribution that is far from normal, so the confidence interval might not be a good estimate of the population mean. The population is likely to be the list of box office receipts for each day of the movie s release. Because the values are from the first 14 days of release, the sample values are not a simple random sample, and they are likely to be the largest of all such values, so the confidence interval is not a good estimate of the population mean. s 14.5 x± tα / = 16.4 ± 3.01 n million dollars < μ < 8.1 million dollars Copyright 014 Pearson Education, Inc.

110 108 Chapter 7: Estimates and Sample Sizes 18. The confidence interval does not contain the value of 4 years. The data appear to have a distribution that is far from normal, so the confidence interval might not be a good estimate of the population mean. s 3.51 x± tα / 6.5± 1.73 n years < μ < 7.9 years 19. The sample data meet the loose requirement of having a normal distribution. Because the confidence interval is entirely below the standard of 1.6 W/kg, it appears that the mean amount of cell phone radiation is less than the FCC standard, but there could be individual cell phones that exceed the standard. s 0.43 x± tα / = ± 1.81 n W/kg < μ < W/kg 0. The sample data meet the loose requirement of having a normal distribution s x± tα / = 33.6 ±.6 n years < μ < 38.8 years 1. The sample data meet the loose requirement of having a normal distribution. We cannot conclude that the population mean is less than 7 μ g/g, because the confidence interval shows that the mean might be greater than that level. s 6.46 x± tα / = 11.05±.6 n μg/g < μ< μg/g. The sample data meet the loose requirement of having a normal distribution. The values are typical because they are between 950 cm 3 and 1800 cm 3. s x± tα / = ± 3.5 n cm < μ < cm 3. Although final conclusions about means of populations should not be based on the overlapping of confidence intervals, the confidence intervals do overlap, so it appears that both populations could have the same mean, and there is not clear evidence of discrimination based on age. CI for ages of unsuccessful applicants CI for ages of successful applicants s 7. s 5.03 x± tα / = ±.07 x± tα / = 44.5±.05 n 3 n years < μ < 50.1 years 4.6 years < μ < 46.4 years 4. Although final conclusions about means of populations should not be based on the overlapping of confidence intervals, the confidence intervals do overlap, so it appears that both populations could have the same mean, and there is not clear evidence that skull breadths changed from 4000 b.c. to 150 a.d. CI for 4000 b.c. CI for 150 a.d. s 4.6 x± tα / = 18.7 ±.01 n mm < μ < mm ± mm < μ < mm Copyright 014 Pearson Education, Inc.

111 Chapter 7: Estimates and Sample Sizes The sample size is zα /σ n = = = 68, and it does appear to be very reasonable. E 3 zα /σ The required sample size is n = = = 104. Limiting the sample to students at your E 0. college would result in a convenience sample that might not be representative of the population of all college students, so it does not make sense to collect the entire sample at your college. zα /σ The required sample size is n = = = 405 E 50 find that many two-year-old used Corvettes in your region. (Tech: 403). It is not likely that you would zα /σ The required sample size is n = = = 753. A major obstacle to getting a good estimate E 15 of the population mean is that it would be very difficult to actually measure times spent on Facebook, so you must rely on reported times that can be very inaccurate zα /σ Use σ = = 450 to get a sample size of n = = = 110. The margin of error 4 E 100 of 100 points seems too high to provide a good estimate of the mean SAT score. 45,000 0 zα /σ Use σ = = 11,50 to get a sample size of n = = = 83, E 100 (Tech: 83,973). The sample size seems too large to be practical With the range rule of thumb, use σ = = 11 to get a required sample size of 4 zα /σ n = = = 117. E zα /σ With s = 10.3, the required sample size is n = = = 10. The better estimate of s is the E standard deviation of the sample, so the correct sample size is likely to be closer to 10 than With the range rule of thumb, use σ = = to get a required sample size of 4 zα /σ n = = = 53. With s = 0.587, the required sample size is 34. The better estimate of s E 0. is the standard deviation of the sample, so the sample size of 34 is the better result. s σ x± tα / = ± x± zα / = 9.808±.33 n 50 n < μ < km < μ < km (Tech: 0.96 < μ < 1.407) (Tech: km < μ < km) σ x± zα / = 17.5±.03 n ng/ml < μ < 10.7 ng/ml Copyright 014 Pearson Education, Inc.

112 110 Chapter 7: Estimates and Sample Sizes σ x± zα / = ± n ppm < μ < ppm (If the original values are used, the upper limit is ppm.) σ x± zα / = 1898± 1.96 n thousand dollars < μ < thousand dollars (Tech: thousand dollars < μ < 19,663.3 thousand dollars) σ x± zα / = 3.95 ±.576 n chocolate chips < μ < 4.99 chocolate chips 39. The sample data do not appear to meet the loose requirement of having a normal distribution. The effect of the outlier on the confidence interval is very substantial. Outliers should be discarded if they are known to be errors. If an outlier is a correct value, it might be very helpful to see its effects by constructing the confidence interval with and without the outlier included. s x± tα / = ±.6 n m < μ < m (Tech: 4.55 m < μ < m) 40. The second confidence interval is narrower, indicating that we have a more accurate estimate when the relatively large sample is from a relatively small finite population. s x± tα / = ±.6 Large population: n g < μ < g s N n x± tα / = ±.6 Finite population: n n g < μ < g 41. The confidence interval based on the first sample value is much wider than the confidence interval based on all 10 sample values. Section 7-4 x ± m < μ < 3.0 m ( mg/dl ) < σ < ( mg/dl) 30.3 mg/dl < σ < 47.5 mg/dl. We have 95% confidence that the limits of 30.3 mg/dl and 47.5 mg/dl contain the true value of the standard deviation of the LDL cholesterol levels of all women.. The format implies that s = 15.7, but s is given as In general, a confidence interval for σ does not have s at the center. 3. The original sample values can be identified, but the dotplot shows that the sample appears to be from a population having a uniform distribution, not a normal distribution as required. Because the normality requirement is not satisfied, the confidence interval estimate of s should not be constructed using the methods of this section. Copyright 014 Pearson Education, Inc.

113 Chapter 7: Estimates and Sample Sizes The normality requirement for a confidence interval estimate of σ has a much stricter normality requirement than the loose normality requirement for a confidence interval estimate of μ. Departures from normality have a much greater effect on confidence interval estimates of σ than on confidence interval estimates of μ. 5. df = 4. χ L = and χ R = ( n 1) s ( n 1) s < σ < χr χl ( 5 1) 0.4 ( 5 1) 0.4 < σ < mg < σ < 0.37 mg 6. df = 19. χ L = and χ R = ( n 1) s ( n 1) s < σ < χr χl ( 0 1) ( 0 1) < σ < g < σ < g 7. df = 39. χ L = (Tech: 3.654) and ( n 1) s ( n 1) s < σ < χr χl ( 40 1) 65. ( 40 1) 65. < σ < ;df = < σ < 8.4 (Tech: 53.4 < σ< 83.7) 8. df = 49. χ L = (Tech: ) and ( n 1) s ( n 1) s < σ < χr χl ( 50 1) ( 50 1) < σ < ;df = < σ < 0.7 (Tech: < σ< 0.731) 9. ( n 1) s ( n 1) s < σ < χr χl ( 106 1) 0.6 ( 106 1) 0.6 < σ < ;df = F < σ < 0.70 F (Tech: F 6 s F) χ R = (Tech: 58.10). χ R = (Tech: 70.). 10. ( n 1) s ( n 1) s < σ < χr χl ( 40 1).55 ( 40 1).55 < σ < ;df = chocolate chips < σ < 3.09 chocolate chips (Tech:.16 chocolate chips < σ < 3.14 chocolate chips) Copyright 014 Pearson Education, Inc.

114 11 Chapter 7: Estimates and Sample Sizes 11. The confidence interval shows that the standard deviation is not likely to be less than 30 ml, so the variation is too high instead of being at an acceptable level below 30 ml. (Such one-sided claims should be tested using the formal methods presented in Chapter 8.) ( n 1) s ( n 1) s < σ < χr χl ( 4 1) 4.8 ( 4 1) 4.8 < σ < ml < σ < ml 1. a. ( n 1) s ( n 1) s < σ < χr χl ( 40 1) 10.3 ( 40 1) 10.3 < σ < ;df = beats per minute < σ < 14.1 beats per minute (Tech: 7.9 beats per minute < σ < 14.4 beats per minute) b. ( n 1) s ( n 1) s < σ < χr χl ( 40 1) 11.6 ( 40 1) 11.6 < σ < beats per minute < σ < 15.9 beats per minute (Tech: 9.0 beats per minute < σ < 16. beats per minute) c. The confidence intervals are not dramatically different, so it appears that the populations of pulse rates of men and women have about the same standard deviation. 13. ( n 1) s ( n 1) s < σ < χ χ R ( 7 1) ( 7 1) < σ < ppm < σ < ppm L Because traffic conditions vary considerably at different times during the day, the confidence interval is an estimate of the standard deviation of the population of speeds at 3:30 on a weekday, not other times. ( n 1) s ( n 1) s < σ < χr χl ( 1 1) ( 1 1) < σ < mi/h < σ < 6.9 mi/h Copyright 014 Pearson Education, Inc.

115 Chapter 7: Estimates and Sample Sizes CI for ages of unsuccessful applicants: 16. a. ( n 1) s ( n 1) s < σ < χr χl ( 5 1) 7. ( 5 1) 7. < σ < years < σ < 11.5 years CI for ages of successful applicants: ( n 1) s ( n 1) s < σ < χr χl ( 9 1) 5.06 ( 9 1) 5.06 < σ < years < σ < 7.5 years Although final conclusions about means of populations should not be based on the overlapping of confidence intervals, the confidence intervals do overlap, so it appears that the two populations have standard deviations that are not dramatically different. b. ( n 1) s ( n 1) s < σ < χr χl ( 10 1) ( 10 1) < σ < min < σ < 0.87 min ( n 1) s ( n 1) s < σ < χr χl ( 10 1) ( 10 1) < σ < min < σ < 3.33 min c. The variation appears to be significantly lower with a single line. The single line appears to be better ( n 1) s ( n 1) s < σ < χ χ R ( 37 1) ( 37 1) < σ < ; df= g < σ < g (Tech: g < σ < g) ( n 1) s ( n 1) s < σ < χ χ R ( 37 1) ( 37 1) < σ < ; df= years < σ < 8.9 years L L Copyright 014 Pearson Education, Inc.

116 114 Chapter 7: Estimates and Sample Sizes ,18 is too large. There aren t 33,18 statistics professors in the population, and even if there were, that sample size is too large to be practical. 0. The sample size of 48 is very practical, although the sample should be selected from the population of all McDonald s restaurants with drive-up windows. 1. The sample size is 768. Because the population does not have a normal distribution, the computed minimum sample size is not likely to be correct.. The sample size is The population of incomes does not have a normal distribution, so the computed sample size is not likely to be correct χl = zα / k = + = and 1 1 χr = zα / k = + = (Tech using z α / = : to the actual critical values. Chapter Quick Quiz 1. 40% 3.1% < p < 40% + 3.1% 36.9% < p < 43.1% χ L = and χ R = 19.63). The approximate values are quite close pˆ = = We have 95% confidence that the limits of and contain the true value of the proportion of females in the population of medical school students. 4. z = [ z ] ˆˆ [ ] α / pq ( 0.5) n = = = 75 E zα /σ n = = = 373 (Tech: 374) E 7. The sample must be a simple random sample and there is a loose requirement that the sample values appear to be from a normally distributed population. 8. The degrees of freedom is the number of sample values that can vary after restrictions have been imposed on all of the values. For the sample data in Exercise 7, df = t = χ L = and Review Exercises 1. a. b. χ R = p ˆ = = = 51.0% ( 557)( 557) pq ˆˆ 84 pˆ ± zα / = ± 1.96 n % < p < 55.1% c. No, the confidence interval shows that the population percentage might be 50% or less, so we cannot safely conclude that the majority of adults say that they are underpaid. Copyright 014 Pearson Education, Inc.

117 Chapter 7: Estimates and Sample Sizes 115. [ z ] ˆˆ [ ] α / pq.575 ( 0.5) n = = = 4145 (Tech: 4147) E zα /σ n = = = 155 (Tech: 154) E 3 4. a. Student t distribution b. Normal distribution c. The distribution is not normal, Student t, or chi-square. d. χ (chi-square distribution) e. Normal distribution 5. a. [ z ] ˆˆ [ ] α / pq.33 ( 0.5) n = = = 543 (Tech: 54) E 0.05 zα /σ b. n = = = 47 (Tech: 46) E 50 c Because the entire confidence interval is above 50%, we can safely conclude that the majority of adults consume alcoholic beverages. pq ˆˆ ( 0.64)( 0.36) pˆ ± zα / = 0.64 ± 1.65 n % < p < 66.5% 7. x tα / s ± = 143±.01 n 1.1 sec < μ < sec 8. Because women and men have some notable physiological differences, the confidence interval does not necessarily serve as an estimate of the mean white blood cell count of men. s.8 x± tα / = 7.15± n < μ < There is 95% confidence that the limits of 37.5 g and 47.9 g contain the true mean deceleration measurement for all small cars. 10. ( n 1) s ( n 1) s < σ < χr χl ( 7 1) 5.6 ( 7 1) 5.6 < σ < g < σ < 1.3 g Cumulative Review Exercises 1. x = 5.5 ; median = 5.0; s = 3.8 s 5.6 x± tα / = 4.7 ±.447 n g < μ < 47.9 g Copyright 014 Pearson Education, Inc.

118 116 Chapter 7: Estimates and Sample Sizes. The range of usual values is from 5.5 ( 3.8) =.1 to ( 3.8) = 3.8 (or from 0 to 13.1). zα /σ Ratio level of measurement; discrete data. 4. n = = = 33 campuses E 5. The population should include only colleges of the same type as the sample, so the population consists of all large urban campuses with residence halls. s 5.8 x± tα / = 5.5±.0 n < μ < The graphs suggest that the population has a distribution that is skewed (to the right) instead of being normal. The histogram shows that some taxi-out times can be very long, and that can occur with heavy traffic, but little or no traffic cannot make the taxi-out time very low. There is a minimum time required, regardless of traffic conditions. Construction of a confidence interval estimate of a population standard deviation has a strict requirement that the sample data are from a normally distributed population, and the graphs show that this strict normality requirement is not satisfied. 7. a. pq ˆˆ ( 0.59)( 0.31) pˆ ± zα / = 0.59 ± 1.96 n 1003 (or < p < 0.61 if using x = 59) < p < 0.60 b. Because the survey was about shaking hands and because it was sponsored by a supplier of hand sanitizer products, the sponsor could potentially benefit from the results, so there might be some pressure to obtain results favorable to the sponsor. [ z ] ˆˆ [ ] α / pq 1.96 ( 0.5) c. n = = = 1083 E There does not appear to be a correlation between HDL and LDL cholesterol levels. 9. a z = = 1.11 and P( z> 1.11) = 13.35% (Tech: 13.3%). 9 Yes, losing about 13% of the market would be a big loss. b. 5th percentile: x = μ + z σ = = 160. mm 95th percentile: x = μ + z σ = = mm a. There are 10 3 possible tickets so the probability of winning by purchasing one ticket is b =. c = Copyright 014 Pearson Education, Inc.

119 Chapter 8: Hypothesis Testing Section 8- Chapter 8: Hypothesis Testing Rejection of the aspirin claim is more serious because the aspirin is a drug treatment. The wrong aspirin dosage can cause adverse reactions. M&Ms do not have those same adverse reactions. It would be wise to use a smaller significance level for testing the aspirin claim.. Estimates and hypothesis tests are both methods of inferential statistics, but they have different objectives. We could use the sample weights to construct a confidence interval estimate of the mean weight of all M&Ms, but hypothesis testing is used to test some claim made about the mean weight of all M&Ms. 3. a. H 0 : μ = 98.6 F b. H 1 : μ 98.6 F c. Reject the null hypothesis or fail to reject the null hypothesis. d. No. In this case, the original claim becomes the null hypothesis. For the claim that the mean body temperature is equal to 98.6 F, we can either reject that claim or fail to reject it, but we cannot state that there is sufficient evidence to support that claim. 4. The P-value of is preferred because it corresponds to the sample evidence that most strongly supports the alternative hypothesis that the XSORT method is effective. 5. a. p = 0.0 b. H 0 : p = 0.0 and H 1 : p a. p > 0.5 b. H 0 : p = 0.5 and H 1 : p > a. μ 76 b. H 0 : μ = 76 and H 1 : μ < a. σ 50 b. H 0 : σ = 50 and H 1 : σ > There is not sufficient evidence to warrant rejection of the claim that 0% of adults smoke. 10. There is sufficient evidence to support the claim that when parents use the XSORT method of gender selection, the proportion of baby girls is greater than There is not sufficient evidence to warrant rejection of the claim that the mean pulse rate of adult females is 76 or lower. 1. There is sufficient evidence to reject the claim that pulse rates of adult females have a standard deviation of at least pˆ p z = = = (or z = if using x = 909) pq n ( 0.75)( 0.5) 101 pˆ p z = = = 1.7 (or z = 1.6 if using x = 481) χ pq n ( 0.48)( 0.5) 100 ( n 1) s ( 40 1).8 = = = σ 5 x μ t = = =.358 s n.8 40 Copyright 014 Pearson Education, Inc.

120 118 Chapter 8: Hypothesis Testing 17. P-value = P( z> ) = Critical value: z = P-value = P( z< ) = Critical value: z = P-value = P( z< 1.75) = (Tech: ). Critical values: z= 1.96, z= P-value = P( z> 1.50) = Critical values: z= 1.96, z= P-value = P( z< 1.3) = (Tech: 0.187). Critical values: z= 1.96, z= P-value = P( z>.50) = Critical values: z= 1.96, z= P-value = P( z< 3.00) = Critical value: z = P-value = P( z>.88) = Critical value: z = a. Reject H 0. b. There is sufficient evidence to support the claim that the percentage of blue M&Ms is greater than 5%. 6. a. Fail to reject H 0. b. There is not sufficient evidence to support the claim that fewer than 0% of M&M candies are green. 7. a. Fail to reject H 0. b. There is not sufficient evidence to warrant rejection of the claim that women have heights with a mean equal to cm. 8. a. Reject H 0. b. There is sufficient evidence to warrant rejection of the claim that women have heights with a standard deviation equal to 5.00 cm. 9. a. H 0 : p = 0.5 and H 1 : p > 0.5 b. α = 0.01 c. Normal distribution. d. Right-tailed. e. z = 1.00 f. P-value = P( z> 1.00) = g. z =.33 h a. H 0 : p = 0.5 and H 1 : p 0.5 b. α = 0.05 c. Normal distribution. d. Two-tailed. e. z = 1.00 f. P-value = P( z> 1.00) = (Tech: ) g. z= 1.96, z= 1.96 h Type I error: In reality p = 0.1, but we reject the claim that p = 0.1. Type II error: In reality p 0.1, but we fail to reject the claim that p = Type I error: In reality p = 0.001, but we reject the claim that p = Type II error: In reality p 0.001, but we fail to reject the claim that p = Type I error: In reality p = 0.5, but we support the claim that p > 0.5. Type II error: In reality p > 0.5, but we fail to support that conclusion. 34. Type I error: In reality 0.9 p =, but we support the claim that 0.9 p <. Type II error: In reality 0.9 p <, but we fail to support that conclusion. Copyright 014 Pearson Education, Inc.

121 Chapter 8: Hypothesis Testing The power of 0.96 shows that there is a 96% chance of rejecting the null hypothesis of p = 0.08 when the true proportion is actually That is, if the proportion of Chantix users who experience abdominal pain is actually 0.18, then there is a 96% chance of supporting the claim that the proportion of Chantix users who experience abdominal pain is greater than a. From p = 0.5, From p = 0.65, ( 0.5)( 0.5) p ˆ = = z = = ; Power = P( z> 0.791) = (Tech: ) ( 0.65)( 0.35) 64 b. Assuming that p = 0.5, as in the null hypothesis, the critical value of z = corresponds to p ˆ = , so any sample proportion greater than causes us to reject the null hypothesis, as shown in the shaded critical region of the top graph. If p is actually 0.65, then the null hypothesis of p = 0.5 is false, and the actual probability of rejecting the null hypothesis is found by finding the area greater than p ˆ = in the bottom graph, which is the shaded area. That is, the shaded area in the bottom graph represents the probability of rejecting the false null hypothesis. ( 0.5)( 0.5) 37. From p = 0.5, pˆ = , from p = 0.55, n ( P( z> 0.84) = ), so: ( 0.55)( 0.45) pˆ = ; Since n Section 8-3 ( 0.5)( 0.5) ( 0.55)( 0.45) = n n 0.5 n = 0.55 n n = n = = The P-value method and the critical value method always yield the same conclusion. The confidence interval method might or might not yield the same conclusion obtained by using the other two methods p ˆ = = The symbol ˆp is used to represent a sample proportion P-value = Because the P-value is so low, we have sufficient evidence to support the claim that p < a. The symbol p represents the population proportion, but the P-value is a probability of getting sample results that are at least as extreme as those obtained (assuming that the null hypothesis is true). b. If the P-value is very low (such as less than or equal to 0.05), the null must go means that we should reject the null hypothesis. c. The statement that if the P is high, the null will fly suggests that with a high P-value, the null hypothesis has been proved or is supported, but we should never make such a conclusion. 5. a. Left-tailed. c. P-value = (rounded) b. z = 1.94 d. H 0 : p = 0.1. Reject the null hypothesis. e. There is sufficient evidence to support the claim that less than 10% of treated subjects experience headaches. Copyright 014 Pearson Education, Inc.

122 10 Chapter 8: Hypothesis Testing 6. a. Two-tailed. b. z = 1.45 c. P-value = d. H 0 : p = Fail to reject the null hypothesis. e. There is not sufficient evidence to warrant rejection of the claim that 35% of homes have guns in them. 7. a. Two-tailed. b. z = 0.8 c. P-value = d. H 0 : p = Fail to reject the null hypothesis. e. There is not sufficient evidence to warrant rejection of the claim that 35% of adults have heard of the Sony Reader. 8. a. Left-tailed. b. z =.53 c. P-value = d. H 0 : p = 0.5. Reject the null hypothesis. e. There is sufficient evidence to support the claim that fewer than half of adults say that public speaking is the activity that they dread most. 9. H 0 : p = 0.5. H 1 : p 0.5. Test statistic: z = = Critical values: z =±.575 ( 0.5)( 0.75) 580 (Tech: ±.576 ). P-value = P( z> 0.67) = (Tech: 0.501). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that 5% of offspring peas will be yellow. Test of p = 0.5 vs p not = 0.5 Sample X N Sample p 95% CI Z-Value P-Value (0.680, ) H 0 : p = H 1 : p Test statistic: z = = Critical values: z =± ( 0.13)( 0.87) 100 P-value = P( z> 1.49) = (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that 13% of M&Ms are brown. Test of p = 0.13 vs p not = 0.13 Sample X N Sample p 95% CI Z-Value P-Value (0.0688, ) H 0 : p = 0.5. H 1 : p > 0.5. Test statistic: z = = Critical value: z = P-value ( 0.5)( 0.5) 100 = P( z> 1.90) = (Tech: 0.090). Reject H 0. There is sufficient evidence to support the claim that the majority of adults feel vulnerable to identify theft. Test of p = 0.5 vs p > 0.5 Sample X N Sample p Z-Value P-Value Copyright 014 Pearson Education, Inc.

123 1. H 0 : p = 0.5. H 1 : p > 0.5. Test statistic: ( 0.5)( 0.5) 806 Chapter 8: Hypothesis Testing z = = 6.7. Critical value: z =.33. P-value = P( z> 6.7) = (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the majority of adults prefer window seats when they fly. Test of p = 0.5 vs p > 0.5 Sample X N Sample p Z-Value P-Value H 0 : p = 0.5. H 1 : p > 0.5. Test statistic: z = = Critical value: z =.33. P-value ( 0.5)( 0.5) 945 = P( z> 6.45) = (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the XSORT method is effective in increasing the likelihood that a baby will be a girl. Test of p = 0.5 vs p > 0.5 Sample X N Sample p Z-Value P-Value H 0 : p = 0.5. H 1 : p > 0.5. Test statistic: z = = Critical value: z =.33. P-value ( 0.5)( 0.5) 91 = P( z> 10.96) = (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the YSORT method is effective in increasing the likelihood that a baby will be a boy. Test of p = 0.5 vs p > 0.5 Sample X N Sample p Z-Value P-Value H 0 : p = 0.5. H 1 : p 0.5. Test statistic: z = =.03. Critical values: z =± P-value ( 0.5)( 0.5) 80 = P( z<.03) = (Tech: 0.04). Reject H 0. There is sufficient evidence to warrant rejection of the claim that touch therapists use a method equivalent to random guesses. However, their success rate of 13/80 (or 43.9%) indicates that they performed worse than random guesses, so they do not appear to be effective. Test of p = 0.5 vs p not = 0.5 Sample X N Sample p 95% CI Z-Value P-Value ( , ) H 0 : p = 0.5. H 1 : p 0.5. Test statistic: z = =.03. Critical values: z =±.575 (Tech: ( 0.5)( 0.5) 80 ±.576 ). P-value = P( z<.03) = (Tech: 0.04). Reject H 0. There is not sufficient evidence to warrant rejection of the claim that touch therapists use a method equivalent to random guesses. However, their success rate of 13/80 (or 43.9%) indicates that they performed worse than random guesses, so they do not appear to be effective. Test of p = 0.5 vs p not = 0.5 Sample X N Sample p 95% CI Z-Value P-Value ( , ) Copyright 014 Pearson Education, Inc.

124 1 Chapter 8: Hypothesis Testing H 0 : p =. H 3 1 : p < 3. Test statistic: z = =.7. Critical value: z =.33. P-value 1 ( )( ) = P( z<.7) = Reject H 0. There is sufficient evidence to support the claim that fewer than 1/3 of the challenges are successful. Players don t appear to be very good at recognizing referee errors. Test of p = vs p < Sample X N Sample p Z-Value P-Value H 0 : p = H 1 : p Test statistic: z = = Critical values: z =± ( 0.43)( 0.57) 601 P-value = P( z> 3.70) = Reject H 0. There is sufficient evidence to warrant rejection of the claim that the percentage who believe that they voted for the winning candidate is equal to 43%. There appears to be a substantial discrepancy between how people said that they voted and how they actually did vote. Test of p = 0.43 vs p not = 0.43 Sample X N Sample p 95% CI Z-Value P-Value ( , ) H 0 : p = p Test statistic: , z = = Critical values: ( )( ) 40,095 z =±.81. P-value = P( z< 0.66) = (Tech: 0.51). Fail to reject H 0. There is not sufficient evidence to support the claim that the rate is different from %. Cell phone users should not be concerned about cancer of the brain or nervous system. Test of p = vs p not = Sample X N Sample p 95% CI Z-Value P-Value ( , ) H 0 : p = H 1 : p > Test statistic: z = = Critical value: z =.33. P-value ( 0.75)( 0.5) 1007 = P( z> 7.33) = (Tech: ). Reject H 0. There is sufficient evidence to support the claim that more than 75% of adults know what Twitter is. Test of p = 0.75 vs p > 0.75 Sample X N Sample p Z-Value P-Value H 0 : p = 0.5. H 1 : p 0.5. Test statistic: z = =.75. Critical values: z =± P-value ( 0.5)( 0.5) 414 = P( z>.75) = (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that the coin toss is fair in the sense that neither team has an advantage by winning it. The coin toss rule does not appear to be fair. Test of p = 0.5 vs p not = 0.5 Sample X N Sample p 95% CI Z-Value P-Value ( , ) Copyright 014 Pearson Education, Inc.

125 . H 0 : p = 0.5. H 1 : p > 0.5. Test statistic: ( 0.5)( 0.5) 71 Chapter 8: Hypothesis Testing z = = Critical value: z = P-value = P( z> 0.83) = (Tech: 0.031). Fail to reject H 0. There is not sufficient evidence to support the claim that among smokers who try to quit with nicotine patch therapy, the majority are smoking a year after the treatment. The results show that about half of those who use nicotine patch therapy are successful in quitting smoking. Test of p = 0.5 vs p > 0.5 Sample X N Sample p Z-Value P-Value H 0 : p = 0.5. H 1 : p < 0.5. Test statistic: z = = Critical value: z =.33. P-value ( 0.5)( 0.5) 380 = P( z< 3.90) = (Tech: ). Reject H 0. There is sufficient evidence to support the claim that fewer than half of smartphone users identify the smartphone as the only thing they could not live without. Because only smartphone users were surveyed, the results do not apply to the general population. Test of p = 0.5 vs p < 0.5 Sample X N Sample p Z-Value P-Value H 0 : p = 0.5. H 1 : p < 0.5. Test statistic: z = = Critical value: z = P-value ( 0.5)( 0.5) 1000 = P( z< 1.13) = (Tech: 0.871). Fail to reject H 0. There is not sufficient evidence to support the claim that less than 0.5 of the deaths occur the week before Thanksgiving. Based on these results, there is no indication that people can temporarily postpone their death to survive Thanksgiving. Test of p = 0.5 vs p < 0.5 Sample X N Sample p Z-Value P-Value H 0 : p = 0.5. H 1 : p > 0.5. Test statistic: z = = 1.91 or z = 1.93 (using x = 14). Critical ( 0.5)( 0.75) 47 value: z = (assuming a 0.05 significance level). P-value = P( z> 1.9) = (using p ˆ = 0.9 ) or (using x = 14) (Tech P-value = 0.069). Reject H 0. There is sufficient evidence to support the claim that more than 5% of women purchase books online. Test of p = 0.5 vs p > 0.5 Sample X N Sample p Z-Value P-Value H 0 : p = 0.5. H 1 : p < 0.5. Test statistic: z = = 1.86 or z = 1.85 (using x = 36). Critical ( 0.5)( 0.5) 514 value: z =.33. P-value = P( z< 1.86) = (using p ˆ = ) or 0.03 (using x = 36) (Tech P-value = 0.030). Fail to reject H 0. There is not sufficient evidence to support the claim that less than half of all human resource professionals say that body piercings are big grooming red flags. Test of p = 0.5 vs p < 0.5 Sample X N Sample p Z-Value P-Value Copyright 014 Pearson Education, Inc.

126 14 Chapter 8: Hypothesis Testing 7. H 0 : p = H 1 : p > Test statistic: z = = 7.85 or z = 7.89 (using x = 463). Critical ( 0.75)( 0.5) 514 value: z =.33. P-value = P( z> 7.85) = (Tech: ). Reject H 0. There is sufficient evidence to support the claim that more than 3/4 of all human resource professionals say that the appearance of a job applicant is most important for a good first impression. Test of p = 0.75 vs p > 0.75 Sample X N Sample p Z-Value P-Value H 0 : p = H 1 : p Test statistic: z = = 5.84 or z = 5.81 (using x = 701). Critical ( 0.61)( 0.39) 100 values: z =± 1.96 (assuming a 0.05 significance level). P-value = P( z> 5.81) = (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that the percentage of all voters who say that they voted is equal to 61%. The results suggest that either survey respondents are not being truthful or they have an incorrect perception of reality. Test of p = 0.61 vs p not = 0.61 Sample X N Sample p 95% CI Z-Value P-Value ( , ) H 0 : p = H 1 : p < Test statistic: z = = 9.09 or z = 9.11 (using x = 339). ( 0.791)( 0.09) 870 Critical value: z =.33. P-value = P( z< 9.09) = (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the percentage of selected Americans of Mexican ancestry is less than 79.1%, so the jury selection process appears to be unfair. Test of p = vs p < Sample X N Sample p Z-Value P-Value H 0 : p = 0.5. H 1 : p > 0.5. Test statistic: z = = 5.83 or z = 5.85 (using x = 49). Critical value: ( 0.5)( 0.5) 703 z = P-value = P( z> 5.83) = (Tech: ). Reject H 0. There is sufficient evidence to support the claim that most workers get their jobs through networking. Test of p = 0.5 vs p > 0.5 Sample X N Sample p Z-Value P-Value H 0 : p = H 1 : p > Test statistic: z = = Critical value: z =.33. P-value ( 0.75)( 0.5) 5,000 = P( z> 7.30) = (Tech: ). Reject H 0. There is sufficient evidence to support the claim that more than 75% of television sets in use were tuned to the Super Bowl. Test of p = 0.75 vs p > 0.75 Sample X N Sample p Z-Value P-Value Copyright 014 Pearson Education, Inc.

127 3. H 0 : p = 0.5. H 1 : p < 0.5. Test statistic: ( 0.5)( 0.5) 1500 Chapter 8: Hypothesis Testing z = =.3. Critical value: z =.33. P-value = P( z<.3) = (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that fewer than half of all households have a high-definition television. Because the use of highdefinition televisions is growing rapidly, these results are not likely to be valid today. Test of p = 0.5 vs p < 0.5 Sample X N Sample p Z-Value P-Value Among 100 M&Ms, 19 are green. H 0 : p = H 1 : p Test statistic: z = = 0.8. ( 0.16)( 0.84) 100 Critical values: z =± P-value = P( z> 0.8) = 0.41 (Tech: 0.413). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that 16% of plain M&M candies are green. Test of p = 0.16 vs p not = 0.16 Sample X N Sample p 95% CI Z-Value P-Value ( , ) Among 48 flights, 44 are on time. H 0 : p = H 1 : p Test statistic: z = =.09. ( 0.795)( 0.05) 48 Critical values: z =± P-value = P( z>.09) = (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that 79.5% of flights are on time. With 91.7% of the 48 flights arriving on time, American Airlines appears to have a better on-time performance. Test of p = vs p not = Sample X N Sample p 95% CI Z-Value P-Value ( , ) H 0 : p = 0.5. H 1 : p > 0.5. Using the binomial probability distribution with an assumed proportion of p = 0.5, the probability of 7 or more heads is 0.035, so the P-value is Reject H 0. There is sufficient evidence to support the claim that the coin favors heads. 36. a. H 0 : p = H 1 : p 0.1. Test statistic: z = =.00. Critical values: z =± Reject ( 0.1)( 0.9) 1000 H 0. There is sufficient evidence to warrant rejection of the claim that the proportion of zeros is b. H 0 : p = H 1 : p 0.1. Test statistic: z = =.00. P-value ( 0.1)( 0.9) 1000 = P( z>.00) = (Tech: 0.045). There is sufficient evidence to warrant rejection of the claim that the proportion of zeros is 0.1. c < p < ; because 0.1 is contained within the confidence interval, fail to reject H 0 : p = There is not sufficient evidence to warrant rejection of the claim that the proportion of zeros is 0.1. Sample X N Sample p 95% CI ( , ) d. The traditional and P-value methods both lead to rejection of the claim, but the confidence interval method does not lead to rejection of the claim. Copyright 014 Pearson Education, Inc.

128 16 Chapter 8: Hypothesis Testing 37. a. From p = 0.40, From p = 0.5, ( 0.4)( 0.6) p ˆ = = z = = ; Power = P( z< 0.588) = (Tech: 0.719) ( 0.5)( 0.75) 50 b = (Tech: 0.781) c. The power of 0.74 shows that there is a reasonably good chance of making the correct decision of rejecting the false null hypothesis. It would be better if the power were even higher, such as greater than 0.8 or 0.9. Section The requirements are (1) the sample must be a simple random sample, and () either or both of these conditions must be satisfied: The population is normally distributed or n > 30. There is not enough information given to determine whether the sample is a simple random sample. Because the sample size is not greater than 30, we must check for normality, but the value of 583 sec appears to be an outlier, and a normal quantile plot or histogram suggests that the sample does not appear to be from a normally distributed population Probability Plot Frequency 3 Probability df denotes the number of degrees of freedom. For the sample of 1 times, df = 1 1= A t test is a hypothesis test that uses the Student t distribution, such as the method of testing a claim about a population mean as presented in this section. The t test methods are much more likely to be used than the z test methods because the t test does not require a known value of σ, and realistic hypothesis tests of claims about μ typically involve a population with an unknown value of σ. 4. Use a 90% confidence level. The given confidence interval does contain the value of 90 sec, so it is possible that the value of μ is equal to 90 sec or some lower value, so there is not sufficient evidence to support the claim that the mean is greater than 90 sec. 5. P-value < (Tech: ) < P-value < 0.05 (Tech: ) < P-value < 0.05 (Tech: ) < P-value < 0.0 (Tech: ) 9. H 0 : μ = 4. H 1 : μ < 4. Test statistic: t = Critical value: t = P-value < (The display shows that the P-value is ) Reject H 0. There is sufficient evidence to support the claim that Chips Ahoy reduced-fat cookies have a mean number of chocolate chips that is less than 4 (but this does not provide conclusive evidence of reduced fat). 10. H 0 : μ = 10 km. H 1 : μ 10 km. Test statistic: t = 0.7. Critical values: t =±.678 (approximately). P- value > 0.0. (Minitab shows a P-value of ) Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the earthquakes are from a population with a mean depth equal to 10 km. Copyright 014 Pearson Education, Inc.

129 Chapter 8: Hypothesis Testing H 0 : μ = 33 years. H 1 : μ 33 years. Test statistic: t = =.367. Critical values t =± (approximately). P-value > 0.0 (Tech: 0.004). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the mean age of actresses when they win Oscars is 33 years. Test of mu = 33 vs not = 33 N Mean StDev SE Mean 95% CI T P (33.46, 38.34) H 0 : μ = lb. H 1 : μ > lb. Test statistic: t = = Critical value: t = (approximately). P-value > 0.10 (Tech: 0.075). Fail to reject H 0. There is not sufficient evidence to support the claim that the mean weight of discarded plastic from the population of households is greater than lb. Test of mu = 1.8 vs > 1.8 N Mean StDev SE Mean T P H 0 : μ = g. H 1 : μ g. Test statistic: t = = Critical values: t =±.101. P-value > 0.0 (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the mean weight of all green M&Ms is equal to g. The green M&Ms do appear to have weights consistent with the package label. Test of mu = vs not = N Mean StDev SE Mean 95% CI T P (0.8360, ) H 0 : μ 98.6 F. H 1 : μ 98.6 F. Test statistic: t = = Critical values: t =± (approximately). P-value < 0.01 (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that the mean body temperature of the population is equal to 98.6 F. There is sufficient evidence to conclude that the common belief is wrong. Test of mu = 98.6 vs not = 98.6 N Mean StDev SE Mean 95% CI T P ( , ) H 0 : μ = 0 lb. H 1 : μ > 0 lb. Test statistic: t = = Critical value: t = P-value < (Tech: 0.000). Reject H 0. There is sufficient evidence to support the claim that the mean weight loss is greater than 0. Although the diet appears to have statistical significance, it does not appear to have practical significance, because the mean weight loss of only 3.0 lb does not seem to be worth the effort and cost. Test of mu = 0 vs > 0 N Mean StDev SE Mean T P Copyright 014 Pearson Education, Inc.

130 18 Chapter 8: Hypothesis Testing H 0 : μ = 1.0 min. H 1 : μ < 1.0 min. Test statistic: t = = Critical value: t = (approximately). P-value > 0.10 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that the mean departure delay time for all such flights is less than 1.0 min. A flight operations manager is not justified in reporting that the mean departure time is less than 1.0 min. Test of mu = 1 vs < 1 N Mean StDev SE Mean T P H 0 : μ = 0. H 1 : μ > 0. Test statistic: t = = Critical value: t = (approximately, assuming a 0.05 significance level). P-value > 0.10 (Tech: 0.447). Fail to reject H 0. There is not sufficient evidence to support the claim that with garlic treatment, the mean change in LDL cholesterol is greater than 0. The results suggest that the garlic treatment is not effective in reducing LDL cholesterol levels. Test of mu = 0 vs > 0 N Mean StDev SE Mean T P H 0 : μ = 10.8 min. H 1 : μ < 10.8 min. Test statistic: t = = Critical value: t = (assuming a 0.05 significance level). P-value > 0.10 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that after treatment with Zopiclone, subjects have a mean wake time less than 10.8 min. This result suggests that the Zoplicone treatment is not effective. Test of mu = 10.8 vs < 10.8 N Mean StDev SE Mean T P H 0 : μ = 4 years. H 1 : μ > 4 years. Test statistic: t = Critical value: t =.539. P-value < (Tech: 0.004). Reject H 0. There is sufficient evidence to support the claim that the mean time required to earn a bachelor s degree is greater than 4.0 years. Because n 30 and the data do not appear to be from a normally distributed population, the requirement that the population is normally distributed or n > 30 is not satisfied, so the conclusion from the hypothesis test might not be valid. However, some of the sample values are equal to 4 years and others are greater than 4 years, so the claim does appear to be justified Frequency Years of College 1 14 Test of mu = 4 vs > 4 Variable N Mean StDev SE Mean T P CollegeYears Copyright 014 Pearson Education, Inc.

131 Chapter 8: Hypothesis Testing The sample data meet the loose requirement of having a normal distribution. H 0 : μ = 30 years. H 1 : μ > 30 years. Test statistic: t = Critical value: t = P-value < 0.05 (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the mean age of all race car drivers is greater than 30 years. Test of mu = 30 vs > 30 Variable N Mean StDev SE Mean T P Ages The sample data meet the loose requirement of having a normal distribution. H 0 : μ = 14 mg/g. H 1 : μ < 14 mg/g. Test statistic: t = Critical value: t = P-value > 0.05 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that the mean lead concentration for all such medicines is less than 14 mg/g. Test of mu = 14 vs < 14 Variable N Mean StDev SE Mean T P Lead The sample data meet the loose requirement of having a normal distribution. H 0 : μ = 1100 cm 3. H 1 : μ 1100 cm 3. Test statistic: t = Critical values: t =± P-value > 0.0 (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the population of brain volumes has a mean equal to cm3. Test of mu = 1100 vs not = 1100 Variable N Mean StDev SE Mean 95% CI T P Volume (1046., 114.) The sample data meet the loose requirement of having a normal distribution. H 0 : μ = 63.8 in. H 1 : μ > 63.8 in. Test statistic: t = Critical value: t =.81. P-value < (Tech: ). Reject H 0. There is sufficient evidence to support the claim that supermodels have heights with a mean that is greater than the mean height of 63.8 in. for women in the general population. We can conclude that supermodels are taller than typical women. Test of mu = 63.8 vs > 63.8 Variable N Mean StDev SE Mean T P Heights The sample data meet the loose requirement of having a normal distribution. H 0 : μ = 65 mi/h. H 1 : μ < 65 mi/h. Test statistic: t = Critical value: t = P-value < (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the sample is from a population with a mean that is less than the speed limit of 65 mi/h. Test of mu = 65 vs < 65 Variable N Mean StDev SE Mean T P HWY The sample data meet the loose requirement of having a normal distribution. H 0 : μ = H 1 : μ > Test statistic: t =.18. Critical value: t = (approximately). P-value < 0.05 (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the population of earthquakes has a mean magnitude greater than Test of mu = 1 vs > 1 Variable N Mean StDev SE Mean Bound T P MAG Copyright 014 Pearson Education, Inc.

132 130 Chapter 8: Hypothesis Testing 6. The sample data meet the loose requirement of having a normal distribution. H 0 : μ = 10 mm Hg. H 1 : μ < 10 mm Hg. Test statistic: t = Critical value: t = P-value > 0.10 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that the female population has a mean systolic blood pressure level less than 10.0 mm Hg. Test of mu = 10 vs < 10 Variable N Mean StDev SE Mean T P SYS The sample data meet the loose requirement of having a normal distribution. H 0 : μ = 83 kg. H 1 : μ < 83 kg. Test statistic: t = Critical value: t =.453. P-value < (Tech: ). Reject H 0. There is sufficient evidence to support the claim that male college students have a mean weight that is less than the 83 kg mean weight of males in the general population. Test of mu = 83 vs < 83 Variable N Mean StDev SE Mean T P WTSEP The sample data meet the loose requirement of having a normal distribution. H 0 : μ = 10 V. H 1 : μ 10 V. Test statistic: t = Critical values: t =±.708. P-value < 0.01 (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that the mean voltage amount is 10 volts. Test of mu = 10 vs not = 10 Variable N Mean StDev SE Mean 95% CI T P Home (13.586, ) H 0 : μ = 4. H 1 : μ < 4. Test statistic: z = = 7.3. Critical value: z = P-value = P( z< 7.3) = (Tech: ). Reject H 0. There is sufficient evidence to support the claim that Chips Ahoy reduced-fat cookies have a mean number of chocolate chips that is less than 4 (but this does not provide conclusive evidence of reduced fat) H 0 : μ = 10 km. H 1 : μ 10 km. Test statistic: z = = 0.7. Critical values: z =±.575. P value = P( z< 0.7) = (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the earthquakes are from a population with a mean depth equal to 10 km H 0 : μ = 33 years. H 1 : μ 33 years. Test statistic: z = =.37. Critical values: z =±.575. P value = P( z>.37) = (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the mean age of actresses when they win Oscars is 33 years H 0 : μ = lb. H 1 : μ > lb. Test statistic: z = = 0.8. Critical value: z = P value = P( z> 0.8) = (Tech: 0.059). Fail to reject H 0. There is not sufficient evidence to support the claim that the mean weight of discarded plastic from the population of households is greater than lb ( ) 33. A = = The approximation yields a critical value of ( e /149 ) t = = 1.655, which is the same as the result from STATDISK or a TI-83/84 Plus calculator. Copyright 014 Pearson Education, Inc.

133 Chapter 8: Hypothesis Testing Using the normal distribution makes you more likely to reject the null hypothesis because the critical z values are not as extreme as the corresponding critical t values. 35. a. The power of shows that there is a 4.74% chance of supporting the claim that μ < 1W/kg when the true mean is actually 0.80 W/kg. This value of power is not very high, and it shows that the hypothesis test is not very effective in recognizing that the mean is less than 1.00 W/kg when the actual mean is 0.80 W/kg. b. β = The probability of a type II error is That is, there is a probability of making the mistake of not supporting the claim that μ < 1W/kg when in reality the population mean is 0.80 W/kg. Section a. The mean waiting time remains the same. b. The variation among waiting times is lowered. c. Because customers all have waiting times that are roughly the same, they experience less stress and are generally more satisfied. Customer satisfaction is improved. d. The single line is better because it results in lower variation among waiting times, so a hypothesis test of a claim of a lower standard deviation is a good way to verify that the variation is lower with a single waiting line.. a. The normality requirement for a hypothesis test of a claim about a standard deviation is much more strict, meaning that the distribution of the population must be much closer to a normal distribution. b. With only 10 sample values, a histogram doesn t really give us a good picture of the distribution, so a normal quantile plot would be better. Also, we should determine that there are no outliers. 3. Use a 90% confidence interval. The conclusion based on the 90% confidence interval will be the same as the conclusion from a hypothesis test using the P-value method or the critical value method. 4. a. H 0 : σ = 1.8 min. H 1 : σ < 1.8 min. b. ( n 1) s ( 10 1) 0.5 χ = = = σ 1.8 c. Reject H 0, the null hypothesis. d. There is sufficient evidence to support the claim that the standard deviation of waiting times of all customers is less than 1.8 min. e. The change to a single waiting line is effective because the variation among waiting times is less than it was with multiple lines. 5. ( 36 1) 0.11 H 0 : σ = 0.15 oz. H 1 : σ < 0.15 oz. Test statistic: χ = = Critical value of χ is 0.15 between and 6.509, so it is estimated to be.501 (Tech:.465). P-value < 0.05 (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the population of volumes has a standard deviation less than 0.15 oz. Method Chi-Square DF P-Value Standard ( 36 1) 0.09 H 0 : σ = 0.15 oz. H 1 : σ < 0.15 oz. Test statistic: χ = = Critical value of χ is 0.15 between and 6.509, so it is estimated to be.501 (Tech:.465). P-value < 0.05 (Tech: 0.000). Reject H 0. There is sufficient evidence to support the claim that the population of volumes has a standard deviation less than 0.15 oz. Method Chi-Square DF P-Value Standard Copyright 014 Pearson Education, Inc.

134 13 Chapter 8: Hypothesis Testing ( 37 1) H 0 : σ = g. H 1 : σ < g. Test statistic: χ = = Critical value of is between and 6.509, so it is estimated to be.501 (Tech: 3.69). P-value < 0.05 (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the population of weights has a standard deviation less than the specification of g. Method Chi-Square DF P-Value Standard ( 35 1) H 0 : σ = g. H 1 : σ > g. Test statistic: χ = = Critical value of is between and P-value < (Tech: ). Reject H 0. There is sufficient evidence to support the claim that pre-1983 pennies have a standard deviation greater than g. Weights of pre pennies appear to vary more than those of post-1983 pennies. Method Chi-Square DF P-Value Standard The data appear to be from a normally distributed population. H 0 : σ = 10 bpm. H 1 : σ 10 bpm. Test ( 40 1 ) 10.3 statistic: χ = = Critical value of χ = and χ = (approximately). 10 P-value > 0.0 (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that pulse rates of men have a standard deviation equal to 10 beats per minute. Method Chi-Square DF P-Value Standard The data appear to be from a normally distributed population. H 0 : σ = 10 bpm. H 1 : σ 10 bpm. Test ( 40 1 ) 11.6 statistic: χ = = Critical values of χ = and χ = (approximately). 10 P-value > 0.10 (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that pulse rates of women have a standard deviation equal to 10 beats per minute. Method Chi-Square DF P-Value Standard ( 5 1) H 0 : σ = 3. mg. H 1 : σ 3. mg. Test statistic: χ = = Critical values: χ = and χ = P-value > 0.0 (Tech: 0.498). Fail to reject H 0. There is not sufficient evidence to support the claim that filtered 100-mm cigarettes have tar amounts with a standard deviation different from 3. mg. There is not enough evidence to conclude that filters have an effect. Method Chi-Square DF P-Value Standard H 0 : σ = cents. H 1 : σ cents. Test statistic: χ χ = and χ χ ( 100 1) 33.5 = = Critical values: χ = (approximately). P-value > 0.0 (Tech: 0.044). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the standard deviation is cents. Because the amounts from 0 cents to 99 cents are all equally likely, the requirement of a normal distribution is violated, so the results are highly questionable. Copyright 014 Pearson Education, Inc.

135 Chapter 8: Hypothesis Testing (continued) Method Chi-Square DF P-Value Standard The data appear to be from a normally distributed population. H 0 : σ =.5 years. H 1 : σ <.5 years. Test ( 15 1 ) 7.67 statistic: χ = = Critical value: χ = P-value < (Tech: ). Reject.5 H 0. There is sufficient evidence to support the claim that the standard deviation of ages of all race car drivers is less than.5 years. Variable Method Chi-Square DF P-Value Ages Standard The data appear to be from a normally distributed population. H 0 : σ = 5 mi/h. H 1 : σ 5 mi/h. Test ( 1 1 ) 4.08 statistic: χ = = Critical values of χ = and χ = P-value > (Tech: 0.455). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the standard deviation of speeds is equal to 5.0 mi/h. Variable Method Chi-Square DF P-Value HWY Standard The data appear to be from a normally distributed population. H 0 : σ = 3. ft. H 1 : σ > 3. ft. Test ( 1 1 ) 5.4 statistic: χ = = Critical value: χ = P-value = Reject H 0. There is 3. sufficient evidence to support the claim that the new production method has errors with a standard deviation greater than 3. ft. The variation appears to be greater than in the past, so the new method appears to be worse, because there will be more altimeters that have larger errors. The company should take immediate action to reduce the variation. Variable Method Chi-Square DF P-Value Errors Standard The data appear to be from a normally distributed population. H 0 : σ = 15. H 1 : σ < 15. Test statistic: ( 1 1 ) 9.50 χ = = Critical value: χ = P-value < 0.05 (Tech: ). Reject H 0. There 15 is sufficient evidence to support the claim that IQ scores of professional pilots have a standard deviation less than 15. Variable Method Chi-Square DF P-Value IQ Standard The data appear to be from a normally distributed population. H 0 : σ = 0.15 oz. H 1 : σ < 0.15 oz. Test ( 36 1 ) statistic: χ = = Critical value of χ is between and 6.509, so it is 0.15 estimated to be.501 (Tech:.465). P-value < 0.01 (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the population of volumes has a standard deviation less than 0.15 oz. Variable Method Chi-Square DF P-Value CKDTVOL Standard Copyright 014 Pearson Education, Inc.

136 134 Chapter 8: Hypothesis Testing 18. The data appear to be from a normally distributed population. H 0 : σ = g. H 1 : σ g. Test ( 35 1 ) statistic: χ = = Critical values of χ = and χ = (approximately). P-value < 0.01 (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that the population of weights has a standard deviation equal to g. Variable Method Chi-Square DF P-Value Wheat Standard χ = =.189, which is reasonably close to the value of.465 obtained from STATDISK and Minitab. 19. Critical ( ) Critical χ = = , which is very close to the value of.465 obtained from STATDISK and Minitab. Chapter Quick Quiz 1. H 0 : μ = 0 sec. H 1 : μ 0 sec.. a. Two-tailed. b. Student t. 3. a. Fail to reject H 0. b. There is not sufficient evidence to warrant rejection of the claim that the sample is from a population with a mean equal to 0 sec. 4. There is a loose requirement of a normally distributed population in the sense that the test works reasonably well if the departure from normality is not too extreme. 5. a. H 0 : p = 0.5. H 1 : p > 0.5. b z = = 6.33 ( 0.5)( 0.5) 511 c. P-value = There is sufficient evidence to support the claim that the majority of adults are in favor of the death penalty for a person convicted of murder. 6. = P( z<.00) = (Tech: ) 7. The only true statement is the one given in part (a). 8. No. All critical values of x are greater than zero. 9. True. 10. False. Review Exercises 1. a. False. d. False. b. True. e. False. c. False. Copyright 014 Pearson Education, Inc.

137 . H 0 : p =. H 3 1 : p 3. Test statistic: Chapter 8: Hypothesis Testing z = = Critical values: z =±.575 (Tech: ±.576 ). 1 ( )( ) P-value = P( z< 1.09) = (Tech: 0.756). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that /3 of adults are satisfied with the amount of leisure time that they have. Test of p = vs p not = Sample X N Sample p 95% CI Z-Value P-Value ( , ) H 0 : p = H 1 : p > Test statistic: z = = or z = (if using p ˆ = 0.9 ). ( 0.75)( 0.5) 737 Critical value: z =±.33. P-value = (Tech: ). Reject H 0. There is sufficient evidence to support the claim that more than 75% of us do not open unfamiliar and instant-message links. Given that the results are based on a voluntary response sample, the results are not necessarily valid. Test of p = 0.75 vs p > 0.75 Sample X N Sample p Z-Value P-Value H 0 : μ = 3369 g. H 1 : μ < 3369 g. Test statistic: t = = Critical value: t = (approximately). P-value < (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the mean birth weight of Chinese babies is less than the mean birth weight of 3369 g for Caucasian babies. 5. H 0 : σ = 567 g. H 1 : σ 567 g. Test statistic: χ ( 81 1) 466 = = Critical values of 567 χ = and χ = P-value is between 0.0 and 0.05 (Tech: 0.09). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the standard deviation of birth weights of Chinese babies is equal to 567 g. Method Chi-Square DF P-Value Standard H 0 : μ = 1.5 mg/m 3. H 1 : μ > 1.5 mg/m 3. Test statistic: t = Critical value: t =.015. P-value > 0.10 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that the sample is from a population with a mean greater than the EPA standard of 1.5 mg/m 3. Because the sample value of 5.40 mg/m 3 appears to be an outlier and because a normal quantile plot suggests that the sample data are not from a normally distributed population, the requirements of the hypothesis test are not satisfied, and the results of the hypothesis test are therefore questionable Probability Plot of Air Lead Probability Air Lead 4 6 Copyright 014 Pearson Education, Inc.

138 136 Chapter 8: Hypothesis Testing 6. (continued) Test of mu = 1.5 vs > 1.5 Variable N Mean StDev SE Mean T P Air Lead H 0 : μ = 5. H 1 : μ 5. Test statistic: t = = Critical values: t =± (approximately). P-value > 0.0 (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the sample is selected from a population with a mean equal to 5. Test of mu = 5 vs not = 5 N Mean StDev SE Mean 95% CI T P (1.40, 7.00) a. A type I error is the mistake of rejecting a null hypothesis when it is actually true. A type II error is the mistake of failing to reject a null hypothesis when in reality it is false. b. Type I error: Reject the null hypothesis that the mean of the population is equal to 5 when in reality, the mean is actually equal to 5. Type II error: Fail to reject the null hypothesis that the population mean is equal to 5 when in reality, the mean is actually different from The χ test has a reasonably strict requirement that the sample data must be randomly selected from a population with a normal distribution, but the numbers are selected in such a way that they are all equally likely, so the population has a uniform distribution instead of the required normal distribution. Because the requirements are not all satisfied, the χ test should not be used. 10. The sample data meet the loose requirement of having a normal distribution. H 0 : μ = 1000 HIC. H 1 : μ < 1000 HIC. Test statistic: t = Critical value: t = P-value < (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the population mean is less than 1000 HIC. The results suggest that the population mean is less than 1000 HIC, so they appear to satisfy the specified requirement. Test of mu = 1000 vs < 1000 Variable N Mean StDev SE Mean T P Booster Cumulative Review Exercises 1. a. x = 53.3 words b. Median = 5.0 words c. s = 15.7 words d. s = 45.1 words e. Range = 45 words. a. Ratio. b. Discrete. c. The sample is a simple random sample if it was selected in such a way that all possible samples of the same size have the same chance of being selected words < μ < 64.5 words Variable N Mean StDev SE Mean 95% CI X (4.10, 64.50) Copyright 014 Pearson Education, Inc.

139 Chapter 8: Hypothesis Testing H 0 : μ = 48.0 words. H 1 : μ > 48.0 words. Test statistic: t = = Critical value: t = P-value > 0.10 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that the mean number of words on a page is greater than There is not enough evidence to support the claim that there are more than 70,000 words in the dictionary. Test of mu = 48 vs > 48 N Mean StDev SE Mean T P a z = = ; P( z> ) =.8%. 1.4 b. 98th percentile: x = μ + z σ = = 38.9 in. c z = = 1.43; P( z< 1.43) = 9.36%. (Tech: 0.934) a. ( 0.15) 3 = It is unlikely because the probability of the event occurring is so small. b. ( 0.097)( 0.15) = c. 1 ( 0.875) 5 = No. The distribution is very skewed. A normal distribution would be approximately bell-shaped, but the displayed distribution is very far from being bell-shaped. 8. Because the vertical scale starts at 7000 and not at 0, the difference between the number of males and the number of females is exaggerated, so the graph is deceptive by creating the wrong impression that there are many more male graduates than female graduates. 9. a. 0.37( 1003) = 373 b. 34.% < p < 40.% Sample X N Sample p 95% CI ( , ) c. Yes. With test statistic z = 8.11 and with a P-value close to 0, there is sufficient evidence to support the claim that less than 50% of adults answer yes. Test of p = 0.5 vs p < 0.5 Sample X N Sample p Z-Value P-Value d. The required sample size depends on the confidence level and the sample proportion, not the population size. 10. H 0 : p = 0.5. H 1 : p < 0.5. Test statistic: z = = Critical value: z =.33. P-value ( 0.5)( 0.5) 1003 = P( z< 8.11) = (Tech: ). Reject H 0. There is sufficient evidence to support the claim that fewer than 50% of Americans say that they have a gun in their home. Copyright 014 Pearson Education, Inc.

140

141 Chapter 9: Inferences from Two Samples Section 9- Chapter 9: Inferences from Two Samples The samples are simple random samples that are independent. For each of the two groups, the number of successes is at least 5 and the number of failures is at least 5. (Depending on what we call a success, the four numbers are 33, 115, 01,9 and 00,745 and all of those numbers are at least 5.) The requirements are satisfied. 33. n 1 = 01,9, p ˆ1 = = , q ˆ1 = = , n = 00,745, 01, ˆ ,745 p = =, p = = ,99+ 00,745 q ˆ = = ,, and q = = a. H 0 : p1 = p. H 1 : p1 < p. b. If the P-value is less than we should reject the null hypothesis and conclude that there is sufficient evidence to support the claim that the rate of polio is less for children given the Salk vaccine than it is for children given a placebo. 4. a. 0.90, or 90% b. Because the confidence interval limits do not contain 0, there appears to be a significant difference between the two proportions. Because the confidence interval consists of negative values only, it appears that the first proportion is less than the second proportion. There is sufficient evidence to support the claim that the rate of polio is less for children given the Salk vaccine than it is for children given a placebo. c. The P-value method and the critical value method are equivalent in the sense that they will always lead to the same conclusion, but the confidence interval method is not equivalent to them. 5. Test statistic: z = 1.39 (rounded). The P-value of E 35 is when rounded to four decimal places. There is sufficient evidence to warrant rejection of the claim that the vaccine has no effect. 6. Test statistic: z =.17. P-value: Because the P-value is greater than the significance level of 0.01, conclude that there is not sufficient evidence to warrant rejection of the claim that for those saying that monitoring is seriously unethical, the proportion of workers is the same as the proportion of managers. For Exercises 7 18, assume that the data fit the requirements for the statistical methods for two proportions unless otherwise indicated. 7. a. H 0 : p1 = p. H 1 : p1> p. Test statistic: z = Critical value: z =.33. P-value: (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the proportion of people over 55 who dream in black and white is greater than the proportion for those under 5. Difference = p (1) - p () Test for difference = 0 (vs > 0): Z = 6.44 P-Value = b. 98% CI: < p1 p < Because the confidence interval limits do not include 0, it appears that the two proportions are not equal. Because the confidence interval limits include only positive values, it appears that the proportion of people over 55 who dream in black and white is greater than the proportion for those under 5. Difference = p (1) - p () 98% CI for difference: ( , ) Copyright 014 Pearson Education, Inc.

142 140 Chapter 9: Inferences from Two Samples 7. (continued) c. The results suggest that the proportion of people over 55 who dream in black and white is greater than the proportion for those under 5, but the results cannot be used to verify the cause of that difference. 8. a. H 0 : p1 = p. H 1 : p1 < p. Test statistic: z = Critical value: z =.33. P-value: (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that the rate of dementia among those who use ginkgo is less than the rate of dementia among those who use a placebo. There is not sufficient evidence to support the claim that ginkgo is effective in preventing dementia. Difference = p (1) - p () Test for difference = 0 (vs < 0): Z = P-Value = b. 98% CI: < p1 p < (Tech: < p1 p < ). Because the confidence interval limits include 0, there does not appear to be a significant difference between dementia rates for those treated with ginkgo and those given a placebo. There is not sufficient evidence to support the claim that the rate of dementia among those who use ginkgo is less than the rate of dementia among those who use a placebo. There is not sufficient evidence to support the claim that ginkgo is effective in preventing dementia. Difference = p (1) - p () 98% CI for difference: ( , ) c. The sample results suggest that ginkgo is not effective in preventing dementia. 9. a. H 0 : p1 = p. H 1 : p1> p. Test statistic: z = Critical value: z = P-value: (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the fatality rate is higher for those not wearing seat belts. Test for difference = 0 (vs > 0): Z = 6.11 P-Value = b. 90% CI: < p1 p <0.01. Because the confidence interval limits do not include 0, it appears that the two fatality rates are not equal. Because the confidence interval limits include only positive values, it appears that the fatality rate is higher for those not wearing seat belts. Difference = p (1) - p () 90% CI for difference: ( , ) c. The results suggest that the use of seat belts is associated with lower fatality rates than not using seat belts. 10. a. H 0 : p1 = p. H 1 : p1 p. Test statistic: z = Critical values: z =±.575 (Tech: ±.576 ). P- value: (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that the survival rates are the same for day and night. Difference = p (1) - p () Test for difference = 0 (vs not = 0): Z = 18.6 P-Value = b. 99% CI: < p1 p < Because the confidence interval limits do not contain 0, there appears to be a significant difference between the two proportions. There is sufficient evidence to warrant rejection of the claim that the survival rates are the same for day and night. Difference = p (1) - p () 99% CI for difference: ( , ) c. The data suggest that for in-hospital patients who suffer cardiac arrest, the survival rate is not the same for day and night. It appears that the survival rate is higher for in-hospital patients who suffer cardiac arrest during the day. Copyright 014 Pearson Education, Inc.

143 Chapter 9: Inferences from Two Samples a. H 0 : p1 = p. H 1 : p1 p. Test statistic: z = Critical values: z =± P-value: (Tech: 0.570). Fail to reject H 0. There is not sufficient evidence to support the claim that echinacea treatment has an effect. Difference = p (1) - p () Test for difference = 0 (vs not = 0): Z = 0.57 P-Value = 0.57 b. 95% CI: < p1 p < Because the confidence interval limits do contain 0, there is not a significant difference between the two proportions. There is not sufficient evidence to support the claim that echinacea treatment has an effect. Difference = p (1) - p () 95% CI for difference: ( , ) c. Echinacea does not appear to have a significant effect on the infection rate. Because it does not appear to have an effect, it should not be recommended. 1. a. H 0 : p1 = p. H 1 : p1 < p. Test statistic: z =.44. Critical value: z =.33. P-value: Reject H 0. There is sufficient evidence to support the claim that the incidence of malaria is lower for infants who use the bednets. Difference = p (1) - p () Test for difference = 0 (vs < 0): Z = -.44 P-Value = b. 98% CI: < p1 p < (Tech: < p1 p < ). Because the confidence interval does not include 0 and it includes only negative values, it appears that the rate of malaria is lower for infants who use the bednets. Difference = p (1) - p () 98% CI for difference: ( , ) c. The bednets appear to be effective. 13. a. H 0 : p1 = p. H 1 : p1 p. Test statistic: z = Critical values: z =± P-value: (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that men and women have equal success in challenging calls. Difference = p (1) - p () Test for difference = 0 (vs not = 0): Z = 0.40 P-Value = b. 95% CI: < p1 p < Because the confidence interval limits contain 0, there is not a significant difference between the two proportions. There is not sufficient evidence to warrant rejection of the claim that men and women have equal success in challenging calls. Difference = p (1) - p () 95% CI for difference: ( , ) c. It appears that men and women have equal success in challenging calls. 14. a. H 0 : p1 = p. H 1 : p1 p. Test statistic: z = Critical values: z =± P-value: (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that New York City police and Los Angeles police have the same proportion of hits. Difference = p (1) - p () Test for difference = 0 (vs not = 0): Z = 1.91 P-Value = b. 95% CI: < p1 p <0.130 (Tech: < p1 p <0.130). Because the confidence interval limits contain 0, there does not appear to be a significant difference between the two proportions. There is not sufficient evidence to warrant rejection of the claim that New York City police and Los Angeles police have the same proportion of hits. Copyright 014 Pearson Education, Inc.

144 14 Chapter 9: Inferences from Two Samples 14. (continued) Difference = p (1) - p () 95% CI for difference: ( , ) c. There does not appear to be a difference between the hit rates of New York City police and Los Angeles police. 15. a. H 0 : p1 = p. H 1 : p1> p. Test statistic: z = Critical value: z =.33. P-value: (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the cure rate with oxygen treatment is higher than the cure rate for those given a placebo. It appears that the oxygen treatment is effective. Difference = p (1) - p () Test for difference = 0 (vs > 0): Z = 9.97 P-Value = b. 98% CI: < p1 p < Because the confidence interval limits do not include 0, it appears that the two cure rates are not equal. Because the confidence interval limits include only positive values, it appears that the cure rate with oxygen treatment is higher than the cure rate for those given a placebo. It appears that the oxygen treatment is effective. Difference = p (1) - p () 98% CI for difference: ( , ) c. The results suggest that the oxygen treatment is effective in curing cluster headaches. 16. a. H 0 : p1 = p. H 1 : p1 < p. Test statistic: z = Critical value: z = P-value: 0.03 (Tech: 0.034). Reject H 0. There is sufficient evidence to support the claim that when given a single large bill, a smaller proportion of women in China spend some or all of the money when compared to the proportion of women in China given the same amount in smaller bills. Difference = p (1) - p () Test for difference = 0 (vs < 0): Z = P-Value = 0.03 b. 90% CI: 0.01 < p1 p < Because the confidence interval does not include 0 and it includes only negative values, it appears that the first proportion is less than the second proportion. There is sufficient evidence to support the claim that when given a single large bill, a smaller proportion of women in China spend some or all of the money when compared to the proportion of women in China given the same amount in smaller bills. Difference = p (1) - p () 90% CI for difference: ( , ) c. Because the P-value is 0.03 (Tech: 0.034), the difference is significant at the 0.05 significance level, but not at the 0.01 significance level. The conclusion does change. 17. a. H 0 : p1 = p. H 1 : p1 < p. Test statistic: z = Critical value: z =.33. P-value: (Tech: 0.114). Fail to reject H 0. There is not sufficient evidence to support the claim that the rate of left-handedness among males is less than that among females. Difference = p (1) - p () Test for difference = 0 (vs < 0): Z = P-Value = 0.11 b. 98% CI: < p1 p <0.065 (Tech: < p1 p < 0.064). Because the confidence interval limits include 0, there does not appear to be a significant difference between the rate of lefthandedness among males and the rate among females. There is not sufficient evidence to support the claim that the rate of left-handedness among males is less than that among females. Copyright 014 Pearson Education, Inc.

145 17. (continued) Chapter 9: Inferences from Two Samples 143 Difference = p (1) - p () 98% CI for difference: ( , ) c. The rate of left-handedness among males does not appear to be less than the rate of left-handedness among females. 18. a. H 0 : p1 = p. H 1 : p1 p. Test statistic: z =.30. Critical values: z =±.575 (Tech: ±.576 ). P- value: (Tech: 0.013). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the rate of those who finish is the same for men and women. Difference = p (1) - p () Test for difference = 0 (vs not = 0): Z =.30 P-Value = 0.01 b. 99% CI: < p1 p < (Tech: < p1 p < ). Because the confidence interval limits contain 0, there does not appear to be a significant difference between the two proportions. There is not sufficient evidence to warrant rejection of the claim that the rate of those who finish is the same for men and women. Difference = p (1) - p () 99% CI for difference: ( , ) c. It appears that men and women finish the New York City marathon at the same rate. 19. a < p1 p <0.17; because the confidence interval limits do not contain 0, it appears that p = p can be rejected. 1 Difference = p (1) - p () 95% CI for difference: ( , ) b < p1 <0.69; < p <0.509; because the confidence intervals do overlap, it appears that p = p cannot be rejected. 1 Sample X N Sample p 95% CI ( , ) ( , c. H 0 : p1 = p. H 1 : p1 p. Test statistic: z =.40. P-value: Critical values: z =± Reject H 0. There is sufficient evidence to reject p1 = p. Difference = p (1) - p () Test for difference = 0 (vs not = 0): Z =.40 P-Value = d. Reject p1 = p. Least effective: Using the overlap between the individual confidence intervals. 0. Hypothesis test: With a test statistic of z = , P-value = 0.05 (Tech: ), reject p1 = p. Confidence interval: 0.4 < p1 p <0.0180, which suggests that we should not reject p1 = p (because 0 is included). The hypothesis test and confidence interval lead to different conclusions about the equality of p1 = p. 1. n Difference = p (1) - p () 95% CI for difference: ( , ) Test for difference = 0 (vs not = 0): Z = P-Value = z / = α = = 3383 (Tech: 338) E 0.0 Copyright 014 Pearson Education, Inc.

146 144 Chapter 9: Inferences from Two Samples Section Independent: b, d, e cm < μ1 μ < cm 3. Because the confidence interval does not contain 0, it appears that there is a significant difference between the mean height of women and the mean height of men. Based on the confidence interval, it appears that the mean height of men is greater than the mean height of women. 4. a. Yes. b. 90% 5. a. H 0 : μ1 = μ. H 1 : μ1 μ. Test statistic: t =.979. Critical values: t =±.03 (Tech: ±.00 ). P- value < 0.01 (Tech: 0.004). Reject H 0. There is sufficient evidence to warrant rejection of the claim that the samples are from populations with the same mean. Color does appear to have an effect on creativity scores. Blue appears to be associated with higher creativity scores. Difference = mu (1) - mu () T-Test of difference = 0 (vs not =): T-Value = -.98 P-Value = DF = 58 b. 95% CI: 0.98 < μ1 μ < 0.18 (Tech: 0.97 < μ1 μ < 0.19) Difference = mu (1) - mu () 95% CI for difference: (-0.970, ) 6. a. H 0 : μ1 = μ. H 1 : μ1 μ. Test statistic: t =.647. Critical values: t =±.03 (Tech: ± ). P- value < 0.0 (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that the samples are from populations with the same mean. Color does appear to have an effect on word recall scores. Red appears to be associated with higher word memory recall scores. Difference = mu (1) - mu () T-Test of difference = 0 (vs not =): T-Value =.65 P-Value = DF = 68 b. 95% CI: 0.83 < μ1 μ < 6.33 (Tech: 0.88 < μ1 μ < 6.8) Difference = mu (1) - mu () 95% CI for difference: (0.88, 6.8) 7. a. H 0 : μ1 = μ. H 1 : μ1 > μ. Test statistic: t = Critical value: t = P-value > 0.10 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that the magnets are effective in reducing pain. It is valid to argue that the magnets might appear to be effective if the sample sizes are larger. Difference = mu (1) - mu () T-Test of difference = 0 (vs >): T-Value = 0.13 P-Value = DF = 33 b. 90% CI: 0.61 < μ1 μ < 0.71 (Tech: 0.59 < μ1 μ < 0.69) Difference = mu (1) - mu () 90% CI for difference: (-0.59, 0.69) 8. a. H 0 : μ1 = μ. H 1 : μ1 < μ. Test statistic: t = Critical value: t =.345 (Tech:.337). P- value > 0.10 (Tech: 0.499). Fail to reject H 0. There is not sufficient evidence to support the claim that the mean number of words spoken in a day by men is less than that for women. Difference = mu (1) - mu () T-Test of difference = 0 (vs <): T-Value = P-Value = 0.50 DF = 364 Copyright 014 Pearson Education, Inc.

147 Chapter 9: Inferences from Two Samples (continued) b. 98% CI: words < μ1 μ < words (Tech: words < μ1 μ < words) Difference = mu (1) - mu () 98% CI for difference: (-437, 1344) 9. a. The sample data meet the loose requirement of having a normal distribution. H 0 : μ1 = μ. H 1 : μ1 > μ. Test statistic: t = Critical value: t =.46 (Tech:.676). P-value > 0.10 (Tech: 0.054). Fail to reject H 0. There is not sufficient evidence to support the claim that men have a higher mean body temperature than women. Difference = mu (1) - mu () T-Test of difference = 0 (vs >): T-Value = 0.85 P-Value = 0.06 DF = 1 b. 98% CI: 0.54 F < μ1 μ < 1.0 F (Tech: 0.51 F < μ1 μ < 0.99 F) Difference = mu (1) - mu () 98% CI for difference: (-0.515, 0.995) 10. a. H 0 : μ1 = μ. H 1 : μ1 μ. Test statistic: t = Critical values: t =±.977 (Tech: ±.789 ). P- value > 0.10 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that men and women have different mean body temperatures. Difference = mu (1) - mu () T-Test of difference = 0 (vs not =): T-Value = 1.56 P-Value = 0.13 DF = 4 b. 99% CI: 0.19 F < μ1 μ < 0.61 F (Tech: 0.17 F < μ1 μ < 0.59 F) Difference = mu (1) - mu () 99% CI for difference: (-0.167, 0.587) 11. a. H 0 : μ1 = μ. H 1 : μ1 < μ. Test statistic: t = Critical value: t =.46 (Tech:.39). P- value < (Tech: ). Reject H 0. There is sufficient evidence to support the claim that the mean maximal skull breadth in 4000 b.c. is less than the mean in a.d Difference = mu (1) - mu () T-Test of difference = 0 (vs <): T-Value = P-Value = DF = 57 b. 98% CI: 8.13 mm < μ1 μ < 1.47 mm (Tech: 8.04 mm < μ1 μ < 1.56 mm) Difference = mu (1) - mu () 98% CI for difference: (-8.04, -1.56) 1. a. H 0 : μ1 = μ. H 1 : μ1 μ. Test statistic: t = Critical value: t =±.01 (Tech:.080). P-value > 0.0 (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that Flight 1 and Flight 3 have the same mean arrival delay time. Difference = mu (1) - mu () T-Test of difference = 0 (vs not =): T-Value = P-Value = DF = 0 b. 95% CI: 18.1 min < μ1 μ < 7.3 min (Tech: 17.4 min < μ1 μ < 6.6 min) Difference = mu (1) - mu () 95% CI for difference: (-17.4, 6.59) Copyright 014 Pearson Education, Inc.

148 146 Chapter 9: Inferences from Two Samples 13. a. H 0 : μ1 = μ. H 1 : μ1 < μ. Test statistic: t = Critical value: t =.46 (Tech:.403). P- value < (Tech: ). Reject H 0. There is sufficient evidence to support the claim that students taking the nonproctored test get a higher mean than those taking the proctored test. Difference = mu (1) - mu () T-Test of difference = 0 (vs <): T-Value = P-Value = DF = 49 b. 98% CI: 5.54 < μ1 μ < 3.10 (Tech: 5.7 < μ1 μ < 3.37) Difference = mu (1) - mu () 98% CI for difference: (-5.7, -3.37) 14. a. H 0 : μ1 = μ. H 1 : μ1 μ. Test statistic: t = Critical values: t =±.756 (Tech: ±.666 ). P- value > 0.0 (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the samples are from populations with the same mean. Difference = mu (1) - mu () T-Test of difference = 0 (vs not =): T-Value = P-Value = DF = 56 b. 99% CI: < μ1 μ < 10.3 (Tech: < μ1 μ < 9.77) Difference = mu (1) - mu () 99% CI for difference: (-17.71, 9.77) 15. a. H 0 : μ1 = μ. H 1 : μ1 μ. Test statistic: t = Critical values: t =±.03 (Tech: ± ). P- value > 0.0 (Tech: 0.066). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that males and females have the same mean BMI. Difference = mu (1) - mu () T-Test of difference = 0 (vs not =): T-Value = 1.7 P-Value = 0.07 DF = 71 b. 95% CI: 1.08 < μ1 μ < 4.76 (Tech: 1.04 < μ1 μ < 4.7) Difference = mu (1) - mu () 95% CI for difference: (-1.04, 4.7) 16. a. H 0 : μ1 = μ. H 1 : μ1 > μ. Test statistic: t =.8. Critical value: t = 1.75 (Tech:.004). P-value < 0.05 (Tech: 0.013). Reject H 0. There is sufficient evidence to support the claim that the mean IQ score of people with low lead levels is higher than the mean IQ score of people with high lead levels. Difference = mu (1) - mu () T-Test of difference = 0 (vs >): T-Value =.8 P-Value = DF = 54 b. 90% CI: 1.5 < μ1 μ < 10.5 (Tech: 1.6 < μ1 μ < 10.4) Difference = mu (1) - mu () 90% CI for difference: (1.59, 10.37) 17. a. H 0 : μ1 = μ. H 1 : μ1 > μ. Test statistic: t = Critical value: t = 1.75 (Tech:.09). P-value > 0.10 (Tech: ) Fail to reject H 0. There is not sufficient evidence to support the claim that the mean IQ score of people with medium lead levels is higher than the mean IQ score of people with high lead levels. Variable N Mean StDev LOW LEAD Difference = mu (1) - mu () T-Test of difference = 0 (vs >): T-Value = 0.09 P-Value = DF = 35 Copyright 014 Pearson Education, Inc.

149 Chapter 9: Inferences from Two Samples (continued) b. 90% CI: 5.9 < μ1 μ < 6.6 (Tech: 5.8 < μ1 μ < 6.4) Difference = mu (1) - mu () Estimate for difference: % CI for difference: (-5.80, 6.45) 18. a. The sample data meet the loose requirement of having a normal distribution. H 0 : μ1 = μ. H 1 : μ1 > μ. Test statistic: t = Critical value: t =.81 (Tech:.411). P-value < (Tech: ). Reject H 0. There is sufficient evidence to support the claim that supermodels have heights with a mean that is greater than the mean height of women in the general population. We can conclude that supermodels are taller than typical women. Variable N Mean StDev Model Height Difference = mu (1) - mu () T-Test of difference = 0 (vs >): T-Value = 1.53 P-Value = DF = 45 b. 98% CI: 4.7 in < μ1 μ < 7.4 in. (Tech: 4.9 in. < μ1 μ < 7. in.) Difference = mu (1) - mu () 98% CI for difference: (4.880, 7.07) 19. a. H 0 : μ1 = μ. H 1 : μ1 < μ. Test statistic: t = Critical value: t =.650 (Tech:.574). P- value > 0.05 (Tech: 0.044). Fail to reject H 0. There is not sufficient evidence to support the claim that the mean longevity for popes is less than the mean for British monarchs after coronation. Difference = mu (Popes) - mu (Kings and Queens) T-Test of difference = 0 (vs <): T-Value = P-Value = DF = 16 b. 98% CI: 3.6 years < μ1 μ < 4.4 years (Tech: 3. years < μ1 μ < 4.0 years) Difference = mu (Popes) - mu (Kings and Queens) 98% CI for difference: (-3.8, 4.10) 0. a. H 0 : μ1 = μ. H 1 : μ1 > μ. Test statistic: t = Critical value: t = (Tech: t = ). P- value < (Tech: 0.004). Reject H 0. There is sufficient evidence to support the claim that the mean amount of strontium-90 from Pennsylvania residents is greater than the mean from New York residents. Difference = mu (Pennsylvania) - mu (New York) T-Test of difference = 0 (vs >): T-Value = 3.7 P-Value = DF = 15 b. 90% CI: 5.0 mbq < μ1 μ < 17.3 mbq (Tech: 5. mbq < μ1 μ < 17.1 mbq) Difference = mu (Pennsylvania) - mu (New York) 90% CI for difference: (5.17, 17.16) 1. H 0 : μ1 = μ. H 1 : μ1 μ. Test statistic: t = Critical values: t =±.03 (Tech: ± ). P-value < 0.01 (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that the two populations have equal means. The difference is highly significant, even though the samples are relatively small. Difference = mu (Pre-1964 Quarters) - mu (Post-1964 Quarters) T-Test of difference = 0 (vs not =): T-Value = 3.77 P-Value = DF = 70 Copyright 014 Pearson Education, Inc.

150 148 Chapter 9: Inferences from Two Samples. 9.1 years < μ1 μ < 5.4 years (Tech: 9.0 years < μ1 μ < 5.3 years). Because the confidence interval includes 0, there is not a significant difference between the two population means. It appears that the sample of men and the sample of women are from populations with the same mean. N Mean StDev MALE AGE FEMALE AGE Difference = mu (MALE AGE) - mu (FEMALE AGE) 95% CI for difference: (-9.00, 5.30) lb < μ1 μ < lb (Tech: lb < μ1 μ < lb). Because the confidence interval does not include 0, there appears to be a significant difference between the two population means. It appears that the cola in cans of regular Pepsi weighs more than the cola in cans of Diet Pepsi, and that is probably due to the sugar in regular Pepsi that is not in Diet Pepsi. N Mean StDev PPREGWT PPDIETWT Difference = mu (PPREGWT) - mu (PPDIETWT) 95% CI for difference: ( , ) 4. H 0 : μ1 = μ. H 1 : μ1 μ. Test statistic: t =.095. Critical values: t =±.03 (Tech: ±.003 ). P-value < 0.01 (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that the two populations have equal means. The difference is due to the sugar in regular Coke that is not in diet Coke. N Mean StDev CKREGWT CKDIETWT Difference = mu (CKREGWT) - mu (CKDIETWT) T-Test of difference = 0 (vs not =): T-Value =.10 P-Value = DF = a. The sample data meet the loose requirement of having a normal distribution. H 0 : μ1 = μ. H 1 : μ1 > μ. Test statistic: t = Critical value: t =.381 (Tech:.38). P-value > 0.10 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that men have a higher mean body temperature than women. Difference = mu (1) - mu () T-Test of difference = 0 (vs >): T-Value = 1.05 P-Value = DF = 68 Both use Pooled StDev = b. 98% CI: 0.31 F < μ1 μ < 0.79 F. The test statistic became larger, the P-value became smaller, and the confidence interval became narrower, so pooling had the effect of attributing more significance to the results. Difference = mu (1) - mu () 98% CI for difference: (-0.307, 0.787) Both use Pooled StDev = a. H 0 : μ1 = μ. H 1 : μ1 < μ. Test statistic: t = Critical value: t =.336. P-value > 0.10 (Tech: 0.477). Fail to reject H 0. There is not sufficient evidence to support the claim that the mean number of words spoken in a day by men is less than that for women. Difference = mu (1) - mu () T-Test of difference = 0 (vs <): T-Value = P-Value = 0.48 DF = 394 Both use Pooled StDev = Copyright 014 Pearson Education, Inc.

151 Chapter 9: Inferences from Two Samples (continued) b. 98% CI: words < μ1 μ < words (Tech: 417. words < μ1 μ < words). The test statistic became larger, the P-value became smaller, and the confidence interval became narrower, so pooling had the effect of attributin 98% CI for difference: (-417, 134) Both use Pooled StDev = H 0 : μ1 = μ. H 1 : μ1 μ. Test statistic: t = Critical values: t =±.080. P-value < 0.01 (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that the two populations have the same mean. ( ) ( μ1 μ) t = ; sp sp + ( 1) ( 1) 0 = = ( 1) + ( 1) s p 8. df = Using df = smaller of n1 1 and n 1 is a more conservative estimate of the number of degrees of freedom (than the estimate obtained with Formula 9-1) in the sense that the confidence interval is wider, so the difference between the sample means needs to be more extreme to be considered a significant difference. 9. a. H 0 : μ1 = μ. H 1 : μ1 < μ. Test statistic: t = Critical value based on degrees of freedom: t =.381 (Tech:.38). P-value < (Tech: ). Reject H 0. There is sufficient evidence to support the claim that students taking the nonproctored test get a higher mean than those taking the proctored test. b < μ1 μ <.96 (Tech: 5.69 < μ1 μ <.95) Section Parts (c) and (e) are true.. d =.4 mi/gal and s d = 1.1mi/gal. μ d represents the mean of the differences from the population of paired data. 3. The test statistic will remain the same. The confidence interval limits will be expressed in the equivalent values of km/l. 4. The first confidence interval shows that we have 95% confidence that the limits of 1.0 mi/gal and 3.8 mi/gal contain the mean of the population of differences, but the second confidence interval shows that we have 95% confidence that the limits of 7.8 mi/gal and 1.6 mi/gal contain the difference between the two population means. Because the first confidence interval does not include 0 mi/gal and consists of positive values only, it appears that the old ratings are higher than the new ratings. Because the second confidence interval does include 0 mi/gal, there does not appear to be a significant different between the mean of the old ratings and the mean of the new ratings. 5. H 0 : μ d = 0 cm. H 1 : μ d > 0 cm. Test statistic: t = (rounded). Critical value: t = P-value > 0.10 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that for the population of heights of presidents and their main opponents; the differences have a mean greater than 0 cm (with presidents tending to be taller than their opponents) cm d μ < <.8 cm. The confidence interval includes 0 cm, so it is very possible that the mean of the differences is equal to 0 cm, indicating that there is no significant difference between heights of presidents and heights of their opponents. Copyright 014 Pearson Education, Inc.

152 150 Chapter 9: Inferences from Two Samples 7. a. d = 11.6 years b. s d = 17. years d μd c. Test statistic t = = = sd n d. H 0 : μ d = 0. H 1 : μd 0. Critical values: t =± a. d = 0.35 F c. d μd Test statistic: t = = =.333 sd n d. H 0 : μ d = 0. H 1 : μd 0. Critical values: t =± 3.18 b. s = 0.30 F H 0 : μ d = 0. H 1 : μd 0. Test statistic: t = = Critical values: t =± P-value > 0.0 (Tech: 0.063). Fail to reject H 0. There is not sufficient evidence to support the claim that there is a difference between the ages of actresses and actors when they win Oscars. Paired T for Actress - Actor T-Test of mean difference = 0 (vs not = 0): T-Value = P-Value = H 0 : μ d = 0. H 1 : μd 0. Test statistic: t = =.333. Critical values: t =± P-value > 0.10 (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that there is no difference between body temperatures measured at 8 A.M. and at 1 A.M. Paired T for 8A.M. - 1A.M. T-Test of mean difference = 0 (vs not = 0): T-Value = -.33 P-Value = min < μd <1.0 min. Because the confidence interval includes only positive values and does not include 0 min, it appears that the taxi-out times are greater than the corresponding taxi-in times, so there is sufficient evidence to support the claim of the flight operations manager that for flight delays, more of the blame is attributable to taxi-out times at JFK than taxi-in times at LAX. Paired T for Out - In 90% CI for mean difference: (0.99, 1.01) d d = 6.5 min ; df = 1 1= 11 sd E = tα / = = 5.5 min n cm 3 < μd < 49.8 cm 3 (Tech: 66.7 cm 3 < μd < 49.7 cm 3 ). Because the confidence interval includes 0 cm 3, the mean of the differences could be equal to 0 cm 3, so there does not appear to be a significant difference. Paired T for First Born - Second Born 99% CI for mean difference: (-66.7, 49.7) 13. H 0 : μ d = 0. H 1 : μ d > 0 d = 8.5 cm 3 ; df = 10 1= 9 sd 56.7 E = tα / = 3.50 = 58.3 cm 3 n Test statistic: t = =.579. Critical value: t = P-value < 0.05 (Tech: 0.047). Reject H 0. There is sufficient evidence to support the claim that among couples, males speak more words in a day than females. Paired T for Male - Female T-Test of mean difference = 0 (vs > 0): T-Value =.58 P-Value = 0.05 Copyright 014 Pearson Education, Inc.

153 Chapter 9: Inferences from Two Samples H 0 : μ d = 0. H 1 : μd 0. Test statistic: t = = Critical values: t =± P-value > 0.01 (Tech: ). Reject H 0. There is sufficient evidence to support the claim of a difference in measurements between the two arms. The right and left arms should yield the same measurements, but the given data show that this is not happening. Paired T for Right arm - Left arm T-Test of mean difference = 0 (vs not = 0): T-Value = P-Value = < μd < 0.. Because the confidence interval does not include 0, it appears that there is sufficient evidence to warrant rejection of the claim that when the 13th day of a month falls on a Friday, the numbers of hospital admissions from motor vehicle crashes are not affected. Hospital admissions do appear to be affected. Paired T for Friday the 6th - Friday the 13th 95% CI for mean difference: (-6.49, -0.17) d = 3.33 cm 3 ; df = 6 1= 5 sd 3.01 E = tα / =.571 = 3. cm 3 n in. < μd <. in. Because the confidence interval limits contain 0, there is not sufficient evidence to support a claim that there is a difference between self-reported heights and measured heights. We might believe that males would tend to exaggerate their heights, but the given data do not provide enough evidence to support that belief. Paired T for Reported - Measured 99% CI for mean difference: (-4.16,.16) 17. H 0 : μ d = 0. H 1 : μ d < 0 d = 1.0 in. ; df = 1 1= 11 sd 3.5 E = tα / = = 3. in. n Test statistic: t = = Critical value: t = P-value > 0.10 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that Harry Potter and the Half-Blood Prince did better at the box office. After a few years, the gross amounts from both movies can be identified, and the conclusion can then be judged objectively without using a hypothesis test. Paired T for Phoenix - Prince T-Test of mean difference = 0 (vs < 0): T-Value = P-Value = H 0 : μ d = 0. H 1 : μ d > Test statistic: t = = Critical value: t = P-value < (Tech: ). Reject H 0. There is sufficient evidence to support the claim that Captopril is effective in lowering systolic blood pressure. Paired T for Before - After T-Test of mean difference = 0 (vs < 0): T-Value = 6.37 P-Value = < μd < Because the confidence interval limits do not contain 0 and they consist of positive values only, it appears that the before measurements are greater than the after measurements, so hypnotism does appear to be effective in reducing pain. Paired T for Before - After 95% CI for mean difference: (0.69, 5.56) d = 3.13 ; df = 8 1= 7 sd.91 E = tα / =.365 =.43 n 8 Copyright 014 Pearson Education, Inc.

154 15 Chapter 9: Inferences from Two Samples F < μd < 6.3 F. Because the confidence interval limits do contain 0 F, there is not a significant difference between the actual high temperatures and those that were forecast five days earlier. This suggests that the forecast temperatures are reasonably accurate. Paired T for Actual High - Forecast High 99% CI for mean difference: (-7.8, 6.8) d = 0.5 F ; df = 8 1= 7 sd 5.48 E = tα / = = 6.8 F n 8 1. H 0 : μ d = 0. H 1 : μd 0. Test statistic: t = Critical values: t =± P-value < 0.01 (Tech: ). Reject H 0. There is sufficient evidence to support the claim that there is a difference between the ages of actresses and actors when they win Oscars. Paired T for Actresses - Actors T-Test of mean difference = 0 (vs not = 0): T-Value = P-Value = H 0 : μ d = 0. H 1 : μd 0. Test statistic: t = Critical values: t =±.08. P-value > 0.0 (Tech: 0.903). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that there is no difference between body temperatures measured at 8 a.m. and at 1 a.m. Paired T for 8 AM - 1 AM T-Test of mean difference = 0 (vs not = 0): T-Value = 0.1 P-Value = H 0 : μ d = 0. H 1 : μ d < 0. Test statistic: t = Critical value of t is between and (Tech: 1.673). P-value > 0.05 (Tech: 0.06). Fail to reject H 0. There is not sufficient evidence to support the claim that among couples, males speak fewer words in a day than females. Paired T for M1 - F1 95% upper bound for mean difference: 135 T-Test of mean difference = 0 (vs < 0): T-Value = P-Value = H 0 : μ d = 0 sec. H 1 : μ d > 0 sec. Test statistic: t = Critical value: t = P-value > 0.10 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that the mean of the differences is greater than 0 sec. There is not sufficient evidence to support the claim that more time is devoted to showing tobacco than alcohol. For animated children s movies, no time should be spent showing the use of tobacco or alcohol. Paired T for Tobacco Use (sec) - Alcohol Use (sec) T-Test of mean difference = 0 (vs > 0): T-Value = 0.94 P-Value = H 0 : μ d = 6.8 kg. H 1 : μd 6.8 kg. Test statistic: t = Critical values: t =± (Tech: ± ). P-value < 0.01 (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that μ d = 6.8 kg. It appears that the Freshman 15 is a myth, and college freshman might gain some weight, but they do not gain as much as 15 pounds. Paired T for WTAPR - WTSEP T-Test of mean difference = 6.8 (vs not = 6.8): T-Value = P-Value = Section a. No. b. No. c. The two samples have the same standard deviation (or variance). Copyright 014 Pearson Education, Inc.

155 . a. s 1 = 6.60 = cm and b. H 0 : σ c. = σ 1 s1 s F = = = s = 6.0 = cm Chapter 9: Inferences from Two Samples 153 d. There is not sufficient evidence to support the claim that heights of men and heights of women have different variances. 3. The F test is very sensitive to departures from normality, which means that it works poorly by leading to wrong conclusions when either or both of the populations has a distribution that is not normal. The F test is not robust against sampling methods that do not produce simple random samples. For example, conclusions based on voluntary response samples could easily be wrong. 4. No. Unlike some other tests which have a requirement that samples must be from normally distributed populations or the samples must have more than 30 values, the F test has a requirement that the samples must be from normally distributed populations, regardless of how large the samples are. 5. H 0 : σ1 = σ. H 1 : σ1 σ. Test statistic: F = Upper critical F value is between and.0739 (Tech: ). P-value: Fail to reject H 0. There is not sufficient evidence to support the claim that weights of regular Coke and weights of regular Pepsi have different standard deviations. 6. H 0 : σ1 = σ. H 1 : σ1 > σ. Test statistic: F = Critical F value is less than (Tech: 1.848). P-value: Fail to reject H 0. There is not sufficient evidence to support the claim that ages of student cars vary more than the ages of faculty cars H 0 : σ1 = σ. H 1 : σ1 σ. Test statistic: F = = Upper critical F value is between and.0739 (Tech: ). P-value: Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the samples are from populations with the same standard deviation. The background color does not appear to have an effect on the variation of word recall scores. Test for Equal Variances F-Test (Normal Distribution) Test statistic = 1.16, p-value = H 0 : σ1 = σ. H 1 : σ1 > σ. Test statistic: F = = Critical F value is between and (Tech: 1.64). P-value: Reject H 0. There is sufficient evidence to support the claim that the numbers of words spoken in a day by men vary more than the numbers of words spoken in a day by women. Test for Equal Variances F-Test (Normal Distribution) Test statistic = 1.40, p-value = H 0 : σ1 = σ. H 1 : σ1 > σ. Test statistic: F = = Critical F value is between and (Tech:.084). P-value: Reject H 0. There is sufficient evidence to support the claim that the treatment group has errors that vary more than the errors of the placebo group. F-Test (Normal Distribution) Test statistic = 9.34, p-value = Copyright 014 Pearson Education, Inc.

156 154 Chapter 9: Inferences from Two Samples H 0 : σ1 = σ. H 1 : σ1 > σ. Test statistic: F = = Critical F value is between and (Tech: ). P-value: Fail to reject H 0. There is not sufficient evidence to support the claim that men have body temperatures that vary more than the body temperatures of women. Test for Equal Variances F-Test (Normal Distribution) Test statistic = 1.8, p-value = H 0 : σ1 = σ. H 1 : σ1 > σ. Test statistic: F = =.167. Critical F value is between.1555 and (Tech:.168). P-value: Fail to reject H 0. There is not sufficient evidence to support the claim that those given a sham treatment (similar to a placebo) have pain reductions that vary more than the pain reductions for those treated with magnets. Test for Equal Variances F-Test (Normal Distribution) Test statistic =.13, p-value = H 0 : σ1 = σ. H 1 : σ1 σ. Test statistic: F = = Upper critical F value is between and.1540 (Tech:.1010). P-value: Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the variation of maximal skull breadths in 4000 b.c. is the same as the variation in a.d Test for Equal Variances F-Test (Normal Distribution) Test statistic = 1.09, p-value = H 0 : σ1 = σ. H 1 : σ1 > σ. Test statistic: F = = Critical F value is between.7876 and (Tech:.8179). P-value: Reject H 0. There is sufficient evidence to support the claim that amounts of strontium-90 from Pennsylvania residents vary more than amounts from New York residents. Test for Equal Variances F-Test (Normal Distribution) Test statistic = 4.16, p-value = H 0 : σ1 = σ. H 1 : σ1 σ. Test statistic: F = = Upper critical F value is between and.5699 (Tech:.5308). P-value: Reject H 0. There is sufficient evidence to warrant rejection of the claim that both populations of longevity times have the same variation. Test for Equal Variances: Kings/Queens, Popes F-Test (Normal Distribution) Test statistic = 4.31, p-value = H 0 : σ1 = σ. H 1 : σ1 σ. Test statistic: F = = Upper critical F value: P-value: Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that females and males have heights with the same amount of variation. Test for Equal Variances F-Test (Normal Distribution) Test statistic = 0.99, p-value = 0.99 Copyright 014 Pearson Education, Inc.

157 Chapter 9: Inferences from Two Samples H 0 : σ1 = σ. H 1 : σ1 > σ. Test statistic: F = = Critical F value: P-value: Fail to reject H 0. There is not sufficient evidence to support the claim that males have weights with more variation than females. Test for Equal Variances F-Test (Normal Distribution) Test statistic = 1.76, p-value = H 0 : σ1 = σ. H 1 : σ1 > σ. Test statistic: F = = Critical F value is between and (Tech: ). P-value: Fail to reject H 0. There is not sufficient evidence to support the claim that males have weights with more variation than females. Test for Equal Variances F-Test (Normal Distribution) Test statistic = 1.5, p-value = H 0 : σ1 = σ. H 1 : σ1 σ. Test statistic: F = = Upper critical F value:.5411 (Tech: ). P-value: Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the two samples are from populations with the same amount of variation. Test for Equal Variances F-Test (Normal Distribution) Test statistic = 1.3, p-value = a. No solution provided. b. c 1 = 4, c = 0 c. Critical value = d. Fail to reject σ log( 0.05 / ) = log = σ Test statistic: t =.055. Critical values: t =±.03 (Tech: ± ). P-value > 0.05 (Tech: ). Using Table A-3 with df = the smaller of n1 1 and n 1, fail to reject σ1 = σ. Using technology with df found from Formula 9-1, reject σ = σ. 1 Difference = mu (pre) - mu (post) T-Test of difference = 0 (vs not =): T-Value =.05 P-Value = DF = F L = 0.77, F R =.8365 Chapter Quick Quiz 1. H 0 : p1 = p. H 1 : p1 p p = = P( z>.04) = < p1 p < Because the data consist of matched pairs, they are dependent. 6. H 0 : μ d = 0. H 1 : μ d > There is not sufficient evidence to support the claim that front repair costs are greater than the corresponding rear repair costs. 8. F distribution Copyright 014 Pearson Education, Inc.

158 156 Chapter 9: Inferences from Two Samples 9. False. 10. True. Review Exercises 1. H 0 : p1 = p. H 1 : p1> p. Test statistic: z = 3.1. Critical value: z =.33. P-value: Reject H 0. There is sufficient evidence to support a claim that the proportion of successes with surgery is greater than the proportion of successes with splinting. When treating carpal tunnel syndrome, surgery should generally be recommended instead of splinting. Difference = p (1) - p () Test for difference = 0 (vs > 0): Z = 3.1 P-Value = % CI: < p1 p <0.33 (Tech: < p1 p < 0.331). The confidence interval limits do not contain 0; the interval consists of positive values only. This suggests that the success rate with surgery is greater than the success rate with splints. Difference = p (1) - p () 98% CI for difference: ( , ) 3. H 0 : p1 = p. H 1 : p1 < p. Test statistic: z = Critical value: z = P-value: (Tech: 0.080). Reject H 0. There is sufficient evidence to support the claim that the fatality rate of occupants is lower for those in cars equipped with airbags. Difference = p (1) - p () Test for difference = 0 (vs < 0): Z = P-Value = H 0 : μ d = 0. H 1 : μ d > 0. Test statistic: t = Critical value: t = P-value < (Tech: ). Reject H 0. There is sufficient evidence to support the claim that flights scheduled 1 day in advance cost more than flights scheduled 30 days in advance. Save money by scheduling flights 30 days in advance. Paired T for Flight scheduled one day in adv - Flight scheduled 30 days in adv T-Test of mean difference = 0 (vs > 0): T-Value = 4.71 P-Value = H 0 : μ d = 0. H 1 : μd 0. Test statistic: t = Critical values: t =±.365. P-value > 0.0 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that there is a difference between self-reported heights and measured heights of females aged Paired T for Reported Height - Measured Height T-Test of mean difference = 0 (vs not = 0): T-Value = P-Value = H 0 : μ1 = μ. H 1 : μ1 > μ. Test statistic: t =.879. Critical value: t =.49 (Tech:.376). P-value < (Tech: 0.006). Reject H 0. There is sufficient evidence to support the claim that stress decreases the amount recalled. Difference = mu (1) - mu () T-Test of difference = 0 (vs >): T-Value =.88 P-Value = DF = % CI: 1.3 < μ1 μ < 14.7 (Tech: 1.4 < μ1 μ < 14.6). The confidence interval limits do not contain 0; the interval consists of positive values only. This suggests that the numbers of details recalled are lower for those in the stress population. Difference = mu (1) - mu () 98% CI for difference: (1.40, 14.60) Copyright 014 Pearson Education, Inc.

159 Chapter 9: Inferences from Two Samples H 0 : p1 = p. H 1 : p1 p. Test statistic: z = 4.0. Critical values: z =±.575. P-value: (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that the acceptance rate is the same with or without blinding. Without blinding, reviewers know the names and institutions of the abstract authors, and they might be influenced by that knowledge. Difference = p (1) - p () Test for difference = 0 (vs not = 0): Z = -4.0 P-Value = H 0 : μ1 = μ. H 1 : μ1 μ. Test statistic: t = Critical values: t =±.014 approximately (Tech: ± ). P-value > 0.0 (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim of no difference between the mean LDL cholesterol levels of subjects treated with raw garlic and subjects given placebos. Both groups appear to be about the same. Difference = mu (1) - mu () T-Test of difference = 0 (vs not =): T-Value = 0.68 P-Value = DF = H 0 : σ1 = σ. H 1 : σ1 σ. Test statistic: F = Upper critical F value is between and (Tech: ). P-value: Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim that the two populations have LDL levels with the same standard deviation. F-Test (Normal Distribution) Test statistic = 1.15, p-value = Cumulative Review Exercises 1. a. Because the sample data are matched with each column consisting of heights from the same family, the data are dependent. b. Mean: in.; median: in.; mode: 6. in.; range: 8.80 in.; standard deviation:.73 in.; variance: 7.43 in c. Ratio. There does not appear to be a correlation or association between the heights of mothers and the heights of their daughters. 69 Heights of Daughters (in.) Heights of Mothers (in.) in. < μ <65.76 in. We have 95% confidence that the limits of in. and in. actually contain the true value of the mean height of all adult daughters. Variable N Mean StDev SE Mean 95% CI Daughters (61.860, ) Copyright 014 Pearson Education, Inc.

160 158 Chapter 9: Inferences from Two Samples 4. H 0 : μ d = 0. H 1 : μd 0. Test statistic: t = Critical values: t =±.6. P-value > 0.0 (Tech: ). Fail to reject H 0. There is not sufficient evidence to warrant rejection of the claim of no significant difference between the heights of mothers and the heights of their daughters. Paired T for Heights of Mothers (in.) - Heights of Daughters (in.) T-Test of mean difference = 0 (vs not = 0): T-Value = 0.8 P-Value = Because the points lie reasonably close to a straight-line pattern and there is no other pattern that is not a straight-line pattern and there are no outliers, the sample data appear to be from a population with a normal distribution < p1 < Because the entire range of values in the confidence interval lies below 0.0, the results do justify the statement that fewer than 0% of Americans choose their computer and/or Internet access when identifying what they miss most when electrical power is lost. Sample X N Sample p 95% CI ( , ) 7. No. Because the Internet users chose to respond, we have a voluntary response sample, so the results are not necessarily valid. [ z ] ˆˆ [ ] α / pq.17 ( 0.5) 8. n = = = 944. The survey should not be conducted using only local phone E 0.0 numbers. Such a convenience sample could easily lead to results that are dramatically different from results that would be obtained by randomly selecting respondents from the entire population, not just those having local phone numbers a. z = = 1.5; P( z> 1.5) = b z = = 3; P( z> 1.5) = c. 80th percentile: x = μ + z σ = = cm. 10. No. Because the states have different population sizes, the mean cannot be found by adding the 50 state means and dividing the total by 50. The mean income for the U.S. population can be found by using a weighted mean that incorporates the population size of each state. Copyright 014 Pearson Education, Inc.

161 Chapter 10: Correlation and Regression Section 10- Chapter 10: Correlation and Regression r represents the value of the linear correlation computed by using the paired sample data. ρ represents the value of the linear correlation coefficient that would be computed by using all of the paired data in the population. The value of r is estimated to be 0 (because there is no correlation between sunspot numbers and the Dow Jones Industrial Average).. No. The value of r = 0 suggests that there is no linear relationship, but there might be some other relationship that is not linear in the sense that the pattern of points in the scatterplot is not a straight-line pattern. 3. The headline is not justified because it states that increased salt consumption is the cause of higher blood pressure levels, but the presence of a correlation between two variables does not necessarily imply that one is the cause of the other. Correlation does not imply causality. A correct headline would be this: Study Shows That Increased Salt Consumption Is Associated with Higher Blood Pressure. 4. Table A-6 shows that the critical values of r are ± 0.31 (assuming a 0.05 significance level), so there is sufficient evidence to support a claim of a linear correlation between the before and after weights. The value of r does not indicate that the diet is effective in reducing weight. While the diet might be effective in reducing weight, there could be a linear correlation if the diet has no effect so that the before and after weights are about the same, or there could be a linear correlation if the diet causes people to gain weight. 5. H 0 : ρ = 0. H 1 : ρ 0; Yes. With r = and critical values of ± 0.31, there is sufficient evidence to support the claim that there is a linear correlation between the durations of eruptions and the time intervals to the next eruptions. 6. H 0 : ρ = 0. H 1 : ρ 0; No. With r = and critical values of ± 0.31, there is not sufficient evidence to support the claim that there is a linear correlation between the durations of eruptions and the heights of eruptions. 7. H 0 : ρ = 0. H 1 : ρ 0; No. With r = and a P-value of (or critical values of ± 0.63 ), there is not sufficient evidence to support the claim that there is a linear correlation between the heights of fathers and the heights of their sons. 8. H 0 : ρ = 0. H 1 : ρ 0; Yes. With r = and critical values of ± 0.497, there is sufficient evidence to support the claim that there is a linear correlation between calories and sugar in a gram of cereal. 9. a. Copyright 014 Pearson Education, Inc.

162 160 Chapter 10: Correlation and Regression 9. (continued) b. H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is sufficient evidence to support the claim of a linear correlation between the two variables. Pearson correlation of x and y = P-Value = 0.00 c. The scatterplot reveals a distinct pattern that is not a straight line pattern. 10. a y x b. H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is sufficient evidence to support the claim of a linear correlation between the two variables. Pearson correlation of x and y = P-Value = 0.00 c. The scatterplot reveals a perfect straight-line pattern, except for the presence of one outlier. 11. a. There appears to be a linear correlation. b. H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± 0.63 (for a 0.05 significance level). There is a linear correlation. Pearson correlation of x and y = P-Value = c. H 0 : ρ = 0. H 1 : ρ 0; r = 0. Critical values: r =± (for a 0.05 significance level). There does not appear to be a linear correlation. Pearson correlation of x and y = P-Value = d. The effect from a single pair of values can be very substantial, and it can change the conclusion. 1. a. There does not appear to be a linear correlation. b. There does not appear to be a linear correlation. c. H 0 : ρ = 0. H 1 : ρ 0; r = 0. Critical values: r =± (for a 0.05 significance level). There does not appear to be a linear correlation. The same results are obtained with the four points in the upper right corner. Pearson correlation of x and y = P-Value = Copyright 014 Pearson Education, Inc.

163 Chapter 10: Correlation and Regression (continued) d. H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± (for a 0.05 significance level). There is a linear correlation. Pearson correlation of x and y = P-Value = e. There are two different populations that should be considered separately. It is misleading to use the combined data from women and men and conclude that there is a relationship between x and y. 13. H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is sufficient evidence to support the claim that there is a linear correlation between weights of lemon imports from Mexico and U.S. car fatality rates. The results do not suggest any cause-effect relationship between the two variables. Pearson correlation of Lemon Imports and Scatterplot of Fatality Rate vs Lemon Imports Fatality Rate = P-Value = Fatality Rate Lemon Imports H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is not sufficient evidence to support the claim that there is a linear correlation between PSAT scores and SAT scores. Because the data are from a voluntary response sample, the results are very questionable. Pearson correlation of PSAT and SAT Scatterplot of SAT vs PSAT = P-Value = SAT PSAT Copyright 014 Pearson Education, Inc.

164 16 Chapter 10: Correlation and Regression 15. H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is not sufficient evidence to support the claim that there is a linear correlation between enrollment and burglaries. The results do not change if the actual enrollments are listed as 3,000, 31,000, 53,000, etc. Pearson correlation of Enrollment and Burglaries = P-Value = Scatterplot of Burglaries vs Enrollment Burglaries Enrollment H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is sufficient evidence to support the claim that there is a linear correlation between altitude and outside air temperature. The results do not change if the altitudes are converted to meters and the temperatures are converted to the Celsius scale. Pearson correlation of Alt and Temp Scatterplot of Temp vs Alt = P-Value = Temp Alt H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is sufficient evidence to support the claim that there is a linear correlation between court incomes and justice salaries. The correlation does not imply that court incomes directly affect justice salaries, but it does appear that justices might profit by levying larger fines, or perhaps justices with higher salaries impose larger fines. Pearson correlation of Income and Salary = P-Value = Salary Scatterplot of Salary vs Income Income Copyright 014 Pearson Education, Inc.

165 Chapter 10: Correlation and Regression H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is sufficient evidence to support the claim that there is a linear correlation between the opening bids suggested by the auctioneer and the final winning bids. Pearson correlation of Open and Win Scatterplot of Win vs Open = P-Value = Win Open H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is sufficient evidence to support the claim that there is a linear correlation between amounts of redshift and distances to clusters of galaxies. Because the linear correlation coefficient is 1.000, it appears that the distances can be directly computed from the amounts of redshift. Pearson correlation of Red and Dist Scatterplot of Dist vs Red = P-Value = Dist Red H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is sufficient evidence to support the claim that there is a linear correlation between weights and prices. The results do not necessarily apply to other populations of diamonds, such as those with different color and clarity ratings. Pearson correlation of Weight and Price Scatterplot of Price vs Weight = P-Value = Price Weight Copyright 014 Pearson Education, Inc.

166 164 Chapter 10: Correlation and Regression 1. H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is sufficient evidence to support the claim of a linear correlation between the overhead width of a seal in a photograph and the weight of a seal. Pearson correlation of Width and Weight Scatterplot of Weight vs Width = P-Value = Wght Width H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is not sufficient evidence to support the claim of a linear correlation between the repair costs from full-front crashes and full-rear crashes. Pearson correlation of Front and Rear Scatterplot of Rear vs Front = P-Value = Rear Front H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is not sufficient evidence to support the claim of a linear correlation between the systolic blood pressure measurements of the right and left arm. Pearson correlation of Right Arm and Left Scatterplot of Left Arm vs Right Arm Arm = P-Value = Left Arm Right Arm Copyright 014 Pearson Education, Inc.

167 Chapter 10: Correlation and Regression H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is sufficient evidence to support the claim of a linear correlation between number of cricket chirps and temperature. Pearson correlation of Chirps and Temp Scatterplot of Temp( F) vs Chirps = P-Value = Temp( F) Chirps H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is not sufficient evidence to support the claim that there is a linear correlation between prices of regular gas and prices of premium gas. Because there does not appear to be a linear correlation between prices of regular and premium gas, knowing the price of regular gas is not very helpful in getting a good sense for the price of premium gas. Pearson correlation of Reg and Prem Scatterplot of Prem vs Reg = P-Value = Prem Reg H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is not sufficient evidence to support the claim that there is a linear correlation between prices of regular gas and prices of mid-grade gas. Because there does not appear to be a linear correlation between prices of regular and midgrade gas, knowing the price of regular gas is not very helpful in getting a good sense for the price of midgrade gas. Pearson correlation of Reg and Mid Scatterplot of Mid vs Reg = P-Value = Mid Reg Copyright 014 Pearson Education, Inc.

168 166 Chapter 10: Correlation and Regression 7. H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is sufficient evidence to support the claim that there is a linear correlation between diameters and circumferences. A scatterplot confirms that there is a linear association between diameters and volumes. Pearson correlation of Diam and Circ = P-Value = Scatterplot of Circum vs Diam Circum Diam H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is sufficient evidence to support the claim that there is a linear correlation between diameters and volumes. Although the results suggest that there is a linear correlation between diameters and volumes, the scatterplot suggests that there is a very strong correlation that is not linear. Pearson correlation of Diam and Vol = Scatterplot of Vol vs Diam P-Value = Vol Diam H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is not sufficient evidence to support the claim of a linear correlation between IQ and brain volume. Pearson correlation of IQ and VOL = Scatterplot of VOL vs IQ P-Value = VOL IQ Copyright 014 Pearson Education, Inc.

169 Chapter 10: Correlation and Regression H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± 0.79 (approximately) (Tech: ± 0.85 ). P-value = There is sufficient evidence to support the claim of a linear correlation between departure delay times and arrival delay times. Pearson correlation of Dep Delay and Arr Scatterplot of Arr Delay vs Dep Delay Delay = P-Value = Arr Delay Dep Delay H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± 0.54 (approximately) (Tech: ±0.63). P-value = There is sufficient evidence to support the claim of a linear correlation between the numbers of words spoken by men and women who are in couple relationships. Pearson correlation of M1 and F1 = Scatterplot of M1 vs F1 P-Value = M F H 0 : ρ = 0. H 1 : ρ 0; r = Critical values: r =± P-value = There is not sufficient evidence to support the claim of a linear correlation between magnitudes of earthquakes and their depths. Pearson correlation of MAG and DEPTH Scatterplot of MAG vs DEPTH = P-Value = MAG DEPTH 15 0 Copyright 014 Pearson Education, Inc.

170 168 Chapter 10: Correlation and Regression 33. a. r = Pearson correlation of y and x = P-Value = b. r = Pearson correlation of y and x^ = P-Value = c. r = (largest) Pearson correlation of y and LOG(x) = P-Value = r =± t =±.485 =± t + n Section 10-3 d. r = Pearson correlation of y and SQRT(x) = P-Value = e. r = Pearson correlation of y and 1/x = P-Value = The symbol ŷ represents the predicted pulse rate. The predictor variable represents height. The response variable represents pulse rate.. The regression line has the property that the sum of squares of the residuals is the lowest possible sum (where a residual is the difference between an observed value of y and a predicted value of y). 3. If r is positive, the regression line has a positive slope and rises from left to right. If r is negative, the slope of the regression line is negative and it falls from left to right. 4. The first equation represents the regression line that best fits sample data, whereas the second equation represents the regression line that best fits all paired data in a population. 5. The regression line fits the points well, so the best predicted time for an interval after the eruption is y ˆ = ( 10) = 69 min. 6. The regression line does not fit the points well, so the best predicted height is y = 17. ft. 7. The regression line does not fit the points well, so the best predicted height is y = 68.0 in. 8. The regression line fits the points well, so the best predicted value is y ˆ = ( 0.40) = 3.86 calories. 9. yˆ = x. The data have a pattern that is not a straight line. Predictor Coef SE Coef T P Constant x Scatterplot of y vs x y x Copyright 014 Pearson Education, Inc.

171 Chapter 10: Correlation and Regression yˆ = x. There is an outlier. Predictor Coef SE Coef T P Constant x Scatterplot of y vs x y x a. yˆ = x Predictor Coef SE Coef T P Constant x b. yˆ = + 0x (or y ˆ = ) Predictor Coef SE Coef T P Constant x c. The results are very different, indicating that one point can dramatically affect the regression equation. 1. a. yˆ = x Predictor Coef SE Coef T P Constant x b. yˆ = x (or y ˆ = 1.5 ) Predictor Coef SE Coef T P Constant x c. yˆ = x (or y ˆ = 9.5 ) Predictor Coef SE Coef T P Constant x d. The results are very different, indicating that combinations of clusters can produce results that differ dramatically from results within each cluster alone. Copyright 014 Pearson Education, Inc.

172 170 Chapter 10: Correlation and Regression 13. yˆ = x; The regression line fits the points well, so the best predicted value is y ˆ = ( 500) = 15.1 fatalities per 100,000 population. Predictor Coef SE Coef T P Constant Lemon Scatterplot of Fatality Rate vs Lemon Imports Fatality Rate Lemon Imports Scatterplot of Residuals vs Lemon Imports Probability Plot of Residuals Normal RESI Probability Lemon Imports Residuals yˆ = x; The regression line does not fit the points well, so the best predicted value is y = 153. The result is not close to the actual reported value of 400. Because the data are from a voluntary response sample, the results have questionable validity. Predictor Coef SE Coef T P Scatterplot of SAT vs PSAT Constant PSAT SAT PSAT Copyright 014 Pearson Education, Inc.

173 Chapter 10: Correlation and Regression (continued) 00 Scatterplot of Residuals vs PSAT Probability Plot of Residuals Normal Residuals Probability PSAT Residuals yˆ = x; The regression line does not fit the points well, so the best predicted value is y = 87.7 burglaries. The predicted value is not close to the actual value of 39 burglaries. Predictor Coef SE Coef T P Scatterplot of Burglaries vs Enrollment Constant Enrollment Burglaries Enrollment Scatterplot of Residuals vs Enrollment Probability Plot of Residuals Normal Residuals Probability Enrollment Residuals Copyright 014 Pearson Education, Inc.

174 17 Chapter 10: Correlation and Regression 16. yˆ = x; The best predicted value is y ˆ = ( 6.37) = 49. F. The predicted value is close to the actual value of 48 F. Predictor Coef SE Coef T P Scatterplot of Temp( F) vs Altitude Constant Altitude Temp( F) Altitude Scatterplot of Residuals vs Altitude 0.99 Probability Plot of Residuals Normal Residuals Probability Altitude Residuals yˆ = x; The best predicted value is y ˆ = ( ) = 30.8, which represents $30,800. The predicted value is not very close to the actual salary of $6,088. The possible outliers might explain the inaccuracy. Predictor Coef SE Coef T P Scatterplot of Justice Salary vs Court Income Constant Income Justice Salary Court Income Copyright 014 Pearson Education, Inc.

175 Chapter 10: Correlation and Regression (continued) 5 0 Scatterplot of Residuals vs Court Income 0.99 Probability Plot of Residuals Normal Residuals Probability Court Income Residuals yˆ = x ; The best predicted value is y ˆ = ( 300 ) = $14. The predicted value is not very close to the actual winning bid of $50. The one influential outlier would account for this inaccuracy. Predictor Coef SE Coef T P Scatterplot of Winning Bid vs Opening Bid Constant Opening Bid Winning Bid Opening Bid Scatterplot of Residuals vs Opening Bid 0.99 Probability Plot of Residuals Normal Residuals Probability Opening Bid Residuals Copyright 014 Pearson Education, Inc.

176 174 Chapter 10: Correlation and Regression 19. yˆ = x ; The best predicted value is y ˆ = ( 0.016) = 0.17 billion lightyears. The predicted value is very close to the actual distance of 0.18 light-years. Predictor Coef SE Coef T P Constant Redshift Distance Scatterplot of Distance vs Redshift Redshift Scatterplot of Residuals vs Redshift Probability Plot of Residuals Normal Residuals Probability Redshift Residuals yˆ = x ; best predicted value is y ˆ = ( 1.5 ) = $8760. (Tech: $8759). The predicted value is far from the actual price of $16,097. The weight of 1.50 carats is well beyond the scope of the available sample weights, so the extrapolation might be off by a considerable amount. Predictor Coef SE Coef T P Scatterplot of Price vs Weight Constant Weight Price Weight Copyright 014 Pearson Education, Inc.

177 Chapter 10: Correlation and Regression (continued) 500 Scatterplot of Residuals vs Weight Probability Plot of Residuals Normal Residuals Probability Weight Residuals yˆ = x; The best predicted weight is y ˆ = ( ) = 76.6 kg. (Tech: 76.5 kg). That prediction is a negative weight that cannot be correct. The overhead width of cm is well beyond the scope of the available sample widths, so the extrapolation might be off by a considerable amount. Predictor Coef SE Coef T P Scatterplot of Weight vs Width Constant Width Weight Width Scatterplot of Residuals vs Width Probability Plot of Residuals Normal Residuals Probability Width Residuals Copyright 014 Pearson Education, Inc.

178 176 Chapter 10: Correlation and Regression. yˆ = x; The regression line does not fit the data well, so the best predicted cost is y = $1615. The predicted cost of $1615 is very different from the actual cost of $98. Predictor Coef SE Coef T P Scatterplot of Rear vs Front Constant Front Rear Front Scatterplot of Residuals vs Front Probability Plot of Residuals Normal Residuals Probability Front Residuals yˆ = x ; The regression line does not fit the data well, so the best predicted value is y = 163. mm Hg. Predictor Coef SE Coef T P Constant Right Arm Scatterplot of Left Arm vs Right Arm 170 Left Arm Right Arm Copyright 014 Pearson Education, Inc.

179 Chapter 10: Correlation and Regression (continued) 15 Scatterplot of Residuals vs Right Arm Probability Plot of Residuals Normal Residuals Probability Right Arm Residuals yˆ = x; best predicted value is y ˆ = ( 3000) =185 F (Tech: 184 F). The value of 3000 chirps in 1 minute is well beyond the scope of the available sample data, so the extrapolation might be off by a considerable amount. Predictor Coef SE Coef T P Scatterplot of Temp( F) vs Chirps Constant Chirps Temp( F) Chirps Scatterplot of Residuals vs Chirps Probability Plot of Residuals Normal Residuals Probability Chirps Residuals 5 10 Copyright 014 Pearson Education, Inc.

180 178 Chapter 10: Correlation and Regression 5. yˆ = x; The regression line does not fit the data well, so the best predicted value is y = $3.05. The predicted price is not very close to the actual price of $.93. Predictor Coef SE Coef T P Scatterplot of Premium vs Regular Constant Regular Premium Regular Scatterplot of Residuals vs Regular Probability Plot of Residuals Normal 0.99 Residuals Probability Regular Residuals yˆ = x; The regression line does not fit the data well, so the best predicted value is y = $.91. The predicted price is not too far from the actual price. Predictor Coef SE Coef T P Constant Regular Scatterplot of Mid-Grade vs Regular Mid-Grade Regular Copyright 014 Pearson Education, Inc.

181 Chapter 10: Correlation and Regression (continued) 0.10 Scatterplot of Residuals vs Regular 0.99 Probability Plot of Residuals Normal Residuals Probability Regular Residuals yˆ = x; The best predicted value is y ˆ = ( 1.50) = 4.7 cm. Even though the diameter of 1.50 cm is beyond the scope of the sample diameters, the predicted value yields the actual circumference. Predictor Coef SE Coef T P Scatterplot of Circumference vs Diameter Constant Diameter Circumference Diameter Scatterplot of Residuals vs Diameter 0.99 Probability Plot of Residuals Normal Residuals Probability Diameter Residuals Copyright 014 Pearson Education, Inc.

182 180 Chapter 10: Correlation and Regression 3 8. yˆ = x ; The best predicted value is y ˆ = ( 1.50) = cm (Tech: cm 3 ). The predicted value is negative and is far from the actual volume of 1.8 cm 3. The diameter of 1.50 cm is beyond the scope of the sample diameters, and the predicted value is way wrong. The scatterplot and residual plot suggest that a nonlinear model would yield better results. Predictor Coef SE Coef T P Scatterplot of Volume vs Diameter Constant Diameter Volume Diameter Scatterplot of Residuals vs Diameter Probability Plot of Residuals Normal Residuals Probability Diameter Residuals yˆ = x; The regression line does not fit the data well, so the best predicted IQ score is y = 101. Predictor Coef SE Coef T P Constant VOLUME Scatterplot of IQ vs VOLUME 110 IQ VOLUME Copyright 014 Pearson Education, Inc.

183 Chapter 10: Correlation and Regression (continued) 30 Scatterplot of Residuals vs VOLUME Probability Plot of Residuals Normal Residuals Probability VOLUME Residuals yˆ = x; best predicted arrival delay time is y ˆ = ( 0) = 18.4 minutes. That is, if a flight has no departure delay, we can predict that the flight will arrive 18.4 minutes early. Predictor Coef SE Coef T P Scatterplot of Arr Delay vs Dep Delay Constant Dep Delay Arr Delay Dep Delay Scatterplot of Residuals vs Dep Delay Probability Plot of Residuals Normal Residuals Probability Dep Delay Residuals Copyright 014 Pearson Education, Inc.

184 18 Chapter 10: Correlation and Regression 31. yˆ = 13, x; The best predicted value is y ˆ = 13, ( 10,000) = 16,400 (Tech: 16,458). Predictor Coef SE Coef T P Constant M Scatterplot of F1 vs M1 F M Scatterplot of Residuals vs M1 Probability Plot of Residuals Normal Residuals Probability M Residuals yˆ = x; The regression line does not fit the data well, so the best predicted value is y = 9.81 km. Predictor Coef SE Coef T P Constant MAG Scatterplot of DEPTH vs MAG DEPTH MAG Copyright 014 Pearson Education, Inc.

185 Chapter 10: Correlation and Regression (continued) 10 Scatterplot of Residuals vs MAG Probability Plot of Residuals Normal Residuals 0-5 Probability MAG Residuals With β 1 = 0, the regression line is horizontal so that different values of x result in the same y value, and there is no correlation between x and y. 34. a b. The sum of squares of the residuals is 101.3, which is larger than yˆ = x ŷ y ( ŷ y) yˆ = x ŷ y ( ŷ y) Section Sum Sum The value of s e = cm is the standard error of estimate, which is a measure of the differences between the observed weights and the weights predicted from the regression equation. It is a measure of the variation of the sample points about the regression line.. We have 95% confidence that the limits of 50.7 kg and 13.0 kg contain the value of the weight for a male with a height of 180 cm. The major advantage of using a prediction interval is that it provides us with a range of likely weights, so we have a sense of how accurate the predicted weight is likely to be. The terminology of prediction interval is used for an interval estimate of a variable, whereas the terminology of confidence interval is used for an interval estimate of a population parameter. 3. The coefficient of determination is r = ( 0.356) = We know that 1.7% of the variation in weight is explained by the linear correlation between height and weight, and 87.3% of the variation in weight is explained by other factors and/or random variation. 4. For the paired weights, s e = 0 because there is an exact conversion formula. For a textbook that weighs lb, the predicted weight is =.04 kg, and there is no prediction interval because the conversion.05 yields an exact result. 5. r = ( 0.933) = % of the variation in waist size is explained by the linear correlation between weight and waist size, and 13.0% of the variation in waist size is explained by other factors and/or random variation. Copyright 014 Pearson Education, Inc.

186 184 Chapter 10: Correlation and Regression 6. r = ( 0.963) = % of the variation in weight is explained by the linear correlation between chest size and weight, and 7.3% of the variation in weight is explained by other factors and/or random variation. 7. r = ( 0.793) = % of the variation in highway fuel consumption is explained by the linear correlation between weight and highway fuel consumption, and 37.1% of the variation in highway fuel consumption is explained by other factors and/or random variation. 8. r = ( 0.751) = % of the variation in household size is explained by the linear correlation between weight of discarded plastic and household size, and 43.6% of the variation in household size is explained by other factors and/or random variation. 9. r = Critical values: r =± 0.31 (assuming a 0.05 significance level). P-value = There is sufficient evidence to support a claim of a linear correlation between foot length and height. 10. r = ( 0.84) = % of the variation in height is explained by the linear correlation between foot length and height. 11. y ˆ = ( 9.0 ) =189 cm cm < y < 00 cm. We have 95% confidence that the limits of 177 cm and 00 cm contain the height of someone with a foot length of 9.0 cm cm < y < 183 cm yˆ = ( 5) = n( x0 x) 1 40( ) E = tα /se 1+ + =.04( ) 1+ + = n n Σx Σx 40 40( ) ( 107.) ( ) ( ) cm < y < 186 cm (Tech: 156 cm < y < 187 cm ) yˆ = ( 5) = n( x0 x) 1 40( ) E = tα /se 1+ + =.71( ) 1+ + = n n Σx Σx 40 40( ) ( 107.) cm < y < 168 cm ( ) ( ) yˆ = ( ) = n( x0 x) 1 40( 5.68) E = tα /se 1+ + = 1.686( ) 1+ + = n n Σx Σx 40 40( ) ( 107.) cm < y < 187 cm ( ) ( ) yˆ = ( 6) = n( x0 x) 1 40( ) E = tα /se 1+ + =.04( ) 1+ + = 1.89 n n Σx Σx 40 40( ) ( 107.) ( ) ( ) Copyright 014 Pearson Education, Inc.

187 Chapter 10: Correlation and Regression a. 10,66.59 b c F < y < 60.4 F Analysis of Variance Source DF SS MS F P Regression Residual Error Total Predicted Values for New Observations Obs Fit SE Fit 95% CI 95% PI (43.6, 55.1) (37.96, 60.4) 18. a b c. $10,400 < y < $105,000 Analysis of Variance Source DF SS MS F P Regression Residual Error Total Predicted Values for New Observations Obs Fit SE Fit 99% CI 99% PI (39.75, 75.31) (10.43, ) 19. a b c billion light-years < y < billion light-years Analysis of Variance Source DF SS MS F P Regression Residual Error Total Predicted Values for New Observations Obs Fit SE Fit 90% CI 90% PI ( , ) ( , ) 0. a. 16,139,685 b. 1,097,655 c. $051 < y < $5419 Analysis of Variance Source DF SS MS F P Regression Residual Error Total Predicted Values for New Observations Obs Fit SE Fit 95% CI 95% PI (886, 4583) (051, 5419) Copyright 014 Pearson Education, Inc.

188 186 Chapter 10: Correlation and Regression < β0 < 103 ;.46 < β1 < 3.98 CI for β 0 b E< β < b + E x b = 80.93; E = t s + =.04( ) + =.06 0 α / e ( Σ ) n x Σx n cm < y < 176 cm Section 10-5 CI for β 1 b E< β < b + E se b1 = 3.186; E = tα / =.04 = ( Σx) Σx n 40 yˆ E< y < yˆ+ E yˆ = ( 9) = n( x0 x) 1 40( 9 9.0) E = tα /se + =.04 ( ) + = 1.90 n n Σx Σx 40 40( 33933) ( ) ( ) ( ) 1. The response variable is weight and the predictor variables are length and chest size.. No, it is not better to use the regression equation with the three predictor variables of length, chest size, and neck size. The adjusted R value of 0.95 is just a little less than 0.933, so in this case it is better to use two predictor variables instead of three. 3. The unadjusted R increases (or remains the same) as more variables are included, but the adjusted R is adjusted for the number of variables and sample size. The unadjusted R incorrectly suggests that the best multiple regression equation is obtained by including all of the available variables, but by taking into account the sample size and number of predictor variables, the adjusted R is much more helpful in weeding out variables that should not be included % of the variation in weights of bears can be explained by the variables of length and chest size, so 7.% of the variation in weights can be explained by other factors and/or random variation. 5. LDL = WT SYS. 6. a b. 9.8%, or c. 4.9%, or No. The P-value of is not very low, and the values of R (0.098) and adjusted R (0.049) are not high. Although the multiple regression equation fits the sample data best, it is not a good fit. 8. Predicted LDL = ( 59.3) ( 1) = 113 mg/dl. This result is not likely to be a good predicted value because the multiple regression equation is not a good model (based on the results from Exercise 7). 9. HWY (highway fuel consumption) because it has the best combination of small P-value (0.000) and highest adjusted R (0.90). Copyright 014 Pearson Education, Inc.

189 Chapter 10: Correlation and Regression WT (weight) and HWY (highway fuel consumption) because they have the best combination of small P- value (0.000) and highest adjusted R (0.935). 11. CITY = HWY. That equation has a low P-value of and its adjusted R value of 0.90 isn t very much less than the values of 0.98 and that use two predictor variables, so in this case it is better to use the one predictor variable instead of two. 1. Predicted city fuel consumption is CITY = ( 36) = 6.3 mi/gal (based on the result from Exercise 11). The predicted value is a good estimate, but it might not be very accurate because the sample consists of only 1 cars. 13. The best regression equation is yˆ = x x, where x 1 represents tar and x represents carbon monoxide. It is best because it has the highest adjusted R value of 0.97 and the lowest P-value of It is a good regression equation for predicting nicotine content because it has a high value of adjusted R and a low P-value. Predictor Coef SE Coef T P Constant Tar S = R-Sq = 88.% R-Sq(adj) = 87.7% Predictor Coef SE Coef T P Constant CO S = R-Sq = 46.0% R-Sq(adj) = 43.7% Predictor Coef SE Coef T P Constant Tar CO S = R-Sq = 93.3% R-Sq(adj) = 9.7% 14. The best regression equation is yˆ = x x, where x 1 represents tar and x represents carbon monoxide. It is best because it has the highest adjusted R value of and the lowest P-value of It is a good regression equation for predicting nicotine content because it has a high value of adjusted R and a low P-value. Predictor Coef SE Coef T P Constant Menth Tar S = R-Sq = 76.% R-Sq(adj) = 75.% Predictor Coef SE Coef T P Constant Menth CO S = R-Sq = 31.3% R-Sq(adj) = 8.3% Predictor Coef SE Coef T P Constant Menth Tar Menth CO S = R-Sq = 91.5% R-Sq(adj) = 90.8% Copyright 014 Pearson Education, Inc.

190 188 Chapter 10: Correlation and Regression 15. The best regression equation is yˆ = x1, where x 1 represents volume. It is best because it has the highest adjusted R value of and the lowest P-value of The three regression equations all have adjusted values of R that are very close to 0, so none of them are good for predicting IQ. It does not appear that people with larger brains have higher IQ scores. Predictor Coef SE Coef T P Constant VOL S = R-Sq = 0.4% R-Sq(adj) = 0.0% Predictor Coef SE Coef T P Constant WT S = R-Sq = 0.0% R-Sq(adj) = 0.0% Predictor Coef SE Coef T P Constant VOL WT S = R-Sq = 0.4% R-Sq(adj) = 0.0% 16. The best regression equation is yˆ = x x, where x 1 represents verbal IQ score and R value of and the x represents performance IQ score. It is best because it has the highest adjusted lowest P-value of Because the adjusted very accurate. Predictor Coef S E Coef T P Constant IQV S = R-Sq = 76.4% R-Sq(adj) = 76.% Predictor Coef SE Coef T P Constant IQP S = R-Sq = 81.6% R-Sq(adj) = 81.4% Predictor Coef SE Coef T P Constant IQV IQP S = R-Sq = 99.9% R-Sq(adj) = 99.9% R is so close to 1, it is likely that predicted values will be For H 0 : β 1 = 0, the test statistic is t = = 5.486, the P-value is 0.000, and the critical values are t =±.110, so reject H 0 and conclude that the regression coefficient of b 1 = should be kept. For H 0 : β = 0, the test statistic is t = = 1.9, the P-value is 0.13, and the critical values are t =±.110, so fail to reject H 0 and conclude that the regression coefficient of b = should be omitted. It appears that the regression equation should include the height of the mother as a predictor variable, but the height of the father should be omitted. Copyright 014 Pearson Education, Inc.

191 1 Chapter 10: Correlation and Regression < β1 < ; < β < The confidence interval for β includes 0, suggesting that the father s height be eliminated as a predictor variable. CI for β 1 b1 E< β1 < b1+ E b1 tα/s1 < β1 < b1 + tα/s < β1 < < β < CI for β b E< β < b + E b tα/s < β < b + tα/s < β < < β < yˆ = x1+.91x, where x 1 represents sex and x represents age. Female: y ˆ = ( 0) +.91( 0) = 61 lb ; male: y ˆ = () ( 0) = 144 lb. The sex of the bear does appear to have an effect on its weight. The regression equation indicates that the predicted weight of a male bear is about 8 lb more than the predicted weight of a female bear with other characteristics being the same. Predictor Coef SE Coef T P Constant SEX AGE Section Since the area of a square is the square of its side, the best model is y = x ; quadratic; R = 1. Quadratic is best because it has the highest R value, but this is not a good model because the value of R is so low. Using the models discussed in this section, it appears that we cannot make accurate predictions of the numbers of points scored in future Super Bowl games. Common sense suggests that no such model could be found % of the variation in Super Bowl points can be explained by the quadratic model that relates the variable of year and the variable of points scored. Because such a small percentage of the variation is explained by the model, the model is not very useful. 4. Instead of showing a pattern that approximates the graph of the quadratic equation, the points are scattered about with no obvious pattern. The points do not fit the graph of the quadratic equation well, so the value of R = is very low. Copyright 014 Pearson Education, Inc.

192 190 Chapter 10: Correlation and Regression 5. Quadratic: d = 4.88t t+ 300 Model R Linear 0.96 Quadratic Logarithmic Exponential Power y = x x R = 1 6. Power: y= 35x 3 Model R Linear 0.99 Quadratic Logarithmic 0.88 Exponential Power y = x 3 R = 1 7. Exponential: y = 100( 1.03 x ) The value of R is slightly higher for the exponential model. Model R Linear Quadratic Logarithmic Exponential Power y = 100(1.03) x R = 1 8. Logarithmic: y= ln x Model R Linear Quadratic Logarithmic Exponential Power y = 4.346Ln(x) R = 1 Copyright 014 Pearson Education, Inc.

193 Chapter 10: Correlation and Regression Power: y= 65.7x. Prediction for the nd day: y = 65.7( ) = $3.5 million, which isn t very close to the actual amount of $. million. The model does not take into account the fact that movies do better on weekend days. Model R Linear 0.56 Quadratic Logarithmic 0.80 Exponential 0.79 y = 65.7x Power 0.84 R = Quadratic: y x x = ,084. (with 1975 coded as 1). Projected value for 00: y = 33( 10) ( 10) + 45, 084 = 6,434 (Tech: 6,454). Model R Linear Quadratic 0.65 Logarithmic Exponential Power y = -3.77x x R = Logarithmic: y= ln x Model R Linear 0.60 Quadratic Logarithmic Exponential Power y = 0.933Ln(x) R = Copyright 014 Pearson Education, Inc.

194 19 Chapter 10: Correlation and Regression 1. Power: y= x Model R Linear Quadratic Logarithmic Exponential Power y = x R = Exponential: y = 10( x ) Model R Linear Quadratic Logarithmic Exponential Power 0.97 y = 10() x R = Exponential: y = 115( 0.938) x. (Result is based on 1980 coded as 1.) With R = 0.53, this best model is not a good model. There is a cyclical pattern that does not fit any of the five models included in this section. Model R Linear 0.18 Quadratic 0.18 Logarithmic 0.5 Exponential 0.53 Power 0.33 y = (0.938) x R = Copyright 014 Pearson Education, Inc.

195 Chapter 10: Correlation and Regression Quadratic: = The projected value for 010 is ( ) ( ) y x x y = = 49,344 (Tech: 49,31), which is dramatically greater than the actual value of 11,655. Model R Linear Quadratic Logarithmic Exponential Power y = 14.93x x R = Quadratic: y x x = The projected value for 010 is y = 31.4( 1) + 133( 1) + 80 = 1,36 (Tech: 1,345), which is not dramatically different from the actual value of 11,655. Model R Linear Quadratic Logarithmic 0.8 Exponential Power y = x x R = ( 1) 17. a. Exponential: y x = [or y = ( ) x for an initial value of 1 that doubles every 1.5 years]. b. Exponential: y = ( ) x, where 1971 is coded as 1. Model R Linear Quadratic 0.55 Logarithmic Exponential Power y = (1.4774) x R = c. Moore s law does appear to be working reasonably well. With very good. R = 0.990, the model appears to be Copyright 014 Pearson Education, Inc.

196 194 Chapter 10: Correlation and Regression 18. a b. 73. c. The quadratic sum of squares of residuals (73.) is less than the sum of squares of residuals from the linear model (6641.8). Chapter Quick Quiz 1. r =± Based on the critical values of ± (assuming a 0.05 significance level), conclude that there is not sufficient evidence to support the claim of a linear correlation between systolic and diastolic readings. 3. The best predicted diastolic reading is 90.6, which is the mean of the five sample diastolic readings. 4. The best predicted diastolic reading is y ˆ = ( 15) = 85.3, which is found by substituting 15 for x in the regression equation. 5. r = False; there could be another relationship. 7. False, correlation does not imply causation. 8. r = 1 9. Because r must be between 1 and 1 inclusive, the value of is the result of an error in the calculations. 10. r = 1 Review Exercises 1. a. r = Critical values: r =± (assuming a 0.05 significance level). P-value = There is sufficient evidence to support the claim that there is a linear correlation between duration and interval-after time. Pearson correlation of After and Duration = 0.96 P-Value = b. r = ( 0.96) = = 85.7% c. yˆ = x Predictor Coef SE Coef T P Constant Duration Scatterplot of After vs Duration After Duration Copyright 014 Pearson Education, Inc.

197 Chapter 10: Correlation and Regression (continued) Scatterplot of Residuals vs Duration Probability Plot of Residuals Normal Residuals Probability Duration Residuals 5 10 d. y ˆ = ( 00) = 81.6 min. a. The scatterplot suggests that there is not sufficient sample evidence to support the claim of a linear correlation between heights of eruptions and interval-after times. 100 Scatterplot of After vs Height 90 After Height b. r = Critical values: r =± (assuming a 0.05 significance level). P-value = There is not sufficient evidence to support the claim that there is a linear correlation between height and interval-after time. Pearson correlation of Height and After = 0.69 P-Value = c. yˆ = x Predictor Coef SE Coef T P Constant Height y ˆ = = 78.9 min d. ( ) Copyright 014 Pearson Education, Inc.

198 196 Chapter 10: Correlation and Regression 3. a. The scatterplot suggests that there is not sufficient sample evidence to support the claim of a linear correlation between duration and height. 150 Scatterplot of Height vs Duration 140 Height Duration b. r = Critical values: r =± (assuming a 0.05 significance level). P-value = There is not sufficient evidence to support the claim that there is a linear correlation between duration and height. Pearson correlation of Height and Duration = P-Value = c. yˆ = x Predictor Coef SE Coef T P Constant Duration d. The regression line does not fit the points well, so the best predicted height y = 18.8 ft. 4. r = Critical values: r =± 0.63 (assuming a 0.05 significance level). P-value = There is not sufficient evidence to support the claim that there is a linear correlation between time and height. Although there is no linear correlation between time and height, the scatterplot shows a very distinct pattern revealing that time and height are associated by some function that is not linear. Pearson correlation of Height(m) and Time(sec) = P-Value = Scatterplot of Height(m) vs Time(sec) Height(m) Time(sec) AFTER = Duration BEFORE, or yˆ = x x. R = 0.87; adjusted R = 0.80; P-value = With high values of R and adjusted R and a small P-value of 0.006, it appears that the regression equation can be used to predict the time interval after an eruption given the duration of the eruption and the time interval before that eruption. Predictor Coef SE Coef T P Constant Duration Before S = R-Sq = 87.% R-Sq(adj) = 8.0% Copyright 014 Pearson Education, Inc.

199 Chapter 10: Correlation and Regression 197 Cumulative Review Exercises 1. x = 3.3 lb, s = 5.7 lb The highest weight before the diet is 1 lb, which converts to z = = The highest weight 1.0 is not unusual because its z score of 1.55 shows that it is within standard deviations of the mean. 3. H : 0 0 μ d =. H : 0 0 μ d >. Test statistic: t = Critical value: t = P-value > 0.05 (Tech: 0.075). Fail to reject H 0. There is not sufficient evidence to support the claim that the diet is effective. Paired T for Before - After 95% lower bound for mean difference: T-Test of mean difference = 0 (vs > 0): T-Value = 1.61 P-Value = lb < μ < lb. We have 95% confidence that the interval limits of lb and lb contain the true value of the mean of the population of all subjects before the diet. Variable N Mean StDev SE Mean 95% CI Before (161.79, ) 5. a. r = Critical values: r =± (assuming a 0.05 significance level). P-value = There is sufficient evidence to support the claim that there is a linear correlation between before and after weights. Pearson correlation of Before and After = P-Value = b. r = 1 c. r = 1 d. The effectiveness of the diet is determined by the amounts of weight lost, but the linear correlation coefficient is not sensitive to different amounts of weight loss. Correlation is not a suitable tool for testing the effectiveness of the diet. 6. a z = = 0.16; P( z> 0.16) = 43.64%. (Tech: 43.58%) 495 b. 10th percentile: x = μ + z σ = = g (Tech: g) c z = = 1.96; P( z< 1.96) = z = = 1.96; P( z> 1.96) = = = 5.00%. Yes, many of the babies do require special treatment. 7. a. H 0 : p = 0.5. H 1 : p > 0.5. Test statistic: z = Critical value: z = P-value: Reject H 0. There is sufficient evidence to support the claim that the majority of us say that honesty is always the best policy. Test of p = 0.5 vs p > % Lower Sample X N Sample p Bound Z-Value P-Value b. The sample is a voluntary response (or self-selected) sample. This type of sample suggests that the results given in part (a) are not necessarily valid. Copyright 014 Pearson Education, Inc.

200 198 Chapter 10: Correlation and Regression 8. a. Nominal b. Ratio c. Discrete d = e. Parameter a. b. c. d. 304 = = = = % 59 = = Number Protestant Catholic Jewish Mormon Members of Congress Other Copyright 014 Pearson Education, Inc.

201 Chapter 11: Goodness-of-Fit and Contingency Tables 199 Chapter 11: Goodness-of-Fit and Contingency Tables Section The test is to determine whether the observed frequency counts agree with the claimed uniform distribution so that frequencies for the different days are equally likely E =, or , for each of the seven days of the week. For Sunday, O = 53 and E = Because the given frequencies differ substantially from frequencies that are all about the same, the χ test statistic should be large and the P-value should be small. 4. df = 6. Critical value: 5. Test statistic: χ = χ = Critical value: χ = P-value = There is sufficient evidence to warrant rejection of the claim that the days of the week are selected with a uniform distribution with all days having the same chance of being selected. 6. Test statistic: χ = 6.6. Critical value: χ = P-value = There is not sufficient evidence to support the claim that the sample is from a population of heights in which the last digits do not occur with the same frequency. 7. Critical value: χ = P-value > 0.10 (Tech: 0.516). There is not sufficient evidence to warrant rejection of the claim that the observed outcomes agree with the expected frequencies. The slot machine appears to be functioning as expected. 8. Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: P-value = 0.04). There is not sufficient evidence to warrant rejection of the claim that the tires selected by the students are equally likely. It appears that students do not have the ability to select the same tire. Tire O E O E ( O E) ( O E) E Left Front = Right Front = Left Rear = Right Rear = Sum Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: 0.497). There is not sufficient evidence to warrant rejection of the claim that homicides in New York City are equally likely for each of the 1 months. There is not sufficient evidence to support the police commissioner s claim that homicides occur more often in the summer when the weather is better. N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = P-value < (Tech: 0.000). There is sufficient evidence to warrant rejection of the claim that American born major league baseball players are born in different months with the same frequency. The sample data appear to support Gladwell s claim. N DF Chi-Sq P-Value Copyright 014 Pearson Education, Inc.

202 00 Chapter 11: Goodness-of-Fit and Contingency Tables 11. Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: P-value = 0.30). There is not sufficient evidence to support the claim that the outcomes are not equally likely. The outcomes appear to be equally likely, so the loaded die does not appear to behave differently from a fair die. N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = P-value < 0.01 (Tech: ). There is sufficient evidence to warrant rejection of the claim that births occur on the days of the week with equal frequency. Because many births are induced or involve Caesarean section, they are scheduled for days other than Saturday or Sunday, so those two days have smaller numbers of births. N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: 0.14). There is not sufficient evidence to warrant rejection of the claim that the likelihood of winning is the same for the different post positions. Based on these results, post position should not be considered when betting on the Kentucky Derby race. N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: 0.53). There is not sufficient evidence to warrant rejection of the claim that the digits are selected in a way that they are equally likely. N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = P-value < (Tech: 0.000). There is sufficient evidence to warrant rejection of the claim that the different days of the week have the same frequencies of police calls. The highest numbers of calls appear to fall on Friday and Saturday, and these are weekend days with disproportionately more partying and drinking. N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = P-value < (Tech: 0.000). There is sufficient evidence to warrant rejection of the claim that the different days of the week have the same frequencies of police calls. Because March has 31 days, three of the days of the week occur more often than the other days of the week, so the comparison does not make sense with the given data. N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = P-value > 0.05 (Tech: 0.056). There is not sufficient evidence to warrant rejection of the claim that the actual numbers of games fit the distribution indicated by the proportions listed in the given table. Copyright 014 Pearson Education, Inc.

203 Chapter 11: Goodness-of-Fit and Contingency Tables (continued) Games Played O E O E ( O E) ( O E) E = = = = Sum Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: 0.467). There is not sufficient evidence to warrant rejection of the claim that the actual eliminations agree with the expected numbers. The leadoff singers do appear to be at a disadvantage because 0 of them were eliminated compared to the expected value of 1.9 eliminations, but that result is not significant in the context of the available sample data. N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = (assuming a 0.05 significance level). P-value > 0.10 (Tech: 0.45). There is not sufficient evidence to warrant rejection of the claim that the color distribution is as claimed. Color O E O E ( O E) ( O E) E Red = Orange = Yellow = Brown = Blue = Green = Sum Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: 0.913). There is not sufficient evidence to warrant rejection of the claim that the actual frequencies fit a Poisson distribution. N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = P-value < (Tech: 0.000). There is sufficient evidence to warrant rejection of the claim that the leading digits are from a population with a distribution that conforms to Benford s law. It does appear that the checks are the result of fraud (although the results cannot confirm that fraud is the cause of the discrepancy between the observed results and the expected results). N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = P-value > 0.05 (Tech: 0.071). There is not sufficient evidence to warrant rejection of the claim that the leading digits are from a population with a distribution that conforms to Benford s law. The author s check amounts appear to be legitimate. Copyright 014 Pearson Education, Inc.

204 0 Chapter 11: Goodness-of-Fit and Contingency Tables. (continued) N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: 0.988). There is not sufficient evidence to warrant rejection of the claim that the leading digits are from a population with a distribution that conforms to Benford s law. The tax entries do appear to be legitimate. N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: 0.45). There is not sufficient evidence to warrant rejection of the claim that the leading digits are from a population with a distribution that conforms to Benford s law. N DF Chi-Sq P-Value a. 6, 13, 15, 6 b z = = 1; P( z< 1) = ; z = = 0; P( 1< z< 0) = = ; z = = 1; P( 0 < z< 1) = = : z = = 1; P( z> 1) = (Tech: , , , ) c = 6.348, = 13.65,, = 13.65, = (Tech: 6.348, 13.65, , 6.344) d. Test statistic: χ = 0.0 (Tech: 0.01). Critical value: χ = P-value > 0.10 (Tech: 0.977). There is not sufficient evidence to warrant rejection of the claim that heights were randomly selected from a normally distributed population. The test suggests that the data are from a normally distributed population. Height O E O E ( O E) ( O E) E Less than Greater than Sum N DF Chi-Sq P-Value Copyright 014 Pearson Education, Inc.

205 Section 11-3 Chapter 11: Goodness-of-Fit and Contingency Tables Because the P-value of 0.16 is not small (such as 0.05 or lower), fail to reject the null hypothesis of independence between the treatment and whether the subject stops smoking. This suggests that the choice of treatment doesn t appear to make much of a difference.. In this context, the word contingency refers to a dependency of one variable on another, and we use a test of independence between the row variable and the column variable to determine whether one variable appears to be contingent on the other. We use the terminology of two-way table because the frequency counts are arranged in a table with two variables: the row variable and the column variable. 3. df = ( 3 1)( 1) = and the critical value is χ = The test is right-tailed. The test statistic is based on differences between observed frequencies and the frequencies expected with the assumption of independence between the row and column variables. Only large values of the test statistic correspond to substantial differences between the observed and expected values, and such large values are located in the right tail of the distribution. 5. Test statistic: χ = Critical value: χ = P-value > 0.05 (Tech: ). There is not sufficient evidence to warrant rejection of the claim that the form of the 100-Yuan gift is independent of whether the money was spent. There is not sufficient evidence to support the claim of a denomination effect. 6. Test statistic: χ = Critical value: χ = P-value < (Tech: 0.00). There is sufficient evidence to warrant rejection of the claim that success is independent of the type of treatment. The results suggest that the surgery treatment is better. 7. Test statistic: χ = Critical value: χ = P-value < (Tech: 0.000). There is sufficient evidence to warrant rejection of the claim that whether a subject lies is independent of the polygraph test indication. The results suggest that polygraphs are effective in distinguishing between truths and lies, but there are many false positives and false negatives, so they are not highly reliable. Expected counts are printed below observed counts No (Did Yes Not Lie) (Lied) Total Total Chi-Sq = 5.571, DF = 1, P-Value = Test statistic: χ = Critical value: χ = P-value > 0.05 (Tech: ). There is not sufficient evidence to support the claim that the results are discriminatory. Expected counts are printed below observed counts Passed Failed Total Total Chi-Sq = 4.43, DF = 1, P-Value = Copyright 014 Pearson Education, Inc.

206 04 Chapter 11: Goodness-of-Fit and Contingency Tables 9. Test statistic: χ = Critical value: χ = P-value < (Tech: 0.000). There is sufficient evidence to warrant rejection of the claim that the sentence is independent of the plea. The results encourage pleas for guilty defendants. Expected counts are printed below observed counts Guilty Not Guilty Plea Plea Total Total Chi-Sq = 4.557, DF = 1, P-Value = Test statistic: χ = Critical value: χ = P-value < (Tech: 0.000). There is sufficient evidence to warrant rejection of the claim that deaths on shifts are independent of whether Gilbert was working. The results favor the guilt of Gilbert. Expected counts are printed below observed counts With Death Without Death Total Total Chi-Sq = , DF = 1, P-Value = Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: 0.686). There is not sufficient evidence to warrant rejection of the claim that the gender of the tennis player is independent of whether the call is overturned. Expected counts are printed below observed counts Yes No Total Total Chi-Sq = 0.164, DF = 1, P-Value = Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: 0.43). There is not sufficient evidence to warrant rejection of the claim that left-handedness is independent of gender. Expected counts are printed below observed counts Yes No Total Total Chi-Sq = 1.364, DF = 1, P-Value = 0.43 Copyright 014 Pearson Education, Inc.

207 Chapter 11: Goodness-of-Fit and Contingency Tables Test statistic: χ = Critical value: χ = P-value < 0.01 (Tech: ). There is sufficient evidence to warrant rejection of the claim that the direction of the kick is independent of the direction of the goalkeeper jump. The results do not support the theory that because the kicks are so fast, goalkeepers have no time to react, so the directions of their jumps are independent of the directions of the kicks. Chi-Sq = , DF = 4, P-Value = Test statistic: χ = Critical value: χ = (assuming a 0.05 significance level). P-value > 0.10 (Tech: 0.715). There is not sufficient evidence to warrant rejection of the claim that the amount of smoking is independent of seat belt use. The theory is not supported by the given data. Chi-Sq = 1.358, DF = 3, P-Value = Test statistic: χ =.95. Critical value: χ = P-value > 0.10 (Tech: 0.3). There is not sufficient evidence to warrant rejection of the claim that getting a cold is independent of the treatment group. The results suggest that echinacea is not effective for preventing colds. Chi-Sq =.95, DF =, P-Value = Test statistic: χ = Critical value: χ = (assuming a 0.05 significance level). P-value < 0.05 (Tech: 0.041). There is sufficient evidence to warrant rejection of the claim that injuries are independent of helmet color. It appears that motorcycle drivers should use yellow or orange helmets. Chi-Sq = 9.971, DF = 4, P-Value = Test statistic: χ = Critical value: χ = P-value < (Tech: ). There is sufficient evidence to warrant rejection of the claim that cooperation of the subject is independent of the age category. The age group of 60 and over appears to be particularly uncooperative. Chi-Sq = 0.71, DF = 5, P-Value = Test statistic: χ = Critical value: χ = P-value > 0.05 (Tech: ). There is sufficient evidence to warrant rejection of the claim that months of births of baseball players are independent of whether they are born in America. The data do appear to support Gladwell s claim. Chi-Sq = 0.054, DF = 11, P-Value = Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: 0.856). There is not sufficient evidence to warrant rejection of the claim that getting an infection is independent of the treatment. The atorvastatin treatment does not appear to have an effect on infections. Chi-Sq = 0.773, DF = 3, P-Value = Test statistic: χ = Critical value: χ = P-value < (Tech: 0.000). There is sufficient evidence to warrant rejection of the claim that left-handedness is independent of parental handedness. It appears that handedness of the parents has an effect on handedness of the offspring, so left-handedness appears to be an inherited trait. Copyright 014 Pearson Education, Inc.

208 06 Chapter 11: Goodness-of-Fit and Contingency Tables 0. (continued) Expected counts are printed below observed counts Yes No Total Total Chi-Sq = , DF = 3, P-Value = Test statistics: χ = and z = , so that z =± 1.96, so z = χ (approximately). Expected counts are printed below observed counts Purchased Kept the Gum Money Total Total Chi-Sq = 1.16, DF = 1, P-Value = Without Yates s correction, the test statistic is z = χ. Critical values: χ = and Difference = p (1) - p () Estimate for difference: % CI for difference: ( , ) Test for difference = 0 (vs not = 0): Z = 3.49 P-Value = χ = With Yates s correction, the test statistic is = Yates s correction decreases the test statistic so that sample data must be more extreme in order to reject the null hypothesis of independence. Without Yates s correction ( ) ( ) ( ) ( ) χ = = With Yates s Correction ( ) ( ) ( ) ( ) χ = = Chapter Quick Quiz 1. H 0 : p1 = p = p3 = p4 = p5. H 1 : At least one of the probabilities is different from the others O = 3 and E = = Right-tailed. 4. df = 4 and the critical value is χ = There is not sufficient evidence to warrant rejection of the claim that occupation injuries occur with equal frequency on the different days of the week. χ Copyright 014 Pearson Education, Inc.

209 Chapter 11: Goodness-of-Fit and Contingency Tables H 0 : Response to the question is independent of gender. H 1 : Response to the question and gender are dependent. 7. Chi-square distribution. 8. Right-tailed. 9. df = ( 1)( 3 1) = and the critical value is χ = There is not sufficient evidence to warrant rejection of the claim that response is independent of gender. Review Exercises 1. Test statistic: χ = Critical value: χ = P-value: There is sufficient evidence to warrant rejection of the claim that auto fatalities occur on the different days of the week with the same frequency. Because people generally have more free time on weekends and more drinking occurs on weekends, the days of Friday, Saturday, and Sunday appear to have disproportionately more fatalities.. Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: 0.689). There is not sufficient evidence to warrant rejection of the claim that the last digits of 0, 1,,..., 9 occur with the same frequency. It does appear that the weights were obtained through measurements. N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = P-value < (Tech: 0.000). There is sufficient evidence to warrant rejection of the claim that weather-related deaths occur in the different months with the same frequency. The summer months appear to have disproportionately more weather-related deaths, and that is probably due to the fact that vacations and outdoor activities are much greater during those months. N DF Chi-Sq P-Value Test statistic: χ = Critical value: χ = P-value: There is sufficient evidence to warrant rejection of the claim that wearing a helmet has no effect on whether facial injuries are received. It does appear that a helmet is helpful in preventing facial injuries in a crash. 5. Test statistic: χ = Critical value: χ = P-value < 0.05 (Tech: 0.060). There is sufficient evidence to warrant rejection of the claim that when flipping or spinning a penny, the outcome is independent of whether the penny was flipped or spun. It appears that the outcome is affected by whether the penny is flipped or spun. If the significance level is changed to 0.01, the critical value changes to 6.635, and we fail to reject the given claim, so the conclusion does change. All expected counts are greater than 5. Expected counts are printed below observed counts Chi-Square contributions are printed below expected counts Heads Tails Total Total Chi-Sq = 4.955, DF = 1, P-Value = 0.06 Copyright 014 Pearson Education, Inc.

210 08 Chapter 11: Goodness-of-Fit and Contingency Tables 6. Test statistic: χ = Critical value: χ = P-value > 0.10 (Tech: 0.19). There is not sufficient evidence to warrant rejection of the claim that home/visitor wins are independent of the sport. Expected counts are printed below observed counts Chi-Square contributions are printed below expected counts Basketball Baseball Hockey Football Total Total Chi-Sq = 4.737, DF = 3, P-Value = 0.19 Cumulative Review Exercises 1. H 0 : p = 0.5. H 1 : p 0.5. Test statistic: z = 7.8. Critical values: z =± P-value: (Tech: ). Reject H 0. There is sufficient evidence to warrant rejection of the claim that among those who die in weather-related deaths, the percentage of males is equal to 50%. Test of p = 0.5 vs p not = 0.5 Exact Sample X N Sample p 95% CI P-Value ( , ) % < p < 65.0%. Because the confidence interval does not include 50% (or half ), we should reject the stated claim. Sample X N Sample p 95% CI ( , ) 3. x = 53.7 years, median = 60.0 years, s = 16.1 years. Because an age of 16 differs from the mean by more than standard deviations, it is an unusual age years < μ < 65. years. Yes, the confidence interval limits do contain the value of 65.0 years that was found from a sample of 969 ICU patients. Variable N Mean StDev SE Mean 95% CI AGES (4.19, 65.1) 5. a. r = Critical values: r =± P-value = There is not sufficient evidence to support the claim that there is a linear correlation between the numbers of boats and the numbers of manatee deaths. Pearson correlation of Boats and Manatee Deaths = P-Value = b. yˆ = x The regression equation is Manatee Deaths = Boats Predictor Coef SE Coef T P Constant Boats Copyright 014 Pearson Education, Inc.

211 5. (continued) Chapter 11: Goodness-of-Fit and Contingency Tables 09 c. y ˆ = ( 84) = 84.6 manatee deaths (the value of y). The predicted value is not very accurate because it is not very close to the actual value of 78 manatee deaths. 6. a. 5th percentile: x = μ + z σ = = 630 mm b z = = 1.06 and P( z< 1.06) = 14.46% (Tech: 14.48%). That percentage is too high, 34 because too many women would not be accommodated. c z = = and P( z> 0.706) = 76.11% (Tech: ). Groups of 16 women do not occupy a cockpit; because individual women occupy the cockpit, this result has no effect on the design. 7. a. Statistic. b. Quantitative. c. Discrete. d. The sampling is conducted so that all samples of the same size have the same chance of being selected. e. The sample is a voluntary response sample (or self-selected sample), and those with strong feelings about the topic are more likely to respond, so it is not a valid sampling plan. 8. a. ( 0.6) 4 = b = 0.4 Copyright 014 Pearson Education, Inc.

212

213 Chapter 1: Analysis of Variance 11 Chapter 1: Analysis of Variance Section 1-1. a. The chest deceleration measurements are categorized according to the one characteristic of size. b. The terminology of analysis of variance refers to the method used to test for equality of the three population means. That method is based on two different estimates of a common population variance.. As we increase the number of individual tests of significance, we increase the risk of finding a difference by chance alone (instead of a real difference in the means). The risk of a type I error finding a difference in one of the pairs when no such difference actually exists is too high. The method of analysis of variance helps us avoid that particular pitfall (rejecting a true null hypothesis) by using one test for equality of several means, instead of several tests that each compare two means at a time. 3. The test statistic is F = 3.88, and the F distribution applies. 4. The P-value is Because the P-value is greater than the significance level of 0.05, we fail to reject the null hypothesis of equal means. There is not sufficient evidence to warrant rejection of the claim that the three different categories of car sizes have the same mean chest deceleration in the standard car crash test. 5. Test statistic: F = P-value: Fail to reject H 0 : μ1 = μ = μ3. There is not sufficient evidence to warrant rejection of the claim that the three categories of blood lead level have the same mean verbal IQ score. Exposure to lead does not appear to have an effect on verbal IQ scores. 6. Test statistic: F = P-value: Fail to reject H 0 : μ1 = μ = μ3. There is not sufficient evidence to warrant rejection of the claim that the three categories of blood lead level have the same mean full IQ score. Exposure to lead does not appear to have an effect on full IQ scores. 7. Test statistic: F = P-value: Reject H 0 : μ1 = μ = μ3. There is sufficient evidence to warrant rejection of the claim that the three size categories have the same mean highway fuel consumption. The size of a car does appear to affect highway fuel consumption. 8. Test statistic: F = P-value: Reject H 0 : μ1 = μ = μ3. There is sufficient evidence to warrant rejection of the claim that the three size categories have the same mean city fuel consumption. The size of a car does appear to affect city fuel consumption. 9. Test statistic: F = P-value: Fail to reject H 0 : μ1 = μ = μ3. There is not sufficient evidence to warrant rejection of the claim that the three size categories have the same mean head injury measurement. The size of a car does not appear to affect head injuries. 10. Test statistic: F = P-value: Fail to reject H 0 : μ1 = μ = μ3. There is not sufficient evidence to warrant rejection of the claim that the three size categories have the same mean pelvis injury measurement. The size of a car does not appear to affect pelvis injuries. 11. Test statistic: F = P-value: Reject H 0 : μ1 = μ = μ3. There is sufficient evidence to warrant rejection of the claim that the three different miles have the same mean time. These data suggest that the third mile appears to take longer, and a reasonable explanation is that the third lap has a hill. EXCEL ANOVA Source of Variation SS df MS F P-value F crit Between Groups E Within Groups Total Copyright 014 Pearson Education, Inc.

214 1 Chapter 1: Analysis of Variance 1. Test statistic: F = P-value: Reject H 0 : μ1 = μ = μ3. There is sufficient evidence to warrant rejection of the claim that the three books have the same mean Flesch Reading Ease score. The data suggest that the books appear to have mean scores that are not all the same. EXCEL ANOVA Source of Variation SS df MS F P-value F crit Between Groups Within Groups Total Test statistic: F = P-value: Reject H 0 : μ1 = μ = μ3 = μ4. There is sufficient evidence to warrant rejection of the claim that the four treatment categories yield poplar trees with the same mean weight. Although not justified by the results from analysis of variance, the treatment of fertilizer and irrigation appears to be most effective. EXCEL ANOVA Source of Variation SS df MS F P-value F crit Between Groups Within Groups Total Test statistic: F = P-value: Fail to reject H 0 : μ1 = μ = μ3 = μ4. There is not sufficient evidence to warrant rejection of the claim that the four treatment categories yield poplar trees with the same mean weight. In the sandy and dry region, there does not appear to be a treatment that is more effective than the others. EXCEL ANOVA Source of Variation SS df MS F P-value F crit Between Groups Within Groups Total Test statistic: F = P-value: Reject H 0 : μ1 = μ = μ3. There is sufficient evidence to warrant rejection of the claim that the three different types of cigarettes have the same mean amount of nicotine. Given that the king-size cigarettes have the largest mean of 1.6 mg per cigarette, compared to the other means of 0.87 mg per cigarette and 0.9 mg per cigarette, it appears that the filters do make a difference (although this conclusion is not justified by the results from analysis of variance). EXCEL ANOVA Source of Variation SS df MS F P-value F crit Between Groups E Within Groups Total Copyright 014 Pearson Education, Inc.

215 Chapter 1: Analysis of Variance Test statistic: F = P-value: Reject H 0 : μ1 = μ = μ3 = μ4. There is sufficient evidence to warrant rejection of the claim that the three samples are from populations with the same mean. It appears that cotinine levels are greater with more exposure to tobacco smoke. (The samples do not appear to be from normally distributed populations, but ANOVA is robust against departures from normality.) EXCEL ANOVA Source of Variation SS df MS F P-value Between Groups E-08 Within Groups Total SMOKER Dotplot of SMOKER, ETS, NOETS ETS NOETS Data Each symbol represents up to observations. 17. The Tukey test results show different P-values, but they are not dramatically different. The Tukey results suggest the same conclusions as the Bonferroni test. 18. a. In Exercise 13 we reject the null hypothesis of equal means. The displayed Bonferroni results show that with a P-value of 0.039, there is a significant difference between the mean of the no treatment group (group 1) and the mean of the group treated with both fertilizer and irrigation (group 4). b. The test statistic is t = P-value = 6( ) = Reject the null hypothesis that the mean weight from the irrigation treatment group is equal to the mean from the group treated with both fertilizer and irrigation. Section The load values are categorized using two different factors of (1) femur (left or right) and () size of car (small, midsize, large).. No. To use two individual tests of one-way analysis of variance is to completely ignore the very important feature of the possible effect from an interaction between femur and size. If there is an interaction, it doesn t make sense to consider the effects of one factor without the other. 3. An interaction between two factors or variables occurs if the effect of one of the factors changes for different categories of the other factor. If there is an interaction effect, we should not proceed with individual tests for effects from the row factor and column factor. If there is an interaction, we should not consider the effects of one factor without considering the effects of the other factor. 4. Yes, the result is a balanced design because each cell has the same number (7) of values. Copyright 014 Pearson Education, Inc.

216 14 Chapter 1: Analysis of Variance 5. For interaction, the test statistic is F = 1.7 and the P-value is 0.194, so there is not sufficient evidence to conclude that there is an interaction effect. For the row variable of femur (right, left), the test statistic is F = 1.39 and the P-value is 0.46, so there is not sufficient evidence to conclude that whether the femur is right or left has an effect on measured load. For the column variable of size of the car, the test statistic is F =.3 and the P-value is 0.1, so there is not sufficient evidence to conclude that the car size category has an effect on the measured load. 6. For interaction, the test statistic is F = 0.34 and the P-value is 0.717, so there is not sufficient evidence to conclude that there is an interaction effect. For the row variable of type (foreign, domestic), the test statistic is F = 5.44 and the P-value is 0.038, so there is sufficient evidence to conclude that the type of car (foreign, domestic) has an effect on measured chest deceleration. For the column variable of size of the car, the test statistic is F = 3.58 and the P-value is 0.060, so there is not sufficient evidence to conclude that the car size category has an effect on the measured chest deceleration. 7. For interaction, the test statistic is F = 1.05 and the P-value is 0.365, so there is not sufficient evidence to conclude that there is an interaction effect. For the row variable of sex, the test statistic is F = 4.58 and the P-value is 0.043, so there is sufficient evidence to conclude that the sex of the subject has an effect on verbal IQ score. For the column variable of blood lead level (LEAD), the test statistic is F = 0.14 and the P- value is 0.871, so there is not sufficient evidence to conclude that blood lead level has an effect on verbal IQ score. It appears that only the sex of the subject has an effect on verbal IQ score. 8. For interaction, the test statistic is F = and the P-value is 0.000, so there is sufficient evidence to conclude that there is an interaction effect. The ratings appear to be affected by an interaction between the use of a supplement and the amount of whey. Because there appears to be an interaction effect, we should not proceed with individual tests of the row factor (supplement) and the column factor (amount of whey). EXCEL ANOVA Source of Variation SS df MS F P-value F crit Sample Columns E Interaction E Within Total For interaction, the test statistic is F = and the P-value is 0.091, so there is sufficient evidence to conclude that there is an interaction effect. The measures of self-esteem appear to be affected by an interaction between the self-esteem of the subject and the self-esteem of the target. Because there appears to be an interaction effect, we should not proceed with individual tests of the row factor (target s selfesteem) and the column factor (subject s self-esteem). EXCEL ANOVA Source of Variation SS df MS F P-value F crit Sample Columns Interaction Within Total Copyright 014 Pearson Education, Inc.

217 Chapter 1: Analysis of Variance For interaction, the test statistic is F = and the P-value is 0.068, so there is not sufficient evidence to conclude that there is an interaction effect. For the row variable of gender (male, female), the test statistic is F = and the P-value is 0.00, so there is sufficient evidence to conclude that gender has an effect on pulse rate. For the column variable of age bracket, the test statistic is F = and the P-value is , so there is sufficient evidence to conclude that the age bracket has an effect on pulse rate. EXCEL ANOVA Source of Variation SS df MS F P-value F crit Sample Columns Interaction Within Total a. Test statistics and P-values do not change. b. Test statistics and P-values do not change. c. Test statistics and P-values do not change. d. An outlier can dramatically affect and change test statistics and P-values. Chapter Quick Quiz 1. H 0 : μ1 = μ = μ3. Because the displayed P-value of is small, reject H 0.. No. Because we reject the null hypothesis of equal means, it appears that the three different power sources do not produce the same mean voltage level, so we cannot expect electrical appliances to behave the same way when run from the three different power sources. 3. Right-tailed. 4. Test statistic: F = In general, larger test statistics result in smaller P-values. 5. The sample voltage measurements are categorized using only one factor: the source of the voltage. 6. Test a null hypothesis that three or more samples are from populations with equal means. 7. With one-way analysis of variance, the different samples are categorized using only one factor, but with two-way analysis of variance, the sample data are categorized into different cells determined by two different factors. 8. For interaction, the test statistic is F = 0.19 and the P-value is Fail to reject the null hypothesis of no interaction. There does not appear to be an effect due to an interaction between sex and major. 9. The test statistic is F = 0.78 and the P-value is There is not sufficient evidence to support a claim that the length estimates are affected by the sex of the subject. 10. The test statistic is F = 0.13 and the P-value is There is not sufficient evidence to support a claim that the length estimates are affected by the subject s major. Review Exercises 1. H 0 : 1 3 μ μ μ = =. Test statistic: F = P-value: Reject the null hypothesis. There is sufficient evidence to warrant rejection of the claim that 4-cylinder cars, 6-cylinder cars, and 8-cylinder cars have the same mean highway fuel consumption amount. Copyright 014 Pearson Education, Inc.

218 16 Chapter 1: Analysis of Variance. For interaction, the test statistic is F = 0.17 and the P-value is 0.915, so there is not sufficient evidence to conclude that there is an interaction effect. For the row variable of site, the test statistic is F = 0.81 and the P-value is 0.374, so there is not sufficient evidence to conclude that the site has an effect on weight. For the column variable of treatment, the test statistic is F = 7.50 and the P-value is 0.001, so there is sufficient evidence to conclude that the treatment has an effect on weight. 3. Test statistic: F = P-value: Reject H 0 : μ1 = μ = μ3. There is sufficient evidence to warrant rejection of the claim that the three different types of cigarettes have the same mean amount of tar. Given that the king-size cigarettes have the largest mean of 1.1 mg per cigarette, compared to the other means of 1.9 mg per cigarette and 13. mg per cigarette, it appears that the filters do make a difference (although this conclusion is not justified by the results from analysis of variance). EXCEL ANOVA Source of Variation SS df MS F P-value F crit Between Groups E Within Groups Total For interaction, the test statistic is F = and the P-value is , so there does not appear to be an effect from an interaction between gender and whether the subject smokes. For gender, the test statistic is F = and the P-value is , so gender does not appear to have an effect on body temperature. For smoking, the test statistic is F = and the P-value is 0.108, so there does not appear to be an effect from smoking on body temperature. EXCEL ANOVA Source of Variation SS df MS F P-value F crit Sample(Gender) Columns(Smokes) Interaction Within Total Cumulative Review Exercises 1. a years, 13.1 years,.7 years b. 9.7 years, 9.0 years, 18.6 years c years, 80.3 years, years d. Ratio.. Test statistic: t = Critical values t =±.160 (assuming a 0.05 significance level). (Tech: P-value = ) Fail to reject H 0 : μ1 = μ. There is not sufficient evidence to support the claim that there is a difference between the means for the two groups. Difference = mu (Presidents) - mu (Monarchs) Estimate for difference: % CI for difference: (-18.33, 3.90) T-Test of difference = 0 (vs not =): T-Value = P-Value = DF = 15 Copyright 014 Pearson Education, Inc.

219 Chapter 1: Analysis of Variance Normal, because the histogram is approximately bell-shaped (or the points in a normal quantile plot are reasonably close to a straight-line pattern with no other pattern that is not a straight-line pattern). Histogram of Presidents Frequency Presidents years < μ < 18.7 years One-Sample T: Presidents Variable N Mean StDev SE Mean 95% CI Presidents (1.30, 18.70) 5. a. H 0 : μ1 = μ = μ3 b. Because the P-value of is greater than the significance level of 0.05, fail to reject the null hypothesis of equal means. There is not sufficient evidence to warrant rejection of the claim that the three means are equal. The three populations do not appear to have means that are significantly different. 6. a. r = Critical values: r =± P-value = There is sufficient evidence to support the claim that there is a linear correlation between September weights and the subsequent April weights. 7. a. Correlations: September, April Pearson correlation of September and April = P-Value = b. yˆ = x Regression Analysis: April versus September The regression equation is April = September Predictor Coef SE Coef T P Constant September c. y ˆ = ( 94) = 86.6 kg, which is not very close to the actual April weight of 105 kg. b. c z = = 1; P( z> 1) = z = = 1; P( 1< z< 1) = = (Tech: 0.687) z = = 3; P( z< 3) = / 5 d. 80th percentile: x = μ + z σ = = (Tech: 334.7) Copyright 014 Pearson Education, Inc.

220 18 Chapter 1: Analysis of Variance 8. a = b < p < 0.5 Test and CI for One Proportion Sample X N Sample p 95% CI ( , ) c. Yes. The confidence interval shows us that we have 95% confidence that the true population proportion is contained within the limits of and 0.5, and 1/4 is not included within that range. 9. a. The distribution should be uniform, with a flat shape. The given histogram agrees (approximately) with the uniform distribution that we expect. b. No. A normal distribution is approximately bell-shaped, but the given histogram is far from being bellshaped. 10. Test statistic: χ = Critical value: χ = (assuming a 0.05 significance level). P-value > 0.10 (Tech: 0.319). There is not sufficient evidence to warrant rejection of the claim that the digits are selected from a population in which the digits are all equally likely. There does not appear to be a problem with the lottery. Chi-Square Goodness-of-Fit Test for Observed Counts in Variable: C1 N DF Chi-Sq P-Value Copyright 014 Pearson Education, Inc.

221 Chapter 13: Nonparametric Statistics Section 13- Chapter 13: Nonparametric Statistics The only requirement for the matched pairs is that they constitute a simple random sample. There is no requirement of a normal distribution or any other specific distribution. The sign test is distribution free in the sense that it does not require a normal distribution or any other specific distribution.. There are positive signs, 7 negative signs, 1 tie, n = 9, and the test statistic is x = (the smaller of and 7). 3. H 0 : There is no difference between the populations of September weights and April weights. H 1 : There is a difference between the populations of September weights and April weights. The sample data do not contradict H 1 because the numbers of positive signs () and negative signs (7) are not exactly the same. 4. The efficiency of 0.63 indicates that with all other things being equal, the sign test requires 100 sample observations to achieve the same results as 63 sample observations analyzed through a parametric test. 5. The test statistic of x = 1 is less than or equal to the critical value of (from Table A-7.) There is sufficient evidence to warrant rejection of the claim of no difference. There does appear to be a difference. 6. The test statistic of x = 5 is less than or equal to the critical value of 5 (from Table A-7.) There is sufficient evidence to warrant rejection of the claim of no difference. There does appear to be a difference ( ) 7. The test statistic of z = =.18 falls in the critical region bounded by z = 1.96 and There is sufficient evidence to warrant rejection of the claim of no difference. There does appear to be a difference. 611 ( ) 8. The test statistic of z = = falls in the critical region bounded by z = 1.96 and There is sufficient evidence to warrant rejection of the claim of no difference. There does appear to be a difference. 9. There are 9 positive signs, 1 negative signs, 0 ties, and n = 10. The test statistic of x = 1 is less than or equal to the critical value of 1 (from Table A-7). There is sufficient evidence to warrant rejection of the claim of no difference. There does appear to be a difference. 80 ( ) 10. The test statistic of z = = 5.5 falls in the critical region bounded by z =± There is 80 sufficient evidence to warrant rejection of the claim of no difference. There does appear to be a difference. 3 ( ) 11. The test statistic of z = = 0.53 does not fall in the critical region bounded by z =± There is not sufficient evidence to warrant rejection of the claim of no difference. There does not appear to be a difference. Sign Test for Median: Ht - HtOpp- Sign test of median = versus not = N Below Equal Above P Median C There are 1 positive signs, 0 negative signs, 0 ties, and n = 1. The test statistic of x = 0 is less than or equal to the critical value of (from Table A-7). There is sufficient evidence to warrant rejection of the claim of no difference. There does appear to be a difference. Copyright 014 Pearson Education, Inc.

222 0 Chapter 13: Nonparametric Statistics 91 ( ) 13. The test statistic of z = = is in the critical region bounded by z =±.575. There 91 is sufficient evidence to warrant rejection of the claim of no difference. The YSORT method appears to have an effect on the gender of the child. (Because so many more boys were born than would be expected with no effect, it appears that the YSORT method is effective in increasing the likelihood that a baby will be a boy.) 104 ( ) 14. The test statistic of z = = 0.88 is not in the critical region bounded by z =± There 104 is not sufficient evidence to warrant rejection of the claim of no difference. It appears that women do not have the ability to predict the sex of their babies. 80 ( ) 15. The test statistic of z = = 1.97 is not in the critical region bounded by z =± There is not sufficient evidence to warrant rejection of the claim that the touch therapists make their selections with a method equivalent to random guesses. The touch therapists do not appear to be effective in selecting the correct hand. 638 ( ) 16. The test statistic of z = = 6.06 is in the critical region bounded by z =± There is 638 sufficient evidence to warrant rejection of the claim that respondents did not have a strong opinion one way or the other. They do appear to favor the opinion that NFL games are not too long. However, the validity of the results is highly questionable because the sample is a voluntary response sample. 40 ( ) 17. The test statistic of z = =.37 is not in the critical region bounded by z =±.575. There 40 is not sufficient evidence to warrant rejection of the claim that the median is equal to g. The quarters appear to be minted according to specifications. Sign Test for Median: Post-1964 Quarters Sign test of median = versus not = N Below Equal Above P Median Post-1964 Quarters ( ) 18. The test statistic of z = = 0.88 is not in the critical region bounded by z =± There is not sufficient evidence to warrant rejection of the claim that the median is equal to Sign Test for Median: MAGNITUDE Sign test of median = versus not = N Below Equal Above P Median MAG ( ) 19. The test statistic of z = = 5.3 is in the critical region bounded by z =± There is 34 sufficient evidence to warrant rejection of the claim that the median amount of Coke is equal to 1 oz. Consumers are not being cheated because they are generally getting more than 1 oz of Coke, not less. Copyright 014 Pearson Education, Inc.

223 Chapter 13: Nonparametric Statistics (continued) Sign Test for Median: CKREGVOL Sign test of median = 1.00 versus not = 1.00 N Below Equal Above P Median CKREGVOL ( ) 0. The test statistic of z = =.70 is in the critical region bounded by z =± There is 79 sufficient evidence to warrant rejection of the claim that the median age is 30 years. Sign Test for Median: Actresses Sign test of median = versus not = N Below Equal Above P Median Actresses ( ) 1. Second approach: The test statistic of z = = 4.9 is in the critical region bounded by 105 z = 1.645, so the conclusions are the same as in Example ( ) Third approach: The test statistic of z = =.8 is in the critical region bounded by 106 z = 1.645, so the conclusions are the same as in Example 4. The different approaches can lead to very different results; as seen in the test statistics of 4.1, 4.9, and.8. The conclusions are the same in this case, but they could be different in other cases.. The column entries are *, *, *, *, *, 0, 0, 0. Section The only requirements are that the matched pairs be a simple random sample and the population of differences be approximately symmetric. There is no requirement of a normal distribution or any other specific distribution. The Wilcoxon signed-ranks test is distribution free in the sense that it does not require a normal distribution or any other specific distribution.. a. 1, 1, 4, 3, 0, 1, 5, 8, 3, 1 b..5,.5, 7, 5.5,.5, 8, 9, 5.5,.5 c..5,.5, 7, 5.5,.5, 8, 9, 5.5,.5 d. 5 and 40 e. T = 5 f. Critical value of T is The sign test uses only the signs of the differences, but the Wilcoxon signed-ranks test uses ranks that are affected by the magnitudes of the differences. 4. The efficiency of 0.95 indicates that with all other things being equal, the Wilcoxon signed-ranks test requires 100 sample observations to achieve the same results as 95 sample observations analyzed through a parametric test. 5. Test statistic: T = 6. Critical value: T = 8. Reject the null hypothesis that the population of differences has a median of 0. There is sufficient evidence to warrant rejection of the claim of no difference. There does appear to be a difference. Test of median = versus median not = N for Wilcoxon Estimated N Test Statistic P Median AGE DIFF Copyright 014 Pearson Education, Inc.

224 Chapter 13: Nonparametric Statistics 80( ) Convert T = to the test statistic z = 4 = ( )( 80+ 1) 4 Critical values: z =± (Tech: P-value = ) Reject the null hypothesis that the population of differences has a median of 0. There is sufficient evidence to warrant rejection of the claim of no difference. There does appear to be a difference. Test of median = versus median not = N for Wilcoxon Estimated N Test Statistic P Median AGE DIFF ( 3 + 1) Convert T = 47 to the test statistic z = 4 = ( 3+ 1)( 3+ 1) 4 Critical values: z =± (Tech: P-value = ) Fail to reject the null hypothesis that the population of differences has a median of 0. There is not sufficient evidence to warrant rejection of the claim of no difference. There does not appear to be a difference. Test of median = versus median not = N for Wilcoxon Estimated N Test Statistic P Median HtDiff Test statistic: T = 0. Critical value: T = 14. Reject the null hypothesis that the population of differences has a median of 0. There is sufficient evidence to warrant rejection of the claim of no difference. There does appear to be a difference. Test of median = versus median not = N for Wilcoxon Estimated N Test Statistic P Median IN-OUT ( 40+ 1) Convert T = 196 to the test statistic z = 4 = ( )( ) 4 Critical values: z =± (Tech: P-value = ) There is sufficient evidence to warrant rejection of the claim that the median is equal to g. The quarters do not appear to be minted according to specifications. Test of median = versus median not = N for Wilcoxon Estimated N Test Statistic P Median Post-1964 Quarters ( ) Convert T = 376 to the test statistic z = 4 = ( )( ) 4 Critical values: z =±.57. (Tech: P-value = ) There is not sufficient evidence to warrant rejection of the claim that the median is equal to Copyright 014 Pearson Education, Inc.

225 10. (continued) Test of median = versus median not = N for Wilcoxon Estimated N Test Statistic P Median Mag Chapter 13: Nonparametric Statistics 3 34( 34+ 1) Convert T = 15.5 to the test statistic z = 4 = ( 34+ 1)( 34+ 1) 4 Critical values: z =± (Tech: P-value = ) There is sufficient evidence to warrant rejection of the claim that the median amount of Coke is equal to 1 oz. Consumers are not being cheated because they are generally getting more than 1 oz of Coke, not less. Test of median = versus median not = N for Wilcoxon Estimated N Test Statistic P Median Volume ( ) Convert T = 67.5 to the test statistic z = 4 = ( )( ) 4 Critical values: z =± (Tech: P-value = ) There is sufficient evidence to warrant rejection of the claim that the median age is 30 years. Test of median = versus median not = N for Wilcoxon Estimated N Test Statistic P Median AGE a. Min: 0 and Max: = 850 b. 850 = 145 c = 000 d. nn ( + 1) k Section Yes. The two samples are independent because the flight data are not matched. The samples are simple random samples. Each sample has more than 10 values.. R = H 0 : Arrival delay times from Flights 19 and 1 have the same median. There are three different possible alternative hypotheses: H 1 : Arrival delay times from Flights 19 and 1 have different medians. H 1 : Arrival delay times from Flight 19 have a median greater than the median of arrival delay times from Flight 1. H 1 : Arrival delay times from Flight 19 have a median less than the median of arrival delay times from Flight The efficiency rating of 0.95 indicates that with all other factors being the same, the Wilcoxon rank-sum test requires 100 sample observations to achieve the same results as 95 observations with the parametric t test of Section 9-3, assuming that the stricter requirements of the parametric t test are satisfied. Copyright 014 Pearson Education, Inc.

226 4 Chapter 13: Nonparametric Statistics ( + + ) ( ) R 1 = 137.5, R = 16.5, μ R = = 150, σ R = = 17.31, test statistic: z = = 0.7. Critical values: z =± (Tech: P-value = ) Fail to reject the null hypothesis that the populations have the same median. There is not sufficient evidence to warrant rejection of the claim that Flights 19 and 1 have the same median arrival delay time. ( + + ) ( ) R 1 = 18, R = 17 R = 17, μ R = = 150, σ R = = 17.31, test statistic: z = = 1.7. Critical values: z =± (Tech: P-value = ) Fail to reject the null hypothesis that the populations have the same median. There is not sufficient evidence to warrant rejection of the claim that Flights 19 and 1 have the same median taxi-out time. ( + + ) ( ) R 1 = 53.5, R = 14.5, μ R = = 18, σ R = = 0.607, test statistic: z = = Critical values: z =± (Tech: P-value = ) Reject the null hypothesis that the populations have the same median. There is sufficient evidence to reject the claim that for those treated with 0 mg of atorvastatin and those treated with 80 mg of atorvastatin, changes in LDL cholesterol have the same median. It appears that the dosage amount does have an effect on the change in LDL cholesterol. ( + + ) ( ) R 1 = 194.5, R = 105.5, μ R = = 150, σ R = = 17.31, test statistic: z = =.57. Critical values: z =± (Tech: P-value = ) Reject the null hypothesis that the populations have the same median. There is sufficient evidence to reject the claim that the median amount of strontium-90 from Pennsylvania residents is the same as the median from New York residents. ( + + ) ( ) R 1 = 501, R = 445, μ R = = 484, σ R = = , test statistic: z = = Critical value: z = (Tech: P-value = ) Fail to reject the null hypothesis that the populations have the same median. There is not sufficient evidence to support the claim that subjects with medium lead levels have full IQ scores with a higher median than the median full IQ score for subjects with high lead levels. It does not appear that lead level affects full IQ scores. ( + + ) ( ) R 1 = 4178, R = 77, μ R = = 3900, σ R = = , test statistic: z = =.38. Critical value: z = (Tech: P-value = ) Reject the null hypothesis that the populations have the same median. There is sufficient evidence to support the claim that subjects with low lead levels have performance IQ scores with a higher median than the median performance IQ score for subjects with high lead levels. It appears that exposure to lead does have an adverse effect. Copyright 014 Pearson Education, Inc.

227 ( + + ) Chapter 13: Nonparametric Statistics 5 ( ) R 1 = 40, R = 80, μ R = = 160, σ R = = , test statistic: z = = Critical values: z =± (Tech: P-value = ) Reject the null hypothesis that the populations have the same median. It appears that the design of quarters changed in ( + + ) ( ) R 1 = 1958, R = 670, μ R = = 1314, σ R = = 88.79, test statistic: z = = 7.5. Critical values: z =± (Tech: P-value = ) Reject the null hypothesis that the populations have the same median. It appears that cans of regular Coke have weights different from cans of diet Coke. The cans of regular Coke appear to weigh more because they include more sugar. ( ) Using U = = 86.5, we get the same value with opposite sign. 14. a. Rank Sum for Rank Treatment A A A B B 3 A B A B 4 A B B A 5 B B A A 7 B A A B 5 B A B A z = = 1.6. The test statistic is 1 11 ( ) 1 b. The R values of 3, 4, 5, 6, 7 have probabilities of 1 6, 1 6, 6, 1 6, and 1 6, respectively c. No, none of the probabilities for the values of R would be less than Section R 1 = = 36.5, R = = 5.5, R 3 = = 47 Low Lead Level Medium Lead Level High Lead Level 70 (1) 7 (3) 8 (7) 85 (10) 90 (14) 93 (16) 86 (1.5) 9 (15) 85 (10) 76 (5) 71 () 75 (4) 84 (8) 86 (1.5) 85 (10) 79 (6). Yes. The samples are independent simple random samples, and each sample has at least five data values. 3. n 1 = 5, n = 6, n 3 = 5, and N = = 16. Copyright 014 Pearson Education, Inc.

228 6 Chapter 13: Nonparametric Statistics 4. The efficiency rating of 0.95 indicates that with all other factors being the same, the Kruskal-Wallis test requires 100 sample observations to achieve the same results as 95 observations with the parametric oneway analysis of variance test, assuming that the stricter requirements of the parametric test are satisfied Test statistic: H = ( + 1) = ( ) Critical value: χ = (Tech: P-value = ) Reject the null hypothesis of equal medians. The data suggest that the different miles present different levels of difficulty Test statistic: H = 3( 36 1) ( 36 1) = Critical value: + χ = (Tech: P-value = ) Reject the null hypothesis of equal medians. The data suggest that the books have levels of reading difficulty that are not all the same Test statistic: H = + + 3( 1+ 1) = ( 1+ 1) Critical value: χ = (Tech: P-value = ) Fail to reject the null hypothesis of equal medians. The data do not suggest that larger cars are safer Test statistic: H = + + 3( 1+ 1) = ( 1+ 1) Critical value: χ = (Tech: P-value = ) Reject the null hypothesis of equal medians. The size of a car does appear to affect highway fuel consumption Test statistic: H = 3( 11 1) ( 11 1) = Critical value: + χ = (Tech: P-value = ) Fail to reject the null hypothesis of equal medians. The data do not suggest that lead exposure has an adverse effect Test statistic: H = 3( 10 1) ( 10 1) = Critical value: + χ = (Tech: P-value = ) Reject the null hypothesis of equal medians. The data suggest that the amounts of nicotine absorbed by smokers is different from the amounts absorbed by people who don t smoke Test statistic: H = 375 ( 1) ( 75 1) = Critical value: + χ = (Tech: P-value: ) Reject the null hypothesis of equal medians. There is sufficient evidence to warrant rejection of the claim that the three different types of cigarettes have the same median amount of nicotine. It appears that the filters do make a difference Test statistic: H = + + 3( 1+ 1) = ( 1+ 1) Critical value: χ = (Tech: P-value = ) Fail to reject the null hypothesis of equal medians. The data do not suggest that larger cars are safer. Copyright 014 Pearson Education, Inc.

229 Chapter 13: Nonparametric Statistics Using ΣT = 16,836 (see table below) and N = = 75, the corrected value of H is =9.0701, which is not substantially different from the value found in Exercise , In this case, the large numbers of ties do not appear to have a dramatic effect on the test statistic H. 3 Nicotine Level Rank t t t Section SUM 16, The methods of Section 10-3 should not be used for predictions. The regression equation is based on a linear correlation between the two variables, but the methods of this section do not require a linear relationship. The methods of this section could suggest that there is a correlation with paired data associated by some nonlinear relationship, so the regression equation would not be a suitable model for making predictions.. Data at the nominal level of measurement have no ordering that enables them to be converted to ranks, so data at the nominal level of measurement cannot be used with the methods of rank correlation. 3. r represents the linear correlation coefficient computed from sample paired data; ρ represents the parameter of the linear correlation coefficient computed from a population of paired data; r s denotes the rank correlation coefficient computed from sample paired data; ρ s represents the rank correlation coefficient computed from a population of paired data. The subscript s is used so that the rank correlation coefficient can be distinguished from the linear correlation coefficient r. The subscript does not represent the standard deviation s. It is used in recognition of Charles Spearman, who introduced the rank correlation method. 4. The efficiency rating of 0.91 indicates that with all other factors being the same, rank correlation requires 100 pairs of sample observations to achieve the same results as 91 pairs of observations with the parametric test using linear correlation, assuming that the stricter requirements for using linear correlation are met. 5. r s = 1. Critical values are r s =± (From Table A-9.) Reject the null hypothesis of ρ s = 0 sufficient evidence to support a claim of a correlation between distance and time. 6. r s = 1. Critical values are r s =± (From Table A-9.) Reject the null hypothesis of ρ s = 0 sufficient evidence to support a claim of a correlation between altitude and time.. There is. There is Copyright 014 Pearson Education, Inc.

230 8 Chapter 13: Nonparametric Statistics 7. r s = Critical values: r s =± (From Table A-9.) Reject the null hypothesis of ρ s = 0. There is sufficient evidence to support the claim of a correlation between the quality scores and prices. These results do suggest that you get better quality by spending more. Pearson correlation of Rank Price and Rank Quality = r s = Critical values: r s =± (From Table A-9.) Fail to reject the null hypothesis of ρ s = 0. There is not sufficient evidence to support the claim of a correlation between the quality scores and prices. These results do not suggest that you get better quality by spending more. Pearson correlation of Rank Quality and Rank Price = r s = Critical values: r s =± (From Table A-9.) Reject the null hypothesis of ρ s = 0. There is sufficient evidence to support the claim of a correlation between the two judges. Examination of the results shows that the first and third judges appear to have opposite rankings. Pearson correlation of First and Second = r s = Critical values: r s =± (From Table A-9.) Fail to reject the null hypothesis of ρ s = 0. There is not sufficient evidence to support the claim of a correlation between the two judges. The two judges appear to rank the bands very differently. Pearson correlation of First and Third = r s = 1. Critical values: r s =± (From Table A-9.) Reject the null hypothesis of ρ s = 0. There is sufficient evidence to conclude that there is a correlation between overhead widths of seals from photographs and the weights of the seals. Pearson correlation of Rank Width and Rank Weight = r s = Critical values: r s =± (From Table A-9.) Reject the null hypothesis of ρ s = 0. There is sufficient evidence to conclude that there is a correlation between the number of chirps in 1 min and the temperature. Pearson correlation of Rank Chirps and Rank Temp = r s = Critical values: r s =± =± Reject the null hypothesis of ρ s = 0. There is 40 1 sufficient evidence to conclude that there is a correlation between the systolic and diastolic blood pressure levels in males. Pearson correlation of RANK SYS and RANK DIAS = r s = Critical values: r s =± (From Table A-9.) Fail to reject the null hypothesis of ρ s = 0. There is not sufficient evidence to conclude that there is a correlation between brain volumes and IQ scores. Pearson correlation of RANK VOL and RANK IQ = r s = Critical values: r s =± =± Reject the null hypothesis of ρ s = 0. There is 48 1 sufficient evidence to conclude that there is a correlation between departure delay times and arrival delay times. Pearson correlation of RANK DEPART and RANK ARRIVE = Copyright 014 Pearson Education, Inc.

231 Chapter 13: Nonparametric Statistics r s = Critical values: r s =± =± Fail to reject the null hypothesis of ρ s = 0. There 50 1 is not sufficient evidence to conclude that there is a correlation between magnitudes and depths of earthquakes. Pearson correlation of RANK MAG and RANK DEPTH = a..447 r s =± =± is not very close to the values of r s =± found in Table A-9. b. Section r s =± =± is quite close to the values of r s =± found in Table A No. The runs test can be used to determine whether the sequence of World Series wins by American League teams and National League teams is not random, but the runs test does not show whether the proportion of wins by the American League is significantly greater than n 1 = 1, n = 8, G = 9 3. a. Answers vary, but here is a sequence that leads to rejection of randomness because the number of runs is, which is very low: W W W W W W W W W W W W E E E E E E E E b. Answers vary, but here is a sequence that leads to rejection of randomness because the number of runs is 17, which is very high: W E W E W E W E W E W E W E W E W W W W 4. No. It is very possible that the sequence of data appears to be random, yet the sampling method (such as voluntary response sampling) might be very unsuitable for statistical methods. 5. n 1 = 19, n = 15, G = 16, critical values: 11, 4 (From Table A-10.) Fail to reject randomness. There is not sufficient evidence to support the claim that we elect Democrats and Republicans in a sequence that is not random. Randomness seems plausible here. 6. n 1 = 17, n = 13, G = 15, critical values: 10, (From Table A-10.) Fail to reject randomness. There is not sufficient evidence to warrant rejection of the claim that odd and even digits occur in random order. 7. n 1 = 0, n = 10, G = 16, critical values: 9, 0 (From Table A-10.) Fail to reject randomness. There is not sufficient evidence to reject the claim that the dates before and after July 1 are randomly selected. 8. n 1 = 10, n = 10, G =, critical values: 6, 16 (From Table A-10.) Reject randomness. The numbers of daily newspapers do not appear to be in a random sequence. Because all of the values above the median occur in the beginning and all of the values below the median occur at the end, there appears to be a downward trend in the numbers of daily newspapers. 9. n 1 = 4, n = 1, G = 17, μ G 41 = + 1 = 3.4, σ G ( ) ( 4 + 1) ( ) = = Test statistic: z = = Critical values: z =± (Tech: P-value = ) Fail to reject randomness. There is not sufficient evidence to reject randomness. The runs test does not test for disproportionately more occurrences of one of the two categories, so the runs test does not suggest that either conference is superior. Copyright 014 Pearson Education, Inc.

232 30 Chapter 13: Nonparametric Statistics 10. n 1 = 6, n = 44, G = 6, μ G 644 = + 1= , σ G ( ) ( 6+ 44) ( ) = = Test statistic: z = = 1.9. Critical values: z =± (Tech: P-value = ) Fail to reject randomness. There is not sufficient evidence to reject randomness The median is 453, n 1 = 3, n = 3, G = 4, μ G = + 1= 4, ( ) 4 4 σ G = = Test statistic: z = = Critical values: ( 3+ 3) ( ) z =± (Tech: P-value = ) Reject randomness. The sequence does not appear to be random when considering values above and below the median. There appears to be an upward trend, so the stock market appears to be a profitable investment for the long term, but it has been more volatile in recent years The mean is C, n 1 = 6, n = 4, G = 8, μ G = + 1= 5.96, ( 6 4) σ G = = Test statistic: z = = Critical values: ( 6 + 4) ( ) z =± (Tech: P-value = ) Reject randomness. The sequence does not appear to be random when considering values above and below the mean. There appears to be an upward trend, so global warming appears to be occurring. 13. a. No solution provided. b. The 84 sequences yield these results: sequences have runs, 7 sequences have 3 runs, 0 sequences have 4 runs, 5 sequences have 5 runs, 0 sequences have 6 runs, and 10 sequences have 7 runs. c. With P( runs) = /84, P(3 runs) = 7/84, P(4 runs) = 0/84, P(5 runs) = 5/84, P(6 runs) = 0/84, and P(7 runs) = 10/84, each of the G values of 3, 4, 5, 6, 7 can easily occur by chance, whereas G = is unlikely because P( runs) is less than The lower critical value of G is therefore, and there is no upper critical value that can be equaled or exceeded. d. Critical value of G = agrees with Table A-10. The table lists 8 as the upper critical value, but it is impossible to get 8 runs using the given elements. Chapter Quick Quiz 1. Distribution-free test. 57 has rank =, 58 has rank 4, and 61 has rank The efficiency rating of 0.91 indicates that with all other factors being the same, rank correlation requires 100 pairs of sample observations to achieve the same results as 91 pairs of observations with the parametric test for linear correlation, assuming that the stricter requirements for using linear correlation are met. 4. The Wilcoxon rank-sum test does not require that the samples be from populations having a normal distribution or any other specific distribution. 5. G = 4 6. Because there are only two runs, all of the values below the mean occur at the beginning and all of the values above the mean occur at the end, or vice versa. This indicates an upward (or downward) trend. 7. Sign test and Wilcoxon signed-ranks test 8. Rank correlation Copyright 014 Pearson Education, Inc.

233 Chapter 13: Nonparametric Statistics Kruskal-Wallis test 10. Test claims involving matched pairs of data; test claims involving nominal data; test claims about the median of a single population Review Exercises 106 ( ) 1. The test statistic of z = = 1.65 is not less than or equal to the critical value of 106 z = Fail to reject the null hypothesis of p = 0.5. There is not sufficient evidence to warrant rejection of the claim that in each World Series, the American League team has a 0.5 probability of winning.. There are 6 positive signs, 0 negative signs, 0 ties, and n = 7. The test statistic of x = 0 is less than or equal to the critical value of 0. There is sufficient evidence to reject the claim of no difference. It appears that there is a difference in cost between flights scheduled 1 day in advance and those scheduled 30 days in advance. Because all of the flights scheduled 30 days in advance cost less than those scheduled 1 day in advance, it is wise to schedule flights 30 days in advance. 3. The test statistic of T = 0 is less than or equal to the critical value of 0. There is sufficient evidence to reject the claim that differences between fares for flights scheduled 1 day in advance and those scheduled 30 days in advance have a median equal to 0. Because all of the flights scheduled 1 day in advance have higher fares than those scheduled 30 days in advance, it appears that it is generally less expensive to schedule flights 30 days in advance instead of 1 day in advance. Test of median = versus median not = N for Wilcoxon Estimated N Test Statistic P Median One Day 30 Days The sample mean is 54.8 years. n 1 = 19, n = 19, and the number of runs is G = 18. The critical values are 13 and 7 (From Table A-10.) Fail to reject the null hypothesis of randomness. There is not sufficient evidence to warrant rejection of the claim that the sequence of ages is random relative to values above and below the mean. The results do not suggest that there is an upward trend or a downward trend. 5. r s = Critical values: r s =± (From Table A-9.) Fail to reject the null hypothesis of ρ s = 0. There is not sufficient evidence to support the claim that there is a correlation between the student ranks and the magazine ranks. When ranking colleges, students and the magazine do not appear to agree. Pearson correlation of Student Ranks and USNEWS Ranks = ( ) 6. The test statistic of z = = 0.88 is not in the critical region bounded by z =± There 3 is not sufficient evidence to warrant rejection of the claim that the population of differences has a median of zero. Based on the sample data, it appears that the predictions are reasonably accurate, because there does not appear to be a difference between the actual high temperatures and the predicted high temperatures. 3( 3 + 1) Convert T = 30.5 to the test statistic z = 4 = ( 3+ 1)( 3+ 1) 4 Critical values: z =± (Tech: P-value = ) There is not sufficient evidence to warrant rejection of the claim that the population of differences has a median of zero. Based on the sample data, it appears that the predictions are reasonably accurate, because there does not appear to be a difference between the actual high temperatures and the predicted high temperatures. Copyright 014 Pearson Education, Inc.

234 3 Chapter 13: Nonparametric Statistics 7. (continued) Test of median = versus median not = N for Wilcoxon Estimated N Test Statistic P Median TEMP Test statistic: H = 3( 7 1) ( 7 1) = Critical value: χ = (Tech: P-value = ) Reject the null hypothesis of equal medians. Interbreeding of cultures is suggested by the data. ( + + ) ( ) R 1 = 60, R = 111, μ R = = 85.5, σ R = = , test statistic: z = =.5. Critical values: z =± (Tech: P-value = ) Reject the null hypothesis that the populations have the same median. Skull breadths from 4000 b.c. appear to have a different median than those from a.d r s = Critical values: r s =± Fail to reject the null hypothesis of ρ s = 0. There is not sufficient evidence to support the claim that there is a correlation between weights of plastic and weights of food. Pearson correlation of Rank Plastic and Rank Rood = Cumulative Review Exercises 1. x = 14.6 hours, median = 15.0 hours, s = 1.7 hours, s =.9 hour, range = 6.0 hours. a. Convenience sample b. Because the sample is from one class of statistics students, it is not likely to be representative of the population of all fulltime college students. c. Discrete d. Ratio 3. The data meet the requirement of being from a normal distribution. H 0: μ = 14 hours. H 1 : μ > 14 hours Test statistic: t = = Critical value: t = 1.79 (assuming a 0.05 significance level). P / 0 value > 0.05 (Tech: 0.08). Fail to reject H 0. There is not sufficient evidence to support the claim that the mean is greater than 14 hours Copyright 014 Pearson Education, Inc.

235 Chapter 13: Nonparametric Statistics The test statistic of x = 5 is not less than or equal to the critical value of 4 (from Table A-7.) There is not sufficient evidence to support the claim that the sample is from a population with a median greater than 14 hours hours < μ < 15.3 hours. We have 95% confidence that the limits of 13.8 hours and 15.3 hours contain the true value of the population mean < μ < r = Critical values: r =± P-value = There is not sufficient evidence to support the claim of a linear correlation between price and quality score. It appears that you don t get better quality by paying more. Pearson correlation of Price and Quality = 0.05 P-Value = r s = Critical values: r s =± Fail to reject the null hypothesis of ρ s = 0. There is not sufficient evidence to support the claim that there is a correlation between price and rank. Pearson correlation of Rank and Price Rank = P-Value = < p < Because the value of 0.5 is not included in the range of values in the confidence interval, the result suggests that the percentage of all such telephones that are not functioning is different from 5% ( )( ) ( )( ) < p < [ z ] ( ) / ( 0.5) 9. n = α = = 169 (Tech: 1691) E There must be an error, because the rates of 13.7% and 10.6% are not possible with samples of size 100. Copyright 014 Pearson Education, Inc.

236

237 Chapter 14: Statistical Process Control 35 Chapter 14: Statistical Process Control Section No. If we know that the manufacture of quarters is within statistical control, we know that the three out-ofcontrol criteria are not violated, but we know nothing about whether the specification of g is being met. It is possible to be within statistical control by manufacturing quarters with weights that are very far from the desired target of g.. An x control chart is a plot of sample means and it includes a centerline as well as a line representing an upper control limit and a line representing a lower control limit. An R control chart is a plot of sample ranges and it also includes a centerline as well as a line representing an upper control limit and a line representing a lower control limit. x denotes the mean of all of the 0 sample means, R denotes the mean of the 0 ranges, UCL denotes the value used to locate the upper control limit in a control chart, and LCL denotes the value used to locate the lower control limit in a control chart. 3. To use an x chart without an R chart is to ignore variation, and amounts of variation that are too large will result in too many defective goods or services, even though the mean might appear to be acceptable. To use an R chart without an x chart is to ignore the central tendency, so the goods or services might not vary much, but the process could be drifting so that daily process data do not vary much, but the daily means are steadily increasing or decreasing. 4. The range and mean are both out of statistical control. The control charts show that the elevations are decreasing substantially, so Lake Mead is becoming shallower. The variation is becoming more stable, so the elevations are not varying as much as they did earlier. 5. x = lb, R = lb, n = 7. For the R chart: LCL = DR 3 = = 4.18 lb and UCL = DR 4 = = lb. For the x chart: LCL = x A R = = lb and UCL = x+ A R = = lb. 6. The run chart does not reveal any patterns suggesting problems that need correcting. Copyright 014 Pearson Education, Inc.

238 36 Chapter 14: Statistical Process Control 7. The R chart does not violate any of the out-of-control criteria, so the variation of the process appears to be within statistical control 8. The x chart does not violate any of the out-of-control criteria, so the mean of the process appears to be within statistical control. 9. x = C, R = C, n = 10. For the R chart: LCL = DR 3 = = 0.09 C and UCL = DR 4 = = C. For the x chart: LCL = x A R = = 14.1 C and UCL = x+ A R = = C. 10. The R chart does not violate any of the out-of-control criteria, so the variation of the process appears to be within statistical control. Copyright 014 Pearson Education, Inc.

239 Chapter 14: Statistical Process Control Because there is a pattern of an upward trend and there are points lying beyond the control limits, the x chart shows that the process is out of statistical control. 1. The run chart reveals a very clear pattern of an upward trend, so this graph provides evidence in favor of the theory that we are undergoing global warming. 13. s = g, n = 5. The R chart and the s chart are very similar in their pattern. LCL = Bs 3 = = 0 g and UCL = Bs 4 = = g. Copyright 014 Pearson Education, Inc.

240 38 Chapter 14: Statistical Process Control 14. x = g, s = g, n = 5. The two x charts are very similar. LCL = x A s = = g and 3 UCL = x+ A s = = g. 3 Section No, the process does not appear to be within statistical control. There is a downward trend, there are at least 8 consecutive points all lying above the centerline, and there are at least 8 consecutive points all lying below the centerline. Because the proportions of defects are decreasing, the manufacturing process is not deteriorating; it is improving.. p is the pooled estimate of the proportion of defective items. It is obtained by finding the total number of defects in all samples combined and dividing that total by the total number of items sampled. 3. LCL denotes the lower control limit. Because the value of is negative and the actual proportion of defects cannot be less than 0, we should replace that value by No. It is very possible that the proportions of defects in repeated samplings behave in a way that makes the process appear to be within statistical control, but the actual proportions of defects could be very high (such as 90%) so that almost all of the dimes fail to meet the manufacturing specifications. Upper and lower control limits of a control chart for the proportion of defects are based on the actual behavior of the process, not the desired behavior. 5. The process appears to be within statistical control. Copyright 014 Pearson Education, Inc.

241 Chapter 14: Statistical Process Control 39 pq ( 0.8)( 0.718) 6. p = 0.8, LCL = p 3 = = n 100 pq ( 0.8)( 0.718) LCL = p + 3 = = n 100 The process appears to be within statistical control. The control chart is the same as the control chart from Exercise 5, except for the values used to locate the centerline and control limits. In this exercise, the proportions of defects are very high. Even though the process is within statistical control, this manufacturing process is yielding far too many defective dimes, so corrective action should be taken to lower the defect rate. pq ( )( ) 7. p = , LCL = p 3 = = n 100,000 pq ( )( ) UCL = p + 3 = = n 100, 000 Because there appears to be a pattern of a downward shift and there are at least 8 consecutive points all lying above the centerline, the process is not within statistical control. pq ( )( ) 8. p = , LCL = p 3 = = n 100,000 pq ( )( ) UCL = p + 3 = = n 100, 000 Copyright 014 Pearson Education, Inc.

242 40 Chapter 14: Statistical Process Control 8. (continued) There is a pattern of a downward trend and there are at least 8 consecutive points all below the centerline, so the process is not within statistical control. Because there is a downward trend in the rate of violent crimes, we are becoming safer. pq ( )( ) 9. p = , LCL = p 3 = = n 1000 pq ( )( ) UCL = p + 3 = = n 1000 The process is out of control because there are points lying beyond the control limits and there are at least 8 points all lying below the centerline. The percentage of voters started to increase in recent years, and it should be much higher than any of the rates shown. pq ( )( ) 10. p = , LCL = p 3 = = n 1000 pq ( )( ) UCL = p + 3 = = n 1000 The process appears to be within statistical control. Ideally, there would be an upward trend due to increasing rates of college enrollments among high school graduates. Copyright 014 Pearson Education, Inc.

243 Chapter 14: Statistical Process Control 41 pq ( 0.068)( 0.973) 11. p = 0.068, LCL = p 3 = = n 500 pq ( 0.068)( 0.973) UCL = p + 3 = = n 500 There is a pattern of a downward trend and there are at least 8 consecutive points all below the centerline, so the process does not appear to be within statistical control. Because the rate of defects is decreasing, the process is actually improving and we should investigate the cause of that improvement so that it can be continued. pq ( )( ) 1. p = , LCL = p 3 = = n 00 pq ( )( ) UCL = p + 3 = = n 00 There appears to be a pattern of increasing variation, so the process is not within statistical control. The cause of the increasing variation should be identified and corrected. Copyright 014 Pearson Education, Inc.

244 4 Chapter 14: Statistical Process Control 13. np = 10, = 1.6, LCL = np 3 npq = ,000( )( ) = UCL = np + 3 npq = ,000( )( ) = 3.4 Except for the vertical scale, the control chart is identical to the one obtained for Example 1. Chapter Quick Quiz 1. Process data are data arranged according to some time sequence. They are measurements of a characteristic of goods or services that result from some combination of equipment, people, materials, methods, and conditions.. Random variation is due to chance, but assignable variation results from causes that can be identified, such as defective machinery or untrained employees. 3. There is a pattern, trend, or cycle that is obviously not random. There is a point lying outside of the region between the upper and lower control limits. There are at least 8 consecutive points all above or all below the centerline. 4. An R chart uses ranges to monitor variation, but an x chart uses sample means to monitor the center (mean) of a process. 5. No. The R chart has at least 8 consecutive points all lying below the centerline and there are points lying beyond the upper control limit. Also, there is a pattern showing that the ranges have jumped in value for the most recent samples. 6. R = 5.8 ft. In general, a value of R is found by first finding the range for the values within each individual subgroup; the mean of those ranges is the value of R. 7. No. The x chart has a point lying below the lower control limit. 8. x = 3.95 ft. In general, a value of x is found by first finding the mean of the values within each individual subgroup; the mean of those subgroup means is the value of x. 9. A p chart is a control chart of the proportions of some attribute, such as defective items. 10. Because there is a downward trend, the process is not within statistical control, but the rate of defects is decreasing, so we should investigate and identify the cause of that trend so that it can be continued. Review Exercises 1. x = kwh, R = kwh, n = 6. For the R chart: LCL = DR 3 = = 0 kwh and UCL = DR 4 = = kwh. For the x chart: LCL = x A R= = kwh and UCL = x A R= = kwh. Copyright 014 Pearson Education, Inc.

245 Chapter 14: Statistical Process Control 43. R = kwh, n = 6. The process variation is within statistical control. LCL = DR 3 = = kwh and UCL = DR 4 = = kwh. 3. x = kwh, R = kwh, n = 6. The process mean is within statistical control. LCL = x A R= = kwh and UCL = x+ A R= = kwh. 4. There does not appear to be a pattern suggesting that the process is not within statistical control. There is 1 point that appears to be exceptionally low. (The author s power company made an error in recording and reporting the energy consumption for that time period.) Copyright 014 Pearson Education, Inc.

246 44 Chapter 14: Statistical Process Control pq ( 0.056)( 0.944) 5. p = 0.056, LCL = p 3 = = ; use 0 n 100 pq ( 0.056)( 0.944) LCL = p + 3 = = n 100 Because there are 8 consecutive points above the centerline and there is an upward trend, the process does not appear to be within statistical control. Cumulative Review Exercises < p < Because all of the values in the confidence interval estimate of the population proportion are greater than 0.5, it does appear that the majority of adults believe that it is not appropriate to wear shorts at work. Sample X N Sample p 95% CI ( , ). a = 0.45 b. ( 0.55) 5 = c. 1 ( 0.55) 5 = r = Critical values: r =± P-value = There is sufficient evidence to support the claim that there is a linear correlation between yields from regular seed and kiln-dried seed. The purpose of the experiment was to determine whether there is a difference in yield from regular seed and kiln-dried seed (or whether kiln-dried seed produces a higher yield), but results from a test of correlation do not provide us with the information we need to address that issue. Pearson correlation of Regular and Kiln-dried = 0.80 P-Value = H 0 : μ d = 0. H 1 : μ d < 0. Test statistic: t = Critical value: t = 1.81 (assuming a 0.05 significance level). P-value < 0.05 (Tech: ). Fail to reject H 0. There is not sufficient evidence to support the claim that kiln-dried seed is better in the sense that it produces a higher mean yield than regular seed. (The sign test can be used to arrive at the same conclusion; the test statistic is x = 3 and the critical value is 1. Also, the Wilcoxon signed-ranks test can be used; the test statistic is T = 13.5 and the critical value is 8.) Minitab Paired T for Regular - Kiln-dried 95% upper bound for mean difference: 0.00 T-Test of mean difference = 0 (vs < 0): T-Value = P-Value = Copyright 014 Pearson Education, Inc.

247 Chapter 14: Statistical Process Control For the sample of yields from regular seed, x = 0.0 and for the sample of yields from kiln-dried seed, x = 1.0, so there does not appear to be a significant difference. For the sample of yields from regular seed, s = 3.4 and for the sample of yields from kiln-dried seed, s = 4.1, so there does not appear to be a significant difference. pq ( 0.1)( 0.878) 6. p = 0.1, LCL = p 3 = = ; use 0 n 50 pq ( 0.1)( 0.878) LCL = p + 3 = = n There appears to be a pattern of an upward trend, so the process is not within statistical control. 8. a z = = 0.7; P( z> 0.7) = 3.58%. With 3.58% of males with head breadths greater than.5 17 cm, too many males would be excluded. b. 5th percentile: x = μ + z σ = = cm 95th percentile: x = μ + z σ = = 19.3 cm 9. With a voluntary response sample, the subjects decide themselves whether to be included. With a simple random sample, subjects are selected through some random process in such a way that all samples of the same size have the same chance of being selected. A simple random sample is generally better for use with statistical methods. 10. Sampling method (part c) Copyright 014 Pearson Education, Inc.

MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. MATH 3/GRACEY PRACTICE EXAM/CHAPTERS 2-3 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) The frequency distribution

More information

Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

More information

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Final Exam Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) A researcher for an airline interviews all of the passengers on five randomly

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Exam Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) The government of a town needs to determine if the city's residents will support the

More information

Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

More information

6. Decide which method of data collection you would use to collect data for the study (observational study, experiment, simulation, or survey):

6. Decide which method of data collection you would use to collect data for the study (observational study, experiment, simulation, or survey): MATH 1040 REVIEW (EXAM I) Chapter 1 1. For the studies described, identify the population, sample, population parameters, and sample statistics: a) The Gallup Organization conducted a poll of 1003 Americans

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Chapter 1 Review 1. As part of survey of college students a researcher is interested in the variable class standing. She records a 1 if the student is a freshman, a 2 if the student

More information

Common Tools for Displaying and Communicating Data for Process Improvement

Common Tools for Displaying and Communicating Data for Process Improvement Common Tools for Displaying and Communicating Data for Process Improvement Packet includes: Tool Use Page # Box and Whisker Plot Check Sheet Control Chart Histogram Pareto Diagram Run Chart Scatter Plot

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

Northumberland Knowledge

Northumberland Knowledge Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

More information

How To Write A Data Analysis

How To Write A Data Analysis Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

More information

STAB22 section 1.1. total = 88(200/100) + 85(200/100) + 77(300/100) + 90(200/100) + 80(100/100) = 176 + 170 + 231 + 180 + 80 = 837,

STAB22 section 1.1. total = 88(200/100) + 85(200/100) + 77(300/100) + 90(200/100) + 80(100/100) = 176 + 170 + 231 + 180 + 80 = 837, STAB22 section 1.1 1.1 Find the student with ID 104, who is in row 5. For this student, Exam1 is 95, Exam2 is 98, and Final is 96, reading along the row. 1.2 This one involves a careful reading of the

More information

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

More information

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: What do the data look like? Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses

More information

Variables. Exploratory Data Analysis

Variables. Exploratory Data Analysis Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Statistics Chapter 2

Statistics Chapter 2 Statistics Chapter 2 Frequency Tables A frequency table organizes quantitative data. partitions data into classes (intervals). shows how many data values are in each class. Test Score Number of Students

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)

More information

Section 1.1 Exercises (Solutions)

Section 1.1 Exercises (Solutions) Section 1.1 Exercises (Solutions) HW: 1.14, 1.16, 1.19, 1.21, 1.24, 1.25*, 1.31*, 1.33, 1.34, 1.35, 1.38*, 1.39, 1.41* 1.14 Employee application data. The personnel department keeps records on all employees

More information

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

More information

Statistics. Measurement. Scales of Measurement 7/18/2012

Statistics. Measurement. Scales of Measurement 7/18/2012 Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does

More information

Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

Shape of Data Distributions

Shape of Data Distributions Lesson 13 Main Idea Describe a data distribution by its center, spread, and overall shape. Relate the choice of center and spread to the shape of the distribution. New Vocabulary distribution symmetric

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous

Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous Chapter 2 Overview Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Classify as categorical or qualitative data. 1) A survey of autos parked in

More information

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. C) (a) 2. (b) 1.5. (c) 0.5-2.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. C) (a) 2. (b) 1.5. (c) 0.5-2. Stats: Test 1 Review Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the given frequency distribution to find the (a) class width. (b) class

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

Interpreting Data in Normal Distributions

Interpreting Data in Normal Distributions Interpreting Data in Normal Distributions This curve is kind of a big deal. It shows the distribution of a set of test scores, the results of rolling a die a million times, the heights of people on Earth,

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis

More information

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,

More information

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test March 2014

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test March 2014 UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test March 2014 STAB22H3 Statistics I Duration: 1 hour and 45 minutes Last Name: First Name: Student number: Aids

More information

Describing, Exploring, and Comparing Data

Describing, Exploring, and Comparing Data 24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter

More information

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data A Few Sources for Data Examples Used Introduction to Environmental Statistics Professor Jessica Utts University of California, Irvine [email protected] 1. Statistical Methods in Water Resources by D.R. Helsel

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer [email protected] Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

Statistics 151 Practice Midterm 1 Mike Kowalski

Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Multiple Choice (50 minutes) Instructions: 1. This is a closed book exam. 2. You may use the STAT 151 formula sheets and

More information

2 Describing, Exploring, and

2 Describing, Exploring, and 2 Describing, Exploring, and Comparing Data This chapter introduces the graphical plotting and summary statistics capabilities of the TI- 83 Plus. First row keys like \ R (67$73/276 are used to obtain

More information

Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures

Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures Introductory Statistics Lectures Visualizing Data Descriptive Statistics I Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the

More information

List of Examples. Examples 319

List of Examples. Examples 319 Examples 319 List of Examples DiMaggio and Mantle. 6 Weed seeds. 6, 23, 37, 38 Vole reproduction. 7, 24, 37 Wooly bear caterpillar cocoons. 7 Homophone confusion and Alzheimer s disease. 8 Gear tooth strength.

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability

DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability RIT Score Range: Below 171 Below 171 Data Analysis and Statistics Solves simple problems based on data from tables* Compares

More information

Determine whether the data are qualitative or quantitative. 8) the colors of automobiles on a used car lot Answer: qualitative

Determine whether the data are qualitative or quantitative. 8) the colors of automobiles on a used car lot Answer: qualitative Name Score: Math 227 Review Exam 1 Chapter 2& Fall 2011 ********************************************************************************************************************** SHORT ANSWER. Show work on

More information

Chapter 2: Frequency Distributions and Graphs

Chapter 2: Frequency Distributions and Graphs Chapter 2: Frequency Distributions and Graphs Learning Objectives Upon completion of Chapter 2, you will be able to: Organize the data into a table or chart (called a frequency distribution) Construct

More information

MATH 103/GRACEY PRACTICE QUIZ/CHAPTER 1. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MATH 103/GRACEY PRACTICE QUIZ/CHAPTER 1. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. MATH 103/GRACEY PRACTICE QUIZ/CHAPTER 1 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use common sense to determine whether the given event

More information

The Math. P (x) = 5! = 1 2 3 4 5 = 120.

The Math. P (x) = 5! = 1 2 3 4 5 = 120. The Math Suppose there are n experiments, and the probability that someone gets the right answer on any given experiment is p. So in the first example above, n = 5 and p = 0.2. Let X be the number of correct

More information

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4 STATISTICS 8, FINAL EXAM NAME: KEY Seat Number: Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4 Make sure you have 8 pages. You will be provided with a table as well, as a separate

More information

Data Exploration Data Visualization

Data Exploration Data Visualization Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

More information

Chapter 4. Probability Distributions

Chapter 4. Probability Distributions Chapter 4 Probability Distributions Lesson 4-1/4-2 Random Variable Probability Distributions This chapter will deal the construction of probability distribution. By combining the methods of descriptive

More information

IBM SPSS Direct Marketing 22

IBM SPSS Direct Marketing 22 IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release

More information

DesCartes (Combined) Subject: Mathematics Goal: Data Analysis, Statistics, and Probability

DesCartes (Combined) Subject: Mathematics Goal: Data Analysis, Statistics, and Probability DesCartes (Combined) Subject: Mathematics Goal: Data Analysis, Statistics, and Probability RIT Score Range: Below 171 Below 171 171-180 Data Analysis and Statistics Data Analysis and Statistics Solves

More information

2-7 Exploratory Data Analysis (EDA)

2-7 Exploratory Data Analysis (EDA) 102 C HAPTER 2 Describing, Exploring, and Comparing Data 2-7 Exploratory Data Analysis (EDA) This chapter presents the basic tools for describing, exploring, and comparing data, and the focus of this section

More information

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve

More information

Math and Science Bridge Program. Session 1 WHAT IS STATISTICS? 2/22/13. Research Paperwork. Agenda. Professional Development Website

Math and Science Bridge Program. Session 1 WHAT IS STATISTICS? 2/22/13. Research Paperwork. Agenda. Professional Development Website Math and Science Bridge Program Year 1: Statistics and Probability Dr. Tamara Pearson Assistant Professor of Mathematics Research Paperwork Informed Consent Pre-Survey After you complete the survey please

More information

DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1

DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1 DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1 OVERVIEW STATISTICS PANIK...THE THEORY AND METHODS OF COLLECTING, ORGANIZING, PRESENTING, ANALYZING, AND INTERPRETING DATA SETS SO AS TO DETERMINE THEIR ESSENTIAL

More information

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 [email protected] 1. Descriptive Statistics Statistics

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. STATISTICS/GRACEY PRACTICE TEST/EXAM 2 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Identify the given random variable as being discrete or continuous.

More information

IBM SPSS Direct Marketing 23

IBM SPSS Direct Marketing 23 IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release

More information

Probability Distributions

Probability Distributions Learning Objectives Probability Distributions Section 1: How Can We Summarize Possible Outcomes and Their Probabilities? 1. Random variable 2. Probability distributions for discrete random variables 3.

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

Using SPSS, Chapter 2: Descriptive Statistics

Using SPSS, Chapter 2: Descriptive Statistics 1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

More information

Darton College Online Math Center Statistics. Chapter 2: Frequency Distributions and Graphs. Presenting frequency distributions as graphs

Darton College Online Math Center Statistics. Chapter 2: Frequency Distributions and Graphs. Presenting frequency distributions as graphs Chapter : Frequency Distributions and Graphs 1 Presenting frequency distributions as graphs In a statistical study, researchers gather data that describe the particular variable under study. To present

More information

IBM SPSS Statistics for Beginners for Windows

IBM SPSS Statistics for Beginners for Windows ISS, NEWCASTLE UNIVERSITY IBM SPSS Statistics for Beginners for Windows A Training Manual for Beginners Dr. S. T. Kometa A Training Manual for Beginners Contents 1 Aims and Objectives... 3 1.1 Learning

More information

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph. MBA/MIB 5315 Sample Test Problems Page 1 of 1 1. An English survey of 3000 medical records showed that smokers are more inclined to get depressed than non-smokers. Does this imply that smoking causes depression?

More information

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

More information

AP * Statistics Review. Descriptive Statistics

AP * Statistics Review. Descriptive Statistics AP * Statistics Review Descriptive Statistics Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production

More information

Data Analysis, Statistics, and Probability

Data Analysis, Statistics, and Probability Chapter 6 Data Analysis, Statistics, and Probability Content Strand Description Questions in this content strand assessed students skills in collecting, organizing, reading, representing, and interpreting

More information

How To: Analyse & Present Data

How To: Analyse & Present Data INTRODUCTION The aim of this How To guide is to provide advice on how to analyse your data and how to present it. If you require any help with your data analysis please discuss with your divisional Clinical

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

DesCartes (Combined) Subject: Mathematics 2-5 Goal: Data Analysis, Statistics, and Probability

DesCartes (Combined) Subject: Mathematics 2-5 Goal: Data Analysis, Statistics, and Probability DesCartes (Combined) Subject: Mathematics 2-5 Goal: Data Analysis, Statistics, and Probability RIT Score Range: Below 171 Below 171 Data Analysis and Statistics Solves simple problems based on data from

More information

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2 Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables

More information

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Ch. 1 Introduction to Statistics 1.1 An Overview of Statistics 1 Distinguish Between a Population and a Sample Identify the population and the sample. survey of 1353 American households found that 18%

More information

Bar Graphs and Dot Plots

Bar Graphs and Dot Plots CONDENSED L E S S O N 1.1 Bar Graphs and Dot Plots In this lesson you will interpret and create a variety of graphs find some summary values for a data set draw conclusions about a data set based on graphs

More information

A Correlation of. to the. South Carolina Data Analysis and Probability Standards

A Correlation of. to the. South Carolina Data Analysis and Probability Standards A Correlation of to the South Carolina Data Analysis and Probability Standards INTRODUCTION This document demonstrates how Stats in Your World 2012 meets the indicators of the South Carolina Academic Standards

More information

Statistics and Probability

Statistics and Probability Statistics and Probability TABLE OF CONTENTS 1 Posing Questions and Gathering Data. 2 2 Representing Data. 7 3 Interpreting and Evaluating Data 13 4 Exploring Probability..17 5 Games of Chance 20 6 Ideas

More information

Measurement & Data Analysis. On the importance of math & measurement. Steps Involved in Doing Scientific Research. Measurement

Measurement & Data Analysis. On the importance of math & measurement. Steps Involved in Doing Scientific Research. Measurement Measurement & Data Analysis Overview of Measurement. Variability & Measurement Error.. Descriptive vs. Inferential Statistics. Descriptive Statistics. Distributions. Standardized Scores. Graphing Data.

More information

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple. Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers 1.3 Measuring Center & Spread, The Five Number Summary & Boxplots Describing Quantitative Data with Numbers 1.3 I can n Calculate and interpret measures of center (mean, median) in context. n Calculate

More information

Section 3 Part 1. Relationships between two numerical variables

Section 3 Part 1. Relationships between two numerical variables Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Ch. 4 Discrete Probability Distributions 4.1 Probability Distributions 1 Decide if a Random Variable is Discrete or Continuous 1) State whether the variable is discrete or continuous. The number of cups

More information

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Organizing Your Approach to a Data Analysis

Organizing Your Approach to a Data Analysis Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize

More information

Mind on Statistics. Chapter 8

Mind on Statistics. Chapter 8 Mind on Statistics Chapter 8 Sections 8.1-8.2 Questions 1 to 4: For each situation, decide if the random variable described is a discrete random variable or a continuous random variable. 1. Random variable

More information

a) Find the five point summary for the home runs of the National League teams. b) What is the mean number of home runs by the American League teams?

a) Find the five point summary for the home runs of the National League teams. b) What is the mean number of home runs by the American League teams? 1. Phone surveys are sometimes used to rate TV shows. Such a survey records several variables listed below. Which ones of them are categorical and which are quantitative? - the number of people watching

More information

Name: Date: Use the following to answer questions 2-3:

Name: Date: Use the following to answer questions 2-3: Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student

More information

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles. Math 1530-017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible

More information

Midterm Review Problems

Midterm Review Problems Midterm Review Problems October 19, 2013 1. Consider the following research title: Cooperation among nursery school children under two types of instruction. In this study, what is the independent variable?

More information

WEEK #22: PDFs and CDFs, Measures of Center and Spread

WEEK #22: PDFs and CDFs, Measures of Center and Spread WEEK #22: PDFs and CDFs, Measures of Center and Spread Goals: Explore the effect of independent events in probability calculations. Present a number of ways to represent probability distributions. Textbook

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

More information