AP Statistics Review Chapters 4-5 Practice Problems Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Would you expect the distribution of this variable to be uniform, unimodal, or bimodal? Symmetric or skewed? Explain why. 1) Ages of patients who had their tonsils removed at a hospital over the course of a year. 1) A) The distribution would likely be unimodal and symmetric. The procedure is much more common among young people, so most patients would be younger, perhaps 8-12 years old. The distribution would be symmetric, since it is possible to have this procedure done earlier or later than the average age. B) The distribution would likely be bimodal and symmetric. The procedure is much more common among young people, so most patients would be younger, perhaps 8-12 years old. Eight-year-olds would be at one mode, and twelve-year-olds would be at the other mode. The distribution would be symmetric, since it is possible to have this procedure done earlier or later than the average age. C) The distribution would likely be unimodal and skewed left. The procedure is much more common among young people, so most patients would be younger, perhaps 8-12 years old. The distribution would be skewed left, since it is possible to have a greater variety of ages among younger people D) The distribution would likely be unimodal and skewed right. The procedure is much more common among young people, so most patients would be younger, perhaps 8-12 years old. The distribution would be skewed right, since it is possible to have a greater variety of ages among older people, while there is a natural left endpoint to the distribution at zero years of age. E) The distribution would likely be bimodal and skewed right. The procedure is much more common among young people, so most patients would be younger, perhaps 8-12 years old. Eight-year-olds would be at one mode, and twelve-year-olds would be at the other mode. The distribution would be skewed right, since it is possible to have a greater variety of ages among older people, while there is a natural left endpoint to the distribution at zero years of age. Describe the distribution (shape, center, spread, unusual features). 2) A university instructor created a website for her Organic Chemistry course. The students in her class were encouraged to use the website as an additional resource for the course. At the end of the semester, the instructor asked each student how many times he or she visited the website and recorded the counts. Based on the histogram below, describe the distribution of website use. 2) 1
A) The distribution of the number of visits to the course website by each student for the semester is skewed to the left, with the number of visits ranging from 1 to 16 visits. The distribution is centered at about 14 visits, with many students visiting 15 times. There is an outlier in the distribution, two students who visited the site only once. The next highest number of visits was 8. B) The distribution of the number of visits to the course website by each student for the semester is skewed to the left, with the number of visits ranging from 1 to 15 visits. The distribution is centered at about 12 visits, with many students visiting 15 times. There is an outlier in the distribution, two students who visited the site only once. The next highest number of visits was 8. C)The distribution of the number of visits to the course website by each student for the semester is skewed to the right, with the number of visits ranging from 1 to 15 visits. The distribution is centered at about 14 visits, with many students visiting 15 times. There is an outlier in the distribution, two students who visited the site only once. The next highest number of visits was 8. D) The distribution of the number of visits to the course website by each student for the semester is skewed to the left, with the number of visits ranging from 1 to 15 visits. The distribution is centered at about 14 visits, with many students visiting 15 times. There is an outlier in the distribution, two students who visited the site only once. The next highest number of visits was 8. E) The distribution of the number of visits to the course website by each student for the semester is skewed to the left, with the number of visits ranging from 1 to 15 visits. The distribution is centered at about 14 visits, with many students visiting 15 times. 3) A business owner recorded her annual profits for the first 12 years since opening her business in 1993. The stem-and-leaf display below shows the annual profits in thousands of dollars. Use both the stemplot and timeplot to describe the distribution. 3) Annual Profit Totals 8 0133 7 9 6 5 023 4 8 3 2 24 1 7 Key: 7 9 = $79,000 profit 2
A) The distribution of the business owner's profits is skewed to the right, and is multimodal, with gaps in between. One mode is at around $80,000, another at around $50,000, and a third mode at around $20,000. The timeplot shows that the profits grew from 1993 to 1994, and were relatively steady from 1994 to 1998. After 1998, the profits declined significantly compared with those between 1993 and 1998. B) The distribution of the business owner's profits is skewed to the left, and is multimodal, with gaps in between. One mode is at around $80,000, another at around $50,000, and a third mode at around $20,000. The timeplot shows that the profits grew from 1993 to 1994, and were relatively steady from 1994 to 1998. After 1998, the profits declined significantly compared with those between 1993 and 1998. C)The distribution of the business owner's profits is skewed to the left, and is unimodal, with gaps in between. The center is at around $50,000. The timeplot shows that the profits grew from 1993 to 1994, and were relatively steady from 1994 to 1998. After 1998, the profits declined significantly compared with those between 1993 and 1998. D) The distribution of the business owner's profits is skewed to the left, and is multimodal, with gaps in between. One mode is at around $80,000, another at around $50,000, and a third mode at around $20,000. The timeplot shows that the profits grew from 1993 to 1994, and were relatively steady from 1994 to 2001. After 2001, the profits declined significantly compared with those between 1994 and 2001. E) The distribution of the business owner's profits is skewed to the left, and is unimodal, with gaps in between. The center is at around $50,000. The timeplot shows that the profits grew from 1993 to 1994, and were relatively steady from 1994 to 2001. After 2001, the profits declined significantly compared with those between 1994 and 2001. 4) The histogram displays the body fat percentages of 65 students taking a college health course. In addition to describing the distribution, give a reason to account for the shape of this distribution. 4) A) The distribution of body fat percentages is bimodal, with a cluster of body fat percentages around 12% and another cluster of body fat percentages around 28%. The upper cluster shows a bit of a skew to the right. Most students in the lower cluster have body fat percentages between 12% and 18%, and most students in the upper cluster have body fat percentages between 22% and 28%. Men and women have different body fat percentages: the lower cluster would likely represent male students, and the upper cluster would likely represent female students. 3
B) The distribution of body fat percentages is bimodal, with a cluster of body fat percentages around 16% and another cluster of body fat percentages around 26%. The upper cluster shows a bit of a skew to the right. Most students in the lower cluster have body fat percentages between 12% and 18%, and most students in the upper cluster have body fat percentages between 22% and 28%. Men and women have different body fat percentages: the lower cluster would likely represent male students, and the upper cluster would likely represent female students. C)The distribution of body fat percentages is unimodal, with a bit of a skew to the right. The body fat percentages are centered around 20%, with a range of 10% to 35%. Most students have body fat percentages between 12% and 28%. Men and women have different body fat percentages, but the average of body fat percentages for men and women would be around 20%. D) The distribution of body fat percentages is unimodal, with a bit of a skew to the right. The body fat percentages are centered around 24%, with a range of 10% to 34%. Most students have body fat percentages between 12% and 28%. Men and women have different body fat percentages, but the average of body fat percentages for men and women would be around 24%. E) The distribution of body fat percentages is bimodal, with a cluster of body fat percentages around 16% and another cluster of body fat percentages around 26%. The upper cluster shows a bit of a skew to the right. Most students in the lower cluster have body fat percentages between 16% and 20%, and most students in the upper cluster have body fat percentages between 22% and 26%. Men and women have different body fat percentages: the lower cluster would likely represent male students, and the upper cluster would likely represent female students. Compare the distributions (shape, center, spread, unusual features). 5) The back-to-back dotplot shows the number of fatalities per year caused by tornadoes in a certain state for two periods: 1950-1974 and 1975-1999. In addition to comparing these distributions, state a reason explaining any differences. 5) A) The distribution of the number of fatalities per year for the period 1950-1974 is unimodal and approximately symmetric. The center of the distribution is about 2 fatalities per year. The number of fatalities per year ranges from 0 to 5 deaths. For the period 1975-1999, the distribution of the number of fatalities per year is also unimodal, but skewed to the left. A typical number of fatalities for this distribution is 0 fatalities, with a range of 0 to 5 deaths. Before 1975, there were more fatalities as a result of tornadoes. Higher construction standards, better warning systems, or medical advancements could all account for this difference. 4
B) The distribution of the number of fatalities per year for the period 1950-1974 is unimodal and skewed to the right. The center of the distribution is about 3 fatalities per year. The number of fatalities per year ranges from 0 to 5 deaths. For the period 1975-1999, the distribution of the number of fatalities per year is also unimodal and skewed to the right. A typical number of fatalities for this distribution is 0 fatalities, with a range of 0 to 5 deaths. C) The distribution of the number of fatalities per year for the period 1950-1974 is unimodal and approximately symmetric. The center of the distribution is about 2 fatalities per year. The number of fatalities per year ranges from 0 to 5 deaths. For the period 1975-1999, the distribution of the number of fatalities per year is also unimodal, but skewed to the left. A typical number of fatalities for this distribution is 0 fatalities, with a range of 0 to 5 deaths. D) The distribution of the number of fatalities per year for the period 1950-1974 is unimodal and approximately symmetric. The center of the distribution is about 2 fatalities per year. The number of fatalities per year ranges from 0 to 5 deaths. For the period 1975-1999, the distribution of the number of fatalities per year is also unimodal, but skewed to the right. A typical number of fatalities for this distribution is 0 fatalities, with a range of 0 to 5 deaths. Before 1975, there were more fatalities as a result of tornadoes. Higher construction standards, better warning systems, or medical advancements could all account for this difference. E) The distribution of the number of fatalities per year for the period 1950-1974 is unimodal and skewed to the right. The center of the distribution is about 3 fatalities per year. The number of fatalities per year ranges from 0 to 5 deaths. For the period 1975-1999, the distribution of the number of fatalities per year is also unimodal and skewed to the right. A typical number of fatalities for this distribution is 0 fatalities, with a range of 0 to 5 deaths. Before 1975, there were more fatalities as a result of tornadoes. Higher construction standards, better warning systems, or medical advancements could all account for this difference. Provide an appropriate response. 6) Here is a histogram of the assets (in millions of dollars) of 71 companies. What aspect of this distribution makes it difficult to summarize, or to discuss, the center and spread? What could be done with these data to make it easier to discuss the distribution? 6) A) The distribution of assets of the 71 companies is heavily skewed to the right. The vast majority of the companies have assets represented in the first bar of the histogram, 0 to 4000 dollars. This makes the discussion of the distribution meaningless. Re-expressing these data using logs or squares might make the distribution nearly symmetric, and a meaningful discussion of center and spread might be possible. 5
B) The distribution of assets of the 71 companies is heavily skewed to the right. The vast majority of the companies have assets represented in the first bar of the histogram, 0 to 4 billion dollars. This makes the discussion of the distribution meaningless. Re-expressing these data using logs or square roots might make the distribution nearly symmetric, and a meaningful discussion of center and spread might be possible. C)The distribution of assets of the 71 companies is heavily skewed to the right. The vast majority of the companies have assets represented in the first bar of the histogram, 0 to 4000 dollars. This makes the discussion of the distribution meaningless. Re-expressing these data using logs or square roots might make the distribution nearly symmetric, and a meaningful discussion of center and spread might be possible. D) The distribution of assets of the 71 companies is heavily skewed to the right. The vast majority of the companies have assets represented in the first bar of the histogram, 0 to 4 billion dollars. This makes the discussion of the distribution meaningless. Re-expressing these data using logs or squares might make the distribution nearly symmetric, and a meaningful discussion of center and spread might be possible. E) The distribution of assets of the 71 companies is heavily skewed to the left. The vast majority of the companies have assets represented in the first bar of the histogram, 0 to 4 billion dollars. This makes the discussion of the distribution meaningless. Re-expressing these data using logs or square roots might make the distribution nearly symmetric, and a meaningful discussion of center and spread might be possible. Find the mean of the data. 7) John liked to order the all-you-can-eat shrimp at his favorite restaurant. Here are the number of shrimp he ate during his last five visits to the restaurant. 7) 12, 14, 20, 12, 16 A) 16 shrimp B) 14.8 shrimp C) 18.5 shrimp D) 14 shrimp E) 12 shrimp Find the median of the data. 8) The annual incomes, in dollars, of several doctors are listed below. 8) 130,000 119,000 163,000 213,000 244,000 144,000 140,000 754,000 201,000 166,000 A) $252,000 B) $166,000 C) $163,000 D) $164,500 E) $227,000 Solve the problem. 9) The test scores of 19 students are listed below. Find the interquartile range (IQR) by hand. 9) 91 49 86 68 61 64 97 55 90 76 82 83 53 88 75 43 92 94 66 A) 28.5 B) 26.5 C) 25 D) 29.5 E) 29 6
10) The test scores of 19 students are listed below. Find the range. 10) 91 99 86 54 72 85 97 91 90 66 82 83 78 88 77 80 92 94 98 A) 33 B) (66, 99) C)44 D) 45 E) (54, 99) 11) Here are the commutes (in miles) for a group of six employees. Find the standard deviation. 11) 19.4 16.9 42.0 39.7 12.1 10.5 A) 40.9 B) 3294.7 C) 12.1 D) 4258.7 E) 13.89 12) Which set has the largest standard deviation? 12) Set 1 Set 2 1 6 6 6 11 1 5 6 7 11 A) Set 1, because 5 and 7 in set 1 are farther from 6 than 6 and 6 in set 2. B) Set 1, because 6 and 6 in set 1 are farther from 6 than 5 and 7 in set 2. C)Set 2, because 6 and 6 in set 2 are farther from 6 than 5 and 7 in set 1. D) Set 2, because 5 and 7 in set 2 are farther from 6 than 6 and 6 in set 1. E) Neither, because set 1 and set 2 have the same standard deviation. 13) Here are summary statistics of the four last digits of social security number of 500 customers, corresponding to the following histogram. 13) Count 500 Mean 4950 StdDev 1531 Median 5009 IQR 2009 Q1 4028 Q3 6037 Is the mean or median a "better" summary of the center of the distribution? A) Median, because of the outliers. B) Neither, because these are not categorical data. C) Median, because the IQR is smaller than the standard deviation. D) Neither, because these are not quantitative data. E) Mean, because the distribution is quite symmetric. 7
14) A small company employs a supervisor at $1200 a week, an inventory manager at $800 a week, 5 stock boys at $300 a week, and 3 drivers at $700 a week. Which measure of center best describes a typical wage at this company, the mean at $560 or the median at $500? A) Mean, because there are no outliers. B) Median, because of the outlier $1200. C) Median, because of the outliers $800 and $1200. D) Mean, because the distribution is symmetric. E) Median, because the distribution is skewed to the left. 14) 15) A small company employs a supervisor at $1200 a week, an inventory manager at $800 a week, 6 stock boys at $400 a week, and 4 drivers at $700 a week. Which measure of spread, would best describe the payroll, the range, the IQR, or the standard deviation? A) Range, because it would be least sensitive to the outlier at $1200. B) IQR, because it would be least sensitive to the outliers at $800 and $1200. C) IQR, because the distribution is symmetric. D) Standard deviation, because it would be least sensitive to the outlier at $1200. E) IQR, because it would be least sensitive to the outlier at $1200. 15) Find the five-number summary for the given data by hand. 16) A small company employs a supervisor at $1400 a week, an inventory manager at $800 a week, 5 stock boys at $400 a week, and 3 drivers at $600 a week. A) 400, 400, 500, 800, 1400 dollars B) 400, 400, 1000, 600, 1400 dollars C) 2000, 400, 500, 1800, 1400 dollars D) 400, 400, 500, 600, 1400 dollars E) 1400, 400, 500, 600, 400 dollars 16) 8
Create a boxplot that represents the given data. 17) Here are the test scores of 32 students: 17) 32 37 41 44 46 48 53 55 56 57 59 63 65 66 68 69 70 71 74 74 75 77 78 79 80 82 83 86 89 92 95 99 I II III IV V A) I B) II C)III D) IV E) V Identify potential outliers, if there are any, in the given data. 18) The normal annual precipitation (in inches) is given below for 21 different U.S. cities. 18) 32.4 30.5 34.6 63.9 22.1 31.8 16.6 27.9 36.2 59.3 25.8 47.2 45.6 8.6 26.6 18.9 14.3 31.4 24.2 12.4 35.4 A) 59.3, 63.9 B) 8.6, 59.3, 63.9 C) 63.9 D) 25.8 E) None 9
Solve the problem. 19) The boxplots display case prices (in dollars) of white wines produced by three vineyards in the western United States. Describe these wine prices. 19) A) Vineyards A and B have different average price, but a similar spread. Vineyard C has lower prices except for one low outlier, and a more consistent pricing as shown by the smaller IQR. B) Vineyards A and B have about the same average price; the boxplots show similar medians and similar IQRs. Vineyard C has consistently higher prices except for one low outlier, and a more consistent pricing as shown by the larger IQR. C)Vineyards A and B have about the same average price; the boxplots show similar medians and similar IQRs. Vineyard C has higher prices except for one low outlier, and a more consistent pricing as shown by the smaller IQR. D) Vineyards A and B have about the same average price; the boxplots show similar medians and similar IQRs. Vineyard C has higher prices except for one low outlier, and a less consistent pricing as shown by the larger IQR. E) Vineyards A and B have about the same average price; the boxplots show similar medians and similar IQRs. Vineyard C has higher prices except for one low outlier, and a more consistent pricing as shown by the smaller IQR. The three distributions are roughly symmetric. 10
Three statistics classes (50 students each) took the same test. Shown below are histograms of the scores for the classes. Use the histograms to answer the question. 20) Which class had the largest standard deviation? A) Class 3, because the shape is symmetric. B) Class 2, because the shape is skewed. C)Class 3, because the shape has the highest number of students. D) Class 1, because the shape is not perfectly symmetric. E) None, because the classes had the same standard deviation. 20) 21) Which class do you think performed better on the test? A) Class 2, because it has the highest median and 50% of class 2 scored at or above the medians of 1 and 3. B) Class 1, because it has the smallest median and 70% of class 1 scored at or above the medians of 2 and 3. C)Class 2, because it has the highest median and 70% of class 2 scored at or above the medians of 1 and 3. D) Class 3, because 74% of class 3 scored at or above the medians of 1 and 2. E) Class 2, because it has different mean and median and 70% of class 2 scored at or above the medians of 1 and 3. 21) 11
Solve the problem. 22) Here are summary statistics for the normal monthly precipitation (in inches) in August for 20 different U.S. cities. 22) Count Mean Median StdDev Min Max Q1 Q3 20 3.23 3.45 1.2 0.4 7.0 2.1 3.8 Write a few sentences about the normal monthly precipitation in August. A) The 20 precipitations range in size between 0.4 and 7 inches. The median amount is 3.45 inches, so half are larger and half are smaller. The middle 50% of these precipitations ranges between 2.1 and 3.8 inches. The distribution is skewed to the right, with at least the outlier 0.4 inch. B) The 20 precipitations range in size between 0.4 and 7 inches. The median amount is 3.45 inches, so half are larger and half are smaller. The middle 50% of these precipitations ranges between 2.1 and 3.8 inches. The distribution is skewed to the right, with at least the outlier 7 inches. C)The 20 precipitations range in size between 0.4 and 7 inches. The median amount is 3.45 inches, so half are larger and half are smaller. The middle 50% of these precipitations ranges between 2.1 and 3.8 inches. The distribution is skewed to the right, with no outliers. D) The 20 precipitations range in size between 0.4 and 7 inches. The median amount is 3.23 inches, so half are larger and half are smaller. The middle 50% of these precipitations ranges between 2.1 and 3.45 inches. The distribution is skewed to the right, with at least the outlier 7 inches. E) The 20 precipitations range in size between 0.4 and 7 inches. The median amount is 3.45 inches, so half are larger and half are smaller. The middle 50% of these precipitations ranges between 0.4 and 3.8 inches. The distribution is skewed to the left, with at least the outlier 0.4 inch. 12
23) Shown below are the boxplot, the histogram and summary statistics for the weekly salaries (in dollars) of 24 randomly selected employees of a company: 23) Count Mean Median StdDev Min Max Q1 Q3 24 978.8 705 765.7 310 3700 510 1225 Write a few sentences describing the distribution. A) The distribution is bimodal and skewed to the right. As shown in the boxplot, there are two outliers, weekly salaries of $2500 and about $3700. The median was 705, while the mean was 978.8, above the median score. The middle 50% of the weekly salaries were between $705 and $1225 for an IQR of $520. B) The distribution is unimodal and skewed to the right. As shown in the boxplot, there are two outliers, weekly salaries of $2500 and about $3700. The median was 978.8, while the mean was 705, above the median score. The middle 50% of the weekly salaries were between $510 and $1225 for an IQR of $715. C)The distribution is unimodal and skewed to the left. As shown in the boxplot, there are two outliers, weekly salaries of $2500 and about $3700. The median was 705, while the mean was 978.8, above the median score. The middle 50% of the weekly salaries were between $510 and $1225 for an IQR of $715. D) The distribution is unimodal and skewed to the left. As shown in the boxplot, there are two outliers, weekly salaries of $2500 and about $3700. The median was 705, while the mean was 978.8, above the median score. The middle 50% of the weekly salaries were between $705 and $1225 for an IQR of $520. E) The distribution is unimodal and skewed to the right. As shown in the boxplot, there are two outliers, weekly salaries of $2500 and about $3700. The median was 705, while the mean was 978.8, above the median score. The middle 50% of the weekly salaries were between $510 and $1225 for an IQR of $715. Provide an appropriate response. 24) A professor has kept records on grades that students have earned in his class. If he wants to examine the percentage of students earning the grades A, B, C, D, and F during the most recent term, which kind of plot could he make? A) pie chart B) boxplot C) timeplot D) dotplot E) histogram 24) 13
25) Which is true of the data shown in the histogram? 25) I. The distribution is approximately symmetric. II. The mean and median are approximately equal. III. The median and IQR summarize the data better than the mean and standard deviation. A) III only B) I and III C)I only D) I and II E) I, II, and III 26) Which is true of the data whose distribution is shown? 26) I. The distribution is skewed to the right. II. The mean is probably smaller than the median. III. We should summarize with mean and standard deviation. A) I and II B) I, II, and III C)II only D) I only E) II and III 14
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Create the requested display for the data. 27) The number of days off that 30 police officers took in a given year are provided below. Create a histogram of the data using bins 2 days wide. Describe the main features of the histogram. 27) 10 1 3 5 4 7 5 1 0 9 11 1 5 4 1 7 7 11 0 6 6 1 5 7 10 1 1 5 6 0 28) The weights, in pounds, of the members of the varsity football team are listed below. Create a stem-and-leaf display of the data. Do not use split stems. 28) 144 152 142 151 160 152 131 164 141 153 140 149 144 135 156 147 133 172 159 135 159 148 171 163 15
Provide an appropriate response. 29) An automobile service shop reported the summary statistics shown for repair bills (in $) for their customers last month. 29) Min 27 Q1 88 Median 132 Q3 308 Max 1442 Mean 284 SD 140 Were any of the bills outliers? Show how you made your decision. 16
Answer Key Testname: CHAPTERS 4-5 REVIEW 1) D 2) D 3) B 4) B 5) D 6) B 7) B 8) D 9) B 10) D 11) E 12) D 13) D 14) B 15) E 16) D 17) B 18) A 19) C 20) B 21) C 22) B 23) E 24) A 25) D 26) D 27) 28) 17 16 15 14 13 1 2 0 3 4 1 2 2 3 6 9 9 0 1 2 4 4 7 8 9 1 3 5 5 Key: 14 2 = 142 pounds 17
Answer Key Testname: CHAPTERS 4-5 REVIEW 29) Yes. IQR = 308-88 = 220. The upper fence for outliers is one and a half IQR's above the third quartile, or 308 + 1.5(220) = 638. The maximum repair bill was $1442, well above $638, so it is certainly an outlier. 18