DATA DESCRIPTION AND PROBABILITY DISTRIBUTIONS

Transcription

1 DATA DESCRIPTION AND PROBABILITY DISTRIBUTIONS TO ACCOMPANY COLLEGE MATHEMATICS for Business, Economics, Life Sciences, and Social Sciences T e n t h Upper Saddle River, New Jersey E d i t i o n RAYMOND A. BARNETT MICHAEL R. ZIEGLER KARL E. BYLEEN

2 COLLEGE MATHEMATICS FOR BUSINESS, ECONOMICS, LIFE SCIENCES, AND SOCIAL SCIENCES Te n t h E d i t i o n Raymond A. Barnett Merritt College Michael R. Ziegler Marquette University Karl E. Byleen Marquette University Upper Saddle River, New Jersey 7458

3 Executive Acquisitions Editor: Petra Recter Editor in Chief: Sally Yagan Project Manager: Jacquelyn Riotto Zupic Vice President/Director of Production and Manufacturing: David W. Riccardi Executive Managing Editor: Kathleen Schiaparelli Senior Managing Editor: Linda Mihatov Behrens Production Editor: Barbara Mack Assistant Manufacturing Manager/Buyer: Michael Bell Manufacturing Manager: Trudy Pisciotti Marketing Manager: Krista M. Bettino Marketing Assistant: Annett Uebel Editorial Assistant/Print Supplements Editor: Joanne Wendelken Art Director: Jonathan Boylan Interior and Cover Designer: Geoffrey Cassar Art Editor: Thomas Benfatti Creative Director: Carole Anson Director of Creative Services: Paul Belfanti Manager, Cover Visual Research and Permissions: Karen Sanatar Cover Photo: PunchStock/Brand X Pictures Art Studio: Scientific Illustrators Composition: Interactive Composition Corporation Part and Chapter Opening Photos: Getty Images, Inc. 25, 22, 1999, 1996, 1993, 199, 1987, 1984, 1981, 1979 Pearson Education, Inc. Pearson Prentice Hall Pearson Education, Inc. Upper Saddle River, New Jersey 7458 All rights reserved. No part of this book may be reproduced, in any form or by any means, without written permission from the publisher. Pearson Prentice Hall is a trademark of Pearson Education, Inc. Printed in the United States of America ISBN Pearson Education LTD., London Pearson Education Australia PTY, Limited, Sydney Pearson Education Singapore, Pte. Ltd Pearson Education North Asia Ltd, Hong Kong Pearson Education Canada, Ltd., Toronto Pearson Educación de Mexico, S.A. de C.V. Pearson Education Japan, Tokyo Pearson Education Malaysia, Pte. Ltd

4 COLLEGE MATHEMATICS FOR BUSINESS, ECONOMICS, LIFE SCIENCES, AND SOCIAL SCIENCES

5 OBJECTIVES 1. Display data using bar graphs, broken-line graphs, and pie graphs. 2. Display quantitative data using histograms, frequency polygons, and cumulative frequency polygons. 3. Compute the mean, median, and mode of a data set to measure central tendency. 4. Compute the range, variance, and standard deviation of a data set to measure variation. 5. Calculate the probability distribution of the number of successes in a sequence of Bernoulli trials. 6. Find the mean and standard deviation of a binomial distribution. 7. Calculate probabilities associated with normal distributions. 8. Approximate a binomial distribution with an appropriately chosen normal distribution. CHAPTER PROBLEM Most colleges and universities require applicants to submit their scores on at least one of two very popular entrance exams, the ACT (American College Testing Program) or the SAT (Scholastic Assessment Test). In the year that two students, Jack and Jill, took one of these exams, the national mean and standard deviation for the ACT were 2.8 and 4.5, respectively, and the national mean and standard deviation for the SAT were 1173 and 1, respectively. Jack scored 26 on the ACT and Jill scored 126 on the SAT. What percentage of ACT exam participants scored less than Jack? What percentage of SAT exam participants scored less than Jill on the SAT? Whose percentage is higher, Jack s or Jill s?

6 8 Data Description and Probability Distributions 8-1 Graphing Data 8-2 Measures of Central Tendency 8-3 Measures of Dispersion 8-4 Bernoulli Trials and Binomial Distributions 8-5 Normal Distributions Chapter 8 Review Review Exercise Group Activity 1: Analysis of Data on Student Lifestyle Group Activity 2: Survival Rates for a Heart Transplant INTRODUCTION In this chapter we study various techniques for analyzing and displaying data. We use bar graphs, broken-line graphs, and pie graphs to present visual interpretations or comparisons of data. We use measures of central tendency (the mean, median, and mode) and measures of dispersion (the range, variance, and standard deviation) to describe and compare data sets. Data collected from different sources, IQ scores and measurements of manufactured parts, for example, often exhibit surprising similarity. We might express such similarity by saying that both data sets exhibit characteristics of a normal distribution. In this chapter we develop theoretical probability distributions the binomial distributions and the normal distributions that can be used as models of empirical data. Section 8-1 Graphing Data Bar Graphs, Broken-Line Graphs, and Pie Graphs Frequency Distributions Comments on Statistics Histograms Frequency Polygons and Cumulative Frequency Polygons Television, newspapers, magazines, books, and reports make substantial use of graphics to visually communicate complicated sets of data to the viewer. In this section we look at bar graphs, broken-line graphs, and pie graphs and the techniques for producing them. It is important to remember that graphs are 493

7 494 Chapter 8 Data Description and Probability Distributions visual aids and should be prepared with care. The object is to provide the viewer with the maximum amount of information while minimizing the time and effort required to read the information from the graph. TABLE 1 U.S. Public Debt Year Debt (billions $) , ,674.2 Bar Graphs, Broken-Line Graphs, and Pie Graphs Bar graphs are widely used because they are easy to construct and easy to read. They are effective in presenting visual interpretations or comparisons of data. Consider Tables 1 and 2. Bar graphs are well suited to describe these two data sets. Vertical bars are usually used for time series that is, data that changes over time, as in Table 1. The labels on the horizontal axis are then units of time (hours, days, years, and so on, whichever is appropriate), as shown in Figure 1. Horizontal bars are generally used for data that changes by category, as in Table 2, because of the ease of labeling categories on the vertical axis of the bar graph (see Fig. 2). To increase clarity, a space is left between the bars. Bar graphs for the data in Tables 1 and 2 are illustrated in Figures 1 and 2. Two additional variations on bar graphs, the double bar graph and the divided bar graph, are illustrated in Figures 3 and 4, respectively. TABLE 2 Traffic at Busiest U.S. Airports, 21 Airport Arrivals and Departures (million passengers) Atlanta 76 Chicago (O Hare) 67 Los Angeles 61 Dallas/Ft. Worth 55 Denver 36 U.S. Public Debt 6, Traffic at Busiest U.S. Airports, 21 5, Atlanta Billion dollars 4, 3, 2, 1, Chicago (O'Hare) Los Angeles Dallas/Fort Worth Denver Year FIGURE 1 Vertical bar graph Arrivals and departures (million passengers) FIGURE 2 Horizontal bar graph

8 Section 8-1 Graphing Data 495 Bachelor's degree Education and Income, 1999 Female Male Tokyo Population of World's Largest Cities, 2 and 215 (projected) Associate degree Mexico City Some college Bombay High school diploma São Paulo New York Some high school Los Angeles 1, 2, 3, 4, 5, 6, Mean annual income ($) FIGURE 3 Double bar graph Million people FIGURE 4 Divided bar graph 1 (A) Using Figure 3, estimate the mean annual income of a male with some college, and of a female who holds a bachelor s degree. Within which educational category is there the greatest difference between male and female income? The least difference? (B) Using Figure 4, estimate the population of São Paulo in the years 2 and 215. Which of the six largest cities is projected to have the greatest increase in population from 2 to 215? The least increase? Which city would you conjecture to be the largest in the world in 235? Explain. A broken-line graph can be obtained from a vertical bar graph by joining the midpoints of the tops of consecutive bars with straight lines. For example, using Figure 1 we obtain the broken-line graph in Figure 5. U.S. Public Debt 6, 5, Billion dollars 4, 3, 2, 1, Year FIGURE 5 Broken-line graph

9 496 Chapter 8 Data Description and Probability Distributions Million dollars Revenue and Cost 3 Revenue Profit Cost 2 Loss 1 Profit Year FIGURE 6 Broken-line graphs Quadrillion BTU Projections of U.S. Energy Consumption Year FIGURE 7 Broken-line graphs Petroleum products Coal Natural gas Nuclear Renewable Broken-line graphs are particularly useful when we want to emphasize the change in one or more variables relative to time. Figures 6 and 7 illustrate two additional variations of broken-line graphs. 2 (A) Using Figure 6, estimate the revenue and costs in 2. In which years is a profit realized? In which year is the greatest loss experienced? (B) Using Figure 7, estimate the U.S. consumption of each of the five sources of energy in 21. Estimate the percentage of total consumption that will come from nuclear energy in the year 21. A pie graph is generally used to show how a whole is divided among several categories. The amount in each category is expressed as a percentage, and then a circle is divided into segments (pieces of pie) proportional to the percentages of each category. The central angle of a segment is the percentage of 36 corresponding to the percentage of that category (see Fig. 8). In constructing pie graphs, we use relatively few categories, arrange the segments in ascending or descending order of size around the circle, and label each part. Active U.S. Military Personnel, 22 World Refugees, 21 Army 33% (119 ) Coast Guard 3% (11 ) Marine Corps 12% (43 ) (A) Navy 27% (97 ) Air Force 25% (91 ) East Asia/ Pacific 5% Americas/ Caribbean 4% Europe 7% FIGURE 8 Pie graphs Middle East 46% Africa 2% South/ Central Asia 18% (B) Bar graphs, broken-line graphs, and pie graphs are easily constructed using a spreadsheet. After the data is entered (see Fig. 9 for the data corresponding

10 Section 8-1 Graphing Data 497 to Fig. 8A) and the type of display (bar, broken-line, pie) is chosen, the graph is drawn automatically. Various options for axes, gridlines, patterns, and text are available to improve the clarity of the visual display. FIGURE 9 Frequency Distributions Observations that are measured on a numerical scale are referred to as quantitative data. Weights, ages, bond yields, the length of a part in a manufacturing process, test scores, and so on, are all examples of quantitative data. Out of the total population of entering freshmen at a large university, a random sample of 1 students is selected and their entrance examination scores are recorded (see Table 3). TABLE 3 Entrance Examination Scores of 1 Entering Freshmen The mass of raw data in Table 3 certainly does not elicit much interest or exhibit much useful information. The data must be organized in some way so that it is comprehensible. This can be done by constructing a frequency table. We generally choose five to twenty class intervals of equal length to cover the data range the more data, the greater the number of intervals and tally the data relative to these intervals. The data range in Table 1 is = 447 (found by subtracting the smallest value in the data from the largest). If we choose ten intervals, each of length 5, we will be able to cover all the scores. Table 4 shows the result of this tally. TABLE 4 Frequency Table Class Interval Tally Frequency Relative Frequency

11 498 Chapter 8 Data Description and Probability Distributions At first it might seem appropriate to start at 3 and form the class intervals: 3 35, 35 4, 4 45, and so on. But if we do this, where will we place 35 or 4? We could, of course, adopt a convention of placing a score falling on an upper boundary of a class in the next higher class (and some people do exactly this); however, to avoid confusion, we will always use one decimal place more for class boundaries than appears in the raw data. Thus, in this case, we chose the class intervals , , and so on, so that each score could be assigned to one and only one class interval. The number of measurements that fall within a given class interval is called the class frequency, and the set of all such frequencies associated with their corresponding classes is called a frequency distribution. Thus, Table 4 represents a frequency distribution of the set of raw scores in Table 3. If we divide each frequency by the total number of items in the original data set (in our case 1), we obtain the relative frequency of the data falling in each class interval that is, the percentage of the whole that falls in each class interval (see the last column in Table 4). The relative frequencies also can be interpreted as probabilities associated with the experiment, A score is drawn at random out of the 1 in the sample. An appropriate sample space for this experiment would be the set of simple outcomes e 1 = a score falls in the first class interval e 2 = a score falls in the second class interval e 1 = a score falls in the tenth class interval The set of relative frequencies is then referred to as the probability distribution for the sample space. EXAMPLE 1 Determining Probabilities from a Frequency Table Referring to the probability distribution just described and Table 4, determine the probability that (A) A randomly drawn score is between and (B) A randomly drawn score is between and Solution (A) Since the relative frequency associated with the class interval is.21, the probability that a randomly drawn score (from the sample of 1) falls in this interval is.21. (B) Since a score falling in the interval is a compound event, we simply add the probabilities for the simple events whose union is this compound event. Thus, we add the probabilities corresponding to each class interval from to to obtain Matched Problem =.7 Repeat Example 1 for the following intervals: (A) (B)

12 Section 8-1 Graphing Data 499 POPULATION (Set of all measurements of interest to the sampler) SAMPLE (Subset of the population) FIGURE 1 Inferential statistics: Based on information obtained from a sample, the goal of statistics is to make inferences about the population as a whole. Comments on Statistics Now, of course, what we are really interested in is whether the probability distribution for the sample of 1 entrance examination scores has anything to do with the total population of entering freshmen in the university in question. This is a problem for the important branch of mathematics called statistics, which deals with the process of making inferences about a total population based on random samples drawn from the population. Figure 1 schematically illustrates the inferential statistical process. We will not go too far into inferential statistics in this book, since the subject is studied in detail in any course in statistics, but our work in probability provides a good foundation for this study. Intuitively, in the entrance examination example, we would expect that the larger the sample size, the more closely the probability distribution for the sample will approximate that for the total population. That is about all that we can say at the moment. Histograms A histogram is a special kind of vertical bar graph. In fact, if you rotate Table 4 on page 497 counterclockwise 9, the tally marks in the table take on the appearance of a bar graph. Histograms have no space between the bars, class boundaries are located on the horizontal axis, and frequencies are associated with the vertical axis. Figure 11 is a histogram for the frequency distribution in Table 4. Note that we have included both frequencies and relative frequencies on the vertical scale. You can include either one or the other, or both, depending on what needs to be emphasized. The histogram is the most common graphical representation of frequency distributions. Frequency 2 Relative frequency Entrance examination scores FIGURE 11 Histogram 3 (A) Draw a histogram for the following data set using a class interval width of.5 starting at

13 5 Chapter 8 Data Description and Probability Distributions (B) Draw a histogram for the same data set using the same class interval width but starting at.75. (C) The histograms of parts (A) and (B) represent the same set of data. How do they differ? Which of the two gives a better description of the data set? Explain. EXAMPLE 2 Solution Constructing Histograms with a Graphing Utility Twenty vehicles were chosen at random upon arrival at a vehicle emissions inspection station, and the time elapsed (in minutes) from arrival to completion of the emissions test was recorded for each of the vehicles: (A) Use a graphing utility to draw a histogram of the data, choosing the five class intervals , , and so on. (B) What is the probability that for a vehicle chosen at random from the sample, the time required at the inspection station is less than 12.5 minutes? That it exceeds 22.5 minutes? (A) Various kinds of statistical plots can be drawn by most graphing utilities. To draw a histogram we enter the data as a list, specify a histogram from among the various statistical plotting options, set the window variables, and graph. Figure 12 shows the data entered as a list, the settings of the window variables, and the resulting histogram for a particular graphing calculator. For details consult your manual FIGURE 12 Matched Problem 2 (B) From the histogram in Figure 12 we see that the first class has frequency 3 and the second has frequency 8. The upper boundary of the second class is 12.5, and the total number of data items is 2. Therefore, the probability that the time required is less than 12.5 minutes is = 11 2 =.55 Similarly, since the frequency of the last class is 1, the probability that the time required exceeds 22.5 minutes is 1 2 =.5 The weights (in pounds) were recorded for 2 kindergarten children chosen at random:

14 Section 8-1 Graphing Data 51 (A) Use a graphing utility to draw a histogram of the data, choosing the five class intervals , , and so on. (B) What is the probability that a kindergarten child chosen at random from the sample weighs less than 42.5 pounds? More than 42.5 pounds? Frequency Polygons and Cumulative Frequency Polygons A frequency polygon is a broken-line graph where successive midpoints of the tops of the bars in a histogram are joined by straight lines. To draw a frequency polygon for a frequency distribution, you do not need to draw a histogram first; you can just locate the midpoints and join them with straight lines. Figure 13 is a frequency polygon for the frequency distribution in Table 4. If the amount of data becomes very large and we substantially increase the number of classes, the frequency polygon will take on the appearance of a smooth curve called a frequency curve. Frequency 25 Relative frequency Entrance examination scores FIGURE 13 Frequency polygon If we are interested in how many or what percentage of a total sample lies above or below a particular measurement, a cumulative frequency table and polygon are useful. Using the frequency distribution in Table 4, we accumulate the frequencies by starting with the first class and adding frequencies as we move down the column. The results are shown in Table 5. (How is the last column formed?) TABLE 5 Cumulative Frequency Table Class Interval Frequency Cumulative Frequency Relative Cumulative Frequency

15 52 Chapter 8 Data Description and Probability Distributions To form a cumulative frequency polygon, or ogive as it is also called, the cumulative frequency is plotted over the upper boundary of the corresponding class. Figure 14 is the cumulative frequency polygon for the cumulative frequency table in Table 5. Notice that we can easily see that 78% of the students scored below 649.5, while only 18% scored below We also can conclude that the probability of a randomly selected score from the sample of 1 lying below is.78 and above is =.22. Cumulative frequency 1 Relative cumulative frequency 1% % % Entrance examination scores FIGURE 14 Cumulative frequency polygon (ogive) Insight Above each class interval in Figure 14, the cumulative frequency polygon is linear. Such a function is said to be piecewise linear. The slope of each piece of a cumulative frequency polygon is greater than or equal to zero (that is, the graph is never falling). The piece with the greatest slope corresponds to the class interval that has the greatest frequency. In fact, for any class interval, the frequency is equal to the slope of the cumulative frequency polygon multiplied by the width of the class interval. 4 (A) Construct a histogram for the data set whose cumulative frequency polygon is shown in Figure FIGURE 15 (B) Can the original data set be reconstructed from the cumulative frequency polygon? Explain.

16 Section 8-1 Graphing Data 53 Answers to Matched Problems 1. (A).11 (B) (A) 8 (B).45; Exercise (A) Construct a frequency table and a histogram for the following data set using a class interval width of 2 starting at (B) Construct a frequency table and a histogram for the following data set using a class interval width of 2 starting at (C) How are the two histograms of parts (A) and (B) similar? How are the two data sets different? 2. (A) Construct a frequency table and a histogram for the data set of part (A) of Problem 1 using a class interval width of 1 starting at.5. (B) Construct a frequency table and a histogram for the data set of part (B) of Problem 1 using a class interval width of 1 starting at.5. (C) How are the histograms of parts (A) and (B) different? 3. The graphing utility command shown in Figure A generated a set of 4 random integers from 2 to 24, stored as list L 1. The statistical plot in Figure B is a histogram of L 1, using a class interval width of 1 starting at 1.5. (A) Explain how the window variables can be changed to display a histogram of the same data set using a class interval width of 2 starting at 1.5. A width of 4 starting at 1.5. (B) Describe the effect of increasing the class interval width on the shape of the histogram. (A) Figure for 3 4. An experiment consists of rolling a pair of dodecahedral (twelve-sided) dice and recording their sum (the sides of each die are numbered from 1 to 12). The command shown in Figure A simulated 5 rolls of the dodecahedral dice. The statistical plot in Figure B is a histogram of the 5 sums using a class interval width of 1 starting at 1.5. (A) Explain how the window variables can be changed to display a histogram of the same data set using a class interval width of 2 starting at 1.5. A width of 3 starting at -.5. (B) Describe the effect of increasing the class interval width on the shape of the histogram. (A) 1.5 Figure for 4 (B) 5 (B) 24.5

17 54 Chapter 8 Data Description and Probability Distributions Applications Business & Economics 5. Gross national product. Graph the data in the following table using a bar graph. Gross National Product (GNP) GNP Year (billion $) Corporation revenues. Graph the data in the following table using a bar graph. Corporation Revenues, 21 Revenue Corporation (million $) Wal-Mart 219,812 Exxon Mobil 191,581 General Motors 177,26 Ford Motor 162,412 Enron 138,718 General Electric 125, Gold production. Use the double bar graph on world gold production to determine the country that showed the greatest increase in gold production from 199 to 2. Which country showed the greatest percentage increase? Was more gold produced in North America or in South Africa in 199? In 2? South Africa United States Canada Australia China World Gold Production Gold production (million troy ounces) 8. Gasoline prices. Graph the data in the following table using a divided bar graph. Global Gasoline Prices, July 2 Price Before Tax Country ($ per gallon) Tax United States Canada Japan United Kingdom Germany Railroad freight. Graph the data in the following table using a broken-line graph. Annual Railroad Carloadings in the United States Year Carloadings (millions) Railroad freight. Refer to Problem 9. If the data were presented in a bar graph, would horizontal bars or vertical bars be used? Could the data be presented in a pie graph? Explain. 11. Federal income. Graph the data in the following table using a pie graph: Federal Income by Source, 2 Income Source (billion $) Personal income tax 1,137 Social insurance taxes 64 Corporate income tax 236 Excise tax 55 Other Gasoline prices. In October 2, the average price of a gallon of gasoline in the United States was $ Of this amount, 75.1 cents was the cost of crude oil, 15.3 cents the cost of refining, 21.4 cents the cost of distribution and marketing, and 41.4 cents the amount of tax. Use a pie graph to present this data. 13. Starting salaries. The starting salaries (in thousands of dollars) of 2 graduates, chosen at random from

18 Section 8-1 Graphing Data 55 the graduating class of an urban university, were determined and recorded in the table: Starting Salaries (A) Construct a frequency and relative frequency table using a class interval width of 4 starting at 2.5. (B) Construct a histogram. (C) What is the probability that a graduate chosen from the sample will have a starting salary above $32,5? Below $28,5? (D) Construct a histogram using a graphing utility. 14. Commute times. Thirty-two persons were chosen at random from among the employees of a large corporation and their commute times (in hours) from home to work were determined and recorded in the table: Commute Times (A) Construct a frequency and relative frequency table using a class interval width of.2 starting at.15. (B) Construct a histogram. (C) What is the probability that a person chosen at random from the sample will have a commuting time of at least an hour? Of at most half an hour? (D) Construct a histogram using a graphing utility. 15. Common stocks. The table shows price earnings ratios of 1 common stocks chosen at random from the New York Stock Exchange. Price Earnings (PE) Ratios (A) Construct a frequency and relative frequency table using a class interval of 5 starting at -.5. (B) Construct a histogram. (C) Construct a frequency polygon. (D) Construct a cumulative frequency and relative cumulative frequency table. What is the probability of a price earnings ratio drawn at random from the sample lying between 4.5 and 14.5? (E) Construct a cumulative frequency polygon. Life Sciences 16. Mouse weights. One hundred healthy mice were weighed at the beginning of an experiment with the following results: Mouse Weights (Grams) (A) Construct a frequency and relative frequency table using a class interval of 2 starting at (B) Construct a histogram. (C) Construct a frequency polygon. (D) Construct a cumulative frequency and relative cumulative frequency table. What is the probability of a mouse weight drawn at random from the sample lying between 45.5 and 53.5? (E) Construct a cumulative frequency polygon. 17. Population growth. Graph the data in the following table using a broken-line graph. Annual World Population Growth Growth Year (millions) AIDS epidemic. One way to gauge the toll of the AIDS epidemic in Sub-Saharan Africa is to compare life expectancies with the figures that would have been projected in the absence of AIDS. Use the broken-line graphs shown to estimate the life expectancy of a child born in the year 22. What

19 56 Chapter 8 Data Description and Probability Distributions would the life expectancy of the same child be in the absence of AIDS? For which years of birth is the life expectancy less than 5 years? If there were no AIDS epidemic, for which years of birth would the life expectancy be less than 5 years? you order the quarter-pound bacon cheeseburger with mayo for lunch? How would such a lunch affect your choice of breakfast and dinner? Discuss. Social Sciences Life expectancy (years) Life Expectancy in Sub-Saharan Africa Without AIDS With AIDS 23. Education. In the United States in 196, 86.4% of school-age children were enrolled in public schools, 12.6% in Catholic schools, and 1.% in other private schools. In 1998, 86.8% were enrolled in public schools, 4.7% in Catholic schools, 6.5% in other private schools, and 2.% were home-schooled. Use two pie graphs to present this data. 24. Study abroad. Would a pie graph be more effective or less effective than the bar graph shown in presenting information on the most popular destinations of U.S. college students who study abroad? Justify your answer. 1 Destinations of U.S. Students Studying Abroad, Year of birth Figure for Nutrition. Graph the data in the following table using a double bar graph. Recommended Daily Allowances Grams of: Males Females Age Age Carbohydrate Protein 6 44 Fat 1 73 United Kingdom Spain Italy France Mexico Australia Germany Costa Rica Ireland Japan Number of students (thousands) 2. Greenhouse gases. The U.S. Department of Energy estimates that halocarbons account for 2%, methane for 15%, nitrous oxide for 5%, and carbon dioxide for 6% of the enhanced heat-trapping effects of greenhouse gases. Use a pie graph to present this data. Find the central angles of the graph. 21. Nutrition. Graph the nutritional information in the following table using a double bar graph. Fast-Food Burgers: Nutritional Information Calories Calories From Fat 2-oz burger, plain addl. oz of beef slice cheese slices bacon tbsp. mayonnaise Nutrition. Refer to Problem 21. Suppose that you are trying to limit the fat in your diet to at most 3% of your calories, and your calories to 2 per day. Should Figure for Median age. Use the broken-line graph shown to estimate the median age in 19 and 199. In which decades did the median age increase? In which did it decrease? Discuss the factors that may have contributed to the increases and decreases. Median Age in the United States, Figure for 25

20 Section 8-2 Measures of Central Tendency State prisoners. In 198 in the United States, 6% of the inmates of state prisons were incarcerated for drug offenses, 3% for property crimes, 4% for public order offenses, and 59% for violent crimes; in 1998 the percentages were 21%, 21%, 1%, and 48%, respectively. Present the data using two pie graphs. Discuss factors that may account for the shift in percentages between 198 and Grade-point averages. One hundred seniors were chosen at random from a graduating class Grade-Point Averages (GPA) at a university and their grade-point averages recorded: (A) Construct a frequency and relative frequency table using a class interval of.2 starting at (B) Construct a histogram. (C) Construct a frequency polygon. (D) Construct a cumulative frequency and relative cumulative frequency table. What is the probability of a GPA drawn at random from the sample being over 2.95? (E) Construct a cumulative frequency polygon. Section 8-2 Measures of Central Tendency Mean Median Mode In the preceding section, we found that graphic techniques contributed substantially to our comprehension of large masses of raw data. In this and the next section, we discuss several important numerical measures that are used to describe sets of data. These numerical descriptions are generally of two types: 1. Measures that indicate the approximate center of a distribution, called measures of central tendency 2. Measures that indicate the amount of scatter about a central point, called measures of dispersion In this section we look at three widely used measures of central tendency, and in the next section we consider measures of dispersion. Mean When we speak of the average yield of 1-year municipal bonds, the average number of smog-free days per year in a certain city, or the average SAT score for students at a university, we usually interpret these averages to be arithmetic averages, or means. In general, we define the mean of a set of quantitative data as follows: DEFINITION Mean: Ungrouped Data The mean of a set of quantitative data is equal to the sum of all the measurements in the data set divided by the total number of measurements in the set.

21 58 Chapter 8 Data Description and Probability Distributions The mean is a single number that, in a sense, represents the entire data set. It involves all the measurements in the set, it is easily computed, and it enters readily into other useful formulas. Because of these and other desirable properties, the mean is the most widely used measure of central tendency. In statistics, we are concerned with both a sample mean and the mean of the corresponding population (the sample mean is often used as an estimator for the population mean), so it is important to use different symbols to represent these two means. It is customary to use a letter with an overbar, such as x, to represent a sample mean and the Greek letter ( mu ) to represent a population mean. NOTATION Mean x = sample mean = population mean Before considering examples, let us formulate the concept symbolically using the summation symbol (see Appendix B-1). If x 1, x 2, p, x n represents a set of n measurements, then the sum x 1 + x 2 + p + x n is compactly and conveniently represented by We now can express the mean in symbolic form. n a x i or i = 1 g n i = 1 x i DEFINITION Mean: Ungrouped Data If x 1, x 2,..., x n is a set of n measurements, then the mean of the set of measurements is given by where [mean] = n a x i i = 1 n = x 1 + x 2 + p + x n n x = [mean] if data set is a sample = [mean] if data set is the population (1) EXAMPLE 1 Finding the Mean Find the mean for the sample measurements 3, 5, 1, 8, 6, 5, 4, and 6. Solution Solve using formula (1): Matched Problem 1 x = n a x i i = 1 n = = 38 8 = 4.75 Find the mean for the sample measurements 3.2, 4.5, 2.8, 5., and 3.6. If data has been grouped in a frequency table, such as Table 4 (Section 8-1), an alternative formula for the mean is generally used:

22 Section 8-2 Measures of Central Tendency 59 DEFINITION Mean: Grouped Data A data set of n measurements is grouped into k classes in a frequency table. If x i is the midpoint of the ith class interval and f i is the ith class frequency, the mean for the grouped data is given by where [mean] = n = a k i = 1 k a x i f i i = 1 n = x 1f 1 + x 2 f 2 + p + x k f k n f i = total number of measurements x = [mean] if data set is a sample = [mean] if data set is the population (2) Caution It is important to note that n is the total number of measurements in the entire data set not the number of classes! The mean computed by formula (2) is a weighted average of the midpoints of the class intervals. In general, this will be close to, but not exactly the same as, the mean computed by formula (1) for ungrouped data. EXAMPLE 2 Finding the Mean for Grouped Data Find the mean for the sample data summarized in Table 4, Section 8-1. Solution We repeat part of Table 4 here, adding columns for the class midpoints x i and the products x i f i (Table 1). TABLE 1 Entrance Examination Scores Midpoint Frequency Product Class Interval x i f i 1 x i f i , , , , , , , , n = a f i = 1 a x i f i = 57,9. i = 1 i = FIGURE 1 The balance point on the histogram is x = 579. Thus, the average entrance examination score for the sample of 1 entering freshmen is k a x i f i i = 1 x = = 57,9 = 579 n 1 If the histogram for the data in Table 1 (Fig. 11, Section 8-1) was drawn on a piece of wood of uniform thickness and the wood cut around the outside of the figure, the resulting object would balance exactly at the mean x = 579, as shown in Figure 1.

23 51 Chapter 8 Data Description and Probability Distributions Matched Problem 2 Compute the mean for the grouped sample data listed in Table 2. TABLE 2 Class Interval Frequency Insight The mean for ungrouped data and the mean for grouped data can be interpreted as the expected values of appropriately chosen random variables (see Section 7-5). Consider a set of n measurements x 1, x 2,..., x n (ungrouped data). Let S be the sample space consisting of n simple events (the n measurements), each equally likely. Let X be the random variable that assigns the numerical value x to each simple event in S. Then each measurement x has probability p i = n. 1 i i The expected value of X is given by E(X) = x 1 p 1 + x 2 p 2 + p + x n p n = x 1 1 n + x 2 1 n + p + x n 1 n = x 1 + x 2 + p + x n n = [mean] Similarly, consider a set of n measurements grouped into k classes in a frequency table (grouped data). Let S be the sample space consisting of n simple events (the n measurements), each equally likely. Let X be the random variable that assigns the midpoint x i of the ith class interval to the measurements that belong to that class interval. Then each midpoint x i has probability p i = f i n, 1 where f i denotes the frequency of the ith class interval. The expected value of X is given by E(X ) = x 1 p 1 + x 2 p 2 + p + x k p k = x 1 a f 1 1 n b + x 2 a f 2 1 n b + p + x k a f k 1 n b = x 1 f 1 + x 2 f 2 + p + x k f k n = [mean] Median Occasionally, the mean can be misleading as a measure of central tendency. Suppose the annual salaries of seven people in a small company are $17,, $2,, $28,, $18,, $18,, $12,, and $24,. The mean salary is x = n a x i i = 1 n = $245, 7 = $35,

24 Section 8-2 Measures of Central Tendency 511 Six of the seven salaries are below the average! The one large salary distorts the results. A measure of central tendency that is not influenced by extreme values is the median. The following definition of median makes precise our intuitive notion of the middle element when a set of measurements is arranged in ascending or descending order. Some sets of measurements, for example, 5, 7, 8, 13, 21, have a middle element. Other sets, for example, 9, 1, 15, 2, 23, 24, have no middle element, or you might prefer to say they have two middle elements. For any number between 15 and 2, half the measurements fall above the number and half fall below. DEFINITION Median 1. If the number of measurements in a set is odd, the median is the middle measurement when the measurements are arranged in ascending or descending order. 2. If the number of measurements in a set is even, the median is the mean of the two middle measurements when the measurements are arranged in ascending or descending order. EXAMPLE 3 Solution Matched Problem 3 Finding the Median Find the median salary in the preceding list of seven salaries. Arrange the salaries in increasing order and choose the middle one: SALARY $ 17, 18, 18, 2, 24, 28, 12, d d Median ($2,) Mean ($35,) In this case, the median is a better measure of central tendency than the mean. Add the salary $1, to those in Example 3 and compute the median and mean for these eight salaries. The median, as we have defined it, is easy to determine and is not influenced by extreme values. Our definition does have some minor handicaps, however. First, if the measurements we are analyzing were carried out in a laboratory and presented to us in a frequency table, we may not have access to the individual measurements. In that case we would not be able to compute the median using the above definition. Second, a set like 4, 4, 6, 7, 7, 7, 9 would have median 7 by our definition, but 7 does not possess the symmetry we expect of a middle element since there are three measurements below 7 but only one above. To overcome these handicaps, we define a second concept, the median for grouped data. To guarantee that the median for grouped data exists and is unique, we assume that the frequency table for the grouped data has no classes of frequency.

25 512 Chapter 8 Data Description and Probability Distributions DEFINITION Median for Grouped Data The median for grouped data with no classes of frequency is the number such that the histogram has the same area to the left of the median as to the right of the median (see Fig. 2). FIGURE 2 The area to the left of the median equals the area to the right. TABLE 3 Class Interval EXAMPLE 4 Solution Frequency Finding the Median for Grouped Data Compute the median for the grouped data of Table 3. We first draw the histogram of the data (Fig. 3). The total area of the histogram is 15, which is just the sum of the frequencies, since all rectangles have a base of length 1. The area to the left of the median must be half the total 15 area that is, 2 = 7.5. Looking at Figure 3 we see that the median M lies between 6.5 and 7.5. Thus, the area to the left of M, which is the sum of the blue shaded areas in Figure 3, must be 7.5: (1)(3) + (1)(1) + (1)(2) + (M - 6.5)(4) = 7.5 Solving for M gives M = That is, the median for the grouped data in Table 3 is Matched Problem FIGURE 3 Find the median for the grouped data in the following table: Class Interval Frequency

26 Section 8-2 Measures of Central Tendency We have given geometric interpretations of the mean for grouped data as the balance point of a histogram (Fig. 1), and of the median for grouped data as the point that divides a histogram into equal areas (Fig. 2). (A) Give an example of a simple histogram in which the balance point (the mean) and the point marking equal areas (the median) are two different points. Explain how other examples in which the mean does not equal the median could be constructed. (B) Give an example of a simple histogram in which the balance point (the mean) and the point marking equal areas (the median) are the same point. Discuss properties of a histogram that would guarantee that the mean and the median are equal. (C) Explain why the median for grouped data is not influenced by extreme values. Mode A third measure of central tendency is the mode. DEFINITION Mode The mode is the most frequently occurring measurement in a data set. There may be a unique mode, several modes, or, if no measurement occurs more than once, essentially no mode. EXAMPLE 5 Finding Mode, Median, and Mean Data Set Mode Median Mean (A) 4, 5, 5, 5, 6, 6, 7, 8, (B) 1, 2, 3, 3, 3, 5, 6, 7, 7, 7, 23 3, (C) 1, 3, 5, 6, 7, 9, 11, 15, 16 None Matched Problem 5 Data set (B) in Example 5 is referred to as bimodal, since there are two modes. Since no measurement in data set (C) occurs more than once, we say that it has no mode. Compute the mode(s), median, and mean for each data set: (A) 2, 1, 2, 1, 1, 5, 1, 9, 4 (B) 2, 5, 1, 4, 9, 8, 7 (C) 8, 2, 6, 8, 3, 3, 1, 5, 1, 8, 3 The mode, median, and mean can be computed in various ways with the aid of a graphing utility. In Figure 4A the data set of Example 5B is entered as a (A) FIGURE 4 (B)

27 514 Chapter 8 Data Description and Probability Distributions list, and its median and mean are computed.the histogram in Figure 4B shows the two modes of the same data set. As with the median, the mode is not influenced by extreme values. Suppose, in the data set of Example 5B, we replace 23 by 8. The modes remain 3 and 7 and the median is still 5, but the mean changes to 4.73.The mode is most useful for large data sets because it emphasizes data concentration. For example, a clothing retailer would be interested in the mode of sizes due to customer demand of the various items stocked in a store. The mode also can be used for qualitative attributes that is, attributes that are not numerical. The mean and median are not suitable in these cases. For example, the mode can be used to give an indication of a favorite brand of ice cream or the worst movie of the year. Figure 5 shows the results of a random survey of 1, people on entree preferences when eating dinner out. According to this survey, we would say that the modal preference is beef. Note that the mode is the only measure of central tendency (location) that can be used for this type of data; the mean and median make no sense. In actual practice, the mean is used the most, the median next, and the mode a distant third Mode Fish Shellfish Beef Lamb Pork FIGURE 5 The modal preference for an entree is beef. 2 For many sets of measurements the median lies between the mode and the mean. But this is not always so. (A) In a class of seven students the scores on an exam were 52, 89, 89, 92, 93, 96, 99. Show that the mean is less than the mode, and that the mode is less than the median. (B) Construct hypothetical sets of exam scores to show that all possible orders among the mean, median, and mode can occur. Answers to Matched Problems x L 3.8 x L 1.1 Median = $22,; mean = $43, Median for grouped data = 6.8

28 Section 8-2 Measures of Central Tendency First, arrange each set of data in ascending order: Data Set Mode Median Mean (A) 1, 1, 1, 1, 2, 2, 4, 5, (B) 1, 2, 4, 5, 7, 8, 9 None (C) 1, 1, 2, 3, 3, 3, 5, 6, 8, 8, 8 3, Exercise 8-2 A Find the mean, median, and mode for the sets of ungrouped data given in Problems 1 and 2. B 1. 1, 2, 2, 3, 3, 3, 3, 4, 4, , 1, 1, 1, 2, 3, 4, 5, 5, 5 Find the mean, median, and/or mode, whichever are applicable, in Problems 3 and Flavor Number Preferring Vanilla 139 Chocolate 376 Strawberry 89 Pistachio 15 Cherry 63 Almond mocha 228 Car Color Number Preferring Red 1,324 White 3,84 Black 1,617 Blue 2,33 Brown 2,718 Gold 1,992 Find the mean for the sets of grouped data in Problems 5 and Interval Frequency Interval Frequency Which single measure of central tendency the mean, median, or mode would you say best C describes the following set of measurements? Discuss the factors that justify your preference Which single measure of central tendency the mean, median, or mode would you say best describes the following set of measurements? Discuss the factors that justify your preference A data set is formed by recording the results of 1 rolls of a fair die. (A) What would you expect the mean of the data set to be? The median? (B) Form such a data set by using a graphing utility to simulate 1 rolls of a fair die, and find its mean and median. 1. A data set is formed by recording the sums on 2 rolls of a pair of fair dice. (A) What would you expect the mean of the data set to be? The median? (B) Form such a data set by using a graphing utility to simulate 2 rolls of a pair of fair dice, and find the mean and median of the set. 11. (A) Construct a set of four numbers that has mean 3, median 25, and mode 175. (B) Let m 1 7 m 2 7 m 3. Devise and discuss a procedure for constructing a set of four numbers that has mean m 1, median m 2, and mode m (A) Construct a set of five numbers that has mean 2, median 15, and mode 5. (B) Let m 1 7 m 2 7 m 3. Devise and discuss a procedure for constructing a set of five numbers that has mean m 1, median m 2, and mode m 3.

29 516 Chapter 8 Data Description and Probability Distributions Applications Business & Economics 13. Price earnings ratios. Find the mean, median, and mode for the data in the following table. Price Earnings Ratios for Eight Stocks in a Portfolio Gasoline tax. Find the mean, median, and mode for the data in the following table. State Gasoline Tax, 22 Tax State (Cents) Wisconsin 31.1 New York 3.25 Connecticut 25 Nebraska 24.5 Kansas 23 Texas 2 California 18 Florida Light bulb lifetime. Find the mean and median for the data in the following table. Life (Hours) of 5 Randomly Selected Light Bulbs Interval Frequency , ,99.5 1, , Price earnings ratios. Find the mean and median for the data in the following table. Price Earnings Ratios of 1 Randomly Chosen Stocks from The New York Stock Exchange Interval Frequency Financial aid. Find the mean, median, and mode for the data on federal student financial assistance in the following table. Average Federal Work Study Award Award Year ($) 18. Tourism. Find the mean, median, and mode for the data in the following table. Life Sciences , , , , , , ,252 International Tourism Receipts, 21 Receipts Country (billion $) United States 72.3 Spain 32.9 France 29.6 Italy 25.9 China 17.8 Germany 17.2 United Kingdom 15.9 Austria 12. Canada 1.8 Greece Mouse weights. Find the mean and median for the data in the following table. Mouse Weights (Grams) Interval Frequency

30 Section 8-3 Measures of Dispersion Blood cholesterol levels. Find the mean and median for the data in the following table: Blood Cholesterol Levels (Milligrams per Deciliter) Interval Frequency Social Sciences 21. Immigration. Find the mean, median, and mode for the data in the following table. Top Ten Countries of Birth of U.S. Foreign-Born Population, 2 Number Country (thousands) Mexico 7,841 Philippines 1,222 China 1,67 India 1,7 Cuba 952 Vietnam 863 El Salvador 765 Korea 71 Dominican Republic 692 Great Britain Grade-point averages. Find the mean and median for the grouped data in the following table. Graduating Class Grade-Point Averages Interval Frequency Entrance examination scores. Compute the median for the grouped data of entrance examination scores given in Table Presidents. Find the mean and median for the grouped data in the following table. U.S. Presidents Ages at Inauguration Age Number Section 8-3 Measures of Dispersion Range Standard Deviation: Ungrouped Data Standard Deviation: Grouped Data Significance of Standard Deviation A measure of central tendency gives us a typical value that can be used to describe a whole set of data, but this measure does not tell us whether the data are tightly clustered or widely dispersed. We now consider two measures of variation range and standard deviation that will give some indication of data scatter. Range A measure of dispersion, or scatter, that is easy to compute and is easily understood is the range. The range for a set of ungrouped data is the difference between the largest and the smallest values in the data set. The range for a frequency distribution is the difference between the upper boundary of the highest class and the lower boundary of the lowest class.

31 518 Chapter 8 Data Description and Probability Distributions (A) Mean 1 Range (B) Mean 1 Range (C) Mean 1 Range TABLE 1 x i FIGURE 1 (x i - x) Consider the histograms in Figure 1. We see that the range adds only a little information about the amount of variation in a data set. The graphs clearly show that even though each data set has the same mean and range, all three sets differ in the amount of scatter, or variation, of the data relative to the mean. The data set in part (A) is tightly clustered about the mean; the data set in part (B) is dispersed away from the mean; and the data set in part (C) is uniformly distributed over its range. Since the range depends only on the extreme values of the data, it does not give us any information about the dispersion of the data between these extremes. We need a measure of dispersion that will give us some idea of how the data are clustered or scattered relative to the mean. The standard deviation is such a measure. Standard Deviation: Ungrouped Data We will develop the concepts of variance and standard deviation both measures of variation through a simple example. Suppose that a random sample of five stamped parts is selected from a manufacturing process, and these parts are found to have the following lengths (in centimeters): Computing the sample mean, we obtain x = n a x i i = 1 n 5.2, 5.3, 5.2, 5.5, = 5 = 5.3 centimeters How much variation exists between the sample mean and all measurements in the sample? As a first attempt at measuring the variation, let us represent the deviation of a measurement from the mean by (x i - x). Table 1 lists all the deviations for this sample. Using these deviations, what kind of formula can we find that will give us a single measure of variation? It appears that the average of the deviations might be a good measure. But look what happens when we add the second column in Table 1. We get! It turns out that this will always happen for any data set. Now what? We could take the average of the absolute values of the deviations; however, this approach leads to problems relative to statistical inference. Instead, to get around the sign problem, we will take the average of the squares of the deviations and call this number the variance of the data set: [variance] = n a (x i - x) 2 i = 1 n (1) Calculating the variance using the entries in Table 1, we have [variance] = 5 a (x i - 5.3) 2 i = 1 =.12 square centimeter 5 We still have a problem, because the units in the variance are square centimeters instead of centimeters (the units of the original data set). To obtain the units of the original data set, we take the positive square root of the variance

32 Section 8-3 Measures of Dispersion 519 and call the result the standard deviation of the data set: a (x i - x) 2 i = 1 [standard deviation] = R n n =.11 centimeter a (x i - 5.3) 2 i = 1 = R 5 The sample variance is usually denoted by s 2 and the population variance by 2 ( is the Greek lowercase letter sigma ). The sample standard deviation is usually denoted by s and the population standard deviation by. In inferential statistics the sample variance s 2 is often used as an estimator for the population variance 2 and the sample standard deviation s for the population standard deviation. It can be shown that one can obtain better estimates of the population parameters in terms of the sample parameters (particularly when using small samples) if the divisor n is replaced by n - 1 when computing sample variances or sample standard deviations. With these modifications, we have the following: 5 (2) DEFINITION Variance: Ungrouped Data* The sample variance s 2 of a set of n sample measurements x 1, x 2, p, x n with mean x is given by s 2 = If x 1, x 2, p, x n is the whole population with mean, then the population variance 2 is given by 2 = n a (x i - x) 2 i = 1 n a (x i - ) 2 i = 1 n - 1 n (3) * In this section we restrict our interest to the sample variance. The standard deviation is just the positive square root of the variance. Therefore, we have the following formulas: DEFINITION Standard Deviation: Ungrouped Data* The sample standard deviation s of a set of n sample measurements x 1, x 2, p, x n with mean x is given by n a (x i - x) 2 i = 1 s = R n - 1 If x 1, x 2, p, x n is the whole population with mean, then the population standard deviation is given by n a (x i - ) 2 i = 1 = R n * In this section we restrict our interest to the sample standard deviation. (4)

33 52 Chapter 8 Data Description and Probability Distributions Computing the standard deviation for the original sample measurements (Table 1), we now obtain 5 a (x i - 5.3) 2 i = 1 s = =.12 centimeter R 5-1 EXAMPLE 1 Solution Matched Problem 1 Finding the Standard Deviation Find the standard deviation for the sample measurements 1, 3, 5, 4, 3. To find the standard deviation for the data set, we can utilize a table or use a calculator. Most will prefer the latter. Here is what we compute: x = = 3.2 (1-3.2) 2 + (3-3.2) 2 + (5-3.2) 2 + (4-3.2) 2 + (3-3.2) 2 s = B 5-1 L 1.48 Find the standard deviation for the sample measurements 1.2, 1.4, 1.7, 1.3, 1.5. Remark Many calculators and graphing utilities can compute x and s directly after the sample measurements are entered a helpful feature, especially when the sample is fairly large. This shortcut is illustrated in Figure 2 for a particular graphing calculator, where the data from Example 1 are entered as a list and several different one-variable statistics are immediately calculated. Included among these statistics are the mean x, the sample standard deviation s (denoted by Sx in Fig. 2B), the population standard deviation (denoted by x), the number n of measurements, the smallest element of the data set (denoted by minx), the largest element of the data set (denoted by maxx), the median (denoted by Med), and several statistics we have not discussed. (A) Data (B) Statistics FIGURE 2 (C) Statistics (continued) Insight If the sample measurements in Example 1 are considered to constitute the whole population, then the population standard deviation x is approximately equal to 1.33 [see Fig. 2(B)]. The computation of x is the same as that of Example 1, except that the denominator n - 1 (= 5-1) under the radical sign is replaced by n (= 5). Consequently, Sx L 1.48 is greater than x L Formulas (4) and (2) produce nearly the same results when the sample size n is large. The law of large numbers states that we can make a

34 Section 8-3 Measures of Dispersion 521 sample standard deviation s as close to the population standard deviation as we like by making the sample sufficiently large. 1 (A) When is the sample standard deviation of a set of measurements equal to? (B) Can the sample standard deviation of a set of measurements ever be greater than the range? Explain why or why not. Standard Deviation: Grouped Data Formula (4) for sample standard deviation is extended to grouped sample data as described in the following box: DEFINITION Standard Deviation: Grouped Data* Suppose a data set of n sample measurements is grouped into k classes in a frequency table, where x i is the midpoint and f i is the frequency of the ith class interval. Then the sample standard deviation s for the grouped data is k a (x i - x) 2 f i i = 1 s = R n - 1 (5) where n = g k i = 1f i = total number of measurements. If x 1, x 2, p, x n is the whole population with mean, then the population standard deviation is given by n a (x i - ) 2 f i i = 1 = R n * In this section we restrict our interest to the sample standard deviation. EXAMPLE 2 Finding the Standard Deviation for Grouped Data Find the standard deviation for each set of grouped sample data. (A) 4 (B) Solution Mean Mean (A) (B) s = B (8-1) 2 (1) + (9-1) 2 (2) + (1-1) 2 (4) + (11-1) 2 (2) + (12-1) 2 (1) 1-1 = B 12 9 L 1.15 s = B (8-1) 2 (4) + (9-1) 2 (1) + (1-1) 2 () + (11-1) 2 (1) + (12-1) 2 (4) 1-1 = B 34 9 L 1.94

35 522 Chapter 8 Data Description and Probability Distributions Matched Problem 2 Comparing the results of parts (A) and (B) in Example 2, we find that the larger standard deviation is associated with the data that deviate furthest from the mean. Find the standard deviation for the grouped sample data shown below Mean 1 Remark Figure 3 illustrates the shortcut computation of the mean and standard deviation on a particular graphing calculator when the data are grouped. The sample data of Example 2A are entered. List L 1 contains the midpoints of the class intervals, and list L 2 contains the corresponding frequencies. The mean, standard deviation, and other one-variable statistics are then calculated immediately. (A) FIGURE 3 Significance of Standard Deviation The standard deviation can give us additional information about a frequency distribution of a set of raw data. Suppose we draw a smooth curve through the midpoints of the tops of the rectangles forming a histogram for a fairly large frequency distribution (see Fig. 4). If the resulting curve is approximately bellshaped, then it can be shown that approximately 68% of the data will lie in the interval from x - s to x + s, about 95% of the data will lie in the interval from x - 2s to x + 2s, and almost all the data will lie in the interval from x - 3s to x + 3s. We will have much more to say about this in Section 8-5. (B) 68% x 3s x 2s x s x x s x 2s FIGURE 4 x 3s

36 Section 8-3 Measures of Dispersion (A) Verify that the following sample of 21 measurements has mean x = 4.95 and sample standard deviation s = Therefore, 15 of the 21 measurements, or 71% of the data, lie in the interval , that is, within 1 standard deviation of the mean. What proportion of the data lies within 2 standard deviations of the mean? Within 3 standard deviations? (B) What proportion of the following data set lies within 1 standard deviation of the mean? Within 2 standard deviations? Within 3 standard deviations? (C) Based on your answers to parts (A) and (B), which of the two data sets would have a histogram that is approximately bell shaped? Confirm your conjecture by constructing a histogram with class interval width 1, starting at -.5, for each data set. Answers to Matched Problems 1. s L s L 1.49 (a value between those found in Example 2, as expected) Exercise 8-3 A In Problems 1 and 2, find the standard deviation for each set of ungrouped sample data using formula (4). 1. 1, 2, 2, 3, 3, 3, 3, 4, 4, , 1, 1, 1, 2, 3, 4, 5, 5, 5 3. (A) What proportion of the following sample of ten measurements lies within 1 standard deviation of the mean? Within 2 standard deviations? Within 3 standard deviations? (B) Based on your answers to part (A), would you conjecture that the histogram is approximately bell shaped? Explain. (C) To confirm your conjecture, construct a histogram with class interval width 1, starting at (A) What proportion of the following sample of ten measurements lies within 1 standard deviation of the mean? Within 2 standard deviations? Within 3 standard deviations? (B) Based on your answers to part (A), would you conjecture that the histogram is approximately bell shaped? Explain. (C) To confirm your conjecture, construct a histogram with class interval width 1, starting at.5. B In Problems 5 and 6, find the standard deviation for each set of grouped sample data using formula (5) Interval Frequency Interval Frequency

37 524 Chapter 8 Data Description and Probability Distributions C In Problems 7 and 8, discuss the validity of each statement. If the statement is always true, explain why. If not, give a counterexample. 7. (A) The sample variance of a set of n sample measurements is always greater than or equal to the sample standard deviation. (B) The population variance of x 1, x 2, p, x n is always greater than or equal to. 8. (A) The sample variance of a set of n sample measurements is always positive. (B) For a sample x 1, x 2 of size two, the sample variance is equal to (x 1 - x 2 ) 2 9. A data set is formed by recording the sums in 1 rolls of a pair of dice. A second data set is formed by 2 recording the results of 1 draws of a ball from a box containing 11 balls numbered 2 through 12. (A) Which of the two data sets would you expect to have the smaller standard deviation? Explain. (B) To obtain evidence for your answer to part (A), use a graphing utility to simulate both experiments, and compute the standard deviations of each of the two data sets. 1. A data set is formed by recording the results of rolling a fair die 2 times. A second data set is formed by rolling a pair of dice 2 times, each time recording the minimum of the two numbers. (A) Which of the two data sets would you expect to have the smaller standard deviation? Explain. (B) To obtain evidence for your answer to part (A), use a graphing utility to simulate both experiments, and compute the standard deviations of each of the two data sets. Applications Find the mean and standard deviation for each of the sample data sets given in Problems Use the suggestions in the remarks following Examples 1 and 2 to perform some of the computations. Business & Economics 11. Earnings per share. The earnings per share (in dollars) for 12 companies selected at random from the list of Fortune 5 companies are Checkout times. The checkout times (in minutes) for 12 randomly selected customers at a large supermarket during the store s busiest time are Quality control. The lives (in hours of continuous use) of 1 randomly selected flashlight batteries are Interval Frequency Stock analysis. The price earnings ratios of 1 randomly selected stocks from the New York Stock Exchange are Interval Frequency Life Sciences 15. Medicine. The reaction times (in minutes) of a drug given to a random sample of 12 patients are Nutrition: animals. The mouse weights (in grams) of a random sample of 1 mice involved in a nutrition experiment are Interval Frequency

38 Section 8-4 Bernoulli Trials and Binomial Distributions 525 Social Sciences 17. Reading scores. The grade-level reading scores from a reading test given to a random sample of 12 students in an urban high school graduating class are Grade-point average. The grade-point averages of a random sample of 1 students from the graduating class of a large university are Interval Frequency Section 8-4 Bernoulli Trials and Binomial Distributions Bernoulli Trials Binomial Formula: Brief Review Binomial Distribution Application In Section 8-1 we discussed frequency and relative frequency distributions, which were represented by tables and histograms (see Table 4, page 497, and Fig. 11, page 499). Frequency distributions and their corresponding probability distributions based on actual observations are empirical in nature. But there are many situations in which it is of interest (and possible) to determine the kind of relative frequency distribution we might expect before any data have actually been collected. What we have in mind is a theoretical, or hypothetical, probability distribution that is, a probability distribution based on assumptions and theory rather than actual observations or measurements. Theoretical probability distributions are used to approximate properties of real-world distributions, assuming the theoretical and empirical distributions are closely matched. There are many interesting theoretical probability distributions. One of particular interest because of its widespread use is the binomial distribution. The reason for the name binomial distribution is that the distribution is closely related to the binomial expansion of (q + p) n, where n is a natural number. We start the discussion with a particular type of experiment called a Bernoulli experiment, or trial. Bernoulli Trials If we toss a coin, either a head occurs or it does not. If we roll a die, either a 3 shows or it fails to show. If you are vaccinated for smallpox, either you contract smallpox or you do not. What do all these situations have in common? All can be classified as experiments with two possible outcomes, each the complement of the other. An experiment for which there are only two possible outcomes, E or E, is called a Bernoulli experiment, or trial, named after Jacob Bernoulli ( ), the Swiss scientist and mathematician who was one of the first to study systematically the probability problems related to a two-outcome experiment.

39 526 Chapter 8 Data Description and Probability Distributions In a Bernoulli experiment or trial, it is customary to refer to one of the two outcomes as a success S and to the other as a failure F. If we designate the probability of success by P(S) = p then the probability of failure is P(F) = 1 - p = q Note: p + q = 1 EXAMPLE 1 Matched Problem 1 Probability of Success in a Bernoulli Trial Suppose that we roll a fair die and ask for the probability of a 6 turning up. This can be viewed as a Bernoulli trial by identifying a success with a 6 turning up and a failure with any of the other numbers turning up. Thus, p = 1 6 and q = = 5 6 Find p and q for a single roll of a fair die, where a success is a number divisible by 3 turning up. Now, suppose that a Bernoulli trial is repeated a number of times. It becomes of interest to try to determine the probability of a given number of successes out of the given number of trials. For example, we might be interested in the probability of obtaining exactly three 5 s in six rolls of a fair die or the probability that 8 people will not catch influenza out of the 1 who have been inoculated. Suppose that a Bernoulli trial is repeated five times so that each trial is completely independent of any other and p is the probability of success on each trial. Then the probability of the outcome SSFFS would be P(SSFFS) = P(S)P(S)P(F)P(F)P(S) = ppqqp = p 3 q 2 See Section 7-3. In general, we define a sequence of Bernoulli trials as follows: DEFINITION Bernoulli Trials A sequence of experiments is called a sequence of Bernoulli trials, or a binomial experiment, if 1. Only two outcomes are possible on each trial. 2. The probability of success p for each trial is a constant (probability of failure is then q = 1 - p). 3. All trials are independent. The reason for calling a sequence of Bernoulli trials a binomial experiment will be made clear shortly. EXAMPLE 2 Probability of an Outcome of a Binomial Experiment If we roll a fair die five times and identify a success in a single roll with a 1 turning up, what is the probability of the sequence SFFSS occurring? Solution p = 1 6 q = 1 - p = 5 6 P(SFFSS) = pqqpp = p 3 q 2 = A 1 6B 3 A 5 6B 2 L.3

40 Matched Problem 2 Section 8-4 Bernoulli Trials and Binomial Distributions 527 In Example 2, find the probability of the outcome FSSSF. If we roll a fair die five times, what is the probability of obtaining exactly three 1 s? Notice how this problem differs from Example 2. In that example we looked at only one way three 1 s can occur. Then in Matched Problem 2 we saw another way. Thus, exactly three 1 s may occur in the following two sequences (among others): SFFSS FSSSF We found that the probability of each sequence occurring is the same, namely, A 1 6B 3 A 5 6B 2 How many more sequences will produce exactly three 1 s? To answer this question, think of the number of ways the following five blank positions can be filled with three S s and two F s: b 1 b 2 b 3 b 4 b 5 n n n n n A given sequence is determined, of course, once the S s are located. Thus, we are interested in the number of ways three blank positions can be selected for the S s out of the five available blank positions b 1, b 2, b 3, b 4, and b 5. This problem should sound familiar it is just the problem of finding the number of combinations of 5 objects taken 3 at a time; that is, C 5,3. Thus, the number of different sequences of successes and failures that produce exactly three successes (exactly three 1 s) is C 5,3 = 5! 3!2! = 1 Since the probability of each sequence is the same, p 3 q 2 = A 1 6B 3 A 5 6B 2 and there are 1 mutually exclusive sequences that produce exactly three 1 s, we have P(exactly three successes) = C 5,3 a b a b = 5! 3!2! a 1 6 b 3 a 5 6 b 2 = (1) a b a b L.32 Reasoning in essentially the same way, the following important theorem can be proved: THEOREM 1 Probability of x Successes in n Bernoulli Trials The probability of exactly x successes in n independent repeated Bernoulli trials, with the probability of success of each trial p (and of failure q), is P(x successes) = C n,x p x q n - x (1) EXAMPLE 3 Probability of x Successes in n Bernoulli Trials If a fair die is rolled five times, what is the probability of rolling (A) Exactly two 3 s? (B) At least two 3 s?

41 528 Chapter 8 Data Description and Probability Distributions Solution (A) Use formula (1) with n = 5, x = 2, and p = 1 6: (B) Notice how this problem differs from part (A). Here we have It is actually easier to compute the probability of the complement of this event, P(x 6 2), and use where P(x = 2) = C 5,2 a b a b = 5! 2!3! a 1 6 b 2 a 5 6 b 3 P(x 2) = 1 - P(x 6 2) We now compute P(x = ) and P(x = 1): L.161 P(x 2) = P(x = 2) + P(x = 3) + P(x = 4) + P(x = 5) P(x 6 2) = P(x = ) + P(x = 1) P(x = ) = C 5, a 1 6 b a b P(x = 1) = C 5,1 a b a b Matched Problem 3 Thus, and = a 5 6 b 5 L.42 = 5! 1!4! a 1 6 b 1 a 5 6 b 4 L.42 P(x 6 2) = =.84 P(x 2) = =.196 Using the same die experiment as in Example 3,what is the probability of rolling (A) Exactly one 3? (B) At least one 3? Binomial Formula: Brief Review Before extending Bernoulli trials to binomial distributions it is worthwhile to review briefly the binomial formula. (A more detailed discussion of this formula can be found in Appendix B-3.) To start, let us calculate directly the first five natural number powers of (a + b) n : (a + b) 1 = a + b (a + b) 2 = a 2 + 2ab + b 2 (a + b) 3 = a 3 + 3a 2 b + 3ab 2 + b 3 (a + b) 4 = a 4 + 4a 3 b + 6a 2 b 2 + 4ab 3 + b 4 (a + b) 5 = a 5 + 5a 4 b + 1a 3 b 2 + 1a 2 b 3 + 5ab 4 + b 5 In general, it can be shown that a binomial expansion is given by the wellknown binomial formula: RESULT Binomial Formula For n a natural number, (a + b) n = C n, a n + C n,1 a n - 1 b + C n,2 a n - 2 b 2 + p + C n,n b n

42 Section 8-4 Bernoulli Trials and Binomial Distributions 529 EXAMPLE 4 Finding Binomial Expansions Use the binomial formula to expand (q + p) 3. Solution Matched Problem 4 (q + p) 3 = C 3, q 3 + C 3,1 q 2 p + C 3,2 qp 2 + C 3,3 p 3 = q 3 + 3q 2 p + 3qp 2 + p 3 Use the binomial formula to expand (q + p) 4. Binomial Distribution We now generalize the discussion of Bernoulli trials to binomial distributions. We start by considering a sequence of three Bernoulli trials. Let the random variable X 3 (see Section 7-5) represent the number of successes in three trials,, 1, 2, or 3. We are interested in the probability distribution for this random variable. Which outcomes of an experiment consisting of a sequence of three Bernoulli trials lead to the random variable values, 1, 2, and 3, and what are the probabilities associated with these values? Table 1 answers these questions. TABLE 1 Probability of X 3 Simple Event Simple Event x successes in 3 trials P(X 3 = x) FFF qqq = q 3 q 3 FFS FSF qqp = q 2 p qpq = q 2 p 1 3q 2 p SFF FSS pqq = q 2 p qpp = qp 2 2 3qp 2 SFS SSF SSS pqp = qp 2 ppq = qp 2 ppp = p 3 3 p 3 The terms in the last column of Table 1 are the terms in the binomial expansion of (q + p) 3, as we saw in Example 4. The last two columns in Table 1 provide a probability distribution for the random variable X 3. Note that both conditions for a probability distribution (see Section 7-5) are met: 1. P(X 3 = x) 1, x {, 1, 2, 3} 2. 1 = 1 3 = (q + p) 3 Recall that q + p = 1. = C 3, q 3 + C 3,1 q 2 p + C 3,2 qp 2 + C 3,3 p 3 = q 3 + 3q 2 p + 3qp 2 + p 3 = P(X 3 = ) + P(X 3 = 1) + P(X 3 = 2) + P(X 3 = 3) Reasoning in the same way for the general case, we see why the probability distribution of a random variable associated with the number of successes in a sequence of n Bernoulli trials is called a binomial distribution the probability of each number is a term in the binomial expansion of (q + p) n. For this reason, a sequence of Bernoulli trials is often referred to as a binomial experiment. In terms of a formula, which we already discussed from another point of view (see Theorem 1), we have

43 53 Chapter 8 Data Description and Probability Distributions DEFINITION Binomial Distribution P(X n = x) = P(x successes in n trials) = C n,x p x q n - x x {, 1, 2, p, n} where p is the probability of success and q is the probability of failure on each trial. Informally, we will write P(x) in place of P(X n = x). EXAMPLE 5 Constructing Tables and Histograms for Binomial Distributions Suppose a fair die is rolled three times and a success on a single roll is considered to be rolling a number divisible by 3. (A) Write the probability function for the binomial distribution. (B) Construct a table for this binomial distribution. (C) Draw a histogram for this binomial distribution. Solution (A) p = 1 3 Since two numbers out of six are divisible by 3 Hence, q = 1 - p = 2 3 n = 3 P(x) = P(x successes in 3 trials) = C 3,x A 1 3B x A 2 3B 3 - x (B) x P(x) (C) P(x) C 3, A 1 3B A 2 3B 3 L.3 C 3,1 A 1 3B 1 A 2 3B 2 L.44 C 3,2 A 1 3B 2 A 2 3B 1 L.22 C 3,3 A 1 3B 3 A 2 3B L Number of successes, x FIGURE 1 If we actually performed the binomial experiment described in Example 5 a large number of times with a fair die, we would find that we would roll no number divisible by 3 in three rolls of a die about 3% of the time, one number divisible by 3 in three rolls about 44% of the time, two numbers divisible by 3 in three rolls about 22% of the time, and three numbers divisible by 3 in three rolls only 4% of the time. Note that the sum of all the probabilities is 1, as it should be. The graphing utility command in Figure 2A simulates 1 repetitions of the binomial experiment in Example 5. The number of successes in each trial is stored in list L 1. From Figure 2B, which shows a histogram of L 1, we note that the empirical probability of rolling one number divisible by 3 in three rolls is 4 1 = 4%, close to the theoretical probability of 44%. The empirical probabilities of, 2, or 3 successes also would be close to the corresponding theoretical probabilities.

44 Section 8-4 Bernoulli Trials and Binomial Distributions (A) (B) Matched Problem 5 FIGURE 2 Repeat Example 5, where the binomial experiment consists of two rolls of a die instead of three rolls. Let X be a random variable with probability distribution x i x 1 x 2... x n p i p 1 p 2... p n In Section 7-5 we defined the expected value of X to be E(X) = x 1 p 1 + x 2 p 2 + p + x n p n The expected value of X is also called the mean of the random variable X, often denoted by. The standard deviation of a random variable X having mean is defined by = 2(x 1 - ) 2 p 1 + (x 2 - ) 2 p 2 + p + (x n - ) 2 p n If a random variable has a binomial distribution, where n is the number of Bernoulli trials, p is the probability of success, and q the probability of failure, then the mean and standard deviation are given by the following formulas. RESULT Mean and Standard Deviation (Random Variable in a Binomial Distribution) Mean: = np Standard deviation: = 2npq X 3 Insight Let the random variable denote the number x of successes in a sequence of three Bernoulli trials. Then x =, 1, 2, or 3. The expected value of X 3 is given by E(X 3 ) = P() + 1 P(1) + 2 P(2) + 3 P(3) = + 1 3q 2 p + 2 3qp p 3 = 3p(q 2 + 2qp + p 2 ) = 3p(q + p) 2 See Table 1. Factor out 3p. Factor the perfect square. q + p = 1 = 3p This proves the formula = np in the case n = 3. Similar but more complicated computations can be used to justify the general formulas = np and = 1npq for the mean and standard deviation of random variables having binomial distributions.

45 532 Chapter 8 Data Description and Probability Distributions EXAMPLE 6 Solution Matched Problem 6 1 EXAMPLE 7 Solution Computing the Mean and Standard Deviation of a Binomial Distribution Compute the mean and standard deviation for the random variable in Example 5. Compute the mean and standard deviation for the random variable in Matched Problem 5. Let X 1 denote the number of successes of 1 Bernoulli trials, each with probability of success p. (A) For what values of p would the mean of be equal to? 5? 1? (B) For what values of p would the standard deviation of X 1 be equal to? 5? 1? Application Binomial experiments are associated with a wide variety of practical problems: industrial sampling, drug testing, genetics, epidemics, medical diagnosis, opinion polls, analysis of social phenomena, qualifying tests, and so on. Several types of applications are included in Exercise 8-4. We will now consider one application in detail. Patient Recovery The probability of recovering after a particular type of operation is.5. Let us investigate the binomial distribution involving eight patients undergoing this operation. (A) Write the function defining this distribution. (B) Construct a table for the distribution. (C) Construct a histogram for the distribution. (D) Find the mean and standard deviation for the distribution. (A) Letting a recovery be a success, we have p =.5 q = 1 - p =.5 n = 8 Hence, (B) n = 3 p = 1 3 q = = 2 3 = np = 3A 1 3B = 1 = 2npq = 23A 1 3BA 2 3B L.82 P(x) = P(exactly x successes in 8 trials) = C 8, x (.5) x (.5) 8 - x = C 8, x (.5) 8 x P(x) C 8, (.5) 8 L.4 C 8,1 (.5) 8 L.31 C 8,2 (.5) 8 L.19 C 8,3 (.5) 8 L.219 C 8,4 (.5) 8 L.273 C 8,5 (.5) 8 L.219 C 8,6 (.5) 8 L.19 C 8,7 (.5) 8 L.31 C 8,8 (.5) 8 L L 1 The discrepancy in the sum is due to round-off errors. (C) P(x) X Number of successes, x

46 Section 8-4 Bernoulli Trials and Binomial Distributions 533 Matched Problem 7 (D) = np = 8(.5) = 4 = 2npq = 28(.5)(.5) L 1.41 Repeat Example 7 for four patients. 2 The mean of a random variable is its expected value. Use the distribution tables for the random variables of Examples 5 and 7 to compute the expected values. Do your answers agree with the results obtained using the formula = np? Explain. Answers to Matched Problems 1. p = 1 3, q = 2 2. p 3 q 2 = A 1 6B 3 A 5 3 6B 2 L.3 3. (A).42 (B) 1 - P(x = ) = = C 4, q 4 + C 4,1 q 3 p + C 4,2 q 2 p 2 + C 4,3 qp 3 + C 4,4 p 4 = q 4 + 4q 3 p + 6q 2 p 2 + 4qp 3 + p 4 5. (A) P(x) = P(x successes in 2 trials) = C 2,x A 1 3B x A 2 3B 2 - x, x {, 1, 2} (B) (C) P(x) x P(x) L L L Number of successes, x 6. L.67; L (A) P(x) = P(exactly x successes in 4 trials) = C 4,x (.5) 4 (B) (C) P(x) x P(x) Number of successes, x (D) = 2; = 1 Exercise 8-4 A Evaluate C n,x p x q n - x for the values of n, x, and p given in Problems n = 5, x = 1, p = n = 5, x = 2, p = n = 6, x = 3, p =.4 4. n = 6, x = 6, p =.4 5. n = 4, x = 3, p = n = 4, x = 3, p = 1 3 In Problems 7 1, a fair coin is tossed four times. What is the probability of obtaining 7. A head on the first toss and tails on each of the other tosses? 8. Exactly one head? 9. At least three tails? 1. Tails on each of the first three tosses? 11. No heads? 12. Four heads? In Problems 13 18, construct a histogram for the binomial distribution P(x) = C n, x p x q n - x, and compute the mean and standard deviation if 13. n = 3, p = n = 3, p = n = 4, p = n = 5, p = n = 5, p = 18. n = 4, p = 1

47 534 Chapter 8 Data Description and Probability Distributions B In Problems 19 24, a fair die is rolled three times. What is the probability of obtaining 19. A 6, 5, and 6, in that order? 2. A 6, 5, and 6, in any order? 21. At least two 6 s? 22. Exactly one 6? 23. No 6 s? 24. At least one 5? 25. If a baseball player has a batting average of.35, what is the probability that the player will get the following number of hits in the next four times at bat? (A) Exactly 2 hits (B) At least 2 hits 26. If a true false test with 1 questions is given, what is the probability of scoring (A) Exactly 7% just by guessing? (B) 7% or better just by guessing? 27. A multiple-choice test consists of 1 questions, each with choices A, B, C, D, E (of which exactly one choice is correct). Which is more likely if you simply guess at each question: that all your answers are wrong, or that at least half are right? Explain. 28. If 6% of the electorate supports the mayor, what is the probability that in a random sample of 1 voters, fewer than half support her? Construct a histogram for each of the binomial distributions in Problems Compute the mean and standard deviation for each distribution. 29. P(x) = C 6, x (.4) x (.6) 6 - x 3. P(x) = C 6, x (.6) x (.4) 6 - x 31. P(x) = C 8, x (.3) x (.7) 8 - x 32. P(x) = C 8, x (.7) x (.3) 8 - x In Problems 33 and 34, use a graphing utility to construct a probability distribution table. 33. A random variable represents the number of successes in 2 Bernoulli trials, each with probability of success p =.85. (A) Find the mean and standard deviation of the random variable. (B) Find the probability that the number of successes lies within 1 standard deviation of the mean. 34. A random variable represents the number of successes in 2 Bernoulli trials, each with probability of success p =.45. (A) Find the mean and standard deviation of the random variable. (B) Find the probability that the number of successes lies within 1 standard deviation of the mean. C In Problems 35 and 36, a coin is loaded so that the 3 probability of a head occurring on a single toss is 4. In five tosses of the coin, what is the probability of getting 35. All heads or all tails? 36. Exactly 2 heads or exactly 2 tails? 37. Toss a coin three times or toss three coins simultaneously, and record the number of heads. Repeat the binomial experiment 1 times and compare your relative frequency distribution with the theoretical probability distribution. 38. Roll a die three times or roll three dice simultaneously, and record the number of 5 s that occur. Repeat the binomial experiment 1 times and compare your relative frequency distribution with the theoretical probability distribution. 39. Find conditions on p that guarantee the histogram for a binomial distribution is symmetrical about x = n 2. Justify your answer. 4. Consider two binomial distributions for 1, repeated Bernoulli trials the first for trials with p =.15, and the second for trials with p =.85. How are the histograms for the two distributions related? Explain. 41. A random variable represents the number of heads in ten tosses of a coin. (A) Find the mean and standard deviation of the random variable. (B) Use a graphing utility to simulate 2 repetitions of the binomial experiment, and compare the mean and standard deviation of the numbers of heads from the simulation to the answers for part (A). 42. A random variable represents the number of 7 s or 11 s in ten rolls of a pair of dice. (A) Find the mean and standard deviation of the random variable. (B) Use a graphing utility to simulate 1 repetitions of the binomial experiment, and compare the mean and standard deviation of the numbers of 7 s or 11 s from the simulation to the answers for part (A).

48 Section 8-4 Bernoulli Trials and Binomial Distributions 535 Applications Business & Economics 43. Management training. Each year a company selects a number of employees for a management training program given by a nearby university. On the average, 7% of those sent complete the program. Out of 7 people sent by the company, what is the probability that (A) Exactly 5 complete the program? (B) 5 or more complete the program? 44. Employee turnover. If the probability of a new employee in a fast-food chain still being with the company at the end of 1 year is.6, what is the probability that out of 8 newly hired people (A) 5 will still be with the company after 1 year? (B) 5 or more will still be with the company after 1 year? 45. Quality control. A manufacturing process produces, on the average, 6 defective items out of 1. To control quality, each day a sample of 1 completed items is selected at random and inspected. If the sample produces more than 2 defective items, then the whole day s output is inspected and the manufacturing process is reviewed. What is the probability of this happening, assuming that the process is still producing 6% defective items? 46. Guarantees. A manufacturing process produces, on the average, 3% defective items. The company ships 1 items in each box and wishes to guarantee no more than 1 defective item per box. If this guarantee accompanies each box, what is the probability that the box will fail to satisfy the guarantee? 47. Quality control. A manufacturing process produces, on the average, 5 defective items out of 1. To control quality, each day a random sample of 6 completed items is selected and inspected. If a success on a single trial (inspection of 1 item) is finding the item defective, then the inspection of each of the 6 items in the sample constitutes a binomial experiment, which has a binomial distribution. (A) Write the function defining the distribution. (B) Construct a table for the distribution. (C) Draw a histogram. (D) Compute the mean and standard deviation. 48. Management training. Each year a company selects 5 employees for a management training program given at a nearby university. On the average, 4% of those sent complete the course in the top 1% of their class. If we consider an employee finishing in the top 1% of the class a success in a binomial experiment, then for the 5 employees entering the program there exists a binomial distribution involving P(x successes out of 5). (A) Write the function defining the distribution. (B) Construct a table for the distribution. (C) Draw a histogram. (D) Compute the mean and standard deviation. Life Sciences 49. Medical diagnosis. A person with tuberculosis is given a chest x ray. Four tuberculosis x-ray specialists examine each x ray independently. If each specialist can detect tuberculosis 8% of the time when it is present, what is the probability that at least 1 of the specialists will detect tuberculosis in this person? 5. Harmful side effects of drugs. A pharmaceutical laboratory claims that a drug it produces causes serious side effects in 2 people out of 1,, on the average. To check this claim, a hospital administers the drug to 1 randomly chosen patients and finds that 3 suffer from serious side effects. If the laboratory s claims are correct, what is the probability of the hospital obtaining these results? 51. Genetics. The probability that brown-eyed parents, both with the recessive gene for blue, will have a child with brown eyes is.75. If such parents have 5 children, what is the probability that they will have (A) All blue-eyed children? (B) Exactly 3 children with brown eyes? (C) At least 3 children with brown eyes? 52. Gene mutations. The probability of gene mutation under a given level of radiation is 3 * 1-5. What is the probability of the occurrence of at least 1 gene mutation if 1 5 genes are exposed to this level of radiation? 53. Epidemics. If the probability of a person contracting influenza on exposure is.6, consider the binomial distribution for a family of 6 that has been exposed. (A) Write the function defining the distribution. (B) Construct a table for the distribution. (C) Draw a histogram. (D) Compute the mean and standard deviation. 54. Side effects of drugs. The probability that a given drug will produce a serious side effect in a person using the drug is.2. In the binomial distribution for 45 people using the drug, what are the mean and standard deviation?

49 536 Chapter 8 Data Description and Probability Distributions Social Sciences 55. Testing. A multiple-choice test is given with 5 choices (only one is correct) for each of 1 questions. What is the probability of passing the test with a grade of 7% or better just by guessing? 56. Opinion polls. An opinion poll based on a small sample can be unrepresentative of the population. To see why, let us assume that 4% of the electorate favors a certain candidate. If a random sample of 7 is asked their preference, what is the probability that a majority will favor this candidate? 57. Testing. A multiple-choice test is given with 5 choices (only one is correct) for each of 5 questions. Answering each of the 5 questions by guessing constitutes a binomial experiment with an associated binomial distribution. (A) Write the function defining the distribution. (B) Construct a table for the distribution. (C) Draw a histogram. (D) Compute the mean and standard deviation. 58. Sociology. The probability that a marriage will end in divorce within 1 years is.4. What are the mean and standard deviation for the binomial distribution involving 1, marriages? 59. Sociology. If the probability is.6 that a marriage will end in divorce within 2 years after its start, what is the probability that out of 6 couples just married, in the next 2 years (A) None will be divorced? (B) All will be divorced? (C) Exactly 2 will be divorced? (D) At least 2 will be divorced? Section 8-5 Normal Distributions Normal Distribution Areas Under Normal Curves Approximating a Binomial Distribution with a Normal Distribution Normal Distribution If we take the histogram for a binomial distribution, say, the one we drew for Example 7, Section 8-4 (n = 8, p =.5), and join the midpoints of the tops of the rectangles with a smooth curve, we obtain the bell-shaped curve in Figure 1. The mathematical foundation for this type of curve was established by Abraham De Moivre ( ), Pierre Laplace ( ), and Carl Gauss ( ). The bell-shaped curves studied by these famous mathematicians are called normal curves or normal probability distributions, and their equations are completely determined by the mean and standard deviation of P(x) Number of successes, x FIGURE 1 Binomial distribution and bell-shaped curve

50 Section 8-5 Normal Distributions 537 m 2 s 1 m 15 s 3 m 5 s m FIGURE 2 Normal probability distributions the distribution. Figure 2 illustrates three normal curves with different means and standard deviations. Insight The equation for a normal curve is fairly complicated: f(x) = 1 - )2 2 2 e-(x 12 where L and e L Given the values of and, however, the function is completely specified, and we could plot points or use a graphing calculator to produce its graph. Substituting x + h for x produces an equation of the same form but with a different value of. Therefore, in the terminology of Section 1-2, any horizontal translation of a normal curve is another normal curve. Until now we have dealt with discrete random variables, that is, random variables that assume a finite or a countably infinite number of values (we have dealt only with the finite case). Random variables associated with normal distributions are continuous in nature; that is, they assume all values over an interval on a real number line. These are called continuous random variables. Random variables associated with people s heights, light bulb lifetimes, or the lengths of time between breakdowns of an office copier are continuous. The following is a list of some of the important properties of normal curves (normal probability distributions of a continuous random variable): PROPERTIES Normal Curves 1. Normal curves are bell shaped and are symmetrical with respect to a vertical line. 2. The mean is at the point where the axis of symmetry intersects the horizontal axis. 3. The shape of a normal curve is completely determined by its mean and standard deviation a small standard deviation indicates a tight clustering about the mean and thus a tall, narrow curve; a large standard deviation indicates a large deviation from the mean and thus a broad, flat curve (see Fig. 2). (continued)

51 538 Chapter 8 Data Description and Probability Distributions PROPERTIES Normal Curves 4. Irrespective of the shape, the area between the curve and the x axis is always Irrespective of the shape, 68.3% of the area will lie within an interval of 1 standard deviation on either side of the mean, 95.4% within 2 standard deviations on either side, and 99.7% within 3 standard deviations on either side (see Fig. 3). 99.7% 95.4% 68.3% m 3s m 2s m 1s m m 1s m 2s m 3s x z FIGURE 3 Normal curve areas The normal probability distribution is the most important of all theoretical distributions. It is at the heart of a great deal of statistical theory, and it is also a useful tool in its own right for solving practical problems. Not only does a normal curve provide a good approximation for a binomial distribution for large n, but it also approximates many other relative frequency distributions. For example, normal curves often provide good approximations for the relative frequency distributions for heights and weights of people, measurements of manufactured parts, scores on IQ tests, college entrance examinations, civil service tests, and measurements of errors in laboratory experiments..2 2 FIGURE 4 Areas Under Normal Curves To use normal curves in practical problems, we must be able to determine areas under different parts of a normal curve. Remarkably, the area under a normal curve between a mean and a given number of standard deviations to the right (or left) of is the same, regardless of the shape of the normal curve. For example, the area under the normal curve with = 3, = 5 from = 3 to = 1.5 is equal to the area under the normal curve with = 15, = 2 from = 15 to = 18 (see Fig. 4, noting that the shaded regions have the same areas, or equivalently, the same numbers of pixels). Therefore, such areas for any normal curve can be easily determined from the areas for the standard normal curve, that is, the normal curve with mean and standard deviation 1. In fact, if z represents the number of standard deviations that a measurement x is from a mean, then the area under a normal curve from to + z equals the area under the standard normal curve from to z (see Fig. 5). Table I in Appendix C lists those areas for the standard normal curve.

52 Section 8-5 Normal Distributions 539 f(x) Area corresponding to z m m zs x z z FIGURE 5 Areas and z values EXAMPLE 1 Finding Probabilities for a Normal Distribution A manufacturing process produces light bulbs with life expectancies that are normally distributed with a mean of 5 hours and a standard deviation of 1 hours.what percentage of the light bulbs can be expected to last between 5 and 67 hours? Solution To answer this question, we first determine how many standard deviations 67 is from 5, the mean. This is easily done by dividing the distance between 5 and 67 by 1, the standard deviation. Thus, z = = 17 1 = 1.7 That is, 67 is 1.7 standard deviations from 5, the mean. Referring to Table I, Appendix C, we see that.4554 corresponds to z = 1.7. And since the total area under a normal curve is 1, we conclude that 45.54% of the light bulbs produced will last between 5 and 67 hours (see Fig. 6). f(x).4554 or 45.54% Matched Problem x 1.7 z FIGURE 6 Light bulb life expectancy: positive z What percentage of the light bulbs can be expected to last between 5 and 75 hours? In general, to find how many standard deviations a measurement x is from a mean, first determine the distance between x and and then divide by : RESULT z = distance between x and standard deviation = x -

53 54 Chapter 8 Data Description and Probability Distributions EXAMPLE 2 Finding Probabilities for a Normal Distribution From all light bulbs produced (see Example 1), what is the probability of a light bulb chosen at random lasting between 38 and 5 hours? Solution To answer this, we first find z: It is usually a good idea to draw a rough sketch of a normal curve and insert relevant data (see Fig. 7). f(x) z = x - = = or 38.49% 38 5 x 1.2 z FIGURE 7 Light bulb life expectancy: negative z Matched Problem 2 Table I in Appendix C does not include negative values for z, but because normal curves are symmetrical with respect to a vertical line through the mean, we simply use the absolute value (positive value) of z for the table. Thus, the area corresponding to z =-1.2 is the same as the area corresponding to z = 1.2, which is And since the area under the whole normal curve is 1, we conclude that the probability of a light bulb chosen at random lasting between 38 and 5 hours is What is the probability of a light bulb chosen at random lasting between 4 and 5 hours? The first graphing utility command in Figure 8A simulates the life expectancies of 1 light bulbs by generating 1 random numbers from the normal distribution with = 5, = 1 of Example 1. The numbers are (A) FIGURE 8 (B)

54 Section 8-5 Normal Distributions 541 stored in list L 1. Note from Figure 8A that the mean and standard deviation of L 1 are close to the mean and standard deviation of the normal distribution. From Figure 8B, which shows a histogram of L 1, we note that the empirical probability that a light bulb lasts between 38 and 5 hours is =.37 which is close to the theoretical probability of.3849 computed in Example 2. Several important properties of a continuous random variable with normal distribution are listed in the box below. PROPERTIES Normal Probability Distribution 1. P(a x b) = area under the normal curve from a to b 2. P(-q 6 x 6q) = 1 = total area under the normal curve 3. P(x = c) = In Example 2, what is the probability of a light bulb chosen at random having a life of exactly 621 hours? The area above 621 and below the normal curve at x = 621 is (a line has no width). Thus, the probability of a light bulb chosen at random having a life of exactly 621 hours is. However, if the number 621 is the result of rounding a number between 62.5 and (which is most likely the case), then the answer to the question is P(62.5 x 621.5) = area under the normal curve from 62.5 to The area is found using the procedures outlined in Example 2. We have just pointed out an important distinction between a continuous random variable and a discrete random variable: For a probability distribution of a continuous random variable, the probability of x assuming a single value is always. On the other hand, for a probability distribution of a discrete random variable, the probability of x assuming a particular value from the set of permissible values is usually a positive number between and 1. Continuous probability distributions are covered in greater depth in a course in calculus. Approximating a Binomial Distribution with a Normal Distribution You no doubt found in some of the problems in Exercise 8-4 that when a binomial random variable assumes a large number of values (that is, when n is large), the use of the probability distribution formula P(x successes in n trials) = C n,x p x q n - x became very tedious. It would be very helpful if there was an easily computed approximation of this distribution for large n. Such a distribution is found in the form of an appropriately selected normal distribution. To clarify ideas and relationships, let us consider an example of a normal distribution approximation of a binomial distribution with a relatively small value of n. Then we will consider an example with a large value of n.

55 542 Chapter 8 Data Description and Probability Distributions EXAMPLE 3 Solution Market Research A credit card company claims that their card is used by 4% of the people buying gasoline in a particular city. A random sample of 2 gasoline purchasers is made. If the company s claim is correct, what is the probability that (A) From 6 to 12 people in the sample use the card? (B) Fewer than 4 people in the sample use the card? We begin by drawing a normal curve with the same mean and standard deviation as the binomial distribution (Fig. 9). A histogram superimposed on this normal curve can be used to approximate the histogram for the binomial distribution. The mean and standard deviation of the binomial distribution are = np = (2)(.4) = 8 = 1npq = 1(2)(.4)(.6) L 2.19 n = sample size p =.4 (from the 4% claim) f(x) x m FIGURE 9 (A) To approximate the probability that 6 to 12 people in the sample use the credit card, we find the area under the normal curve from 5.5 to We use 5.5 rather than 6, because the rectangle in the histogram corresponding to 6 extends from 5.5 to 6.5; and, reasoning in the same way, we use 12.5 instead of 12.To use Table I in Appendix C, we split the area into two parts: to the left of the mean and to the right of the mean. The A 1 A 2 f(x) A 1 A x z 1 m z 2 FIGURE 1

56 Section 8-5 Normal Distributions 543 sketch in Figure 1 is helpful. Areas and are found as follows: z 1 = x - z 2 = x - = L 2.5 A =.4798 Total area = A 1 + A 2 =.8527 Thus, the approximate probability that the sample will contain between 6 and 12 users of the credit card is.85 (assuming that the firm s claim is correct). (B) To use the normal curve to approximate the probability that the sample contains fewer than 4 users of the credit card, we must find the area A 1 under the normal curve to the left of 3.5.The sketch in Figure 11 is useful. Since the total area under either half of the normal curve is.5, we first use Table I in Appendix C to find the area A 2 under the normal curve from 3.5 to the mean 8, and then subtract A 2 from.5: z = x - = = A 1 A 1 =.5 - A 2 = =.22 L-1.14 A 1 =.3729 L-2.5 A 2 =.4798 Thus, the approximate probability that the sample contains fewer than 4 users of the credit card is approximately.2 (assuming that the company s claim is correct). f(x) A A 1 A x Matched Problem 3 z m FIGURE 11 In Example 3 use the normal curve to approximate the probability that in the sample there are (A) From 5 to 9 users of the credit card (B) More than 1 users of the card You no doubt are wondering how large n should be before a normal distribution provides an adequate approximation for a binomial distribution. Without getting too involved, the following rule-of-thumb provides a good test: RESULT Rule-of-Thumb Test Use a normal distribution to approximate a binomial distribution only if the interval [ - 3, + 3 ] lies entirely in the interval from to n.

57 544 Chapter 8 Data Description and Probability Distributions Note that in Example 3 the interval [ - 3, + 3 ] = [1.43, 14.57] lies entirely within the interval from to 2; hence, the use of the normal distribution was justified. 1 (A) Show that if n 3 and.25 p.75 for a binomial distribution, then it passes the rule-of-thumb test. (B) Give an example of a binomial distribution that passes the rule-ofthumb test but does not satisfy the conditions of part (A). EXAMPLE 4 Solution Quality Control A company manufactures 5, ballpoint pens each day. The manufacturing process produces 5 defective pens per 1,, on the average. A random sample of 4 pens is selected from each day s production and tested. What is the probability that the sample contains (A) At least 14 and no more than 25 defective pens? (B) 33 or more defective pens? Is it appropriate to use a normal distribution to approximate this binomial distribution? The answer is yes, since the rule-of-thumb test passes with ease: = np = 4(.5) = 2 = 1npq = 14(.5)(.95) L 4.36 [ - 3, + 3 ] = [6.92, 33.8] p = 5 1, =.5 This interval is well within the interval from to 4. (A) To find the approximate probability of the number of defective pens in a sample being at least 14 and not more than 25, we find the area under the normal curve from 13.5 to To use Table I in Appendix C, we split the area into an area to the left of the mean and an area to the right of the mean, as shown in Figure 12. f(x) A 1 A x z 1 m z 2 FIGURE 12 z 1 = x - z 2 = x - = = Total area = A 1 + A 2 =.8281 L-1.49 A 1 =.4319 L 1.26 A 2 =.3962 Thus, the approximate probability of the number of defective pens in the sample being at least 14 and not more than 25 is.83.

58 Section 8-5 Normal Distributions 545 (B) Since the total area under a normal curve from the mean on is.5, we find the area A 1 (see Fig. 13) from Table I in Appendix C and subtract it from.5 to obtain A 2. f(x) A 1 A x m z FIGURE 13 Matched Problem 4 z = x - = L 2.87 A 1 =.4979 A 2 =.5 - A 1 = =.21 L.2 Thus, the approximate probability of finding 33 or more defective pens in the sample is.2. If a random sample of 4 included more than 33 defective pens, then the management would conclude that either a rare event has happened and the manufacturing process is still producing only 5 defective pens per 1,, on the average, or something is wrong with the manufacturing process and it is producing more than 5 defective pens per 1,, on the average.the company might very well have a policy of checking the manufacturing process whenever 33 or more defective pens are found in a sample rather than believing a rare event has happened and that the manufacturing process is still running smoothly. Suppose in Example 4 that the manufacturing process produces 4 defective pens per 1,, on the average. What is the approximate probability that in the sample of 4 pens there are (A) At least 1 and no more than 2 defective pens? (B) 27 or more defective pens? When to Use the.5 Adjustment If we are assuming a normal probability distribution for a continuous random variable (such as that associated with heights or weights of people), then we find P(a x b), where a and b are real numbers, by finding the area under the corresponding normal curve from a to b (see Example 2). However, if we use a normal probability distribution to approximate a binomial probability distribution, then we find P(a x b), where a and b are nonnegative integers, by finding the area under the corresponding normal curve from a -.5 to b +.5 (see Examples 3 and 4). 2 (A) Construct a histogram of the binomial distribution with n = 8 and p =.1. (B) Does the binomial distribution of part (A) satisfy the rule-of-thumb test?

59 546 Chapter 8 Data Description and Probability Distributions (C) Use a graphing utility to graph the normal distribution that has the same mean and standard deviation as the binomial distribution of part (A). How does the graph compare to the histogram? (D) Is the normal distribution a good approximation to the binomial distribution in this case? Explain. Answers to Matched Problems % (A).7 (B) (A).83 (B).4 Exercise 8-5 A In Problems 1 6, use Table I in Appendix C to find the area under the standard normal curve from to the indicated measurement Given a normal distribution with mean 1 and standard deviation 1, in Problems 7 12 find the number of standard deviations each measurement is from the mean. Express the answer as a positive number Using the normal distribution described for Problems 7 12 and Table I in Appendix C, find the area under the normal curve from the mean to the indicated measurement in Problems B Given a normal distribution with mean 7 and standard deviation 8, find the area under the normal curve above the intervals in Problems or larger or larger or smaller or smaller In Problems 27 and 28, discuss the validity of each statement. If the statement is always true, explain why. If not, give a counterexample. 27. (A) All normal distributions have the same shape. (B) The area above the x axis and below the normal curve is the same for all normal distributions. 28. (A) The distribution of final exam scores from a statistics class of 9 students is a normal distribution. (B) In a normal distribution, the probability is that a score lies more than 4 standard deviations away from the mean. In Problems 29 36, use the rule-of-thumb test to check whether a normal distribution (with the same mean and standard deviation as the binomial distribution) is a suitable approximation for the binomial distribution with 29. n = 15, p =.7 3. n = 12, p = n = 15, p = n = 2, p = n = 1, p = n = 2, p = n = 5, p = n = 4, p = A Bernoulli trial has probability of success p =.1. Explain how to determine the number of repeated trials necessary to obtain a binomial distribution that passes the rule-of-thumb test for using a normal distribution as a suitable approximation. 38. For a binomial distribution with n = 1, explain how to determine the smallest and largest values of p that pass the rule-of-thumb test for using a normal distribution as a suitable approximation. C A binomial experiment consists of 5 trials with the probability of success for each trial.4. What is the probability of obtaining the number of successes indicated in Problems 39 46? Approximate these probabilities to two decimal places using a normal curve. (This binomial experiment easily passes the rule-of-thumb test, as you can check. When computing the probabilities, adjust the intervals as in Examples 3 and 4.) or more or more

60 Section 8-5 Normal Distributions or less or less To graph Problems 47 5, use a graphing utility and refer to the normal probability distribution function with mean and standard deviation : 1 f(x) = 12 - ) e-(x (1) 47. Graph equation (1) with = 5 and (A) = 1 (B) = 15 (C) = 2 Graph all three in the same viewing window with Xmin =-1, Xmax = 4, Ymin =, and Ymax = Graph equation (1) with = 4 and (A) = 8 (B) = 12 (C) = 16 Graph all three in the same viewing window with Xmin =-5, Xmax = 3, Ymin =, and Ymax = Graph equation (1) with = 2 and (A) = 2 (B) = 4 Graph both in the same viewing window with Xmin =, Xmax = 4, Ymin =, and Ymax = Graph equation (1) with = 18 and (A) = 3 (B) = 6 Graph both in the same viewing window with Xmin =, Xmax = 4, Ymin =, and Ymax = (A) If 12 scores are chosen from a normal distribution with mean 75 and standard deviation 8, how many scores x would be expected to satisfy 67 x 83? (B) Use a graphing utility to generate 12 scores from the normal distribution with mean 75 and standard deviation 8. Determine the number of scores x such that 67 x 83, and compare your results with the answer to part (A). 52. (A) If 25 scores are chosen from a normal distribution with mean 1 and standard deviation 1, how many scores x would be expected to be greater than 11? (B) Use a graphing utility to generate 25 scores from the normal distribution with mean 1 and standard deviation 1. Determine the number of scores greater than 11, and compare your results with the answer to part (A). Applications Business & Economics 53. Sales. Salespeople for a business machine company have average annual sales of $2,, with a standard deviation of $2,. What percentage of the salespeople would be expected to make annual sales of $24, or more? Assume a normal distribution. 54. Guarantees. The average lifetime for a car battery of a certain brand is 17 weeks, with a standard deviation of 1 weeks. If the company guarantees the battery for 3 years, what percentage of the batteries sold would be expected to be returned before the end of the warranty period? Assume a normal distribution. 55. Quality control. A manufacturing process produces a critical part of average length 1 millimeters, with a standard deviation of 2 millimeters. All parts deviating by more than 5 millimeters from the mean must be rejected. What percentage of the parts must be rejected, on the average? Assume a normal distribution. 56. Quality control. An automated manufacturing process produces a component with an average width of 7.55 centimeters, with a standard deviation of.2 centimeter. All components deviating by more than.5 centimeter from the mean must be rejected. What percentage of the parts must be rejected, on the average? Assume a normal distribution. 57. Marketing claims. A company claims that 6% of the households in a given community use their product. A competitor surveys the community, using a random sample of 4 households, and finds only 15 households out of the 4 in the sample using the product. If the company s claim is correct, what is the probability of 15 or fewer households using the product in a sample of 4? Conclusion? Approximate a binomial distribution with a normal distribution. 58. Labor relations. A union representative claims 6% of the union membership will vote in favor of a particular settlement. A random sample of 1 members

61 548 Chapter 8 Data Description and Probability Distributions is polled, and out of these, 47 favor the settlement. What is the approximate probability of 47 or fewer in a sample of 1 favoring the settlement when 6% of all the membership favor the settlement? Conclusion? Approximate a binomial distribution with a normal distribution. Life Sciences 59. Medicine. The average healing time of a certain type of incision is 24 hours, with standard deviation of 2 hours. What percentage of the people having this incision would heal in 8 days or less? Assume a normal distribution. 6. Agriculture. The average height of a hay crop is 38 inches, with a standard deviation of 1.5 inches. What percentage of the crop will be 4 inches or more? Assume a normal distribution. 61. Genetics. In a family with 2 children, the probability that both children are girls is approximately.25. In a random sample of 1, families with 2 children, what is the approximate probability that 22 or fewer will have 2 girls? Approximate a binomial distribution with a normal distribution. 62. Genetics. In Problem 61, what is the approximate probability of the number of families with 2 girls in the sample being at least 225 and not more than 275? Approximate a binomial distribution with a normal distribution. Social Sciences 63. Testing. Scholastic Aptitude Tests are scaled so that the mean score is 5 and the standard deviation is 1. What percentage of the students taking this test should score 7 or more? Assume a normal distribution. 64. Politics. Candidate Harkins claims a private poll shows that she will receive 52% of the vote for governor. Her opponent, Mankey, secures the services of another pollster, who finds that 47 out of a random sample of 1, registered voters favor Harkins. If Harkins s claim is correct, what is the probability that only 47 or fewer will favor her in a random sample of 1,? Conclusion? Approximate a binomial distribution with a normal distribution. 65. Grading on a curve. An instructor grades on a curve by assuming the grades on a test are normally distributed. If the average grade is 7 and the standard deviation is 8, find the test scores for each grade interval if the instructor wishes to assign grades as follows: 1% A s, 2% B s, 4% C s, 2% D s, and 1% F s. 66. Psychology. A test devised to measure aggressive passive personalities was standardized on a large group of people. The scores were normally distributed, with a mean of 5 and a standard deviation of 1. If we want to designate the highest 1% as aggressive, the next 2% as moderately aggressive, the middle 4% as average, the next 2% as moderately passive, and the lowest 1% as passive, what ranges of scores will be covered by these five designations? Chapter 8 Review Important Terms, Symbols, and Concepts 8-1 Graphing Data Bar graphs, broken-line graphs, and pie graphs are used to present visual interpretations or comparisons of data. Large sets of quantitative data can be organized in a frequency table, generally constructed by choosing five to twenty class intervals of equal length to cover the data range. The number of measurements that fall in a given class interval is called the class frequency, and the set of all such frequencies associated with their respective classes is called a frequency distribution. The relative frequency of a class is its frequency divided by the total number of items in the data set. A histogram is a vertical bar graph used to represent a frequency distribution. A frequency polygon is a brokenline graph obtained by joining successive midpoints of the tops of the bars in a histogram. A cumulative frequency polygon, or ogive, is obtained by plotting the cumulative frequency over the upper boundary of the corresponding class. 8-2 Measures of Central Tendency Mean Ungrouped Data: If x 1, x 2, p, x n is a set of n measurements, then the mean is given by The mean is denoted by x if the data set is a sample, and by if the data set is an entire population. Grouped Data: If a data set of n measurements is grouped into k classes, and x i is the midpoint of the ith class interval and is the ith class frequency, then the mean for the f i [mean] = x 1 + x 2 + p + x n n

62 Chapter 8 Review 549 grouped data is given by [mean] = x 1f 1 + x 2 f 2 + p + x k f k n The mean for grouped data is denoted by x if the data set is a sample, and by if the data set is an entire population. Median Ungrouped Data: Arrange the measurements in ascending or descending order. If the number of measurements is odd, the median is the middle measurement. If the number of measurements is even, the median is the mean of the two middle measurements. Grouped Data: The median for grouped data with no classes of frequency zero is the number such that the histogram has the same area to the left of the median as to the right of the median. Mode The mode is the most frequently occurring measurement in a data set. There may be a unique mode, several modes, or, if no measurement occurs more than once, essentially no mode. 8-3 Measures of Dispersion Range The range for a set of ungrouped data is the difference between the largest and the smallest values in the data set. The range for a frequency distribution is the difference between the upper boundary of the highest class and the lower boundary of the lowest class. Standard Deviation Ungrouped Data: The sample standard deviation s of a set of n sample measurements x 1, x 2, p, x n with mean x is given by s = B (x 1 - x) 2 + (x 2 - x) 2 + p + (x n - x) 2 n - 1 The square of the sample standard deviation, s 2, is called the sample variance. Grouped Data: Suppose a data set of n sample measurements is grouped into k classes in a frequency table, where x i is the midpoint, f i is the frequency of the ith class interval, and x is the mean of the grouped data. Then the sample standard deviation s for the grouped data is s = B (x 1 - x) 2 f 1 + (x 2 - x) 2 f 2 + p + (x k - x) 2 f k n Bernoulli Trials and Binomial Distributions Bernoulli Trials A sequence of experiments is called a sequence of Bernoulli trials, or a binomial experiment, if 1. Only two outcomes are possible on each trial. 2. The probability of success p is the same for each trial (the probability of failure is then q = 1 - p). 3. All trials are independent. Binomial Distributions Let the random variable X n represent the number of successes in n Bernoulli trials. The probability distribution of X n, called a binomial distribution, is given by P(X n = x) = P(x successes in n trials) = C n,x p x q n - x x =, 1, p, n where p is the probability of success and q is the probability of failure on each trial. The mean and standard deviation of a binomial distribution are given by the formulas = np and = 1npq, respectively. 8-5 Normal Distributions Normal probability distributions, or normal curves, are continuous curves that approximate the relative frequency distributions of measurements such as IQ scores, heights and weights of people, and errors in laboratory experiments. Properties of Normal Curves 1. Normal curves are bell-shaped and are symmetrical with respect to a vertical line. 2. The mean is at the point where the axis of symmetry intersects the horizontal axis. 3. The shape of a normal curve is completely determined by its mean and standard deviation. 4. The area between a normal curve and the horizontal axis is always % of the area under a normal curve lies within 1 standard deviation of the mean, 95.4% within 2 standard deviations, and 99.7% within 3 standard deviations. Areas under Normal Curves In any normal distribution, the probability P(a x b) that x lies between a and b is equal to the area under the normal curve from a to b. Such areas can be found from a table of areas associated with the standard normal curve, that is, the normal curve with mean and standard deviation 1. The area under any normal curve from to x = + z is equal to the area under the standard normal curve from to z, where z = x -. Approximating Binomial Distributions with Normal Curves To approximate a binomial distribution that is associated with a sequence of n Bernoulli trials, each having probability of success p, we choose a normal distribution with mean = np and standard deviation = 1npq. As a rule of thumb, the normal distribution provides a reasonable approximation if the interval [ - 3, + 3 ] lies entirely within the interval [, n].

63 55 Chapter 8 Data Description and Probability Distributions Review Exercise Work through all the problems in this chapter review and check your answers in the back of the book.answers to all review problems are there along with section numbers in italics to indicate where each type of problem is discussed. Where weaknesses show up, review appropriate sections in the text. A 1. Graph the following data using a bar graph and a broken-line graph. Voter Turnout in Presidential Elections Year Percentage of Voting Age Population Graph the data in the following table using two pie graphs, one for men and one for women. Living Arrangements of the Elderly (65 Years and Over) Men (%) Women (%) Alone With spouse With relatives With nonrelatives (A) Draw a histogram for the binomial distribution P(x) = C 3, x (.4) x (.6) 3 - x (B) What are the mean and standard deviation? 4. For the set of sample measurements 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, find the (A) Mean (B) Median (C) Mode (D) Standard deviation 5. If a normal distribution has a mean of 1 and a standard deviation of 1, then (A) How many standard deviations is 118 from the mean? (B) What is the area under the normal curve between the mean and 118? B 6. Given the sample of 25 quiz scores listed in the table below from a class of 5 students: (A) Construct a frequency table using a class interval of width 2 starting at 9.5. (B) Construct a histogram. (C) Construct a frequency polygon. (D) Construct a cumulative frequency and relative cumulative frequency table. (E) Construct a cumulative frequency polygon. Quiz Scores For the set of grouped sample data given in the table, (A) Find the mean. (B) Find the standard deviation. (C) Find the median. Interval Frequency (A) Construct a histogram for the binomial distribution P(x) = C 6, x (.5) x (.5) 6 - x (B) What are the mean and standard deviation? 9. What are the mean and standard deviation for a binomial distribution with p =.6 and n = 1,? In Problems 1 and 11, discuss the validity of each statement. If the statement is always true, explain why. If not, give a counterexample. 1. (A) If the data set x 1, x 2, p, x n has mean x, then the data set x 1 + 5, x 2 + 5, p, x n + 5 has mean x + 5. (B) If the data set x 1, x 2, p, x n has standard deviation s, then the data set x 1 + 5, x 2 + 5, p, x n + 5 has standard deviation s + 5.

64 Review Exercise 551 C 11. (A) If X represents a binomial random variable with mean, then P(X ) =.5. (B) If X represents a normal random variable with mean, then P(X ) =.5. (C) The area of a histogram of a binomial distribution is equal to the area above the x axis and below a normal curve. 12. If the probability of success in a single trial of a binomial experiment with 1, trials is.6, what is the probability of obtaining at least 55 and no more than 65 successes in 1, trials? [Hint: Approximate with a normal distribution.] 13. Given a normal distribution with mean 5 and standard deviation 6, find the area under the normal curve: (A) Between 41 and 62 (B) From 59 on 14. A data set is formed by recording the sums when a pair of dice is rolled 1 times. A second data set is formed by again rolling a pair of dice 1 times, but recording the product, not the sum, of the two numbers. (A) Which of the two data sets would you expect to have the smaller standard deviation? Explain. (B) To obtain evidence for your answer to part (A), use a graphing utility to simulate both experiments, and compute the standard deviations of each of the two data sets. 15. For the sample quiz scores in Problem 6 above, find the mean and standard deviation using the data: (A) Without grouping (B) Grouped, with class interval of width 2 starting at A fair die is rolled five times. What is the probability of rolling: (A) Exactly three 6 s? (B) At least three 6 s? 17. Two dice are rolled three times. What is the probability of getting a sum of 7 at least once? 18. Ten students take an exam worth 1 points. (A) Construct a hypothetical set of exam scores for the ten students in which both the median and the mode are 3 points higher than the mean. (B) Could the median and the mode both be 5 points higher than the mean? Explain. 19. In the last presidential election, 39% of the registered voters in a certain city actually cast ballots. (A) In a random sample of 2 registered voters from that city, what is the probability that exactly 8 voted in the last presidential election? (B) Verify by the rule-of-thumb test that the normal distribution with mean 7.8 and standard deviation 2.18 is a good approximation of the binomial distribution with n = 2 and p =.39. (C) For the normal distribution of part (B), P(x = 8) =. Explain the discrepancy between this result and your answer from part (A). 2. A random variable represents the number of wins in a 12-game season for a football team that has a probability of.9 of winning any of its games. (A) Find the mean and standard deviation of the random variable. (B) Find the probability that the team wins each of its 12 games. (C) Use a graphing utility to simulate 1 repetitions of the binomial experiment associated with the random variable, and compare the empirical probability of a perfect season with the answer to part (B). Applications Business & Economics 21. Retail sales. The daily number of bad checks received by a large department store in a random sample of 1 days out of the past year was 15, 12, 17, 5, 5, 8, 13, 5, 16, and 4. Find the (A) Mean (B) Median (C) Mode (D) Standard deviation 22. Preference survey. Find the mean, median, and/or mode, whichever are applicable, for the following employee cafeteria service survey: Drink Ordered with Meal Number Coffee 435 Tea 137 Milk 298 Soft drink 522 Milk shake 392

65 552 Chapter 8 Data Description and Probability Distributions 23. Plant safety. The weekly record of reported accidents in a large auto assembly plant in a random sample of 35 weeks from the past 1 years is listed below: (A) Construct a frequency and relative frequency table using class intervals of width 2 and starting at (B) Construct a histogram and frequency polygon. (C) Find the mean and standard deviation for the grouped data. 24. Personnel screening. The scores on a screening test for new technicians are normally distributed with mean 1 and standard deviation 1. Find the approximate percentage of applicants taking the test who score (A) Between 92 and 18 (B) 115 or higher 25. Market research. A newspaper publisher claims that 7% of the people in a community read their newspaper. Doubting the assertion, a competitor randomly surveys 2 people in the community. Based on the publisher s claim (and assuming a binomial distribution): (A) Compute the mean and standard deviation of the binomial distribution. (B) Determine whether the rule-of-thumb test warrants the use of a normal distribution to approximate this binomial distribution. (C) Calculate the approximate probability of finding at least 13 and no more than 155 readers in the sample. (D) Determine the approximate probability of finding 125 or fewer readers in the sample. (E) Use a graphing utility to graph the relevant normal distribution. Life Sciences 26. Health care. A small town has three doctors on call for emergency service. The probability that any one doctor will be available when called is.9. What is the probability that at least one doctor will be available for an emergency call? Group Activity 1 Group Activity 2 Analysis of Data on Student Lifestyle (A) Select several quantitative variables related to student life or the economic impact of students on the surrounding community for example, the number of hours spent studying outside of class per week, the number of long-distance telephone calls made per month, the number of ounces of alcoholic beverages consumed per week, the number of dollars spent on recreational activities per week, and so on. Interview a total of approximately 4 students to obtain data on each of the variables you have selected. Discuss how the sample of students should be selected in order to obtain an approximately random sample. Discuss how the interviews should be conducted in order to obtain reliable information. (B) Compute the mean, median, and standard deviation for the data set corresponding to each of your quantitative variables. Compute the proportion of the sample that lies within 1, 2, and 3 standard deviations of the mean. Which of your data sets most closely approximates a normal distribution? (C) Use histograms and/or tables and other graphs to present the results of your study to those outside your group. Survival Rates for a Heart Transplant In recent years approximately 2,4 heart transplant operations have been performed annually in the United States. The American Heart Association reported that the 1-year survival rate for a heart transplant is 82.4%, the 2-year survival rate is 78.2%, and the 3-year rate is 74.6%.

66 Group Activity 2 Survival Rates for a Heart Transplant 553 Ten patients are currently awaiting a heart transplant at St. Luke s Hospital. Assume that each of the ten undergoes the transplant surgery. (A) Construct probability distribution tables for the random variables X 1, X 2, and X 3, where X k represents the number from among the ten patients who survive for at least k years. (Assume that X k is binomial.) (B) What is the probability that eight or more of the ten heart transplant recipients will survive at least 1 year? 2 years? 3 years? (C) Use a graphing utility to simulate 1 repetitions of the binomial experiments associated with X 1, X 2, and X 3, and compare the results of the simulations with the answers to part (B). (D) If 2,5 heart transplants are performed in the United States this year, what is the probability that at least 2, of the recipients will survive at least 1 year? 2 years? 3 years? (Approximate the appropriate binomial distributions with normal distributions.)

67 AA

68 Answers A-43 ANSWERS Chapter 8 Exercise (A) and (B) Class Interval Tally Frequency Relative Frequency Frequency Relative frequency (C) The frequency tables and histograms are identical, but the data set in part (B) is more spread out than that of part (A). 3. (A) Let Xmin = 1.5, Xmax = 25.5, change Xscl from 1 to 2, and multiply Ymax and Yscl by 2; change Xscl from 1 to 4, and multiply Ymax and Yscl by 4. (B) The shape becomes more symmetrical and more rectangular. 5. Gross National Product 7. China; China; South Africa; North America 9. Annual Railroad Carloadings 11. Federal Income by Source, 2 1 in the United States Billions of dollars Year 2 Millions Year Personal income tax (54%) Social insurance taxes (31%) Other (1%) Corporate income tax (11%) Excise tax (3%) 13. (A) (B) 8 (C).45;.2 Class Relative Interval Tally Frequency Frequency 6 (D) Frequency (A) (B) Frequency Relative (C) Class Relative frequency Interval Frequency Frequency Price earnings ratios Frequency Price earning ratios Relative frequency

69 A-44 Answers (D) Relative Class Cumulative Cumulative Interval Frequency Frequency Frequency (E) Cumulative frequency 1 1% Price earning ratios P(PE ratio between 4.5 and 14.5) = Annual World Population Growth 19. Males, age Females, age Millions Year Carbohydrate Protein Public and Private Schooling in the U.S. Fat RDA (g) Burger Extra patty Cheese Bacon Calories Calories from fat Mayonnaise and 33; the median age decreased in the 195s and 196s, but increased in the other decades. Public 86.4% Catholic 12.6% Public 86.8% Other private 1.% Catholic 4.7% Other private 6.5% Homeschooled 2.% 27. (A) Class Relative Interval Frequency Frequency (B) Frequency Grade-point averages Relative frequency (C) Frequency Relative frequency Relative cumulative frequency Grade-point averages

70 Answers A-45 (D) Relative Class Cumulative Cumulative Interval Frequency Frequency Frequency (E) Cumulative frequency Grade-point averages Relative cumulative frequency P(GPA ) =.2 Exercise Mean = 3; median = 3; mode = 3 3. Modal preference is chocolate. 5. Mean = The median 9. (A) Close to 3.5; close to 3.5 (B) Answer depends on results of simulation. 11. (A) 175, 175, 325, 525 (B) Let the four numbers be u, v, w, x, where u and v are both equal to m 3. Choose w so that the mean of w and m 3 is m 2 ; then choose x so that the mean of u, v, w, and x is m Mean L 14.7; median = 11.5; mode = Mean = 1,45.5 hr; median = 1,49.5 hr 17. Mean L $1,211; median = $1,228; mode = $1, Mean = 5.5 g; median = 5.55 g 21. Mean = 1,572,; median = 97,5; no mode 23. Median = 577 Exercise (A) 7%; 1%; 1% (B) Yes (C) (A) False (B) True 9. (A) The first data set. It is more likely that 2 the sum is close to 7, for example, than to 2 or (B) Answer depends on results of simulation x = $4.35; s = $ x = 8.7 hr; s =.6 hr 15. x = 5.1 min; s =.9 min 17. x = 11.1; s = 2.3 Exercise L L =.75; = = 1.333; = = ; = P(x) P(x) P(x) x x x (A).311 (B) It is more likely that all answers are wrong (.17) than that at least half are right (.33). 29. = 2.4; = = 2.4; = P(x) x P(x) x

71 A-46 Answers 33. (A) = 17; = (B) The theoretical probability distribution is given by P(x) = C 3, x (.5) x (.5) 3 - x = C 3, x (.5) p =.5 P(x) x Frequency of Heads in 1 Tosses of Three Coins Number of Theoretical Heads Frequency Actual Frequency 12.5 List your experimental results here (A) = 5; = (B) Answer depends on results of simulation. 43. (A).318 (B) (A) P(x) = C 6, x (.5) x (.95) 6 - x (B) (C) P(x) (D) =.3; =.53 x P(x) x (A).1 (B).264 (C) (A) P(x) = C 6, x (.6) x (.4) 6 - x (B) (C) P(x) (D) x P(x) x =.36; = (A) P(x) = C 5, x (.2) x (.8) 5 - x (B) (C) P(x) (D) x P(x) (A).41 (B).467 (C).138 (D) x = 1; =.89 Exercise (A) False (B) True 29. No 31. Yes 33. No 35. Yes 37. Solve the inequality np - 31npq to obtain n (A) Approx. 82 (B) Answer depends on results of simulation % % ; either a rare event has happened or the company s claim is false % % 65. A s, 8.2 or greater; B s, ; C s, ; D s, ; F s, 59.8 or lower

72 Chapter 8 Review Exercise 1. (8-1) Percentage of voting age population voting in the U.S. presidential election Alone 16% Men Year Percentage of voting age population voting in the U.S. presidential election 4. (A) x = 2.7 (B) 2.5 (C) 2 (D) s = 1.34 (8-2, 8-3) 5. (A) 1.8 (B).4641 (8-5) 6. (A) (B) Frequency Relative (C) Class Relative frequency Interval Frequency Frequency Year Answers A (8-1) With With 3. (A).432 (B) = 1.2; =.85 (8-4) With relatives relatives nonrelatives.4 7.4% With 16.2% 2.3% nonrelatives % With spouse 74.3% With spouse 39.7% Women Alone 42% Frequency Relative frequency (D) Relative Class Cumulative Cumulative Interval Frequency Frequency Frequency (E) Cumulative frequency 25 1% Relative cumulative frequency (8-1) 7. (A) x = 7 (B) s = 2.45 (C) 7.14 (8-2, 8-3) 8. (A) P(x) (B) = 3; = 1.22 (8-4) 9. = 6; = (8-4).4 1. (A) True (B) False (8-2, 8-3) 11. (A) False (B) True (C) True (8-4, 8-5) (8-5) 13. (A).914 (B).668 (8-5) (A) The first data set. Sums range from 2 to 12, but products range from 1 to (B) Answers depend on results of simulation. (8-3) (A) x = 14.6; s = 1.83 (B) x = 14.6; s = 1.78 (8-2, 8-3) (A).322 (B).355 (8-4) (8-4) x (A) 1, 1, 2, 2, 9, 9, 9, 9, 9, 9 (B) No (8-2) 19. (A).179 (C) The normal distribution is continuous, not discrete, so the correct analogue of part (A) is P(7.5 x 8.5) L.18 (using Table I in Appendix C). (8-5) 2. (A) = 1.8; = 1.39 (B).282 (C) Answer depends on results of simulation. (8-4) 21. (A) x = 1 (B) 1 (C) 5 (D) s = 5.14 (8-2, 8-3) 22. Modal preference is soft drink (8-2)

73 A-48 Answers 23. (A) Class Relative (B) Frequency (C) Interval Frequency Frequency Frequency Relative frequency and relative frequency histogram Frequency and relative frequency polygon x = 34.61; s = 2.22 (8-1, 8-2, 8-3) (A) 57.62% (B) 6.68% (8-5) 25. (A) = 14; = 6.48 (B) Yes (C).939 (D).125 (E) (8-4) (8-4, 8-5)

74 DATA DESCRIPTION AND PROBABILITY DISTRIBUTIONS TO ACCOMPANY COLLEGE MATHEMATICS for Business, Economics, Life Sciences, and Social Sciences T e n t h Upper Saddle River, New Jersey E d i t i o n RAYMOND A. BARNETT MICHAEL R. ZIEGLER KARL E. BYLEEN