Sampling Distributions


 Abner Gibbs
 2 years ago
 Views:
Transcription
1 CHAPTER 11 Bruce Coleman/Alamy Sampling Distributions IN THIS CHAPTER WE COVER... What is the average income of American households? Each March, the government s Current Population Survey asks detailed questions about income. The 98,105 households contacted in March 2007 had a mean total money income of $66,570 in (The median income was of course lower, $48,201.) That $66,570 describes the sample, but we use it to estimate the mean income of all households. This is an example of statistical inference: we use information from a sample to infer something about a wider population. Because the results of random samples and randomized comparative experiments include an element of chance, we can t guarantee that our inferences are correct. What we can guarantee is that our methods usually give correct answers. The reasoning of statistical inference rests on asking, How often would this method give a correct answer if I used it very many times? If our data come from random sampling or randomized comparative experiments, the laws of probability answer the question What would happen if we did this many times? This chapter presents some facts about probability that help answer this question. Parameters and statistics Statistical estimation and the law of large numbers Sampling distributions The sampling distribution of x The central limit theorem 291
2 292 CHAPTER 11 Sampling Distributions Parameters and statistics As we begin to use sample data to draw conclusions about a wider population, we must take care to keep straight whether a number describes a sample or a population. Here is the vocabulary we use. PARAMETER, STATISTIC A parameter is a number that describes the population. In statistical practice, the value of a parameter is not known because we cannot examine the entire population. A statistic is a number that can be computed from the sample data without making use of any unknown parameters. In practice, we often use a statistic to estimate an unknown parameter. EXAMPLE 11.1 Household earnings The mean income of the sample of 98,105 households contacted by the Current Population Survey was x = $66,570. The number $66,570 is a statistic because it describes this one Current Population Survey sample. The population that the poll wants to draw conclusions about is all 116 million U.S. households. The parameter of interest is the mean income of all of these households. We don t know the value of this parameter. population mean sample mean x Remember s and p: statistics come from samples, and parameters come from populations. As long as we were just doing data analysis, the distinction between population and sample was not important. Now, however, it is essential. The notation we use must reflect this distinction. We write μ (the Greek letter mu) for the mean of a population. This is a fixed parameter that is unknown when we use a sample for inference. The mean of the sample is the familiar x, the average of the observations in the sample. This is a statistic that would almost certainly take a different value if we chose another sample from the same population. The sample mean x from a sample or an experiment is an estimate of the mean μ of the underlying population. APPLY YOUR KNOWLEDGE 11.1 Effects of caffeine. How does caffeine affect our bodies? In a matched pairs experiment, subjects pushed a button as quickly as they could after taking a caffeine pill and also after taking a placebo pill. The mean pushes per minute were 283 for the placebo and 311 for caffeine. Is each of the boldface numbers a parameter or a statistic? 11.2 Florida voters. Florida has played a key role in recent presidential elections. Voter registration records show that 41% of Florida voters are registered as Democrats and 37% as Republicans. (Most of the others did not choose a party.) To test a random digit dialing device, you use it to call 250 randomly chosen residential telephones in Florida. Of the registered voters contacted, 33% are registered Democrats. Is each of the boldface numbers a parameter or a statistic?
3 Statistical estimation and the law of large numbers Ancient projectile points. Most of what we know about North America before Columbus comes from artifacts such as fragments of clay pottery and stone projectile points. Locations and cultures can be distinguished by the types of artifacts found. At one site in North Carolina, 82% of the projectile points unearthed came from the Middle Archaic period (6000 to 3000 b.c.) and the remaining 18% from the Late Archaic period (3000 to 1000 b.c.). Is each of the boldface numbers a parameter or a statistic? Statistical estimation and the law of large numbers Statistical inference uses sample data to draw conclusions about the entire population. Because good samples are chosen randomly, statistics such as x are random variables. We can describe the behavior of a sample statistic by a probability model that answers the question What would happen if we did this many times? Here is an example that will lead us toward the probability ideas most important for statistical inference. Courtesy of Padre Island Seashore/National Park Service EXAMPLE 11.2 Does this wine smell bad? Sulfur compounds such as dimethyl sulfide (DMS) are sometimes present in wine. DMS causes offodors in wine, so winemakers want to know the odor threshold, the lowest concentration of DMS that the human nose can detect. Different people have different thresholds, so we start by asking about the mean threshold μ in the population of all adults. The number μ is a parameter that describes this population. To estimate μ, we present tasters with both natural wine and the same wine spiked with DMS at different concentrations to find the lowest concentration at which they identify the spiked wine. Here are the odor thresholds (measured in micrograms of DMS per liter of wine) for 10 randomly chosen subjects: The mean threshold for these subjects is x = It seems reasonable to use the sample result x = 27.4 to estimate the unknown μ. An SRS should fairly represent the population, so the mean x of the sample should be somewhere near the mean μ of the population. Of course, we don t expect x to be exactly equal to μ. Werealize that if we choose another SRS, the luck of the draw will probably produce a different x. Enigma/Alamy If x is rarely exactly right and varies from sample to sample, why is it nonetheless a reasonable estimate of the population mean μ? Here is one answer: if we keep on taking larger and larger samples, the statistic x is guaranteed to get closer and closer to the parameter μ. We have the comfort of knowing that if we can afford to keep on measuring more subjects, eventually we will estimate the mean odor threshold of all adults very accurately. This remarkable fact is called the law of large numbers. It is remarkable because it holds for any population, not just for some special class such as Normal distributions.
4 294 CHAPTER 11 Sampling Distributions LAW OF LARGE NUMBERS Draw observations at random from any population with finite mean μ. As the number of observations drawn increases, the mean x of the observed values gets closer and closer to the mean μ of the population. Hightech gambling There are twice as many slot machines as bank ATMs in the United States. Once upon a time, you put in a coin and pulled the lever to spin three wheels, each with 20 symbols. No longer. Now the machines are video games with flashy graphics and outcomes produced by random number generators. Machines can accept many coins at once, can pay off on a bewildering variety of outcomes, and can be networked to allow common jackpots. Gamblers still search for systems, but in the long run the law of large numbers guarantees the house its 5% profit. FIGURE 11.1 The law of large numbers in action: as we take more observations, the sample mean x always approaches the mean μ of the population. The law of large numbers can be proved mathematically starting from the basic laws of probability. The behavior of x is similar to the idea of probability. In the long run, the proportion of outcomes taking any value gets close to the probability of that value, and the average outcome gets close to the population mean. Figure 10.1 (page 263) shows how proportions approach probability in one example. Here is an example of how sample means approach the population mean. EXAMPLE 11.3 The law of large numbers in action In fact, the distribution of odor thresholds among all adults has mean 25. The mean μ = 25 is the true value of the parameter we seek to estimate. Figure 11.1 shows how the sample mean x of an SRS drawn from this population changes as we add more subjects to our sample. The first subject in Example 11.2 had threshold 28, so the line in Figure 11.1 starts there. The mean for the first two subjects is Mean of first n observations x = = ,000 Number of observations, n
5 Statistical estimation and the law of large numbers 295 This is the second point on the graph. At first, the graph shows that the mean of the sample changes as we take more observations. Eventually, however, the mean of the observations gets close to the population mean μ = 25 and settles down at that value. If we started over, again choosing people at random from the population, we would get a different path from left to right in Figure The law of large numbers says that whatever path we get will always settle down at 25 as we draw more and more people. The Law of Large Numbers applet animates Figure 11.1 in a different setting. You can use the applet to watch x change as you average more observations until it eventually settles down at the mean μ. The law of large numbers is the foundation of such business enterprises as gambling casinos and insurance companies. The winnings (or losses) of a gambler on a few plays are uncertain that s why some people find gambling exciting. In Figure 11.1, the mean of even 100 observations is not yet very close to μ. It is only in the long run that the mean outcome is predictable. The house plays tens of thousands of times. So the house, unlike individual gamblers, can count on the longrun regularity described by the law of large numbers. The average winnings of the house on tens of thousands of plays will be very close to the mean of the distribution of winnings. Needless to say, this mean guarantees the house a profit. That s why gambling can be a business. APPLET APPLY YOUR KNOWLEDGE 11.4 The law of large numbers made visible. Roll two balanced dice and count the spots on the up faces. The probability model appears in Example 10.5 (page 267). You can see that this distribution is symmetric with 7 as its center, so it s no surprise that the mean is μ = 7. This is the population mean for the idealized population that contains the results of rolling two dice forever. The law of large numbers says that the average x from a finite number of rolls gets closer and closer to 7 as we do more and more rolls. (a) (b) Click More dice once in the Law of Large Numbers applet to get two dice. Click Show mean to see the mean 7 on the graph. Leaving the number of rolls at 1, click Roll dice three times. How many spots did each roll produce? What is the average for the three rolls? You see that the graph displays at each point the average number of spots for all rolls up to the last one. This is exactly like Figure Set the number of rolls to 100 and click Roll dice. The applet rolls the two dice 100 times. The graph shows how the average count of spots changes as we make more rolls. That is, the graph shows x as we continue to roll the dice. Sketch (or print out) the final graph. (c) Repeat your work from (b). Click Reset to start over, then roll two dice 100 times. Make a sketch of the final graph of the mean x against the number of rolls. Your two graphs will often look very different. What they have in common is that the average eventually gets close to the population mean APPLET
6 296 CHAPTER 11 Sampling Distributions μ = 7. The law of large numbers says that this will always happen if you keep on rolling the dice Insurance. The idea of insurance is that we all face risks that are unlikely but carry high cost. Think of a fire destroying your home. Insurance spreads the risk: we all pay a small amount, and the insurance policy pays a large amount to those few of us whose homes burn down. An insurance company looks at the records for millions of homeowners and sees that the mean loss from fire in a year is μ = $250 per person. (Most of us have no loss, but a few lose their homes. The $250 is the average loss.) The company plans to sell fire insurance for $250 plus enough to cover its costs and profit. Explain clearly why it would be unwise to sell only 12 policies. Then explain why selling thousands of such policies is a safe business. Sampling distributions The law of large numbers assures us that if we measure enough subjects, the statistic x will eventually get very close to the unknown parameter μ. But the odor threshold study in Example 11.2 had just 10 subjects. What can we say about estimating μ by x from a sample of 10 subjects? Put this one sample in the context of all such samples by asking, What would happen if we took many samples of 10 subjects from this population? Here s how to answer this question: Take a large number of samples of size 10 from the population. Calculate the sample mean x for each sample. Make a histogram of the values of x. Examine the shape, center, and spread of the distribution displayed in the histogram. simulation In practice it is too expensive to take many samples from a large population such as all adult U.S. residents. But we can imitate many samples by using software. Using software to imitate chance behavior is called simulation. EXAMPLE 11.4 What would happen in many samples? Extensive studies have found that the DMS odor threshold of adults follows roughly a Normal distribution with mean μ = 25 micrograms per liter and standard deviation σ = 7 micrograms per liter. We call this the population distribution of odor threshold. Figure 11.2 illustrates the process of choosing many samples and finding the sample mean threshold x for each one. Follow the flow of the figure from the population at the left, to choosing an SRS and finding the x for this sample, to collecting together the x s from many samples. The first sample has x = The second sample contains a different 10 people, with x = 24.28, and so on. The histogram at the right of the figure shows the distribution of the values of x from 1000 separate SRSs of size 10. This histogram displays the sampling distribution of the statistic x.
7 Sampling distributions 297 Take many SRSs and collect their means x. The distribution of all the x's is close to Normal. SRS size 10 x = SRS size 10 x = SRS size 10 x = Population, mean μ = FIGURE 11.2 The idea of a sampling distribution: take many samples from the same population, collect the x s from all the samples, and display the distribution of the x s. The histogram shows the results of 1000 samples. POPULATION DISTRIBUTION, SAMPLING DISTRIBUTION The population distribution of a variable is the distribution of values of the variable among all the individuals in the population. The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same population. Be careful: The population distribution describes the individuals that make up the population. A sampling distribution describes how a statistic varies in many samples from the population. Strictly speaking, the sampling distribution is the ideal pattern that would emerge if we looked at all possible samples of size 10 from our population. A distribution obtained from a fixed number of trials, like the 1000 trials in Figure 11.2, is only an approximation to the sampling distribution. One of the uses of probability theory in statistics is to obtain sampling distributions without simulation. The interpretation of a sampling distribution is the same, however, whether we obtain it by simulation or by the mathematics of probability. We can use the tools of data analysis to describe any distribution. Let s apply those tools to Figure What can we say about the shape, center, and spread of this distribution?
8 298 CHAPTER 11 Sampling Distributions Shape: It looks Normal! Detailed examination confirms that the distribution of x from many samples is very close to Normal. Center: The mean of the 1000 x s is That is, the distribution is centered very close to the population mean μ = 25. Spread: The standard deviation of the 1000 x s is 2.217, notably smaller than the standard deviation σ = 7 of the population of individual subjects. Although these results describe just one simulation of a sampling distribution, they reflect facts that are true whenever we use random sampling. APPLY YOUR KNOWLEDGE 11.6 Sampling distribution versus population distribution. During World War II, 12,000 ablebodied male undergraduates at the University of Illnois participated in required physical training. Each student ran a timed mile. Their times followed the Normal distribution with mean 7.11 minutes and standard deviation 0.74 minute. An SRS of 100 of these students has mean time x = 7.15 minutes. A second SRS of size 100 has mean x = 6.97 minutes. After many SRSs, the many values of the sample mean x follow the Normal distribution with mean 7.11 minutes and standard deviation minute. (a) What is the population? What values does the population distribution describe? What is this distribution? (b) What values does the sampling distribution of x describe? What is the sampling distribution? 11.7 Generating a sampling distribution. Let s illustrate the idea of a sampling distribution in the case of a very small sample from a very small population. The population is the scores of 10 students on an exam: Student Score The parameter of interest is the mean score μ in this population. The sample is an SRS of size n = 4 drawn from the population. Because the students are labeled 0 to 9, a single random digit from Table B chooses one student for the sample. (a) (b) (c) Find the mean of the 10 scores in the population. This is the population mean μ. Use the first digits in row 116 of Table B to draw an SRS of size 4 from this population. What are the four scores in your sample? What is their mean x? This statistic is an estimate of μ. Repeat this process 9 more times, using the first digits in rows 117 to 125 of Table B. Make a histogram of the 10 values of x. You are constructing the sampling distribution of x. Is the center of your histogram close to μ?
9 The sampling distribution of x 299 The sampling distribution of x Figure 11.2 suggests that when we choose many SRSs from a population, the sampling distribution of the sample means is centered at the mean of the original population and is less spread out than the distribution of individual observations. Here are the facts. MEAN AND STANDARD DEVIATION OF A SAMPLE MEAN 2 Suppose that x is the mean of an SRS of size n drawn from a large population with mean μ and standard deviation σ. Then the sampling distribution of x has mean μ and standard deviation σ/ n. These facts about the mean and the standard deviation of the sampling distribution of x are true for any population, not just for some special class such as Normal distributions. They have important implications for statistical inference: The mean of the statistic x is always equal to the mean μ of the population. That is, the sampling distribution of x is centered at μ. In repeated sampling, x will sometimes fall above the true value of the parameter μ and sometimes below, but there is no systematic tendency to overestimate or underestimate the parameter. This makes the idea of lack of bias in the sense of no favoritism more precise. Because the mean of x is equal to μ, we say that the statistic x is an unbiased estimator of the parameter μ. An unbiased estimator is correct on the average in many samples. How close the estimator falls to the parameter in most samples is determined by the spread of the sampling distribution. If individual observations have standard deviation σ, then sample means x from samples of size n have standard deviation σ/ n. That is, averages are less variable than individual observations. Not only is the standard deviation of the distribution of x smaller than the standard deviation of individual observations, but it gets smaller as we take larger samples. The results of large samples are less variable than the results of small samples. unbiased estimator The upshot of all this is that we can trust the sample mean from a large random sample to estimate the population mean accurately. If the sample size n is large, the standard deviation of x is small, and almost all samples will give values of x that lie very close to the true parameter μ. However, the standard deviation of the sampling distribution gets smaller only at the rate n. To cut the standard deviation of x in half, we must take four times as many observations, not just twice as many. So very accurate estimates may be expensive. We have described the center and spread of the sampling distribution of a sample mean x, but not its shape. The shape of the sampling distribution depends on the shape of the population distribution. In one important case there is a simple CAUTION
10 300 CHAPTER 11 Sampling Distributions relationship between the two distributions: if the population distribution is Normal, then so is the sampling distribution of the sample mean. SAMPLING DISTRIBUTION OF A SAMPLE MEAN If individual observations have the N(μ, σ) distribution, then the sample mean x of an SRS of size n has the N(μ, σ/ n) distribution. Sample size matters The new thing in baseball is using statistics to evaluate players, with new measures of performance to help decide which players are worth the high salaries they demand. This challenges traditional subjective evaluation of young players and the usefulness of traditional measures such as batting average. But success has led many major league teams to hire statisticians. The statisticians say that sample size matters in baseball also: the 162game regular season is long enough for the better teams to come out on top, but 5game and 7game playoff series are so short that luck has a lot to say about who wins. EXAMPLE 11.5 Population distribution, sampling distribution If we measure the DMS odor thresholds of individual adults, the values follow the Normal distribution with mean μ = 25 micrograms per liter and standard deviation σ = 7 micrograms per liter. This is the population distribution of odor threshold. Take many SRSs of size 10 from this population and find the sample mean x for each sample, as in Figure The sampling distribution describes how the values of x vary among samples. That sampling distribution is also Normal, with mean μ = 25 and standard deviation σ = 7 = n 10 Figure 11.3 contrasts these two Normal distributions. Both are centered at the population mean, but sample means are much less variable than individual observations. The smaller variation of sample means shows up in probability calculations. You can show (using software or standardizing and using Table A) that about 52% of all adults have odor thresholds between 20 and 30. But almost 98% of means of samples of size 10 lie in this range. The sampling distribution describes how sample means x vary in repeated samples. FIGURE 11.3 The distribution of single observations (the population distribution) compared with the sampling distribution of the means x of 10 observations, for Example Both have the same mean, but averages are less variable than individual observations. The population distribution describes how individuals vary in the population DMS odor threshold 40 50
11 The central limit theorem 301 APPLY YOUR KNOWLEDGE 11.8 A sample of young men. A government sample survey plans to measure the blood cholesterol level of an SRS of men aged 20 to 34. The researchers will report the mean x from their sample as an estimate of the mean cholesterol level μ in this population. (a) Explain to someone who knows no statistics what it means to say that x is an unbiased estimator of μ. (b) The sample result x is an unbiased estimator of the population truth μ no matter what size SRS the study uses. Explain to someone who knows no statistics why a large sample gives more trustworthy results than a small sample Larger sample, more accurate estimate. Suppose that in fact the blood cholesterol level of all men aged 20 to 34 follows the Normal distribution with mean μ = 188 milligrams per deciliter (mg/dl) and standard deviation σ = 41 mg/dl. (a) Choose an SRS of 100 men from this population. What is the sampling distribution of x? What is the probability that x takes a value between 185 and 191 mg/dl? This is the probability that x estimates μ within ±3 mg/dl. (b) Choose an SRS of 1000 men from this population. Now what is the probability that x falls within ±3 mg/dl of μ? The larger sample is much more likely to give an accurate estimate of μ Measurements in the lab. Juan makes a measurement in a chemistry laboratory and records the result in his lab report. The standard deviation of students lab measurements is σ = 10 milligrams. Juan repeats the measurement 3 times and records the mean x of his 3 measurements. (a) What is the standard deviation of Juan s mean result? (That is, if Juan kept on making 3 measurements and averaging them, what would be the standard deviation of all his x s?) (b) How many times must Juan repeat the measurement to reduce the standard deviation of x to 5? Explain to someone who knows no statistics the advantage of reporting the average of several measurements rather than the result of a single measurement. The central limit theorem The facts about the mean and standard deviation of x are true no matter what the shape of the population distribution may be. But what is the shape of the sampling distribution when the population distribution is not Normal? It is a remarkable fact that as the sample size increases, the distribution of x changes shape: it looks less like that of the population and more like a Normal distribution. When the sample is large enough, the distribution of x is very close to Normal. This is true no matter what shape the population distribution has, as long as the population has a finite standard deviation σ. This famous fact of probability theory is called the central limit theorem. It is much more useful than the fact that the distribution of x is exactly Normal if the population is exactly Normal.
12 302 CHAPTER 11 Sampling Distributions CENTRAL LIMIT THEOREM Draw an SRS of size n from any population with mean μ and finite standard deviation σ. The central limit theorem says that when n is large, the sampling distribution of the sample mean x is approximately Normal: x is approximately N (μ, n σ ) The central limit theorem allows us to use Normal probability calculations to answer questions about sample means from many observations even when the population distribution is not Normal. What was that probability again? Wall Street uses fancy mathematics to predict the probabilities that fancy investments will go wrong. The probabilities are always too low sometimes because something was assumed to be Normal but was not. Probability predictions in other areas also go wrong. In mid September 2007, the New York Mets had probability of making the National League playoffs, or so an elaborate calculation said. Then the Mets lost 12 of their final 17 games, the Phillies won 13 of their final 17, and the Mets were out. Maybe next year? More general versions of the central limit theorem say that the distribution of any sum or average of many small random quantities is close to Normal. This is true even if the quantities are correlated with each other (as long as they are not too highly correlated) and even if they have different distributions (as long as no one random quantity is so large that it dominates the others). The central limit theorem suggests why the Normal distributions are common models for observed data. Any variable that is a sum of many small influences will have approximately a Normal distribution. How large a sample size n is needed for x to be close to Normal depends on the population distribution. More observations are required if the shape of the population distribution is far from Normal. Here are two examples in which the population is far from Normal. EXAMPLE 11.6 The central limit theorem in action In March 2007, the Current Population Survey contacted 98,105 households. Figure 11.4(a) is a histogram of the earnings of the 61,742 households that had earned income greater than zero in As we expect, the distribution of earned incomes is strongly skewed to the right and very spread out. The right tail of the distribution is even longer than the histogram shows because there are too few high incomes for their bars to be visible on this scale. In fact, we cut off the earnings scale at $400,000 to save space a few households earned even more than $400,000. The mean earnings for these 61,742 households was $69,750. Regard these 61,742 households as a population with mean μ = $69,750. Take an SRS of 100 households. The mean earnings in this sample is x = $66,807. That s less than the mean of the population. Take another SRS of size 100. The mean for this sample is x = $70,820. That s higher than the mean of the population. What would happen if we did this many times? Figure 11.4(b) is a histogram of the mean earnings for 500 samples, each of size 100. The scales in Figures 11.4(a) and 11.4(b) are the same, for easy comparison. Although the distribution of individual earnings is skewed and very spread out, the distribution of sample means is roughly symmetric and much less spread out. Figure 11.4(c) zooms in on the center part of the histogram in Figure 11.4(b) to more clearly show its shape. Although n = 100 is not a very large sample size and the population distribution is extremely skewed, we can see that the distribution of sample means is close to Normal.
13 The central limit theorem Percent of households Household earnings (thousands of dollars) (a) Percent of samples Because both histograms use the same scales, you can directly compare this with the histogram above Mean household earnings for samples of size 100 (thousands of dollars) (b) FIGURE 11.4 The central limit theorem in action, for Example (a) The distribution of earned income in a population of 61,742 households. (b) The distribution of the mean earnings for 500 SRSs of 100 households each from this population. (Continued)
14 304 CHAPTER 11 Sampling Distributions FIGURE 11.4 (Continued) (c) The distribution of the sample means in more detail: the shape is close to Normal. This is the same histogram pictured in Figure 11.4b, drawn in a scale that more clearly shows its shape Means of samples of size 100 (thousands of dollars) (c) Comparing Figure 11.4(a) with Figures 11.4(b) and 11.4(c) illustrates the two most important ideas of this chapter. THINKING ABOUT SAMPLE MEANS Means of random samples are less variable than individual observations. Means of random samples are more Normal than individual observations. APPLET EXAMPLE 11.7 The central limit theorem in action The Central Limit Theorem applet allows you to watch the central limit theorem in action. Figure 11.5 presents snapshots from the applet, drawn on the same scales for easy comparison. Figure 11.5(a) shows the population distribution, that is, the density curve of a single observation. This distribution is strongly rightskewed, and the most probable outcomes are near 0. The mean μ of this distribution is 1, and its standard deviation σ is also 1. This particular distribution is called an exponential distribution. Exponential distributions are used as models for the lifetime in service of electronic components and for the time required to serve a customer or repair a machine. Figures 11.5(b), (c), and (d) are the density curves of the sample means of 2, 10, and 25 observations from this population. As n increases, the shape becomes more Normal. The mean remains at μ = 1, and the standard deviation decreases, taking the value 1/ n. The density curve for 10 observations is still somewhat skewed to the right but already resembles a Normal curve having μ = 1 and σ = 1/ 10 = The density
15 The central limit theorem 305 FIGURE (a) 0 1 (b) The central limit theorem in action, for Example The distribution of sample means x from a strongly nonnormal population becomes more Normal as the sample size increases. (a) The distribution of 1 observation. (b) The distribution of x for 2 observations. (c) The distribution of x for 10 observations. (d) The distribution of x for 25 observations. 0 1 (c) 0 1 (d) curve for n = 25 is yet more Normal. The contrast between the shapes of the population distribution and of the distribution of the mean of 10 or 25 observations is striking. Let s use Normal calculations based on the central limit theorem to answer a question about the very nonnormal distribution in Figure 11.5(a). EXAMPLE 11.8 Maintaining air conditioners STATE: The time (in hours) that a technician requires to perform preventive maintenance on an airconditioning unit is governed by the exponential distribution whose density curve appears in Figure 11.5(a). The mean time is μ = 1 hour and the standard deviation is σ = 1 hour. Your company has a contract to maintain 70 of these units in an apartment building. You must schedule technicians time for a visit to this building. Is it safe to budget an average of 1.1 hours for each unit? Or should you budget an average of 1.25 hours? S T E P PLAN: We can treat these 70 air conditioners as an SRS from all units of this type. What is the probability that the average maintenance time for 70 units exceeds 1.1 hours? That the average time exceeds 1.25 hours? SOLVE: The central limit theorem says that the sample mean time x spent working on 70 units has approximately the Normal distribution with mean equal to the population
16 306 CHAPTER 11 Sampling Distributions FIGURE 11.6 The exact distribution (dotted) and the Normal approximation from the central limit theorem (solid) for the average time needed to maintain an air conditioner, for Example The probability we want is the area to the right of 1.1. Exact density curve for x. Normal curve from the central limit theorem. 1.1 mean μ = 1 hour and standard deviation σ = 1 = 0.12 hour The distribution of x is therefore approximately N(1, 0.12). This Normal curve is the solid curve in Figure Using this Normal distribution, the probabilities we want are P (x > 1.10 hours) = P (x > 1.25 hours) = Software gives these probabilities immediately, or you can standardize and use Table A. For example, ( ) ( ) x P (x > 1.10) = P > P = P (Z > 0.83) = = with the usual roundoff error. Don t forget to use standard deviation 0.12 in your software or when you standardize x. CONCLUDE: If you budget 1.1 hours per unit, there is a 20% chance that the technicians will not complete the work in the building within the budgeted time. This chance drops to 2% if you budget 1.25 hours. You therefore budget 1.25 hours per unit. Using more mathematics, we can start with the exponential distribution and find the actual density curve of x for 70 observations. This is the dotted curve in
17 Chapter 11 Summary 307 Figure You can see that the solid Normal curve is a good approximation. The exactly correct probability for 1.1 hours is an area to the right of 1.1 under the dotted density curve. It is The central limit theorem Normal approximation is off by only about APPLY YOUR KNOWLEDGE What does the central limit theorem say? Asked what the central limit theorem says, a student replies, As you take larger and larger samples from a population, the histogram of the sample values looks more and more Normal. Is the student right? Explain your answer Detecting gypsy moths. The gypsy moth is a serious threat to oak and aspen trees. A state agriculture department places traps throughout the state to detect the moths. When traps are checked periodically, the mean number of moths trapped is only 0.5, but some traps have several moths. The distribution of moth counts is discrete and strongly skewed, with standard deviation 0.7. (a) What are the mean and standard deviation of the average number of moths x in 50 traps? (b) Use the central limit theorem to find the probability that the average number of moths in 50 traps is greater than More on insurance. An insurance company knows that in the entire population of millions of homeowners, the mean annual loss from fire is μ = $250 and the standard deviation of the loss is σ = $1000. The distribution of losses is strongly rightskewed: most policies have $0 loss, but a few have large losses. If the company sells 10,000 policies, can it safely base its rates on the assumption that its average loss will be no greater than $275? Follow the fourstep process as illustrated in Example Bruce Coleman/Alamy S T E P C H A P T E R 1 1 S U M M A R Y A parameter in a statistical problem is a number that describes a population, such as the population mean μ. To estimate an unknown parameter, use a statistic calculated from a sample, such as the sample mean x. The law of large numbers states that the actually observed mean outcome x must approach the mean μ of the population as the number of observations increases. The population distribution of a variable describes the values of the variable for all individuals in a population. The sampling distribution of a statistic describes the values of the statistic in all possible samples of the same size from the same population. When the sample is an SRS from the population, the mean of the sampling distribution of the sample mean x is the same as the population mean μ. That is, x is an unbiased estimator of μ.
18 308 CHAPTER 11 Sampling Distributions The standard deviation of the sampling distribution of x is σ/ n for an SRS of size n if the population has standard deviation σ. That is, averages are less variable than individual observations. When the sample is an SRS from a population that has a Normal distribution, the sample mean x also has a Normal distribution. Choose an SRS of size n from any population with mean μ and finite standard deviation σ. The central limit theorem states that when n is large the sampling distribution of x is approximately Normal. That is, averages are more Normal than individual observations. We can use the N(μ, σ/ n) distribution to calculate approximate probabilities for events involving x. C H E C K Y O U R S K I L L S The Bureau of Labor Statistics announces that last month it interviewed all members of the labor force in a sample of 60,000 households; 4.9% of the people interviewed were unemployed. The boldface number is a (a) sampling distribution. (b) parameter. (c) statistic A study of voting chose 663 registered voters at random shortly after an election. Of these, 72% said they had voted in the election. Election records show that only 56% of registered voters voted in the election. The boldface number is a (a) sampling distribution. (b) parameter. (c) statistic Annual returns on the more than 5000 common stocks available to investors vary a lot. In a recent year, the mean return was 8.3% and the standard deviation of returns was 28.5%. The law of large numbers says that (a) you can get an average return higher than the mean 8.3% by investing in a large number of stocks. (b) as you invest in more and more stocks chosen at random, your average return on these stocks gets ever closer to 8.3%. (c) if you invest in a large number of stocks chosen at random, your average return will have approximately a Normal distribution Scores on the mathematics part of the SAT exam in a recent year were roughly Normal with mean 515 and standard deviation 114. You choose an SRS of 100 students and average their SAT math scores. If you do this many times, the mean of the average scores you get will be close to (a) 515. (b) 515/100 = (c) 515/ 100 = Scores on the mathematics part of the SAT exam in a recent year were roughly Normal with mean 515 and standard deviation 114. You choose an SRS of 100 students and average their SAT math scores. If you do this many times, the standard deviation of the average scores you get will be close to (a) 114. (b) 114/100 = (c) 114/ 100 = A newborn baby has extremely low birth weight (ELBW) if it weighs less than 1000 grams. A study of the health of such children in later years examined a random
19 Chapter 11 Exercises 309 sample of 219 children. Their mean weight at birth was x = 810 grams. This sample mean is an unbiased estimator of the mean weight μ in the population of all ELBW babies. This means that (a) in many samples from this population, the mean of the many values of x will be equal to μ. (b) as we take larger and larger samples from this population, x will get closer and closer to μ. (c) in many samples from this population, the many values of x will have a distribution that is close to Normal The number of hours a light bulb burns before failing varies from bulb to bulb. The distribution of burnout times is strongly skewed to the right. The central limit theorem says that (a) as we look at more and more bulbs, their average burnout time gets ever closer to the mean μ for all bulbs of this type. (b) the average burnout time of a large number of bulbs has a distribution of the same shape (strongly skewed) as the distribution for individual bulbs. (c) the average burnout time of a large number of bulbs has a distribution that is close to Normal The length of human pregnancies from conception to birth varies according to a distribution that is approximately Normal with mean 266 days and standard deviation 16 days. The probability that the average pregnancy length for 6 randomly chosen women exceeds 270 days is about (a) (b) (c) C H A P T E R 1 1 E X E R C I S E S Testing glass. How well materials conduct heat matters when designing houses. As a test of a new measurement process, 10 measurements are made on pieces of glass known to have conductivity 1. The average of the 10 measurements is Is each of the boldface numbers a parameter or a statistic? Explain your answer Small classes in school. The Tennessee STAR experiment randomly assigned children to regular or small classes during their first four years of school. When these children reached high school, 40.2% of blacks from small classes took the ACT or SAT college entrance exams. Only 31.7% of blacks from regular classes took one of these exams. Is each of the boldface numbers a parameter or a statistic? Explain your answer Roulette. A roulette wheel has 38 slots, of which 18 are black, 18 are red, and 2 are green. When the wheel is spun, the ball is equally likely to come to rest in any of the slots. One of the simplest wagers chooses red or black. A bet of $1 on red returns $2 if the ball lands in a red slot. Otherwise, the player loses his dollar. When gamblers bet on red or black, the two green slots belong to the house. Because the probability of winning $2 is 18/38, the mean payoff from a $1 bet is twice 18/38, or 94.7 cents. Explain what the law of large numbers tells us about what will happen if a gambler makes very many bets on red.
20 310 CHAPTER 11 Sampling Distributions Gandee Vasan/Getty Images Lightning strikes. The number of lightning strikes on a square kilometer of open ground in a year has mean 6 and standard deviation 2.4. (These values are typical of much of the United States.) The National Lightning Detection Network uses automatic sensors to watch for lightning in a sample of 10 square kilometers. What are the mean and standard deviation of x, the mean number of strikes per square kilometer? Heights of male students. To estimate the mean height μ of male students on your campus, you will measure an SRS of students. Heights of people of the same sex and similar ages are close to Normal. You know from government data that the standard deviation of the heights of young men is about 2.8 inches. Suppose that (unknown to you) the mean height of all male students is 70 inches. (a) If you choose one student at random, what is the probability that he is between 69 and 71 inches tall? (b) You measure 25 students. What is the sampling distribution of their average height x? (c) What is the probability that the mean height of your sample is between 69 and 71 inches? Glucose testing. Shelia s doctor is concerned that she may suffer from gestational diabetes (high blood glucose levels during pregnancy). There is variation both in the actual glucose level and in the blood test that measures the level. A patient is classified as having gestational diabetes if the glucose level is above 140 milligrams per deciliter (mg/dl) one hour after having a sugary drink. Shelia s measured glucose level one hour after the sugary drink varies according to the Normal distribution with μ = 125 mg/dl and σ = 10 mg/dl. (a) (b) If a single glucose measurement is made, what is the probability that Shelia is diagnosed as having gestational diabetes? If measurements are made on 4 separate days and the mean result is compared with the criterion 140 mg/dl, what is the probability that Shelia is diagnosed as having gestational diabetes? Durable press fabrics. Durable press cotton fabrics are treated to improve their recovery from wrinkles after washing. Unfortunately, the treatment also reduces the strength of the fabric. The breaking strength of untreated fabric is Normally distributed with mean 58 pounds and standard deviation 2.3 pounds. The same type of fabric after treatment has Normally distributed breaking strength with mean 30 pounds and standard deviation 1.6 pounds. 4 A clothing manufacturer tests an SRS of 5 specimens of each fabric. (a) What is the probability that the mean breaking strength of the 5 untreated specimens exceeds 50 pounds? (b) What is the probability that the mean breaking strength of the 5 treated specimens exceeds 50 pounds? Glucose testing, continued. Shelia s measured glucose level one hour after a sugary drink varies according to the Normal distribution with μ = 125 mg/dl and σ = 10 mg/dl. What is the level L such that there is probability only 0.05 that the mean glucose level of 4 test results falls above L? (Hint: This requires a backward Normal calculation. See page 83 in Chapter 3 if you need to review.)
21 Chapter 11 Exercises Pollutants in auto exhausts. The level of nitrogen oxides (NOX) in the exhaust of cars of a particular model varies Normally with mean 0.2 grams per mile (g/mi) and standard deviation 0.05 g/mi. Government regulations call for NOX emissions no higher than 0.3 g/mi. (a) What is the probability that a single car of this model fails to meet the NOX requirement? (b) A company has 25 cars of this model in its fleet. What is the probability that the average NOX level x of these cars is above the 0.3 g/mi limit? Auto accidents. The number of accidents per week at a hazardous intersection varies with mean 2.2 and standard deviation 1.4. This distribution takes only wholenumber values, so it is certainly not Normal. (a) Let x be the mean number of accidents per week at the intersection during a year (52 weeks). What is the approximate distribution of x according to the central limit theorem? (b) What is the approximate probability that x is less than 2? (c) What is the approximate probability that there are fewer than 100 accidents at the intersection in a year? (Hint: Restate this event in terms of x.) Pollutants in auto exhausts, continued. The level of nitrogen oxides (NOX) in the exhaust of cars of a particular model varies Normally with mean 0.2 g/mi and standard deviation 0.05 g/mi. A company has 25 cars of this model in its fleet. What is the level L such that the probability that the average NOX level x for the fleet is greater than L is only 0.01? (Hint: This requires a backward Normal calculation. See page 83 in Chapter 3 if you need to review.) Returns on stocks. Andrew plans to retire in 40 years. He plans to invest part of his retirement funds in stocks, so he seeks out information on past returns. He learns that over the entire 20th century, the real (that is, adjusted for inflation) annual returns on U.S. common stocks had mean 8.7% and standard deviation 20.2%. 5 The distribution of annual returns on common stocks is roughly symmetric, so the mean return over even a moderate number of years is close to Normal. What is the probability (assuming that the past pattern of variation continues) that the mean annual return on common stocks over the next 40 years will exceed 10%? What is the probability that the mean return will be less than 5%? Follow the fourstep process as illustrated in Example Airline passengers get heavier. In response to the increasing weight of airline passengers, the Federal Aviation Administration in 2003 told airlines to assume that passengers average 190 pounds in the summer, including clothing and carryon baggage. But passengers vary, and the FAA did not specify a standard deviation. A reasonable standard deviation is 35 pounds. Weights are not Normally distributed, especially when the population includes both men and women, but they are not very nonnormal. A commuter plane carries 19 passengers. What is the approximate probability that the total weight of the passengers exceeds 4000 pounds? Use the fourstep process to guide your work. (Hint: To apply the central limit theorem, restate the problem in terms of the mean weight.) Sampling male students. To estimate the mean height μ of male students on your campus, you will measure an SRS of students. You know from government Alan Hicks/Getty Images S T E P S T E P Jeff Greenberg/The Image Works
Introduction to the Practice of Statistics Fifth Edition Moore, McCabe
Introduction to the Practice of Statistics Fifth Edition Moore, McCabe Section 5.2 Homework Answers 5.29 An automatic grinding machine in an auto parts plant prepares axles with a target diameter µ = 40.125
More informationAP Statistics Solutions to Packet 9
AP Statistics Solutions to Packet 9 Sampling Distributions Sampling Distributions Sample Proportions Sample Means HW #7 1 4, 9, 10 11 For each boldface number in Exercises 9.1 9.4, (a) state whether it
More informationChapter 9: Sampling Distributions
Chapter 9: Sampling Distributions 1. A phonein poll conducted by a newspaper reported that 73% of those who called in liked business tycoon Donald Trump. The number 73% is a A) statistic. B) sample. C)
More informationThe Idea of Probability
AP Statistics 5.1 Reading Guide Name Directions: Read the following pages and then answer the questions at the end. We ll have a short miniquiz over this material (for Mastery) when we return from Thanksgiving
More informationc. Construct a boxplot for the data. Write a one sentence interpretation of your graph.
MBA/MIB 5315 Sample Test Problems Page 1 of 1 1. An English survey of 3000 medical records showed that smokers are more inclined to get depressed than nonsmokers. Does this imply that smoking causes depression?
More informationAMS 5 CHANCE VARIABILITY
AMS 5 CHANCE VARIABILITY The Law of Averages When tossing a fair coin the chances of tails and heads are the same: 50% and 50%. So if the coin is tossed a large number of times, the number of heads and
More informationWhen σ Is Known: Recall the Mystery Mean Activity where x bar = 240.79 and we have an SRS of size 16
8.3 ESTIMATING A POPULATION MEAN When σ Is Known: Recall the Mystery Mean Activity where x bar = 240.79 and we have an SRS of size 16 Task was to estimate the mean when we know that the situation is Normal
More information5.1.1 The Idea of Probability
5.1.1 The Idea of Probability Chance behavior is unpredictable in the short run but has a regular and predictable pattern in the long run. This remarkable fact is the basis for the idea of probability.
More informationThis HW reviews the normal distribution, confidence intervals and the central limit theorem.
Homework 3 Solution This HW reviews the normal distribution, confidence intervals and the central limit theorem. (1) Suppose that X is a normally distributed random variable where X N(75, 3 2 ) (mean 75
More informationThe Math. P (x) = 5! = 1 2 3 4 5 = 120.
The Math Suppose there are n experiments, and the probability that someone gets the right answer on any given experiment is p. So in the first example above, n = 5 and p = 0.2. Let X be the number of correct
More informationThe overall size of these chance errors is measured by their RMS HALF THE NUMBER OF TOSSES NUMBER OF HEADS MINUS 0 400 800 1200 1600 NUMBER OF TOSSES
INTRODUCTION TO CHANCE VARIABILITY WHAT DOES THE LAW OF AVERAGES SAY? 4 coins were tossed 1600 times each, and the chance error number of heads half the number of tosses was plotted against the number
More informationLab 6: Sampling Distributions and the CLT
Lab 6: Sampling Distributions and the CLT Objective: The objective of this lab is to give you a hands on discussion and understanding of sampling distributions and the Central Limit Theorem (CLT), a theorem
More informationJohn Kerrich s cointossing Experiment. Law of Averages  pg. 294 Moore s Text
Law of Averages  pg. 294 Moore s Text When tossing a fair coin the chances of tails and heads are the same: 50% and 50%. So, if the coin is tossed a large number of times, the number of heads and the
More informationChapter 6 ATE: Random Variables Alternate Examples and Activities
Probability Chapter 6 ATE: Random Variables Alternate Examples and Activities [Page 343] Alternate Example: NHL Goals In 2010, there were 1319 games played in the National Hockey League s regular season.
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
STATISTICS/GRACEY PRACTICE TEST/EXAM 2 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Identify the given random variable as being discrete or continuous.
More informationChapter 6 Random Variables
Chapter 6 Random Variables Day 1: 6.1 Discrete Random Variables Read 340344 What is a random variable? Give some examples. A numerical variable that describes the outcomes of a chance process. Examples:
More informationChapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs
Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)
More informationfind confidence interval for a population mean when the population standard deviation is KNOWN Understand the new distribution the tdistribution
Section 8.3 1 Estimating a Population Mean Topics find confidence interval for a population mean when the population standard deviation is KNOWN find confidence interval for a population mean when the
More informationLesson 1: Experimental and Theoretical Probability
Lesson 1: Experimental and Theoretical Probability Probability is the study of randomness. For instance, weather is random. In probability, the goal is to determine the chances of certain events happening.
More information4. Describing Bivariate Data
4. Describing Bivariate Data A. Introduction to Bivariate Data B. Values of the Pearson Correlation C. Properties of Pearson's r D. Computing Pearson's r E. Variance Sum Law II F. Exercises A dataset with
More informationStatistics 151 Practice Midterm 1 Mike Kowalski
Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Multiple Choice (50 minutes) Instructions: 1. This is a closed book exam. 2. You may use the STAT 151 formula sheets and
More informationElementary Statistics and Inference. Elementary Statistics and Inference. 16 The Law of Averages (cont.) 22S:025 or 7P:025.
Elementary Statistics and Inference 22S:025 or 7P:025 Lecture 20 1 Elementary Statistics and Inference 22S:025 or 7P:025 Chapter 16 (cont.) 2 D. Making a Box Model Key Questions regarding box What numbers
More informationUnit 19: Probability Models
Unit 19: Probability Models Summary of Video Probability is the language of uncertainty. Using statistics, we can better predict the outcomes of random phenomena over the long term from the very complex,
More informationGCSE Statistics Revision notes
GCSE Statistics Revision notes Collecting data Sample This is when data is collected from part of the population. There are different methods for sampling Random sampling, Stratified sampling, Systematic
More informationCHAPTER 6: ZSCORES. ounces of water in a bottle. A normal distribution has a mean of 61 and a standard deviation of 15. What is the median?
CHAPTER 6: ZSCORES Exercise 1. A bottle of water contains 12.05 fluid ounces with a standard deviation of 0.01 ounces. Define the random variable X in words. X =. ounces of water in a bottle Exercise
More informationStatistics 1040 Summer 2009 Exam III NAME. Point score Curved Score
Statistics 1040 Summer 2009 Exam III NAME Point score Curved Score Each question is worth 10 points. There are 12 questions, so a total of 120 points is possible. No credit will be given unless your answer
More informationM 225 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT!
M 225 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points 114 14 15 3 16 5 17 4 18 4 19 11 20 9 21 8 22 16 Total 75 1 Multiple choice questions (1 point each) 1. Look
More informationStatistics 2014 Scoring Guidelines
AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home
More informationMonte Carlo Method: Probability
John (ARC/ICAM) Virginia Tech... Math/CS 4414: The Monte Carlo Method: PROBABILITY http://people.sc.fsu.edu/ jburkardt/presentations/ monte carlo probability.pdf... ARC: Advanced Research Computing ICAM:
More information4. Continuous Random Variables, the Pareto and Normal Distributions
4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random
More informationCOMMON CORE STATE STANDARDS FOR
COMMON CORE STATE STANDARDS FOR Mathematics (CCSSM) High School Statistics and Probability Mathematics High School Statistics and Probability Decisions or predictions are often based on data numbers in
More informationStatistics Notes Revision in Maths Week
Statistics Notes Revision in Maths Week 1 Section  Producing Data 1.1 Introduction Statistics is the science that studies the collection and interpretation of numerical data. Statistics divides the study
More informationStandard 12: The student will explain and evaluate the financial impact and consequences of gambling.
TEACHER GUIDE 12.1 GAMBLING PAGE 1 Standard 12: The student will explain and evaluate the financial impact and consequences of gambling. Risky Business Priority Academic Student Skills Personal Financial
More informationRock, Paper, Scissors Tournament
UNIT 5: PROBABILITY Introduction to The Mathematics of Chance Rock, Paper, Scissors Tournament History As old as civilization. Egyptians used a small mammal bone as a 4 sided die (500 BC) Games of chance
More informationX X AP Statistics Solutions to Packet 7 X Random Variables Discrete and Continuous Random Variables Means and Variances of Random Variables
AP Statistics Solutions to Packet 7 Random Variables Discrete and Continuous Random Variables Means and Variances of Random Variables HW #44, 3, 6 8, 3 7 7. THREE CHILDREN A couple plans to have three
More information(SEE IF YOU KNOW THE TRUTH ABOUT GAMBLING)
(SEE IF YOU KNOW THE TRUTH ABOUT GAMBLING) Casinos loosen the slot machines at the entrance to attract players. FACT: This is an urban myth. All modern slot machines are stateoftheart and controlled
More information9. Sampling Distributions
9. Sampling Distributions Prerequisites none A. Introduction B. Sampling Distribution of the Mean C. Sampling Distribution of Difference Between Means D. Sampling Distribution of Pearson's r E. Sampling
More informationProbability. Experiment is a process that results in an observation that cannot be determined
Probability Experiment is a process that results in an observation that cannot be determined with certainty in advance of the experiment. Each observation is called an outcome or a sample point which may
More information6.1. Construct and Interpret Binomial Distributions. p Study probability distributions. Goal VOCABULARY. Your Notes.
6.1 Georgia Performance Standard(s) MM3D1 Your Notes Construct and Interpret Binomial Distributions Goal p Study probability distributions. VOCABULARY Random variable Discrete random variable Continuous
More informationSection 7C: The Law of Large Numbers
Section 7C: The Law of Large Numbers Example. You flip a coin 00 times. Suppose the coin is fair. How many times would you expect to get heads? tails? One would expect a fair coin to come up heads half
More information32 Measures of Central Tendency and Dispersion
32 Measures of Central Tendency and Dispersion In this section we discuss two important aspects of data which are its center and its spread. The mean, median, and the mode are measures of central tendency
More informationLab 11. Simulations. The Concept
Lab 11 Simulations In this lab you ll learn how to create simulations to provide approximate answers to probability questions. We ll make use of a particular kind of structure, called a box model, that
More informationCHAPTER 7: THE CENTRAL LIMIT THEOREM
CHAPTER 7: THE CENTRAL LIMIT THEOREM Exercise 1. Yoonie is a personnel manager in a large corporation. Each month she must review 16 of the employees. From past experience, she has found that the reviews
More informationLecture 13. Understanding Probability and LongTerm Expectations
Lecture 13 Understanding Probability and LongTerm Expectations Thinking Challenge What s the probability of getting a head on the toss of a single fair coin? Use a scale from 0 (no way) to 1 (sure thing).
More informationChapter 10  Practice Problems 1
Chapter 10  Practice Problems 1 1. A researcher is interested in determining if one could predict the score on a statistics exam from the amount of time spent studying for the exam. In this study, the
More informationInterpreting Data in Normal Distributions
Interpreting Data in Normal Distributions This curve is kind of a big deal. It shows the distribution of a set of test scores, the results of rolling a die a million times, the heights of people on Earth,
More informationChapter 4. Probability and Probability Distributions
Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the
More informationSAMPLING DISTRIBUTIONS
0009T_c07_308352.qd 06/03/03 20:44 Page 308 7Chapter SAMPLING DISTRIBUTIONS 7.1 Population and Sampling Distributions 7.2 Sampling and Nonsampling Errors 7.3 Mean and Standard Deviation of 7.4 Shape of
More informationReview #2. Statistics
Review #2 Statistics Find the mean of the given probability distribution. 1) x P(x) 0 0.19 1 0.37 2 0.16 3 0.26 4 0.02 A) 1.64 B) 1.45 C) 1.55 D) 1.74 2) The number of golf balls ordered by customers of
More information7. Normal Distributions
7. Normal Distributions A. Introduction B. History C. Areas of Normal Distributions D. Standard Normal E. Exercises Most of the statistical analyses presented in this book are based on the bellshaped
More informationName: Date: Use the following to answer questions 23:
Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student
More informationChapter 2. The Normal Distribution
Chapter 2 The Normal Distribution Lesson 21 Density Curve Review Graph the data Calculate a numerical summary of the data Describe the shape, center, spread and outliers of the data Histogram with Curve
More informationMBA 611 STATISTICS AND QUANTITATIVE METHODS
MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 111) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain
More informationSection 6.1 Discrete Random variables Probability Distribution
Section 6.1 Discrete Random variables Probability Distribution Definitions a) Random variable is a variable whose values are determined by chance. b) Discrete Probability distribution consists of the values
More informationThe Normal Curve. The Normal Curve and The Sampling Distribution
Discrete vs Continuous Data The Normal Curve and The Sampling Distribution We have seen examples of probability distributions for discrete variables X, such as the binomial distribution. We could use it
More informationMONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010
MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times
More informationStandard 12: The student will explain and evaluate the financial impact and consequences of gambling.
STUDENT MODULE 12.1 GAMBLING PAGE 1 Standard 12: The student will explain and evaluate the financial impact and consequences of gambling. Risky Business Simone, Paula, and Randy meet in the library every
More informationMEASURES OF VARIATION
NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are
More informationSTT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables
Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random
More information1) What is the probability that the random variable has a value greater than 2? A) 0.750 B) 0.625 C) 0.875 D) 0.700
Practice for Chapter 6 & 7 Math 227 This is merely an aid to help you study. The actual exam is not multiple choice nor is it limited to these types of questions. Using the following uniform density curve,
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. A) 0.4987 B) 0.9987 C) 0.0010 D) 0.
Ch. 5 Normal Probability Distributions 5.1 Introduction to Normal Distributions and the Standard Normal Distribution 1 Find Areas Under the Standard Normal Curve 1) Find the area under the standard normal
More information1 Lesson 3: Presenting Data Graphically
1 Lesson 3: Presenting Data Graphically 1.1 Types of graphs Once data is organized and arranged, it can be presented. Graphic representations of data are called graphs, plots or charts. There are an untold
More informationMath 140: Introductory Statistics Instructor: Julio C. Herrera Exam 3 January 30, 2015
Name: Exam Score: Instructions: This exam covers the material from chapter 7 through 9. Please read each question carefully before you attempt to solve it. Remember that you have to show all of your work
More informationChapter 6 Review 0 (0.083) (0.917) (0.083) (0.917)
Chapter 6 Review MULTIPLE CHOICE. 1. The following table gives the probabilities of various outcomes for a gambling game. Outcome Lose $1 Win $1 Win $2 Probability 0.6 0.25 0.15 What is the player s expected
More informationSample Term Test 2A. 1. A variable X has a distribution which is described by the density curve shown below:
Sample Term Test 2A 1. A variable X has a distribution which is described by the density curve shown below: What proportion of values of X fall between 1 and 6? (A) 0.550 (B) 0.575 (C) 0.600 (D) 0.625
More informationComment on the Tree Diagrams Section
Comment on the Tree Diagrams Section The reversal of conditional probabilities when using tree diagrams (calculating P (B A) from P (A B) and P (A B c )) is an example of Bayes formula, named after the
More informationDO NOT POST THESE ANSWERS ONLINE BFW Publishers Chapter 5
Section 5.1 Chapter 5 Check Your Understanding, page 292: 1. (a) If you asked a large sample of U.S. adults whether they usually eat breakfast, about 61% of them will answer yes. (b) In a random sample
More information4. Introduction to Statistics
Statistics for Engineers 41 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation
More informationSeveral scatterplots are given with calculated correlations. Which is which? 4) 1) 2) 3) 4) a) , b) , c) 0.002, d) 0.
AP Statistics Review Chapters 78 Practice Problems Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Suppose you were to collect data for the
More informationIntroduction to Descriptive Statistics
Mathematics Learning Centre Introduction to Descriptive Statistics Jackie Nicholas c 1999 University of Sydney Acknowledgements Parts of this booklet were previously published in a booklet of the same
More informationChapter 5 Section 2 day 1 2014f.notebook. November 17, 2014. Honors Statistics
Chapter 5 Section 2 day 1 2014f.notebook November 17, 2014 Honors Statistics Monday November 17, 2014 1 1. Welcome to class Daily Agenda 2. Please find folder and take your seat. 3. Review Homework C5#3
More informationElementary Statistics and Inference. Elementary Statistics and Inference. 17 Expected Value and Standard Error. 22S:025 or 7P:025.
Elementary Statistics and Inference S:05 or 7P:05 Lecture Elementary Statistics and Inference S:05 or 7P:05 Chapter 7 A. The Expected Value In a chance process (probability experiment) the outcomes of
More informationCentral Tendency and Variation
Contents 5 Central Tendency and Variation 161 5.1 Introduction............................ 161 5.2 The Mode............................. 163 5.2.1 Mode for Ungrouped Data................ 163 5.2.2 Mode
More information A few more notes about Z  SPSS and the normal curve  Chapter 6: Samples vs. Populations  Convenience/accidental sampling: why online polls suck
 A few more notes about Z  SPSS and the normal curve  Chapter 6: Samples vs. Populations  Convenience/accidental sampling: why online polls suck Last day, we looked at the relationship between standard
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. A) ±1.88 B) ±1.645 C) ±1.96 D) ±2.
Ch. 6 Confidence Intervals 6.1 Confidence Intervals for the Mean (Large Samples) 1 Find a Critical Value 1) Find the critical value zc that corresponds to a 94% confidence level. A) ±1.88 B) ±1.645 C)
More information, for x = 0, 1, 2, 3,... (4.1) (1 + 1/n) n = 2.71828... b x /x! = e b, x=0
Chapter 4 The Poisson Distribution 4.1 The Fish Distribution? The Poisson distribution is named after SimeonDenis Poisson (1781 1840). In addition, poisson is French for fish. In this chapter we will
More informationThe number of phone calls to the attendance office of a high school on any given school day A) continuous B) discrete
Exam Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) State whether the variable is discrete or continuous.
More informationNormal Distribution Lecture Notes
Normal Distribution Lecture Notes Professor Richard Blecksmith richard@math.niu.edu Dept. of Mathematical Sciences Northern Illinois University Math 101 Website: http://math.niu.edu/ richard/math101 Section
More informationPopulation and sample; parameter and statistic. Sociology 360 Statistics for Sociologists I Chapter 11 Sampling Distributions. Question about Notation
Population and sample; parameter and statistic Sociology 360 Statistics for Sociologists I Chapter 11 Sampling Distributions The Population is the entire group we are interested in A parameter is a number
More informationCollege of the Canyons A. Morrow Math 140 Exam 3
College of the Canyons Name: A. Morrow Math 140 Exam 3 Answer the following questions NEATLY. Show all necessary work directly on the exam. Scratch paper will be discarded unread. 1 point each part unless
More informationStatistics and Random Variables. Math 425 Introduction to Probability Lecture 14. Finite valued Random Variables. Expectation defined
Expectation Statistics and Random Variables Math 425 Introduction to Probability Lecture 4 Kenneth Harris kaharri@umich.edu Department of Mathematics University of Michigan February 9, 2009 When a large
More informationMATHEMATICS FOR ENGINEERS STATISTICS TUTORIAL 4 PROBABILITY DISTRIBUTIONS
MATHEMATICS FOR ENGINEERS STATISTICS TUTORIAL 4 PROBABILITY DISTRIBUTIONS CONTENTS Sample Space Accumulative Probability Probability Distributions Binomial Distribution Normal Distribution Poisson Distribution
More informationDistributions: Population, Sample and Sampling Distributions
119 Part 2 / Basic Tools of Research: Sampling, Measurement, Distributions, and Descriptive Statistics Chapter 9 Distributions: Population, Sample and Sampling Distributions In the three preceding chapters
More informationChapter 6 Continuous Probability Distributions
Continuous Probability Distributions Learning Objectives 1. Understand the difference between how probabilities are computed for discrete and continuous random variables. 2. Know how to compute probability
More informationM 140 Test 2 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Extra Credit 4 Total 60
M 140 Test 2 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points 110 10 11 5 12 4 13 12 14 6 15 17 16 6 17 Extra Credit 4 Total 60 1 Multiple Choice Questions (1 point each) 1. A new
More informationOdds: Odds compares the number of favorable outcomes to the number of unfavorable outcomes.
MATH 11008: Odds and Expected Value Odds: Odds compares the number of favorable outcomes to the number of unfavorable outcomes. Suppose all outcomes in a sample space are equally likely where a of them
More informationChapter 16. Law of averages. Chance. Example 1: rolling two dice Sum of draws. Setting up a. Example 2: American roulette. Summary.
Overview Box Part V Variability The Averages Box We will look at various chance : Tossing coins, rolling, playing Sampling voters We will use something called s to analyze these. Box s help to translate
More informationMath 1011 Homework Set 2
Math 1011 Homework Set 2 Due February 12, 2014 1. Suppose we have two lists: (i) 1, 3, 5, 7, 9, 11; and (ii) 1001, 1003, 1005, 1007, 1009, 1011. (a) Find the average and standard deviation for each of
More informationMath 1070 Exam 2B 22 March, 2013
Math 1070 Exam 2B 22 March, 2013 This exam will last 50 minutes and consists of 13 multiple choice and 6 free response problems. Write your answers in the space provided. All solutions must be sufficiently
More informationMAT 118 DEPARTMENTAL FINAL EXAMINATION (written part) REVIEW. Ch 13. One problem similar to the problems below will be included in the final
MAT 118 DEPARTMENTAL FINAL EXAMINATION (written part) REVIEW Ch 13 One problem similar to the problems below will be included in the final 1.This table presents the price distribution of shoe styles offered
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Final Exam Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) A researcher for an airline interviews all of the passengers on five randomly
More informationModels for Discrete Variables
Probability Models for Discrete Variables Our study of probability begins much as any data analysis does: What is the distribution of the data? Histograms, boxplots, percentiles, means, standard deviations
More informationDescriptive Statistics and Measurement Scales
Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample
More information6.042/18.062J Mathematics for Computer Science. Expected Value I
6.42/8.62J Mathematics for Computer Science Srini Devadas and Eric Lehman May 3, 25 Lecture otes Expected Value I The expectation or expected value of a random variable is a single number that tells you
More informationA frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes
A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes together with the number of data values from the set that
More informationWhat Is Risk and How Can We Manage It? By William C. Wood
What Is Risk and How Can We Manage It? By William C. Wood Risk is defined as the possibility that an injury or loss will occur, and there is risk in everything. To see this, consider whether there is any
More informationMind on Statistics. Chapter 8
Mind on Statistics Chapter 8 Sections 8.18.2 Questions 1 to 4: For each situation, decide if the random variable described is a discrete random variable or a continuous random variable. 1. Random variable
More information1.1 What is Statistics?
1.1 What is Statistics? Statistics is the science that deals with the collection, analysis, and interpretation of numerical information. This science can be divided into two areas: descriptive statistics
More informationGeneralization: How Broadly Do the Results Apply?
Chapter 2 Generalization: How Broadly Do the Results Apply? Chapter Overview This chapter is about generalization, one of the four pillars of inference: strength, size, breadth, and cause. How broadly
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Exam Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Find the mean for the given sample data. 1) Bill kept track of the number of hours he spent
More information