# Inference for two Population Means

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Inference for two Population Means Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 27 November 1, 2011 Two Population Means 1 / 65 Case Study Case Study Example 19.2 on page 545 of the text describes an experiment with pseudoscorpions of the species Cordylochernes scorpiodes which live in the tropics. The females typically mate multiple times, even though mating once provides sufficient sperm for fertilization. Researchers conducted an experiment to examine a possible evolutionary explanation for this behavior. If there is genetic incompatibility between some pairs, a female may be more fertile by having multiple partners. In the experiment, females were randomly assigned to one of two treatment groups. In one group, a female was mated with the same male twice; in the other group, two different males were mated to the same female. The response variable is the number of successful broods. This variable is a small integer, ranging from 0 in some cases to a maximum of observed value of 7. Two Population Means Case Study 2 / 65

2 Data The data set is small enough that we can display it. The mean of the 20 values in the Same treatment group is 2.2. The mean of the 16 values in the Different treatment group is 3.6. Group Number of Successful Broods Same Different Two Population Means Case Study 3 / 65 Questions Do female pseudoscorpions have more successful broods on average, when they have multiple partners? The real biological question of interest involves the pseudoscorpions in their natural environment. The experimental setting seeks to explore the question in a controlled setting. In the experiment, the sample means are 2.2 and 3.6; what does this imply about the populations? Here, there is a single biological population of pseudoscorpions that can be thought of as two statistical populations on the basis of assignment to experimental treatment groups. Two Population Means Case Study 4 / 65

3 A statistical model A statistical model for the experiment is that there are two probability distributions for the number of successful broods, one for each treatment group. The specific distributions are not specified, but each is summarized my a mean or expected value, say µ 1 for the Same treatment group and µ 2 for the Different treatment group. The null hypothesis is µ 1 = µ 2 ; the biologically interesting alternative here is µ 1 < µ 2. How can we test this hypothesis? For the sample estimates, 2.2 < 3.6, but what can we infer about the populations? Two Population Means Case Study 5 / 65 The Big Picture We have two populations with means µ 1 and µ 2. We have independent samples from each sample means x 1 and x 2 of sizes n 1 and n 2 respectively. We want to test H 0 : µ 1 = µ 2 versus the alternative H A : µ 1 < µ 2 with few assumptions about the distributions in the populations. The key idea of a randomization test is to consider the null distribution of the difference in sample means for all possible random samples assuming that the randomization is independent of the observed data. Two Population Means The Big Picture 6 / 65

4 Example In the example, there are 20 females in the Same treatment group and 16 in the Different treatment group. The observed difference in sample means is = (without roundoff). What if the number of successful broods each female pseudoscorpion had was independent of the assignment to treatment group? If this were the case, we could compare the observed difference in sample means to the null distribution of differences in sample means for all other possible results of the randomization. There are ( 36 20) = possible ways to randomly separate 36 individuals into groups of size 20 and 16. Instead of finding exactly how many of these 7 billion+ possible randomization results have differences in sample means at least as extreme as the observed 1.425, we can use the computer to simulate the randomization process and estimate the p-value. Before beginning, we will examine methods to graph the data. Two Population Means The Big Picture 7 / 65 Types of Graphs When comparing two independent samples, we can use extensions of the types of graphs for single samples: density plots; histograms; box-and-whisker plots; dot plots. As this example data set is small, a dot plot is best because there is no compelling reason to summarize the data. Points should be jittered so equal values are not directly on top of one another. Two Population Means Graphics 8 / 65

5 Dot plot of the data 6 broods Different Same Two Population Means Graphics 9 / 65 Comments on the Graphics There is a lot of overlap between the samples. It would be difficult to place an individual in one group or the other on the basis of the number of successful broods. But the centers of the distributions appear to be a bit different with generally larger values for the Different group on average. Two Population Means Graphics 10 / 65

6 Components of a Randomization Test Randomization Tests 1 State hypotheses; 2 Select and calculate a test statistic; 3 Use simulation to find the null distribution of the test statistic; 4 Compare the value of the actual test statistic to its null distribution to compute a p-value; 5 Summarize the results in the context of the problem. Two Population Means Randomization Tests 11 / 65 State Hypotheses Hypotheses are statements about populations; Here we are assuming that the pseudoscorpions in the sample may be treated as if they were randomly sampled from the population of these pseudoscorpions in the wild; In words, the hypotheses are: H 0 : There would be no difference in the mean number of successful broods for each experimental condition among all female pseudoscorpions in the population. H A : The experimental condition with different partners produces a larger mean number of successful broods than the experimental condition with the same partner mating twice. Two Population Means Randomization Tests 12 / 65

7 State Hypotheses (cont.) In symbols, letting µ 1 and µ 2 represent the mean number of successful broods in the population for the Same and Different groups, respectively, the hypotheses are: H 0 : µ 1 = µ 2 H A : µ 1 < µ 2 One could also test the alternative hypotheses H A : µ 1 µ 2 or H A : µ 1 > µ 2 if appropriate for the setting. Two Population Means Randomization Tests 13 / 65 Select a Test Statistic The difference in sample means is the natural test statistic for a hypothesis that compares population means. As we are determining the null distribution by simulation, there is no need to standardize the test statistic so it can be compared to some well-known benchmark distribution (such as standard normal, chi-square, or t). For the observed data, x 1 = 44/20 = 2.2 and x 2 = 58/16 = and the difference is x 1 x 2 = Two Population Means Randomization Tests 14 / 65

8 Compute the Null Distribution Conceptually, we take a random sample of 20 without replacement from the 36, compute its mean and the mean of the 16 remaining values, and take the difference. Repeat this process very many times and see how may differences are or smaller. The proportion of such values is the p-value. Two Population Means Randomization Tests 15 / 65 Graph of Null Distribution 0.6 Density Difference in Sample Means Two Population Means Randomization Tests 16 / 65

9 P-value It is evident from the graph that the observed difference is fairly unusual relative to the sampling distribution. The p-value is the actual proportion of sampled randomizations with a difference at least as extreme as that observed. For the 100,000 randomly selected randomizations, the p-value is estmated to be A different simulation would estimate this differently, but not by too much. The jaggedness of the preceding graph is caused by lots of ties in the randomization distribution of the difference between sample means. Two Population Means Randomization Tests 17 / 65 Conclusions The p-value is fairly small. In the context of the problem, we can say this. There is strong evidence that female pseudoscorpions have fewer successful broods when they mate with only one partner than when they mate with two partners under the given experimental conditions (two independent sample randomization test, p = 0.009, n 1 = 20, n 2 = 16). This result is consistent with the evolutionary explanation that the behavior of having multiple partners as seen in nature may overcome the possibility of genetic incompatibility among some partners. Two Population Means Randomization Tests 18 / 65

10 Summary Randomization tests are useful for comparing population means (or other population characteristics). The method simply considers the test statistic under the null hypothesis of independence between the randomization and the response variable of interest. As simulation determines the null distribution, there is no need to scale the test statistic so that it is comparable to a standard benchmark null distribution. Randomization tests are only practical with a computer. We see that the shape of the null distribution is symmetric and bell-shaped, which suggests that an approximation with a standard distribution may be accurate. Later in these notes we will reexamine this data using a t-test. Two Population Means Randomization Tests 19 / 65 Two Different Designs Example There are two standard designs to compare two treatment groups; 1 In a paired design, there is a single sample of pairs of observations and each treatment is applied to each sampled unit. In a sample of people, scores from vision tests are recorded separately for each eye. There is interest in comparing scores between right and left eyes. Example 2 In a two-independent-sample design, there are two separate samples, and all elements of one sample get one treatment, all elements of the other sample get the other treatment. In a dairy cattle study with two different feed supplements, cattle are randomly separated into two groups and each group is given one of two dietary supplements. Average daily milk yield is compared between the groups. Two Population Means Comparison of Means 20 / 65

11 Butterfat Study This study and the accompaning data is from a former Statistics 571 student. Case Study The butterfat content in milk is an important factor in determining its economic value and in how it is processed to form dairy products such as cheese, ice cream, and butter. In an experiment, a company is interested in comparing the performances of two different labs which measure the butterfat content of milk. Two separate samples were collected from 107 loads of milk, and one sample from each load was sent to one of two labs. Butterfat content changes based on the identity of the cows, the time of milking, the time since the last milking, and other factors, so the percentage butterfat can be expected to vary from load to load, but should be consistent for samples taken from the same load as each load is properly agitated before sampling to promote mixing throughout the load. How should this data be examined to compare the performances of the labs? Two Population Means Case Studies 21 / 65 Horned Lizards Case Study The horned lizard Phrynosoma mcalli has horns it uses for protection. Researchers tested a hypothesis that longer horns are more protective then shorter horns. A predator of these lizards is the loggerhead shrike, a bird that impales the lizards on thorns or barbed wire. Researchers compared the horn lengths of 30 skewered lizards with 154 horned lizards that were living. The average length of the skewered lizards was mm and the average length of the living lizards was mm. Is this evidence that longer horns are more protective? Two Population Means Case Studies 22 / 65

12 Graphs Density plots, histograms, dot plots, and box-and-whisker plots are all useful for graphical comparisons between samples. For paired samples, graphs of differences are also useful. It is very important to graph data and look for patterns or outliers before carrying out statistical inferences. Two Population Means Graphics 23 / 65 Butterfat Study The butterfat percentage data is from a paired data design. There is a single sample of size n = 107. Each sample unit (a load) is measured twice (the two butterfat measurements from the two labs). Paired data design data is analyzed by: taking differences within each pair (same order); analyzing the single sample of differences using one-sample methods. Two Population Means Application 24 / 65

13 Scatter plot For paired data, a scatter plot of one measurement versus the other helps show the relationship and identify potential outliers. Definition An outlier is a single observation that sticks out in a plot as unusual and not part of the general trend. There are multiple ways to deal with outliers in an analysis. Two Population Means Application Graphics 25 / 65 Scatter plot Percentage butterfat from Lab Percentage butterfat from Lab 1 Two Population Means Application Graphics 26 / 65

14 Dot plot Difference in Percentage butterfat (Lab 1 Lab2) Two Population Means Application Graphics 27 / 65 Box-and-Whisker Plot Difference in Percentage butterfat (Lab 1 Lab2) Two Population Means Application Graphics 28 / 65

15 Density Plot 15 Density Difference in Percentage butterfat (Lab 1 Lab2) Two Population Means Application Graphics 29 / 65 Histogram 40 Percent of Total Difference in Percentage butterfat (Lab 1 Lab2) Two Population Means Application Graphics 30 / 65

16 Observations There are four individual measurements that are outliers. There are several possible explanations: 1 data may be recorded incorrectly; 2 the load may not have been agitated properly before sampling; 3 one or both labs occasionally makes a poor measurement; With additional background information, we understand that the second explanation is most plausible, as some of the individuals that do the work of taking samples fail to properly agitate the milk, and as cream rises to the top, there is the possibility that two separate samples from the same load might differ considerably in percentage of butterfat. As these observations are likely not telling us about the performance of the laboratories, but about how a small part of the data is collected, and our desired inference is about the laboratories, it is reasonable to discard the outliers. Two Population Means Application Graphics 31 / 65 What to do with outliers For the purposes of this example in lecture, we will discard the four outliers from further analysis, as explained on the previous slide. For 103 of the 107 data points, differences between measurements are no more than 0.07 percent. For the other loads, the differences are at least twice as large, ranging from 0.15 to If this was my data, I would seek explanations in each case for the discrepancy: (was the data recorded improperly, is there a note to cast suspicion on a particular measurement) Generally, one should not discard data unless there is a good reason. If I had more information that indicated that the observed measurements were genuinely accurate and just unusual, I would not discard the data. This example highlights the importance of graphing data before conducting an analysis for inference. Two Population Means Application Graphics 32 / 65

17 Dot plot (after outliers removed) After Removal of Outliers Difference in Percentage butterfat (Lab 1 Lab2) Two Population Means Application Graphics 33 / 65 Histogram (after outliers removed) After Removal of Outliers Difference in Percentage butterfat (Lab 1 Lab2) Percent of Total Two Population Means Application Graphics 34 / 65

18 Estimation for Paired Designs For paired designs, inference is on the differences in measurements. The data is considered to be a single sample of differences. Confidence intervals use the single sample methods we have seen before; 1 a t distribution method; or 2 the bootstrap Here we will use the t distribution as the sample size is large enough to overcome even extreme nonnormality and the (remaining) data has a fairly symmetric distribution. Two Population Means Application Estimation 35 / 65 Confidence Intervals from Paired Designs Confidence Interval for µ D = µ 1 µ 2 A P% confidence interval for µ D = µ 1 µ 2 has the form D t s < µ < D + t s n n where t is the critical value such that the area between t and t under a t-density with n 1 degrees of freedom is P/100, where n is the sample size. Two Population Means Application Estimation 36 / 65

19 Chalkboard example Example Here are summary statistics: The sample size is 103. The mean difference (Lab 1 - Lab 2) is The standard deviation of the differences (Lab 1 - Lab 2) is Find a 95% confidence interval using the formula on the board < µ D < 7e 04 Two Population Means Application Estimation 37 / 65 Compare to R Solution > t.test(lab1, lab2, paired = T) Paired t-test data: lab1 and lab2 t = , df = 102, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of the differences Two Population Means Application Estimation 38 / 65

20 Interpretation After excluding observations with unexplained unusually large differences, we are 95% confident that the mean difference (Lab1 - Lab 2) in butterfat determination is between and in the common situation where there is an absence of large discrepancies in the measurements. Two Population Means Application Estimation 39 / 65 Comparison with outliers included An analysis with the outliers included would be appropriate if it were determined that the outliers were, in fact, genuinely correct. Notice that the interval still includes 0 (consistent with no differences in the labs), but that the confidence interval is much wider. Paired t-test data: bfat[lab == "Lab1"] and bfat[lab == "Lab2"] t = , df = 106, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of the differences Two Population Means Application Estimation 40 / 65

21 Hypothesis Tests We can also formally test the null hypothesis that the mean difference in measurements in the two labs is zero. H 0 : µ 1 µ 2 = 0 H A : µ 1 µ 2 0 Paired t test If differences D 1, D 2,..., D n are normally distributed, then T = D d 0 s/ n t(n 1) where d 0 (usually 0) is the mean difference in the null hypothesis, s is the sample standard deviation of differences, and n is the sample size. Two Population Means Application Hypothesis Tests 41 / 65 Chalkboard example Example Here are summary statistics: The sample size is 103. The mean difference (Lab 1 - Lab 2) is The standard deviation of the differences (Lab 1 - Lab 2) is Test the hypothesis of no difference between the labs. T = 1.68, p = , df = 102 Two Population Means Application Hypothesis Tests 42 / 65

22 Compare to R Solution > t.test(lab1, lab2, paired = T) Paired t-test data: lab1 and lab2 t = , df = 102, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of the differences Two Population Means Application Hypothesis Tests 43 / 65 Two Independent Samples The lizard example is not paired. There is no reason to match a specific individual in one sample with an individual in the other. In fact, the two sample sizes are not even equal. Inference methods and graphs are different for independent samples. Two Population Means Two Independent Samples 44 / 65

23 Graphics Separate graphics are produced for each separate sample. The graphs use a layout to make comparisons between them easy. Often, such graphs are side by side or on top of one another. It is best to use common scales on axes. Two Population Means Two Independent Samples Graphics 45 / 65 Histograms Living Percent of Total Killed Horn Length (mm) Two Population Means Two Independent Samples Graphics 46 / 65

24 Dot Plots Horn Length (mm) Killed Living Two Population Means Two Independent Samples Graphics 47 / 65 Density Plots Horn Length (mm) Density Killed Living Two Population Means Two Independent Samples Graphics 48 / 65

25 Box-and-Whisker Plots Living Killed Horn Length (mm) Two Population Means Two Independent Samples Graphics 49 / 65 Violin Plots 30 Horn Length (mm) Killed Living Two Population Means Two Independent Samples Graphics 50 / 65

26 Remarks While there are some extreme values, they do not stick out far from the overall pattern of the data. Both distributions look to be not strongly skewed. Inferences based on t distributions will be accurate, even with moderate nonnormality in the underlying populations. Two Population Means Two Independent Samples Graphics 51 / 65 t Distribution for Independent Samples t Distribution for Independent Samples If two independent samples X 1,..., X n1 and Y 1,..., Y n2 are each normally distributed with respective means µ 1 and µ 2 and if they share a common variance σ 2, then the test statistic T = ( X Ȳ ) (µ 1 µ 2 ) s p 1 n n 2 t(n 1 + n 2 2) where s p = (n 1 1)s (n 2 1)s 2 2 (n 1 1) + (n 2 1) is the pooled sample standard deviation and s 1 and s 2 are the respective single sample standard deviations. Note that s 2 p, the pooled variance, is a weighted average of the sample variances, weighted by the degrees of freedom. Two Population Means Two Independent Samples t Distribution 52 / 65

27 Derivation Using linearity properties of expectation and variances of independent random variables, it is straightforward to show that if X 1,..., X n1 and Y 1,..., Y n2 are independent samples with E(X i ) = µ 1, Var(X i ) = σ1 2, E(Y i ) = µ 2, and Var(Y i ) = σ2 2, then E( X Ȳ ) = µ 1 µ 2 Var( X Ȳ ) = σ2 1 n 1 + σ2 2 n 2 Furthermore, if both distributions are normal, the distribution of the difference in sample means is also normal. Under the additional assumption that σ 1 = σ 2, Var( X Ȳ ) = σ 2 ( 1 n n 2 ) Two Population Means Two Independent Samples t Distribution 53 / 65 Standard Error The standard error for the difference in sample means is σ1 SE( X 2 Ȳ ) = + σ2 2 n 1 n 2 Under the assumption that σ 1 = σ 2, this is estimated as ŜE( X Ȳ ) = s p 1 n n 2 Without this assumption, the estimate is s1 SE( X 2 Ȳ ) = + s2 2 n 1 n 2 Welch s t-test uses SE for the T statistic; in this case the distribution is only approximate and the degrees of freedom is approximated with a messier formula (see page 304 in the textbook for details). Welch s approach is the default in the R function t.test(). Two Population Means Two Independent Samples t Distribution 54 / 65

28 Chalkboard example Example Here are summary statistics for the lizard data: The sample sizes are 154 for the living lizards and 30 for the unfortunate skewered ones. The sample means are for the living lizards and for the killed ones. The standard deviations are respectively 2.63 and Find a 95% confidence interval for the difference in mean horn length for the two groups. Test the hypothesis test of no difference in mean horn size. Two Population Means Two Independent Samples Estimation and Testing 55 / 65 Comparison with R Use t.test() with argument var.equal=t for the conventional two-sample test. > t.test(hornliving, hornkilled, var.equal = T) Two Sample t-test data: hornliving and hornkilled t = , df = 182, p-value = 2.27e-05 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y Two Population Means Two Independent Samples Estimation and Testing 56 / 65

29 Comparison with R (Welch s Test) Use t.test() with default arguments for Welch s two-sample test which does not assume equal variances. Note that the approximate degrees of freedom is always at least as big as the smaller of n 1 1 and n 2 1 and no larger than n 1 + n 2 2. > t.test(hornliving, hornkilled) Welch Two Sample t-test data: hornliving and hornkilled t = , df = , p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y Two Population Means Two Independent Samples Estimation and Testing 57 / 65 Interpretation of Confidence Interval We are 95% confident that the mean length of the horns is between 1.21 and 3.38 mm longer in surviving lizards than in lizards killed by the loggerhead shrike in the study population where the data was collected. This is consistent with the biological hypothesis that larger horns offer more protection. There are other possible interpretations, as other variables may be confounded with horn length. Perhaps lizards with longer horns are also older, heavier, faster, or something else that is the real cause of the greater protection. Two Population Means Two Independent Samples Estimation and Testing 58 / 65

30 A one-sided t-test > t.test(hornliving, hornkilled, alternative = "greater") Welch Two Sample t-test data: hornliving and hornkilled t = , df = , p-value = 5.889e-05 alternative hypothesis: true difference in means is greater than 0 95 percent confidence interval: Inf sample estimates: mean of x mean of y Two Population Means Two Independent Samples Estimation and Testing 59 / 65 Interpretation of Hypothesis Test There is strong evidence that the mean horn length in living lizards is greater than that in lizards killed by the loggerhead shrike (p < , t = 4.26, Welch s two-sample independent t-test, n 1 = 154, n 2 = 30). This is consistent with the biological hypothesis that longer horns offer greater protection, but is likewise consistent with explanations of protection offered by other characteristics associated with lizards with longer horns. Two Population Means Two Independent Samples Estimation and Testing 60 / 65

31 Case Study Example Recall the pseudoscorpions example we used for the permutation test. In one group, a female was mated with the same male twice; in the other group, two different males were mated to the same female. The response variable is the number of successful broods. This variable is a small integer, ranging from 0 in some cases to a maximum of observed value of 7. For the 100,000 randomly selected randomizations, the p-value is estmated to be Two Population Means Application 61 / 65 Comparison to Welch s t test Compare the corresponding p-values. There is also evidence that the mean number of successful broods is smaller in the Same treatment group. The p-values differ. > t.test(same, different, alternative = "less") Welch Two Sample t-test data: same and different t = , df = , p-value = alternative hypothesis: true difference in means is less than 0 95 percent confidence interval: -Inf sample estimates: mean of x mean of y Two Population Means Application 62 / 65

32 Cautions and Concerns Always plot data before doing inferences. Handle potential outliers with care. Use the paired or two-sample t methods when appropriate; paired when the design is a single sample with each unit measured twice, independent when the samples are independent of one another. t methods are robust to nonnormality when sample sizes are large enough. Inferences to populations depends on random sampling; be prepared to argue when a sample can be thought of as representative despite nonrandom sampling. The two-sample independent t methods assume equal variances in the samples, but are robust to minor differences. Randomization or permutation methods are alternatives to t tests. Two Population Means Cautions and Concerns 63 / 65 What You Should Know You should know: how to use randomization/permutation methods or t methods for paired and independent sample problems; how to determine of paired or independent sample methods are appropriate; how to graph data in various ways to display differences in distributions; how to examine graphs of data to informally assess method assumptions; how to interpret inferences in context. Two Population Means What you should know 64 / 65

33 Extensions We have moved from one to two populations: what about three or more? ANOVA (Analysis of Variance) is on the way. Two Population Means Extensions 65 / 65

### Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

### Power and Sample Size Determination

Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 Power 1 / 31 Experimental Design To this point in the semester,

### Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

### Statistics 2014 Scoring Guidelines

AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

### Statistical Inference and t-tests

1 Statistical Inference and t-tests Objectives Evaluate the difference between a sample mean and a target value using a one-sample t-test. Evaluate the difference between a sample mean and a target value

### Null Hypothesis H 0. The null hypothesis (denoted by H 0

Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

### AP Statistics 2007 Scoring Guidelines

AP Statistics 2007 Scoring Guidelines The College Board: Connecting Students to College Success The College Board is a not-for-profit membership association whose mission is to connect students to college

### AP Statistics 2012 Scoring Guidelines

AP Statistics 2012 Scoring Guidelines The College Board The College Board is a mission-driven not-for-profit organization that connects students to college success and opportunity. Founded in 1900, the

### AP Statistics 2011 Scoring Guidelines

AP Statistics 2011 Scoring Guidelines The College Board The College Board is a not-for-profit membership association whose mission is to connect students to college success and opportunity. Founded in

### Paired vs. 2 sample comparisons. Comparing means. Paired comparisons allow us to account for a lot of extraneous variation.

Comparing means! Tests with one categorical and one numerical variable Paired vs. sample comparisons! Goal: to compare the mean of a numerical variable for different groups. Paired comparisons allow us

BIOSTATISTICS QUIZ ANSWERS 1. When you read scientific literature, do you know whether the statistical tests that were used were appropriate and why they were used? a. Always b. Mostly c. Rarely d. Never

### UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates

UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally

### AP Statistics 2002 Scoring Guidelines

AP Statistics 2002 Scoring Guidelines The materials included in these files are intended for use by AP teachers for course and exam preparation in the classroom; permission for any other use must be sought

### General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n

### Structure of the Data. Paired Samples. Overview. The data from a paired design can be tabulated in this form. Individual Y 1 Y 2 d i = Y 1 Y

Structure of the Data Paired Samples Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 11th November 2005 The data from a paired design can be tabulated

### Comparing Means in Two Populations

Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

### Testing: is my coin fair?

Testing: is my coin fair? Formally: we want to make some inference about P(head) Try it: toss coin several times (say 7 times) Assume that it is fair ( P(head)= ), and see if this assumption is compatible

### Sociology 6Z03 Topic 15: Statistical Inference for Means

Sociology 6Z03 Topic 15: Statistical Inference for Means John Fox McMaster University Fall 2016 John Fox (McMaster University) Soc 6Z03: Statistical Inference for Means Fall 2016 1 / 41 Outline: Statistical

### Lecture Notes Module 1

Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

### Unit 26: Small Sample Inference for One Mean

Unit 26: Small Sample Inference for One Mean Prerequisites Students need the background on confidence intervals and significance tests covered in Units 24 and 25. Additional Topic Coverage Additional coverage

### AP Statistics 2001 Solutions and Scoring Guidelines

AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use

### MTH 140 Statistics Videos

MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

### LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

### AP Statistics 2000 Scoring Guidelines

AP Statistics 000 Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use must be sought

### Box plots & t-tests. Example

Box plots & t-tests Box Plots Box plots are a graphical representation of your sample (easy to visualize descriptive statistics); they are also known as box-and-whisker diagrams. Any data that you can

### THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

### Chapter Additional: Standard Deviation and Chi- Square

Chapter Additional: Standard Deviation and Chi- Square Chapter Outline: 6.4 Confidence Intervals for the Standard Deviation 7.5 Hypothesis testing for Standard Deviation Section 6.4 Objectives Interpret

### Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

### Experimental Design. Principles of Experimental Design. Biology Education. Case Studies

Experimental Design Principles of Experimental Design Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 15, 2011 Many interesting questions in biology involve

### Hypothesis Testing or How to Decide to Decide Edpsy 580

Hypothesis Testing or How to Decide to Decide Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Hypothesis Testing or How to Decide to Decide

### SPSS on two independent samples. Two sample test with proportions. Paired t-test (with more SPSS)

SPSS on two independent samples. Two sample test with proportions. Paired t-test (with more SPSS) State of the course address: The Final exam is Aug 9, 3:30pm 6:30pm in B9201 in the Burnaby Campus. (One

### Business Statistics. Lecture 8: More Hypothesis Testing

Business Statistics Lecture 8: More Hypothesis Testing 1 Goals for this Lecture Review of t-tests Additional hypothesis tests Two-sample tests Paired tests 2 The Basic Idea of Hypothesis Testing Start

### Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

### Two-sample inference: Continuous data

Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### 1.5 Oneway Analysis of Variance

Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

### AP Statistics 1998 Scoring Guidelines

AP Statistics 1998 Scoring Guidelines These materials are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use must be sought from the Advanced Placement

### Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis

### AP Statistics 2013 Scoring Guidelines

AP Statistics 2013 Scoring Guidelines The College Board The College Board is a mission-driven not-for-profit organization that connects students to college success and opportunity. Founded in 1900, the

### find confidence interval for a population mean when the population standard deviation is KNOWN Understand the new distribution the t-distribution

Section 8.3 1 Estimating a Population Mean Topics find confidence interval for a population mean when the population standard deviation is KNOWN find confidence interval for a population mean when the

### NCSS Statistical Software

Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

### One-sample normal hypothesis Testing, paired t-test, two-sample normal inference, normal probability plots

1 / 27 One-sample normal hypothesis Testing, paired t-test, two-sample normal inference, normal probability plots Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis

### Using CrunchIt (http://bcs.whfreeman.com/crunchit/bps4e) or StatCrunch (www.calvin.edu/go/statcrunch)

Using CrunchIt (http://bcs.whfreeman.com/crunchit/bps4e) or StatCrunch (www.calvin.edu/go/statcrunch) 1. In general, this package is far easier to use than many statistical packages. Every so often, however,

### Inferences About Differences Between Means Edpsy 580

Inferences About Differences Between Means Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Inferences About Differences Between Means Slide

### How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

### Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

### 12.5: CHI-SQUARE GOODNESS OF FIT TESTS

125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

### Statistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

7 Statistical Intervals Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to

### Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.

Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.

### Fairfield Public Schools

Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

### 9.1 (a) The standard deviation of the four sample differences is given as.68. The standard error is SE (ȳ1 - ȳ 2 ) = SE d - = s d n d

CHAPTER 9 Comparison of Paired Samples 9.1 (a) The standard deviation of the four sample differences is given as.68. The standard error is SE (ȳ1 - ȳ 2 ) = SE d - = s d n d =.68 4 =.34. (b) H 0 : The mean

### T adult = 96 T child = 114.

Homework Solutions Do all tests at the 5% level and quote p-values when possible. When answering each question uses sentences and include the relevant JMP output and plots (do not include the data in your

### Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

### One-Sample t-test. Example 1: Mortgage Process Time. Problem. Data set. Data collection. Tools

One-Sample t-test Example 1: Mortgage Process Time Problem A faster loan processing time produces higher productivity and greater customer satisfaction. A financial services institution wants to establish

### Using Excel for inferential statistics

FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

### Confidence Intervals for the Difference Between Two Means

Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

### AP Statistics 2010 Scoring Guidelines

AP Statistics 2010 Scoring Guidelines The College Board The College Board is a not-for-profit membership association whose mission is to connect students to college success and opportunity. Founded in

### Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

### STAT 350 Practice Final Exam Solution (Spring 2015)

PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

### C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample

### Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

### Compare birds living near a toxic waste site with birds living in a pristine area.

STT 430/630/ES 760 Lecture Notes: Chapter 7: Two-Sample Inference 1 February 27, 2009 Chapter 7: Two Sample Inference Chapter 6 introduced hypothesis testing in the one-sample setting: one sample is obtained

### Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

### e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

### Hypothesis tests: the t-tests

Hypothesis tests: the t-tests Introduction Invariably investigators wish to ask whether their data answer certain questions that are germane to the purpose of the investigation. It is often the case that

### Module 5 Hypotheses Tests: Comparing Two Groups

Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this

### Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric

### Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

### Confidence Intervals for the Mean. Single Sample and Paired Samples t tests

Confidence Intervals for the Mean Single Sample and Paired Samples t tests Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and

### The Wilcoxon Rank-Sum Test

1 The Wilcoxon Rank-Sum Test The Wilcoxon rank-sum test is a nonparametric alternative to the twosample t-test which is based solely on the order in which the observations from the two samples fall. We

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### Hypothesis testing S2

Basic medical statistics for clinical and experimental research Hypothesis testing S2 Katarzyna Jóźwiak k.jozwiak@nki.nl 2nd November 2015 1/43 Introduction Point estimation: use a sample statistic to

### Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

### t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

### Difference of Means and ANOVA Problems

Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

### 3. Nonparametric methods

3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

### Statistiek I. t-tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen. John Nerbonne 1/35

Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://wwwletrugnl/nerbonne/teach/statistiek-i/ John Nerbonne 1/35 t-tests To test an average or pair of averages when σ is known, we

### Hypothesis tests, confidence intervals, and bootstrapping

Hypothesis tests, confidence intervals, and bootstrapping Business Statistics 41000 Fall 2015 1 Topics 1. Hypothesis tests Testing a mean: H0 : µ = µ 0 Testing a proportion: H0 : p = p 0 Testing a difference

### Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

### LAB 4 ASSIGNMENT CONFIDENCE INTERVALS AND HYPOTHESIS TESTING. Using Data to Make Decisions

LAB 4 ASSIGNMENT CONFIDENCE INTERVALS AND HYPOTHESIS TESTING This lab assignment will give you the opportunity to explore the concept of a confidence interval and hypothesis testing in the context of a

### Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

### STATISTICS 151 SECTION 1 FINAL EXAM MAY

STATISTICS 151 SECTION 1 FINAL EXAM MAY 2 2009 This is an open book exam. Course text, personal notes and calculator are permitted. You have 3 hours to complete the test. Personal computers and cellphones

### 1 Confidence intervals

Math 143 Inference for Means 1 Statistical inference is inferring information about the distribution of a population from information about a sample. We re generally talking about one of two things: 1.

### Chapter 8 Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter problem: Does the MicroSort method of gender selection increase the likelihood that a baby will be girl? MicroSort: a gender-selection method developed by Genetics

### Math 62 Statistics Sample Exam Questions

Math 62 Statistics Sample Exam Questions 1. (10) Explain the difference between the distribution of a population and the sampling distribution of a statistic, such as the mean, of a sample randomly selected

### Sections 4.5-4.7: Two-Sample Problems. Paired t-test (Section 4.6)

Sections 4.5-4.7: Two-Sample Problems Paired t-test (Section 4.6) Examples of Paired Differences studies: Similar subjects are paired off and one of two treatments is given to each subject in the pair.

### Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

### Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

### AP Statistics 2004 Scoring Guidelines

AP Statistics 2004 Scoring Guidelines The materials included in these files are intended for noncommercial use by AP teachers for course and exam preparation; permission for any other use must be sought

### STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico. Fall 2013

STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico Fall 2013 CHAPTER 18 INFERENCE ABOUT A POPULATION MEAN. Conditions for Inference about mean

### Independent t- Test (Comparing Two Means)

Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent

### Unit 21 Student s t Distribution in Hypotheses Testing

Unit 21 Student s t Distribution in Hypotheses Testing Objectives: To understand the difference between the standard normal distribution and the Student's t distributions To understand the difference between

### Statistics 641 - EXAM II - 1999 through 2003

Statistics 641 - EXAM II - 1999 through 2003 December 1, 1999 I. (40 points ) Place the letter of the best answer in the blank to the left of each question. (1) In testing H 0 : µ 5 vs H 1 : µ > 5, the

### 7.1 Inference for comparing means of two populations

Objectives 7.1 Inference for comparing means of two populations Matched pair t confidence interval Matched pair t hypothesis test http://onlinestatbook.com/2/tests_of_means/correlated.html Overview of

Chapter 8 Hypothesis Tests Chapter Table of Contents Introduction...157 One-Sample t-test...158 Paired t-test...164 Two-Sample Test for Proportions...169 Two-Sample Test for Variances...172 Discussion

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### Chapter 2. Hypothesis testing in one population

Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance