13 Two-Sample T Tests

Similar documents
Descriptive Statistics

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Two-sample hypothesis testing, II /16/2004

3.4 Statistical inference for 2 populations based on two samples

Lecture Notes Module 1

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Section 13, Part 1 ANOVA. Analysis Of Variance

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Association Between Variables

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Mind on Statistics. Chapter 13

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

EXCEL Analysis TookPak [Statistical Analysis] 1. First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it:

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Linear Models in STATA and ANOVA

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Independent samples t-test. Dr. Tom Pierce Radford University

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Name: Date: Use the following to answer questions 3-4:

Tests for Two Proportions

Study Guide for the Final Exam

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

One-Way Analysis of Variance

Analysis of Variance ANOVA

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Introduction to Hypothesis Testing

Chapter 7 Section 7.1: Inference for the Mean of a Population

Two-sample inference: Continuous data

Statistiek I. Proportions aka Sign Tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen.

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Statistics 2014 Scoring Guidelines

Recall this chart that showed how most of our course would be organized:

UNDERSTANDING THE TWO-WAY ANOVA

WHERE DOES THE 10% CONDITION COME FROM?

Math 251, Review Questions for Test 3 Rough Answers

Step 6: Writing Your Hypotheses Written and Compiled by Amanda J. Rockinson-Szapkiw

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

Unit 26: Small Sample Inference for One Mean

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 27: Comparing Two Means

Non-Parametric Tests (I)

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Chapter 23 Inferences About Means

HYPOTHESIS TESTING: POWER OF THE TEST

Unit 26 Estimation with Confidence Intervals

CHAPTER 14 NONPARAMETRIC TESTS

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Independent t- Test (Comparing Two Means)

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Statistics Review PSY379

Appendix 2 Statistical Hypothesis Testing 1

WISE Power Tutorial All Exercises

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Statistical tests for SPSS

In the past, the increase in the price of gasoline could be attributed to major national or global

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Characteristics of Binomial Distributions

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Simple Linear Regression Inference

Simple Regression Theory II 2010 Samuel L. Baker

Permutation Tests for Comparing Two Populations

Inference for two Population Means

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Comparing Means in Two Populations

2 Precision-based sample size calculations

Non-Inferiority Tests for Two Proportions

CALCULATIONS & STATISTICS

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Experimental Designs (revisited)

Point Biserial Correlation Tests

Part 2: Analysis of Relationship Between Two Variables

Lesson 17: Margin of Error When Estimating a Population Proportion

5.1 Identifying the Target Parameter

Mathematics (Project Maths Phase 1)

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Crosstabulation & Chi Square

The relationship between mathematics preparation and conceptual learning gains in physics: a possible hidden variable in diagnostic pretest scores

Technical Information

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Chapter Eight: Quantitative Methods

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Mind on Statistics. Chapter 12

Transcription:

www.ck12.org CHAPTER 13 Two-Sample T Tests Chapter Outline 13.1 TESTING A HYPOTHESIS FOR DEPENDENT AND INDEPENDENT SAMPLES 270

www.ck12.org Chapter 13. Two-Sample T Tests 13.1 Testing a Hypothesis for Dependent and Independent Samples Learning Objectives Identify situations that contain dependent or independent samples. Calculate the standard deviation for two independent samples. Calculate the test statistic to test hypotheses about dependent data pairs. Calculate the test statistic to test hypotheses about independent data pairs. Introduction In the previous lessons we learned about hypothesis testing for a single sample mean (comparing a sample mean to a hypothesized population mean). In this chapter we will apply the principals of hypothesis testing to situations involving two samples. There are many situations in everyday life where we would perform statistical analysis involving two samples. For example, suppose that we wanted to test a hypothesis about the effect of two medications on curing an illness. Or, we may want to test the difference between the means of males and females on the SAT. In both of these cases, we would analyze both samples and the hypothesis would address the difference between two sample means. In this lesson, we will identify situations with different types of samples, learn to calculate the test statistic, calculate the estimate for population variance for both samples and calculate the test statistic to test hypotheses about the difference of proportions or means between samples. Independent Samples Let s recall what we assumed when we conducted a hypothesis test on a single sample. We knew that we needed to select a random sample from the population, measure that sample statistic and then make an inference about the population based on that sample. When we work with two independent samples, our null hypothesis assumes that if the samples are selected at random, the samples will vary only by chance and the difference will not be statistically significant. In short, when we have independent samples we assume that the scores of one sample do not affect the other. Dependent Samples Dependent samples are a bit different. Two samples of data are dependent when each score in one sample is paired with a specific score in the other sample. In short, these types of samples are related to each other. Dependent samples can occur in two scenarios. In one, a group may be measured twice such as in a pretest-posttest situation (scores on a test before and after the lesson). The other scenario is one in which an observation in one sample is matched with an observation in the second sample such as when researchers measure the attitudes or behaviors of twins. Testing Hypotheses with Independent Samples The set up for the independent sample t-test is very similar to the single sample t-test. Now however, instead of one sample mean, we have two. Let s look at the hypotheses for the t-test: 271

13.1. Testing a Hypothesis for Dependent and Independent Samples www.ck12.org H 0 : µ 1 = µ 2 H A : µ 1 µ 2 Notice our hypothesis statements have two population means, denoting the fact that we will be testing whether the means of two separate populations are equal to one another. An equivalent way of writing the hypotheses is as follows: H 0 : µ 1 µ 2 = 0 H A : µ 1 µ 2 0 Both methods of writing the hypothesis statements are valid. Let s see how the new hypothesis statements look in our t-statistic formula: Where: t = ( x 1 x 2 ) (µ 1 µ 2 ) SE ( x1 x 2 ) x 1 x 2 is the difference between the sample means µ 1 µ 2 is the difference between the hypothesized population means SE ( x1 x 2 ) is the standard error of the difference between the sample means The standard error of the difference between the sample means is calculated by:se ( x1 x 2 ) = s 2 1 n 1 + s2 2 n 2 This standard error is called an unpooled standard error, because the standard deviation of each sample is considered. Finally, just like the single sample t-test, we ll need to know the degrees of freedom so that we can find the correct critical value. Here s the formula for the degrees of freedom when using an unpooled standard error independent samples t-test: d f = ( s 2 ) 2 1 n + s2 2 1 n2 ( ) 1 s 2 2 ( ) 1 n 1 1 n + 1 s 2 2 2 1 n 2 1 n 2 Nice, right? Don t worry, we will not be using this formula. Technology automatically calculates this formula for us. When we solve the upooled independent samples t-test by hand, the conservative approach is to use the lowest n of the two groups minus one: d f = n lowest 1 The Assumptions of the Independent Samples t-test Just like the single sample t-test, the independent samples t-test has some assumptions that we must consider in order for the test to be valid: A random sample of each population is used. The random samples are each made up of independent observations Each sample is independent of one another The population distribution of each population must be nearly normal, or the size of the sample is large. Notice the new assumption now that we have two means that we are examining, we need to make sure those two means are independent of one another. What dose that independence mean? It means that if an observation is assigned to one group, it cannot also be recorded in the other group. Oftentimes, independent groups are things like: males vs. females, Astros vs. Rangers fans, tall vs. short, etc. 272

www.ck12.org Chapter 13. Two-Sample T Tests Example: Independent t-test The head of the English department is interested in the difference in writing scores between freshman English students who are taught by different teachers. The incoming freshmen are randomly assigned to one of two English teachers and are given a standardized writing test after the first semester. We take a sample of eight students from one class and nine from the other. Is there a difference in achievement on the writing test between the two classes? Here s the data from the two classes: TABLE 13.1: Class 1 Class 2 35 52 51 87 66 76 42 62 37 81 46 71 60 55 55 67 53 Mean 49.44 68.88 Standard Deviation 10.38 12.30 Hypothesis Step 1: Clearly state the Null and Alternative Hypothesis. We will be testing to see if the mean score of the two classes are equal to one another: H 0 : µ 1 = µ 2 H A : µ 1 µ 2 Hypothesis Step 2: Identify the appropriate significance level and confirm the test assumptions. We ll use the standard significance test of 0.05. We were told that students were randomly assigned, and we ll assume that students did not switch classes (for independence), and we ll assume the student score are independent from one another. We ll assume the underlying population of students in each class is nearly normal in the distribution of scores. Hypothesis Step 3: Analyze the data and generate the test statistic. Now we ll use our t-test to get to the analysis. First, our standard error of the difference between the sample means: 10.38 2 SE ( x1 x 2 ) = + 12.302 9 8 SE ( x1 x 2 ) = 11.97 + 18.91 SE ( x1 x 2 ) = 5.557 Now, we will use that SE in the t-test equation: t = ( x 1 x 2 ) (µ 1 µ 2 ) SE ( x1 x 2 ) = (49.44 68.88) (0) 5.557 = 3.498 273

13.1. Testing a Hypothesis for Dependent and Independent Samples www.ck12.org We know that our smallest group has just eight students, so our degrees of freedom is (8-1)=7. The critical value of the t-distribution for 7 degrees of freedom is ± 2.365. Hypothesis Step 4: Interpret your results. Because our calculated t-value is outside the t-critical value (our value falls in the critical region of the t- distribution), we reject our Null hypothesis. We conclude that the populations of students in the two classes significantly differ in their standardized test scores at the end of the semester. As the two classes were randomly assigned, we can plausibly conclude that the difference in the scores was due to the class assignment which class the students were in (and whatever teaching technique was used). Hypotheses with Dependent Samples As we mentioned earlier, dependent sample are a bit different from the independent samples t-test. Another name for the dependent samples t-test is the paired samples t-test. That name should give you a hint as to how the test is different. In some way, the two variables we will be testing will be paired or related to one another. Now, just because we used the word related don t think correlation. This is still a t-test, and we ll be testing questions about the means of variables. Let s see how the dependent samples t-test is different from the indepdent samples. First, the hypothesis statement. In the dependent samples t-test, our hypothesis statement looks like: H 0 : δ = 0 H A : δ 0 That symbol is the Greek letter delta which is the representation of difference. So the hypothesis of the dependent samples t-test is that the difference between two variables is zero. Now, that sure sounds a lot like the hypothesis of an independent samples t-test when we test the difference between two population means. The difference is subtle but important. In the independent samples t-test we were testing the difference between two means we calculated the mean for each group and then compared them. In the dependent samples t-test we are looking at the difference between two variables within a single observation. Let s put this in context of our previous example of standardized scores. We were told that the students were tested at the end of the semester that s just one time of testing. But what if all the students were tested when they came into the class (sort of a basic knowledge test) and then again at the end of the semester. Assuming the test was the same (or very close) both times the student saw it, then any change in the scores from beginning to end, should represent the knowledge gain over the course of the semester. Because all students have two scores one at beginning and one at the end of the semester the dependent samples t-test allows us to take the difference between the two scores for every student. This difference score is only one column of data. What we are really interested in the dependent samples t-test that difference variable Let s look at the t-test formula for a better understanding: t = d δ SE d Where: d is the average of the difference between paired variables SEd is the standard error of the difference variable Notice that d in the numerator of the formula. That s the average of new variable of differences. Again, thinking about our example: If there was no knowledge change over the semester, the average of the differences in scores for each student should be zero that s why our Null hypothesis statement states that delta (the average difference in scores for the entire population of students) is equal to zero. 274

www.ck12.org Chapter 13. Two-Sample T Tests The standard error in the formula is the standard error for the variable representing the difference, which is made up of the standard deviation of the differences. This the standard deviation of the differences formula should look a lot like a simple standard deviation: ( d d ) 2 s d = n 1 And the formula for the standard error is: SEd = s d n Our degrees of freedom are similar to what they were earlier. But this time since the test is only concerned about a new variable of differences for each subject, we can use the (n-1) degrees of freedom formula, where n represents the number of pairs of data. If there were eight students in the class and each was measured twice, we would still have eight difference scores one for each student. The Assumptions of the Dependent Samples t-test The assumptions of the dependent samples t-test are very much like the single sample t-test assumptions after all, the test is only concerned with a single variable (the difference). The sample of differences (hence the sample of paired observations) is random. The paired observations are independent of one another. The distribution of population differences must be nearly normal, or the size of the paired observation sample is large. Let s look at an example of the dependent samples t-test in action to put it all together: A math teacher wants to determine the effectiveness of her statistics lesson. She gives a simple skills test to nine students before the start of class (a pre-test) and the same skills test to the same students at the end of class (a post-test). Here s the data (with some calculations): TABLE 13.2: Student Pre-test Post-test Difference 1 78 80 2 2 67 69 2 3 56 70 14 4 78 79 1 5 96 96 0 6 82 84 2 7 84 88 4 8 90 92 2 9 87 92 5 Mean 3.56 s 4.19 Hypothesis Step 1: Clearly state the Null and Alternative Hypothesis. The statistics instructor is interested in the improvement over the semester, for each of her students. Since there are two measures for each student, and those measures are paired, we ll need to use the dependent 275

13.1. Testing a Hypothesis for Dependent and Independent Samples www.ck12.org samples t-test. We ll assume that the normal state of affaires would be that there s no change due to chance (so our delta is equal to zero). The Null and Alternative would be: H 0 : δ = 0 H A : δ 0 Hypothesis Step 2: Identify the appropriate significance level and confirm the test assumptions. We ll assume the standard significance level of 0.05. We ll assume that the students in her class are random, that they the students are independent of one another, and that the distribution of all possible difference scores in the population are nearly normally distributed. Hypothesis Step 3: Analyze the data. We have the mean and standard deviation for the data not for each time variable (pre and post), but for the difference between the two variables for each student (the column on the right). This column of differences is what we ll use in the test. First, let s calculate the Standard Error: SEd = s d n SEd = 4.19 = 1.40 9 Now, we ll use that in our dependent samples t-test: t = d δ SEd t = 3.56 0 1.40 = 2.54 With this test, we have nine students, so we have (9-1) = 8 degrees of freedom. The t-critical value for 8 degrees of freedom is ± 2.306. Hypothesis Step 4: Interpret your results. As our calculated t-statistic is greater than our t-critical value (our value lies in the critical region), we reject our Null hypothesis and conclude that there was in fact a change in student performance from the Pre-test to the Post-test. Lesson Summary In addition to testing single samples associated with a mean, we can also perform hypothesis tests with two samples. We can test two independent samples (which are samples that do not affect one another) or dependent samples which assume that the samples are related to each other. When testing a hypothesis about two independent samples, we follow a similar process as when testing one random sample. However, when computing the test statistic, we need to calculate the estimated standard error of the difference between sample means. We carry out the test on the means of two independent samples in a similar way as the testing of one random sample. However, we use the following formula to calculate the test statistic: t = ( x 1 x 2 ) (µ 1 µ 2 ) SE.( x 1 x 2 ) with the standard error defined above. We can also test the likelihood that two dependent samples are related. dependent samples, we use the formula: To calculate the test statistic for two 276 d t = δ SEd

www.ck12.org Chapter 13. Two-Sample T Tests Review Questions 1. In hypothesis testing, we have scenarios that have both dependent and independent samples. Give an example of an experiment with (1) dependent samples and (2) independent samples. 2. True or False: When we test the difference between the means of males and females on the SAT, we are using independent samples. 3. A study is conducted on the effectiveness of a drug on the hyperactivity of laboratory rats. Two random samples of rats are used for the study and one group is given Drug A and the other group is given Drug B and the number of times that they push a lever is recorded. The following results for this test were calculated: TABLE 13.3: Drug A Drug B X 75.6 72.8 n 18 24 s 2 12.25 10.24 s 3.5 3.2 3. a. Does this scenario involve dependent or independent samples? Explain. b. What would the hypotheses be for this scenario? c. Calculate the estimated standard error for this scenario. d. What is the test statistic and at an alpha level of.05 what conclusions would you make about the null hypothesis? 4. A survey is conducted on attitudes towards drinking. A random sample of eight married couples is selected, and the husbands and wives respond to an attitude-toward-drinking scale. The scores are as follows: TABLE 13.4: Husbands Wives 16 15 20 18 10 13 15 10 8 12 19 16 14 11 15 12 1. (a) What would be the hypotheses for this scenario? (b) Calculate the estimated standard deviation for this scenario. (c) Compute the standard error of the difference for these samples. (d) What is the test statistic and at an alpha level of.05 what conclusions would you make about the null hypothesis? 277