General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.



Similar documents
Name: Date: Use the following to answer questions 3-4:

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Unit 26: Small Sample Inference for One Mean

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

ISyE 2028 Basic Statistical Methods - Fall 2015 Bonus Project: Big Data Analytics Final Report: Time spent on social media

Unit 27: Comparing Two Means

Math 108 Exam 3 Solutions Spring 00

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Confidence Intervals for the Difference Between Two Means

3.4 Statistical inference for 2 populations based on two samples

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Mind on Statistics. Chapter 13

Independent t- Test (Comparing Two Means)

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Two-sample hypothesis testing, II /16/2004

Chapter 23 Inferences About Means

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

2 Precision-based sample size calculations

Unit 26 Estimation with Confidence Intervals

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Two Related Samples t Test

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Difference of Means and ANOVA Problems

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Chapter 7 Section 1 Homework Set A

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Chapter 7: Simple linear regression Learning Objectives

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

HYPOTHESIS TESTING: POWER OF THE TEST

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Regression Analysis: A Complete Example

Chapter 9: Two-Sample Inference

Factors affecting online sales

Statistics 2014 Scoring Guidelines

CHAPTER 14 NONPARAMETRIC TESTS

Inference for two Population Means

P(every one of the seven intervals covers the true mean yield at its location) = 3.

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

A) B) C) D)

Tutorial 5: Hypothesis Testing

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MTH 140 Statistics Videos

STAT 350 Practice Final Exam Solution (Spring 2015)

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Recall this chart that showed how most of our course would be organized:


Statistics 151 Practice Midterm 1 Mike Kowalski

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

6.2 Normal distribution. Standard Normal Distribution:

Chapter Four. Data Analyses and Presentation of the Findings

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

Two-sample inference: Continuous data

1) The table lists the smoking habits of a group of college students. Answer: 0.218

Simple Linear Regression Inference

Chapter 8 Section 1. Homework A

a) Find the five point summary for the home runs of the National League teams. b) What is the mean number of home runs by the American League teams?

NCSS Statistical Software

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

1.5 Oneway Analysis of Variance

Chapter 7 Section 7.1: Inference for the Mean of a Population

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Chapter 23. Inferences for Regression

2 Sample t-test (unequal sample sizes and unequal variances)

11. Analysis of Case-control Studies Logistic Regression

Final Exam Practice Problem Answers

Section 13, Part 1 ANOVA. Analysis Of Variance

Name: (b) Find the minimum sample size you should use in order for your estimate to be within 0.03 of p when the confidence level is 95%.

Chapter 8 Paired observations

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

An analysis method for a quantitative outcome and two categorical explanatory variables.

Solution to HW - 1. Problem 1. [Points = 3] In September, Chapel Hill s daily high temperature has a mean

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Significance, Meaning and Confidence Intervals

12: Analysis of Variance. Introduction

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Lesson 17: Margin of Error When Estimating a Population Proportion

= $96 = $24. (b) The degrees of freedom are. s n For the mean monthly rent, the 95% confidence interval for µ is

Lecture Notes Module 1

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Chapter 7 - Practice Problems 1

7. Comparing Means Using t-tests.

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Transcription:

General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1. 4. For a hypothesis test, t = x 1 x 2 SE ; convert to a P-value using table of t statistics 5. For a confidence interval, calculate critical value t corresponding to desired confidence level (e.g. in our example we had df = 13, confidence level 95%, led to t = 2.16). Then the confidence interval is x 2 x 1 ± t SE. 1

Example (Question 10.28, page 493) Following are the numbers of newspapers read by a sample of women and of men Women: 5,3,6,3,7,1,1,3,0,4,7,2,2,7,3,0,5,0,4,4,5,14,3,1,2,1,7,2,5,3,7 Men: 0,3,7,4,3,2,1,12,1,6,2,2,7,7,5,3,14,3,7,6,5,5,2,3,5,5,2,3,3 (a) Construct and interpret a plot comparing responses of males and females (b) Construct and interpret a 95% confidence intervals comparing populations means (c) Show the five steps of a significance test comparing populations means (d) State and check the assumptions 2

MEN WOMEN 0 2 4 6 8 10 12 14 Newspapers Read Box plot for number of newspapers read by women and by men. 3

(a) See boxplots; male and female distributions look very similar (b) With suffix 1 indicating women, 2 for men: x 1 = 3.774, x 2 = 4.414, s 1 = 2.929, s 2 = 3.100, n 1 = 31, n 2 = 29, SE = s 2 1 n + s2 2 1 n = 0.78. We have df = 57.1 according to Welch- 2 Satterthwaite formula, df = 28 by simpler formula. Based on df = 28, the critical value of t for a 95% confidence interval is t = 2.048, so the 95% confidence interval for µ 2 µ 1 is 4.414 3.774 ± 2.048 0.78 = ( 0.957, 2.237). (c) The t statistic for a hypothesis test is x 2 x 1 SE = 0.82; this is well below the critical value for a test at significance level.05 (the critical value is t = 2.048, as in part 2) so we DO NOT REJECT the null hypothesis that the means for men and women are equal. 4

(d) The assumptions require independence of the two samples (probably more or less correct); randomness of the two samples (depends on how the samples were obtained); and approximately normal distributions for the samples themselves (probably true to a reasonable approximation). 5

Paired Comparison Tests Consider the following dataset, based on the midterm and final exam scores of a recent course of mine (not STOR 151!): Student 1 2 3 4 5 6 7 8 9 10 Midterm 86 100 90 90 94 73 76 76 95 87 Final 84 95 77 83 70 76 54 81 90 84 Difference 2 5 13 7 24 3 22 5 5 3 Mean midterm score=86.7; Mean final score=79.4; Difference 7.3 Is this significant evidence of a difference? 6

We could apply the same test as previously, which would lead to x 1 = 86.7, x 2 = 79.4, s 1 = 9.06, s 2 = 11.37, 9.06 2 SE = 10 + 11.372 = 4.60, 10 86.7 79.4 t = = 1.59, 4.60 df = 9. The two-sided P-value is 0.15 this is greater than 0.05, therefore the result is not significant. However, there is an error in this calculation... 7

The two samples are not independent since they represent scores from the same students. Instead, we apply a matched pairs test: 1. Calculate the difference in scores for each student. 2. Carry out a single-sample test for the null hypothesis that the mean difference is 0. 8

Details The 10 differences 2, 5, 13, 7, 24, 3, 22, 5, 5, 3 have a mean x = 7.3 and a standard deviation s = 9.67. The standard error is 9.67 10 = 3.06. The t statistic is 7.3 3.06 = 2.39. The corresponding two-sided P-value is 0.04. Therefore, the result is statistically significant 9

Message of this example: The standard two-sample test is valid only when the two samples are independent (along with other assumptions: quantitative variables, random sampling, approximately normal distributions) When the observations are directly paired (e.g. two exam scores from the same student, two medical results from the same patient, etc.) it is possible to apply a paired comparison test instead. The main difference is a different method of computing the standard error. This has implications for both hypothesis testing and confidence interval calculations. 10

Example: Problem 10.47, page 511/512. In a study to determine whether exercise reduced blood pressure, a sample of three patients was tested, with the following results: Subject Before After 1 150 130 2 165 140 3 135 120 (a) Explain why the three before observations and the three after observations are dependent samples. (b) Find the sample mean of the before scores, sample mean of the after scores, and the sample mean of d = before after. (c) Find a 95% confidence interval for the mean difference, and interpret it. 11

(a) They are from the same subjects; it is a matched pairs design. (b) 150+165+135 3 = 150; 130+140+120 3 = 130; difference 20. (c) The three differences 20, 25, 15 have mean 20 and standard 5 2 +0+5 2 deviation 2 = 5. The standard error is 5 = 2.887. 3 Based on the t table with df = 2, the critical value of t for a 95% confidence interval is 4.303. Therefore, a 95% confidence interval for the difference is 20 ± 4.303 2.887 = (7.58, 32.42). This does not contain 0, so despite the very small sample size, the difference is statistically significant. 12