Unit 27: Comparing Two Means



Similar documents
Unit 26: Small Sample Inference for One Mean

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Name: Date: Use the following to answer questions 3-4:

Mind on Statistics. Chapter 13

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Chapter 7 Section 7.1: Inference for the Mean of a Population

Unit 31: One-Way ANOVA

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

Chapter 23 Inferences About Means

Recall this chart that showed how most of our course would be organized:

P(every one of the seven intervals covers the true mean yield at its location) = 3.

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Online 12 - Sections 9.1 and 9.2-Doug Ensley

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

= $96 = $24. (b) The degrees of freedom are. s n For the mean monthly rent, the 95% confidence interval for µ is

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

STAT 145 (Notes) Al Nosedal Department of Mathematics and Statistics University of New Mexico. Fall 2013

How To Check For Differences In The One Way Anova

Paired 2 Sample t-test

3.4 Statistical inference for 2 populations based on two samples

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

CHAPTER 14 NONPARAMETRIC TESTS

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Mind on Statistics. Chapter 12

Regression Analysis: A Complete Example

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Hypothesis Test for Mean Using Given Data (Standard Deviation Known-z-test)

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

STAT 350 Practice Final Exam Solution (Spring 2015)

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Two-sample hypothesis testing, II /16/2004

Statistics 2014 Scoring Guidelines

Case Study Call Centre Hypothesis Testing

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Testing Hypotheses About Proportions

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

9 Testing the Difference

MTH 140 Statistics Videos

The Variability of P-Values. Summary

Exploratory data analysis (Chapter 2) Fall 2011

a) Find the five point summary for the home runs of the National League teams. b) What is the mean number of home runs by the American League teams?

Tutorial 5: Hypothesis Testing

Section 8-1 Pg. 410 Exercises 12,13

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Walk the Line Written by: Maryann Huey Drake University

Chapter 7 Section 1 Homework Set A

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Chapter 7: Simple linear regression Learning Objectives

2 Sample t-test (unequal sample sizes and unequal variances)

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Sample Size and Power in Clinical Trials

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Mind on Statistics. Chapter 15

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 23. Inferences for Regression

HYPOTHESIS TESTING WITH SPSS:

AP Stats Chapter 13. Section 13.1: Comparing Two Means. Characteristics of two sample problems: What are the three conditions for comparing two means?

1. How different is the t distribution from the normal?

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Data Analysis Tools. Tools for Summarizing Data

One-Way Analysis of Variance

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Solutions to Homework 3 Statistics 302 Professor Larget

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Solutions to Homework 6 Statistics 302 Professor Larget

Mind on Statistics. Chapter 10

ISyE 2028 Basic Statistical Methods - Fall 2015 Bonus Project: Big Data Analytics Final Report: Time spent on social media

Need for Sampling. Very large populations Destructive testing Continuous production process

Math 108 Exam 3 Solutions Spring 00

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Descriptive Statistics

AP STATISTICS (Warm-Up Exercises)

NCSS Statistical Software

Testing a Hypothesis about Two Independent Means

Simple Linear Regression

Introduction to Hypothesis Testing

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Simple Linear Regression Inference

Final Exam Practice Problem Answers

MATH 140 HYBRID INTRODUCTORY STATISTICS COURSE SYLLABUS

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Chapter 9: Two-Sample Inference

12: Analysis of Variance. Introduction

Chapter 2 Simple Comparative Experiments Solutions

STA 130 (Winter 2016): An Introduction to Statistical Reasoning and Data Science

Chapter 8 Section 1. Homework A

Transcription:

Unit 27: Comparing Two Means Prerequisites Students should have experience with one-sample t-procedures before they begin this unit. That material is covered in Unit 26, Small Sample Inference for One Mean. Additional Topic Coverage Additional coverage of two-sample t-procedures can be found in The Basic Practice of Statistics, Chapter 19, Two-Sample Problems. Activity Description In this activity, students will check whether the mean number of chips per cookie in Nabisco s Chips Ahoy regular and reduced fat chocolate chip cookies differ. In Unit 25 s activity, students collected data on the number of chips per cookie from Chips Ahoy regular chocolate chip cookies. They can use those data in this activity. In addition, if they have not already done so, they will need to collect data on the number of chips per cookie from one or more bags of Chips Ahoy reduced fat chocolate chip cookies. Materials One or two bags each of Nabisco s Chips Ahoy chocolate chip cookies, regular and reduced fat; paper plates or paper towels. If students have not collected the data as part of Unit 25 s activity, then they should be assigned to work in groups of 2 to 4 students to collect the data for this activity. If you need to have a group of 3, students in the group can trade off being chip counters. See Unit 25 s Activity Description for suggestions on collecting these data. The sample data, on which sample solutions to the activity are based, are given in Table T27.1. or the sample data, two Unit 27: Comparing Two Means aculty Guide Page 1

independent counts were taken on the number of chips in each cookie, and then the results were averaged. Original Chip Count (averaged from two independent counts) 18.5 17.0 14.0 14.5 15.0 13.5 16.5 15.5 19.0 16.0 20.5 23.5 17.5 18.5 21.5 18.5 22.5 15.0 16.0 17.5 12.0 14.5 12.5 12.0 20.5 21.5 22.5 24.0 18.5 16.5 22.5 22.0 18.5 18.5 21.5 20.0 17.5 16.5 17.5 17.5 19.5 21.5 24.5 18.0 23.0 20.5 19.5 25.0 19.0 20.0 22.0 19.0 21.5 18.0 14.5 17.0 21.0 10.5 18.0 18.0 20.0 13.5 23.5 16.5 19.5 Reduced Chip Count (averaged from two independent counts) 11.0 7.5 5.5 8.5 9.5 15.5 20.0 14.0 18.5 21.5 18.5 18.0 19.5 16.0 21.0 19.0 17.0 21.5 13.5 14.5 14.0 18.5 11.5 13.0 19.0 20.0 16.0 18.0 14.0 20.0 9.0 13.0 11.5 14.5 13.5 11.5 13.0 18.0 19.0 21.5 16.5 12.0 14.0 14.0 11.0 14.5 15.0 21.0 21.5 17.5 15.5 16.5 14.5 16.0 16.5 10.5 15.5 16.0 15.0 9.0 11.5 8.5 15.0 13.5 17.0 15.0 9.5 5.5 12.5 17.0 19.5 20.5 16.0 20.0 19.0 15.5 19.5 Table T27.1. Sample data. Students will need a copy of the class data for questions 2 5. If you decide not to collect the data (a task students really enjoy), have the class use the sample data in Table T27.1, which was collected in two statistics classes (each class got a bag of each type of cookie). Unit 27: Comparing Two Means aculty Guide Page 2

The Video Solutions 1. Sample answer: Take a sample from all licensed drivers in some state and look at the number of tickets the males and females in this group received for moving violations. We could then compare the mean number of tickets for women to the mean number of tickets for men. 2. The Hadza are a group of traditional hunter-gatherers who live in a way that is very similar to our ancestors. The men hunt with bows and arrows and the women forage for berries and root vegetables. 3. The original assumption was that the Hadza would burn more calories than the Westerners do, due to their more active lifestyle. 4. A two-sample t-test was used. 5. It turned out that no significant difference was found between the mean daily total energy expenditures of the Hadza and the Westerners. So, the researchers original assumption that the Hadza burned more calories than Westerners due to a more active lifestyle was not supported by the data. 6. He placed the blame on people eating too much rather than on a sedentary lifestyle. Unit 27: Comparing Two Means aculty Guide Page 3

Unit Activity Solutions 1. See question 3. 2. a. Sample answer: Since chocolate chips have fat in them, we think that Nabisco may put fewer chips in its reduced fat cookies as a way of reducing the fat content. b. Sample answer: Given our answer in (a), we decided to use a one-sided alternative. Let µ 1 = the mean number of chips per cookie in regular Chips Ahoy chocolate chip cookies and let µ 2 = the mean number of chips per cookie in reduced fat Chips Ahoy chocolate chip cookies. Below are the null and alternative hypotheses: H 0 : µ 1 µ 2 = 0 H a : µ 1 µ 2 > 0 3. Sample data: Original Chip Count (averaged from two independent counts) 18.5 17.0 14.0 14.5 15.0 13.5 16.5 15.5 19.0 16.0 20.5 23.5 17.5 18.5 21.5 18.5 22.5 15.0 16.0 17.5 12.0 14.5 12.5 12.0 20.5 21.5 22.5 24.0 18.5 16.5 22.5 22.0 18.5 18.5 21.5 20.0 17.5 16.5 17.5 17.5 19.5 21.5 24.5 18.0 23.0 20.5 19.5 25.0 19.0 20.0 22.0 19.0 21.5 18.0 14.5 17.0 21.0 10.5 18.0 18.0 20.0 13.5 23.5 16.5 19.5 Reduced Chip Count (averaged from two independent counts) 11.0 7.5 5.5 8.5 9.5 15.5 20.0 14.0 18.5 21.5 18.5 18.0 19.5 16.0 21.0 19.0 17.0 21.5 13.5 14.5 14.0 18.5 11.5 13.0 19.0 20.0 16.0 18.0 14.0 20.0 9.0 13.0 11.5 14.5 13.5 11.5 13.0 18.0 19.0 21.5 16.5 12.0 14.0 14.0 11.0 14.5 15.0 21.0 21.5 17.5 15.5 16.5 14.5 16.0 16.5 10.5 15.5 16.0 15.0 9.0 11.5 8.5 15.0 13.5 17.0 15.0 9.5 5.5 12.5 17.0 19.5 20.5 16.0 20.0 19.0 15.5 19.5 Unit 27: Comparing Two Means aculty Guide Page 4

Regular: n 1 = 65, x 1 = 18.462, s 1 = 3.308 Reduced at: n 2 = 77, x 2 = 15.266, s 2 = 3.941 4. Based on comparative boxplots, it appears that Chips Ahoy regular has more chips per cookie than Chips Ahoy reduced fat. Original Cookie Type Reduced at 5 10 15 Number of Chips 20 25 5. a. Sample answer: (18.462 15.266) 0 t = 5.25 3.308 2 3.9412 65 77 b. Sample answer: Using a conservative approach, we assume that df = 65 1 = 64. rom software and using a one-sided alternative, we get a p-value that is essentially 0 (at least to 4 decimals). c. Sample answer: Reject the null hypothesis. Conclude that the difference in the mean number of chips per cookie between the two types of cookies is positive. Hence, the mean number of chips per cookie in Chips Ahoy regular cookies is greater than the mean number of chips per cookie in Chips Ahoy reduced fat cookies. 6. Sample answer: (18.462 15.266) ± (1.998) (1.981, 4.411) (3.308) 2 65 (3.941)2 77 3.196 ±1.215, or Unit 27: Comparing Two Means aculty Guide Page 5

Exercise Solutions 1. a. Let µ 1 be the mean job performance rating for pregnant employees and µ 2 be the mean job performance rating for non-pregnant female employees. H 0 : µ 1 µ 2 = 0 H a : µ 1 µ 2 0 (2.38 2.69) 0 b. t = 2.10 1.10 2 71 0.582 71 Using a two-sided alternative, and letting df = 71-1 = 70 (the conservative approach), we determine p = 2(0.01967) 0.039. (If we use the statistical package Minitab, we get df = 106 and p = 0.038. The conservative approach yields a slightly higher p-value.) We conclude that there is a statistical difference in the mean job performance ratings for the two groups. c. Using the conservative approach, we set df = 70 in order to determine the t-critical value: t* 1.994. 1.10 0.58 (2.38 2.69) ± (1.994) 0.31± 0.294, or (-0.604, -0.016) 71 71 Since both endpoints of the confidence interval are negative, the mean job performance appraisal rating for pregnant employees is lower than for non-pregnant female employees. In this case, lower ratings are associated with better job performance. 2. a. A one-sample paired t-test should be used. The during and after data involve the same group of women rather than samples from two independent populations. b. t = 0.27 0 1.138 ; 2.00 71 based on a t-distribution with df = 70 and a two-sided alternative, p = 2(0.1295) 0.259. The mean difference in job performance appraisal ratings during pregnancy and after returning from pregnancy leave do not differ significantly. Unit 27: Comparing Two Means aculty Guide Page 6

3. a. Sample answer: It is somewhat difficult to tell if the distribution of SAT Math scores is higher for males than for females. rom the dotplots, the distribution of SAT Math scores is more variable for males than for females. The highest SAT Math scores are from males but the lowest are also from males. Nevertheless, the midpoint of the distribution for the male students appears higher than the midpoint of the distribution for the female students. Gender emale Male 400 440 480 520 560 Math SAT 600 640 680 b. Consider the normal quantile plots below. Given the dots fall within 95% confidence interval bands on the normal quantile plots, it is reasonable to assume the data from both samples come from approximately normal distributions. Normal Quantile Plots Normal - 95% CI 99 Math SAT emales 99 400 600 Math SAT Males 800 95 90 95 90 Percent 80 70 60 50 40 30 20 80 70 60 50 40 30 20 10 5 10 5 1 400 500 600 1 c. x = 493.5, s = 44.64 ; x M = 544.0, s M = 80.6 d. t (544.0 493.5) 0 = 2.45 ; using df = 19, we get the following p-value: p 0.012 80.6 44.64 20 20 Therefore, we can conclude that the mean SAT Math scores are significantly higher for firstyear male students entering this university than for first-year female students. Unit 27: Comparing Two Means aculty Guide Page 7

4. a. H 0 : µ G µ B = 0 versus H a : µ G µ B 0. (Could also be expressed as H 0 : µ G = µ B versus H a : µ G µ B.) b. t (494 409) 0 = 2.023 (172) (148) 31 27 c. Use df = 27 1, or 26; p = 2(0.02673) 0.053. Since p > 0.5, the results are not significant at the 0.05 level. d. 2 172 148 = 31 27 df 55.9948 ; use df = 55. 1 172 1 148 30 31 26 27 p = 2(0.02397) 0.048. Since p < 0.05, the results are significant at the 0.05 level. Notice that the p-value in (c) is only slightly larger using the conservative approach than it is here. Unit 27: Comparing Two Means aculty Guide Page 8

Review Questions Solutions 1. The men in the sample earned more, on average, than did the women. The difference in the sample means was large enough that it couldn t be attributed to chance variation. If we assume that men and women in the entire student population have the same average earnings, the probability of observing a difference as large as we observed is only p = 0.038, making it pretty unlikely (less than a 4% chance). Because this probability is so small, we have good evidence that the average earnings of all male and female students (not just those who happen to be in the samples) differ. There was also some difference between the average earnings of African- American and white students in the sample. However, a difference at least this large would happen almost half the time (p = 0.476) just by chance if African Americans and whites in the entire student population had exactly the same average earnings. 2. a. Sample answer: No mention was made that the samples were randomly selected from each area. The researchers wanted to be sure that the differences in pulmonary function could be attributed to the pollution levels in the two areas and not due to a lurking variable such as age, height, weight or BMI that was different for the men in the two groups of participants in the study. b. t (4.49 4.32) 0 = 2.116 ; using df = 59, we get p = 2(0.01929) 0.039 0.43 0.45 60 60 We conclude that there is a statistically significant difference in mean forced vital capacity (VC) in healthy, non-smoking, young men in Area 1 and Area 2. (Because the sample size was large, even a relatively small difference in sample means turned out to be statistically different.) c. t (17.17 16.28) 0 = 1.411 ; using df = 59, we get p = 2(0.08175) 0.164 4.26 2.39 60 60 We conclude that there is no significant difference in the mean respiratory rate in healthy, non-smoking, young men in Areas 1 and 2. Unit 27: Comparing Two Means aculty Guide Page 9

d. We construct a 95% confidence interval for the difference in mean VC between the two areas. We use df = 59 (conservative approach) to find the t-critical value: t* = 2.001. 0.43 0.45 (4.49 4.32) ± (2.001) 0.17 ± 0.16 60 60 or (0.01, 0.33). 3. a. Both distributions appear reasonably symmetric. Neither data set has outliers. The whiskers on the boxplots appear slightly longer than half the width of the boxes. These are characteristics that you would expect from normal data. So, based on the boxplots it appears reasonable to assume that both data sets are approximately normal. Although the SAT Writing scores for the male students are more spread out than for the female students, it looks as if the SAT Writing scores for the female students tend to be higher than for the male students. emale Gender Male 350 400 450 500 550 SAT Writing Scores 600 650 b. emale students: n = 25, x = 541.20, s = 48.59 Male students: n M = 35, x = 507.10, s M = 73.70 c. t (541.20 507.1) 0 = 2.16 (48.59) (73.7) 25 35 Adopting a conservative approach, we use df = 24 to determine a p-value: We get p = 2(0.02049) 0.041 < 0.05. We conclude that there is a significant difference between the mean SAT Writing scores for first-year women at this university compared to first-year men. 48.59 73.7 d. (541.20 507.10) ± 2.064 34.10 ± 32.61, or (1.49, 66.71) 25 35 Unit 27: Comparing Two Means aculty Guide Page 10

4. a. H 0 : µ B µ G = 0 versus H a : µ B µ G > 0. (Could also be expressed as H 0 : µ B = µ G versus H 0 : µ B = µ G.) b. (92.3 80.8) 0 t = 1.05 ; using df = 26, p = 0.152 42.0 41.4 27 31 There is insufficient evidence to reject the null hypothesis in favor of the alternative. Hence, we are unable to conclude from these data that 4-year-old boys consume more mouthfuls of food during a meal than 4-year-old girls. Unit 27: Comparing Two Means aculty Guide Page 11