Testing a claim about a population mean



Similar documents
Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

HYPOTHESIS TESTING: POWER OF THE TEST

Hypothesis Testing --- One Mean

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Name: Date: Use the following to answer questions 3-4:

Hypothesis testing - Steps

Normal Approximation. Contents. 1 Normal Approximation. 1.1 Introduction. Anthony Tanbakuchi Department of Mathematics Pima Community College

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

6: Introduction to Hypothesis Testing

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Hypothesis Test for Mean Using Given Data (Standard Deviation Known-z-test)

Business Statistics, 9e (Groebner/Shannon/Fry) Chapter 9 Introduction to Hypothesis Testing

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

Two-Group Hypothesis Tests: Excel 2013 T-TEST Command

HYPOTHESIS TESTING WITH SPSS:

22. HYPOTHESIS TESTING

Stats for Strategy Fall 2012 First-Discussion Handout: Stats Using Calculators and MINITAB

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Simple Linear Regression Inference

Two Related Samples t Test

Independent t- Test (Comparing Two Means)

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Chapter 2 Probability Topics SPSS T tests

3.4 Statistical inference for 2 populations based on two samples

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Principles of Hypothesis Testing for Public Health

Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives

How To Test For Significance On A Data Set

Biostatistics: Types of Data Analysis

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Testing Hypotheses About Proportions

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Chapter Study Guide. Chapter 11 Confidence Intervals and Hypothesis Testing for Means

1-3 id id no. of respondents respon 1 responsible for maintenance? 1 = no, 2 = yes, 9 = blank

Chapter 7: One-Sample Inference

Chapter 26: Tests of Significance

Using Stata for One Sample Tests

Introduction to Hypothesis Testing OPRE 6301

sensr - Part 2 Similarity testing and replicated data (and sensr) p 1 2 p d 1. Analysing similarity test data.

Hypothesis Testing. Steps for a hypothesis test:

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Chapter 2. Hypothesis testing in one population

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Non-Inferiority Tests for One Mean

Lecture Notes Module 1

Permutation Tests for Comparing Two Populations

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Chapter 4 Statistical Inference in Quality Control and Improvement. Statistical Quality Control (D. C. Montgomery)

Study Guide for the Final Exam

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

Practice problems for Homework 12 - confidence intervals and hypothesis testing. Open the Homework Assignment 12 and solve the problems.

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Point Biserial Correlation Tests

Hypothesis Testing. Hypothesis Testing

Tutorial 5: Hypothesis Testing

In the general population of 0 to 4-year-olds, the annual incidence of asthma is 1.4%

Chapter 4: Statistical Hypothesis Testing

Chapter 7 Section 7.1: Inference for the Mean of a Population

CHAPTER 14 NONPARAMETRIC TESTS

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Module 4 (Effect of Alcohol on Worms): Data Analysis

Descriptive Statistics

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Inclusion and Exclusion Criteria

Chapter 8: Hypothesis Testing for One Population Mean, Variance, and Proportion

t-test Statistics Overview of Statistical Tests Assumptions

NCSS Statistical Software

Final Exam Practice Problem Answers

2 Precision-based sample size calculations

Regression Analysis: A Complete Example

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Section 13, Part 1 ANOVA. Analysis Of Variance

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

WISE Power Tutorial All Exercises

Unit 26 Estimation with Confidence Intervals

Stats Review Chapters 9-10

Testing for differences I exercises with SPSS

The power of a test is the of. by using a particular and a. value of the that is an to the value

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

In the past, the increase in the price of gasoline could be attributed to major national or global

Introduction. Statistics Toolbox

Transcription:

Introductory Statistics Lectures Testing a claim about a population mean One sample hypothesis test of the mean Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the author 2009 (Compile date: Tue May 19 14:50:42 2009) Contents 1 Testing a claim about a population mean 1 1.1 Introduction....... 1 Review.......... 2 1.2 Testing a claims when σ is known......... 2 Use............ 2 Computation...... 3 A complete example.. 3 1.3 Testing a claim when σ is unknown....... 5 A complete example.. 5 Determining required sample size... 7 OC Curves for t-test.. 8 1.4 Summary........ 12 1.5 Additional Examples.. 12 1 Testing a claim about a population mean 1.1 Introduction Example 1. The Carolina Tobacco Company advertised that its best selling non-filtered cigarettes contain at most 40 mg of nicotine, but Consumer Advocate magazine ran tests of 10 randomly selected cigarettes and found the amounts (in mg): 47.3, 39.3, 40.3, 38.3, 46.3, 43.3, 42.3, 49.3, 40.3, 46.3 It s a serious matter to charge that the company advertising is wrong, so the magazine editor wants to be extremely careful not to make a false claim. Can the editor prove that the mean nicotine level is greater than 40 mg? Question 1. What would H 0 and H a be? Question 2. What α should be used? 1

2 of 13 1.2 Testing a claims when σ is known REVIEW What is a hypothesis test? A hypothesis test calculates the probability of observing our sample data assuming the null hypotheses H 0 is true. If the probability is low enough (p-value α), we reject H 0 and have evidence to support our alternative hypothesis H a. Eight simple steps 0. Write down what is known. 1. Determine which type of hypothesis test to use. 2. Check the test s requirements. 3. Formulate the hypothesis: H 0, H a 4. Determine the significance level α. 5. Find the p-value. 6. Make the decision. 7. State the final conclusion. K-T-R-H-S-P-D-C: Know The Right Hypothesis So People Don t Complain What is the p-value? The p-value represents the probability of observing our sample data assuming H 0 is true. We use the sampling distribution to determine if sampling error could explain our observed sample statistic s deviation from H 0 s claim about the parameter. If we decide to reject H 0, the p-value is the actual Type I error for the study data. 1 The p-value is calculated using the test statistic and it s corresponding distribution. Common form of a test statistic (sample statistic) (null hypothesis of parameter) test statistic = (standard deviation of sample statistic) What are the types of errors? Type I error α / p-value occurs when we reject H 0 but H 0 is actually true. The p-value is the Type I error. (ex. Convicting a innocent person when H 0 is innocence.) Type II error β occurs when we fail to reject H 0 but H 0 is actually false. (ex. Letting a criminal go free when H 0 is innocence.) 1.2 Testing a claims when σ is known Often used to help answer: USE 1 The α we use to make our decision is just the maximum Type I error we will accept. It is the significance level, not the actual Type I error. (1)

Testing a claim about a population mean 3 of 13 1. Is the mean of a population equal to µ 0? 2. Is the mean of a population different than µ 0? COMPUTATION One sample hypothesis test, σ known. Definition 1.1 requirements (1) simple random sample, (2) σ known, (3) C.L.T. applies. null hypothesis H 0 : µ = µ 0 alternative hypothesis (1) H a : µ µ 0, (2) H a : µ < µ 0, (3) H a : µ > µ 0 test statistic : described by the z distribution z = x µ 0 σ/ n (2) Question 3. What does the test statistic z represent? Finding the p-value manually Since there is no R function to do hypothesis tests when σ is known, we need to manually calculate the p-value. To find a p-value for any hypothesis test: 1. Calculate the test statistic. 2. Using the distribution of the test statistic: If H a contains <: p-value is the area in the lower tail bounded by the test statistic. If H a contains >: p-value is the area in the upper tail bounded by the test statistic. If H a contains : p-value is double the tail area bounded by the test statistic. If the test statistic is negative use lower tail area. If the test statistic is positive use upper tail area. Question 4. What distribution are we using when finding a p-value? Question 5. What is the distribution of the test statistic z? Back to our original example: A COMPLETE EXAMPLE

4 of 13 1.2 Testing a claims when σ is known Example 2. The Carolina Tobacco Company advertised that its best selling non-filtered cigarettes contain at most 40 mg of nicotine, but Consumer Advocate magazine ran tests of 10 randomly selected cigarettes and found the amounts (in mg): 47.3, 39.3, 40.3, 38.3, 46.3, 43.3, 42.3, 49.3, 40.3, 46.3 It s a serious matter to charge that the company advertising is wrong, so the magazine editor chooses a significance level of α = 0.01. Can the editor prove that the mean nicotine level is greater than 40 mg? Assume nicotine levels are normally distributed and σ = 3.8. Solving the problem: step-by-step Step 0 known information: R: x = c ( 4 7. 3, 3 9. 3, 4 0. 3, 3 8. 3, 4 6. 3, 4 3. 3, 4 2. 3, + 4 9. 3, 4 0. 3, 4 6. 3 ) R: sigma = 3. 8 Step 1 Test: single sample mean test with sigma known. Step 2 Requirements: (1) simple random sample, (2) σ known, (3) CLT. All ok! Step 3 Hypothesis: H 0 : µ = 40mg, H a : µ > 40mg R: mu0 = 40 Step 4 Significance: α = 0.01 Step 5 p-value: must do manually find (1) test statistic z = x µ 0 σ/ n R: x. bar = mean ( x ) R: x. bar [ 1 ] 4 3. 3 R: n = l e n g t h ( x ) R: n [ 1 ] 10 R: z = ( x. bar mu0) / ( sigma / s q r t ( n ) ) R: z [ 1 ] 2. 7462 (2) find p-value: this is one-tailed with H a :>, find area in upper tail: R: p. v a l = 1 pnorm ( z ) R: p. v a l [ 1 ] 0.0030146 Step 6 Decision: since p-val α reject H 0. (0.00301 0.01) Step 7 Conclusion: The sample data supports the claim that the mean level of nicotine is greater than 40 mg. The probability of observing our data by chance if the mean level was 40 mg was true is only 0.00301.

Testing a claim about a population mean 5 of 13 Question 6. What is the probability that we made a Type I Error (rejected the null hypothesis when it is actually true)? 1.3 Testing a claim when σ is unknown One sample hypothesis test, σ not known. Definition 1.2 requirements (1) simple random sample, (2) σ not known, (3) C.L.T. applies. null hypothesis H 0 : µ = µ 0 alternative hypothesis (1) H a : µ µ 0, (2) H a : µ < µ 0, (3) H a : µ > µ 0 test statistic : described by the t-distribution t = x µ 0 s/, df = n 1 (3) n If you re not given the raw data, you must use the test statistic. Question 7. What distribution do you use to calculate the p-value? Question 8. A study was conducted. From the 35 random samples, x = 50.8, s = 2.8. Find the test statistic and p-value if H a : µ > 50. (Check: t = 1.69, p-value=0.0501) If you have the raw data, then use R: One sample hypothesis test (σ unknown): t.test(x, mu=0, alternative="two.sided") x vector of sample data mu represents the null hypothesis value of µ 0 (default 0). alternative H a :"two.sided", <:"less", >:"greater" R Command Back to our original example A COMPLETE EXAMPLE

6 of 13 1.3 Testing a claim when σ is unknown Example 3. The Carolina Tobacco Company advertised that its best selling non-filtered cigarettes contain at most 40 mg of nicotine, but Consumer Advocate magazine ran tests of 10 randomly selected cigarettes and found the amounts (in mg): 47.3, 39.3, 40.3, 38.3, 46.3, 43.3, 42.3, 49.3, 40.3, 46.3 It s a serious matter to charge that the company advertising is wrong, so the magazine editor chooses a significance level of α = 0.01. Can the editor prove that the mean nicotine level is greater than 40 mg? Now σ is not known! Assume nicotine levels are normally distributed. Solving the problem: step-by-step Step 0 known information: R: x = c ( 4 7. 3, 3 9. 3, 4 0. 3, 3 8. 3, 4 6. 3, 4 3. 3, 4 2. 3, + 4 9. 3, 4 0. 3, 4 6. 3 ) Step 1 Test: single sample mean test with sigma NOT known. Step 2 Requirements: (1) simple random sample, (2)σ not known, (3) CLT all ok! Step 3 Hypothesis: H 0 : µ = 40mg, H a : µ > 40mg R: mu0 = 40 Step 4 Significance: α = 0.01 Step 5 p-value: can use R this time R: r e s = t. t e s t ( x, mu = mu0, a l t e r n a t i v e = g r e a t e r ) R: r e s One Sample t t e s t data : x t = 2. 7 4 5 8, df = 9, p value = 0.01132 a l t e r n a t i v e h y p o t h e s i s : t r u e mean i s g r e a t e r than 40 95 p e r c e n t c o n f i d e n c e i n t e r v a l : 41.097 I n f sample e s t i m a t e s : mean o f x 4 3. 3 Step 6 Decision: since p-val α fail to reject H 0. (0.0113 0.01) Step 7 Conclusion: (different than last time!) The sample data does not contradict the claim that the mean level of nicotine is 40 mg at the 0.01 significance level. The likelihood of observing our sample data by chance assuming the mean level is 40 mg was 0.0113. Question 9. Why did we have a different conclusion this time?

Testing a claim about a population mean 7 of 13 DETERMINING REQUIRED SAMPLE SIZE Before you conduct a hypothesis test (post-hoc), you must determine the necessary sample size to detect the smallest effect you are interested in. One sample t-test power Given the following parameters, you can determine the required n: α significance level, typically 0.05. Power=1 β desired power, typically 0.8 or 0.9. h = µ µ 0 smallest difference in actual mean µ from the null hypothesis mean µ 0 you want to detect. (Effect size.) σ population standard deviation (or best estimate). t-test required sample size: power.t.test(delta=h, sd =σ, sig.level=α, power=1 β, type ="one.sample", alternative="two.sided") delta minimum effect size of interest sd estimate of population standard deviation power desired power alternative set to either one.sided or two.sided Example 4. Consumer Advocate magazine believes that the nicotine content in the Carolina Tobacco Company s best selling non-filtered cigarettes contain more than the advertised 40 mg of nicotine per cigarette. A reporter wants to design a study to to determine if the claim is false, how many cigarettes should be randomly sampled?. The editor won t run the article unless the cigarets actually have 43 mg or more of nicotine. It s a serious matter to charge that the company advertising is wrong, so the magazine editor chooses a significance level of α = 0.01. Prior study data indicates that σ = 3.8 mg for these cigarettes. What is the appropriate sample size? Known Info: H 0 : µ = 40, H a : µ > 40 (one sided), minimum effect interested in detecting h = 43 40 = 3 mg, σ = 3.8 mg, α = 0.01, choose Power= 0.8. R: power. t. t e s t ( d e l t a = 3, sd = 3. 8, s i g. l e v e l = 0. 0 1, + power = 0. 8, type = one. sample, a l t e r n a t i v e = one. s i d e d ) One sample t t e s t power c a l c u l a t i o n R Command n = 1 8. 897 d e l t a = 3 sd = 3. 8

8 of 13 1.3 Testing a claim when σ is unknown One Sample t Test, alpha=0.05, Two Tailed Power 1.0 0.9 0.8 0.7 0.6 0.5 0.4 n = 1000 n = 250 n = 100 n = 50 n = 30 n = 20 n = 10 0.3 n = 5 0.2 0.1 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 h σ (NOTE: CLT must apply.) s i g. l e v e l = 0. 0 1 power = 0. 8 a l t e r n a t i v e = one. s i d e d Thus, required sample size is n = 19. Since our original sample only used n = 10, our power was much less than 0.8. OC CURVES FOR T -TEST Definition 1.3 Operating Characteristic Curves. OC curves for various hypothesis tests provide a simple method for quickly estimating the required sample size (ad-hoc) necessary to achieve a desired power for a given hypothesis test. They are also used to estimate the actual Type II error (pos-hoc) once a study has been conducted. Now we will repeat our previous example using OC curves.

Testing a claim about a population mean 9 of 13 One Sample t Test, alpha=0.01, Two Tailed Power 1.0 0.9 0.8 0.7 0.6 0.5 n = 1000 n = 250 n = 100 n = 50 n = 30 n = 20 0.4 n = 10 0.3 n = 5 0.2 0.1 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 h σ (NOTE: CLT must apply.) Example 5. Consumer Advocate magazine believes that the nicotine content in the Carolina Tobacco Company s best selling non-filtered cigarettes contain more than the advertised 40 mg of nicotine per cigarette. A reporter wants to design a study to to determine if the claim is false, how many cigarettes should be randomly sampled?. The editor won t run the article unless the cigarets actually have 43 mg or more of nicotine. It s a serious matter to charge that the company advertising is wrong, so the magazine editor chooses a significance level of α = 0.01. Prior study data indicates that σ = 3.8 mg for these cigarettes. What is the appropriate sample size? Known Info: H 0 : µ = 40, H a : µ > 40 (one sided), minimum effect interested in detecting h = 43 40 = 3 mg, σ = 3.8 mg, α = 0.01, choose Power= 0.8. Question 10. Use the following OC curves to estimate the required sample size.

10 of 13 1.3 Testing a claim when σ is unknown One Sample t Test, alpha=0.05, One Tailed Power 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 n = 1000 n = 250 n = 100 n = 50 n = 30 n = 20 n = 10 n = 5 0.1 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 h σ (NOTE: CLT must apply.) Question 11. The study was conducted with a sample size of only 10, the sample standard deviation for the data was 3.8 mg. Use the OC curves to estimate the actual power and Type II error.

Testing a claim about a population mean 11 of 13 One Sample t Test, alpha=0.01, One Tailed Power 1.0 0.9 0.8 0.7 0.6 0.5 n = 1000 n = 250 n = 100 n = 50 n = 30 n = 20 0.4 n = 10 0.3 n = 5 0.2 0.1 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 h σ (NOTE: CLT must apply.)

12 of 13 1.4 Summary 1.4 Summary Hypothesis test for µ requirements (1) simple random sample, (2) C.L.T. applies. null hypothesis H 0 : µ = µ 0 alternative hypothesis (1) H a : µ µ 0, (2) H a : µ < µ 0, (3) H a : µ > µ 0 test when σ known: z = x µ 0 σ/ n See page 3 for steps to find p-value manually. test when σ unknown: t.test(x, mu=µ 0, alternative="two.sided") x vector of sample data mu represents the null hypothesis value of µ 0 (default 0). alternative H a :"two.sided", <:"less", >:"greater" If your not given the raw data, you must use the test statistic on page 5. 1.5 Additional Examples Example 6. A cereal lobbyist claims that the mean sugar content in cereal (grams of sugar per gram of cereal) is less than -.35 g for all cereals. You believe it is higher. If the mean content is at least 0.45 g you want to counter their claim. Question 12. What is H 0 and H a? Question 13. What sample size should you use to test your claim (Previous data indicates σ = 0.15 g, desired power is 0.8)? Example 7. Different cereals are randomly selected, and the sugar content (grams of sugar per gram of cereal) is obtained for each cereal, with the results given below. Use a 0.05 significance level to test the claim of a cereal lobbyist that the mean for all cereals is less than 0.35 g. 0.03, 0.24, 0.30, 0.47, 0.43, 0.07, 0.47, 0.13, 0.44, 0.39, 0.48, 0.17, 0.13, 0.09, 0.45, 0.43 Question 14. Run the hypothesis test assuming σ = 0.15. 2 2 CHECK: z = 1.47, p-value = 0.0712

Testing a claim about a population mean 13 of 13 Question 15. For the previous question, would would the p-value be if H a : µ 0.35 g? Question 16. Run the hypothesis test assuming σ is unknown. 3 3 CHECK: p-value = 0.105