1 Statistical Tests of Hypotheses

Similar documents
Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Introduction to Hypothesis Testing OPRE 6301

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Hypothesis Testing --- One Mean

Testing a claim about a population mean

HYPOTHESIS TESTING WITH SPSS:

Lesson 9 Hypothesis Testing

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Hypothesis Testing: Two Means, Paired Data, Two Proportions

22. HYPOTHESIS TESTING

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

1 Hypothesis Testing. H 0 : population parameter = hypothesized value:

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Study Guide for the Final Exam

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Chapter 2. Hypothesis testing in one population

1 Nonparametric Statistics

II. DISTRIBUTIONS distribution normal distribution. standard scores

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

3.4 Statistical inference for 2 populations based on two samples

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Hypothesis Testing. Hypothesis Testing

Mind on Statistics. Chapter 12

Introduction to Hypothesis Testing

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Two-sample inference: Continuous data

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

Statistics 2014 Scoring Guidelines

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript

Sample Size and Power in Clinical Trials

Permutation Tests for Comparing Two Populations

Unit 26 Estimation with Confidence Intervals

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

HYPOTHESIS TESTING: POWER OF THE TEST

Name: Date: Use the following to answer questions 3-4:

5.1 Identifying the Target Parameter

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Topic #6: Hypothesis. Usage

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

Independent samples t-test. Dr. Tom Pierce Radford University

Chapter 7 Section 7.1: Inference for the Mean of a Population

Descriptive Statistics

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Regression Analysis: A Complete Example

Testing Hypotheses About Proportions

Chapter 5 Analysis of variance SPSS Analysis of variance

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Hypothesis testing - Steps

Planning sample size for randomized evaluations

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Unit 26: Small Sample Inference for One Mean

Principles of Hypothesis Testing for Public Health

Chapter Study Guide. Chapter 11 Confidence Intervals and Hypothesis Testing for Means

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Tutorial 5: Hypothesis Testing

Chapter 7: One-Sample Inference

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Week 4: Standard Error and Confidence Intervals

Business Statistics, 9e (Groebner/Shannon/Fry) Chapter 9 Introduction to Hypothesis Testing

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Normal distributions in SPSS

Independent t- Test (Comparing Two Means)

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Hypothesis Testing. Reminder of Inferential Statistics. Hypothesis Testing: Introduction

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chi-square test Fisher s Exact test

Section 12 Part 2. Chi-square test

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Chapter 7 TEST OF HYPOTHESIS

Chapter 23 Inferences About Means

Chapter 2 Probability Topics SPSS T tests

Non-Inferiority Tests for Two Means using Differences

Chapter 4: Statistical Hypothesis Testing

Chapter 26: Tests of Significance

Inclusion and Exclusion Criteria

1.5 Oneway Analysis of Variance

SAMPLING & INFERENTIAL STATISTICS. Sampling is necessary to make inferences about a population.

Lecture Notes Module 1

Projects Involving Statistics (& SPSS)

Tests for Two Proportions

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

12: Analysis of Variance. Introduction

Inference for two Population Means

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Transcription:

1 Statistical Tests of Hypotheses Previously we considered the problem of estimating a population with the help of sample data, now we will be checking if claims about the population mean are true, or plausible to a given degree, Since this is statistics and decisions about the population are based on samples, we might make errors when making decisions. You will learn how to control the probabilities to make errors. A hypothesis test is a method for using sample data to decide between two competing claims or hypotheses about a population characteristic. Example: µ 0.5 vs. µ > 0.5 µ = 100 vs. µ 100 µ 175 vs. µ < 175 Definition: The null hypothesis H 0 is a claim about a population characteristic. ( We will try to disprove this hypothesis with the help of sample data) The alternative hypothesis H a is the competing claim and logical compliment of H 0. (When we can disprove H 0, then H a must be correct). In testing H 0 vs. H a : H 0 will be rejected only if the evidence from the sample strongly suggests that H 0 is false. Otherwise H 0 will not be rejected, and we will state that we could not find evidence against the claim. So there are two possible conclusions: reject H 0 (accept H a ) do not reject H 0 Note that these decisions are not symmetric, there is no way you can say you accept H 0. Remark: Hypotheses should be the logical compliment of each other (Warning! This is different from the text book). Common choices of hypotheses are Two-tailed Test H 0 : population characteristic = specific value versus H a : population characteristic specific value Upper-tailed Test H 0 : population characteristic specific value versus H a : population characteristic > specific value 1

Lower-tailed Test H 0 : population characteristic specific value versus H a : population characteristic < specific value In the text book they always choose H 0 : population characteristic = specific value, which they argue is equivalent to the other null hypotheses. The decision would be the same but not the underlying logic. Examples: H 0 : µ = 0.25 versus H a : µ 0.25 H 0 : µ 100 versus H a : µ < 100 We can not test H 0 : µ 100 versus H a : µ > 150 Be careful when choosing hypotheses, because a statistical test can only support the alternative hypothesis, by rejecting H 0. Is H 0 not being rejected doesn t mean strong support for H 0, but lack of strong evidence against H 0. Example: A company is advertising that the average lifetime of their light bulbs is 1000 hours. You might question this, and want to show that in fact the lifetime is shorter. You would test H 0 : µ 1000 versus H a : µ < 1000. Rejection of H 0 would then support your claim. However, nonrejection of H 0 doesn t necessarily provide strong support for the advertised claim. The way the decisions are made, the scientist will choose H a to contain the claim he wants to prove. How to make the decision (reject H 0, or do not reject H 0 ) The decision to reject, or not to reject H 0 is based on information contained in a sample drawn from the population of interest. This information will be given in form of the test statistic (a number that measures, if the sample data is in accordance with H 0 ), or the P-value (the probability for observing this value of the test statistic (or more extreme), if H 0 is true) Assuming that H 0 is true the P-value measures how likely it is to observe such data, as those found in the sample (or more extreme). If the P-value is small, this indicates that the assumption, that H 0 is true, is most likely wrong. Is the P-value not small, this indicates that the sample does not provide evidence against H 0. Use the test statistic or the P-value to make a decision. 2

Example: Suppose µ is the mean time patients stay in a certain hospital (this is the mean of ALL patients). The administration wants to know if patients stay in average less than 5 days in the hospital, without looking up all files. They randomly choose 100 files and find the mean time those patients stayed in the hospital: x = 4.530, and s = 3.678. They use these data to test H 0 : µ 5 versus H a : µ < 5. The sample mean seems to support the alternative hypothesis, but is this enough evidence to reject H 0? To find out, we calculate the test statistic, that will relate the sample value x with the claimed value from the null hypothesis 5. z = x µ 0 s/ n = 4.5 5 3.678/ 100 = 1.28 If H 0 is true this test statistic is approximately standard normal distributed (Central Limit Theorem). So that the value from a random sample can be judged by the standard normal distribution. Lets calculate the probability to observe such a small or even smaller value for z, if H 0 is in fact true (this is then the P-value): P value = P (z 1.28, when H 0 is true) = 0.101 (Table II) There is a 10% chance of observing this value of the test statistic z if H 0 is true. Overall not that unlikely. We decide better not to reject H 0. If the P-value would have been 0, we would have come to the conclusion, that is impossible to observe this data, if H 0 is true and we would have rejected the null-hypothesis for that reason. 1.1 Errors in Hypothesis Testing As there are in criminal trials, there are two different types of errors you can make in statistical testing: In a trial the jury might convict an innocent person, and the other error is to set a guilty person free. 3

Definition: type I error the error of rejecting H 0 even though H 0 is true type II error the error of failing to reject H 0 even though H 0 is false Test Truth H 0 is true H 0 is false reject H 0 type I error OK do not reject H 0 OK type II error The only way to guarantee that neither type of error will occur is to make such decisions on the basis of a census of the entire population. The risk of error is introduced when we try to make an inference on a sample. Definition: The probability of a type I error is denoted by α and is called the level of significance of the test. The probability of a type II error is denoted by β. We would like to ensure with the choice of the method, telling us how to make a decision, that both error probabilities are small. But a mathematical analysis shows that how ever we are making the decision between H 0 and H a the error probabilities behave like a seesaw. When we force one to be small the other goes up. Due to this relationship between the error probabilities, one had to choose to control one and let the other go. It was decided to make sure with the choice for a hypothesis that the P(error of type I) will be be small. Remark: After assessing the consequences of type I and type II errors identify the largest α that is tolerable for the problem. Don t use a too small level of significance, because the smaller α the greater β. Decision Rule: A decision as to whether H 0 should be rejected results now from comparing the P-value to the chosen α. H 0 should be rejected if P-value α. H 0 should not be rejected if P-value > α. Example: A drug is proposed to lengthen the survival time after a specific cancer treatment. To show the efficacy of the new drug a study has to be designed to test the following hypotheses for µ the mean survival time under the new treatment. H 0 : µ mean survival time without new treatment versus H a : µ > mean survival time without new treatment 4

An error of type I would mean to conclude the drug is lengthening the survival time, even though this is not the case. An error of type II would mean to conclude the drug not efficient even though it is. The scientist doing the study, wants to make sure, that this drug is only used if it is really efficient, so she has to limit the probability for the error of type I. she chooses α = 0.01. 1.2 A Large Sample Test for a Population Mean, when σ is known The Test 1. Hypotheses: two tailed: H 0 : µ = µ 0 versus H a : µ µ 0 lower tailed: H 0 : µ µ 0 versus H a : µ < µ 0 upper tailed: H 0 : µ µ 0 versus H a : µ > µ 0 Choose α. 2. Assumption: The data is a large random sample or the sample data come from a normal population and σ is known. 3. Test statistic: z 0 = x µ 0 σ/ n estimated by z 0 x µ 0 s/ n 4. P-value and Rejection Region: Test type P-value Rejection Region Upper tail P (z > z 0 ) z 0 > z α Lower tail P (z < z 0 ) z 0 < z α Two tail 2 P (z > abs(z 0 )) abs(z 0 ) > z α/2 5. Decision: Reject H 0, if and only if p value α or equivalently the value of the test statistic falls into the rejection region. Context: Put the result into context. Example: Assume you have a sample with n = 50, x = 871 and s = 21. Test at a significance level of α = 0.05 the hypotheses: 1. H 0 : µ = 880 versus H a : µ 880 two-tailed with µ 0 = 880 2. We know σ, and we find that the sample size is large. 3. z 0 x µ 0 s/ n = 871 880 21/ 50 = 3.03 5

4. Using α = 0.05 you find the rejection region to equal abs(z 0 ) > z α/2 = 1.96. Since abs(z 0 ) > 1.96, you reject H 0, and conclude that at significance level α = 0.05 µ is not equal to 880. Or equivalently: Find the p-value= 2 P (z > abs(z 0 )) = 2 P (z > 3.03) = 2 (1 0.9988) = 0.0024. 5. Since p-value= 0.0024 < 0.05 = α, we reject the null hypothesis. 6. At significance level of 5% the data provide sufficient evidence that µ 880. Definition: The p-value of a statistical test is the probability to observe the value of the test statistic if in fact H 0 is true. Decision Rule: 1. Find the Rejection Region: If the value of the test statistic falls into this region, reject H 0. or 2. Find the p-value: If p value α holds, reject H 0. The assumption, that we know σ is very strong, since we already assume that we do not know µ. How come we do not know the mean but the standard deviation for the population of interest? For this reason we need a different tool, based on the t-distribution. 1.3 A test for a mean µ, when σ is unknown The test introduces in the section above is based on the z-score, which uses the population standard deviation σ. In most situations σ is unknown and has to be replaced by the sample standard deviation s. Resulting in a procedure that then is only approximate (does not give the true error probability). Reminder Student s t distribution Consider the t-score t = x µ s/ n is t-distributed withdf = n 1, if the sample is large or the population follows a normal distribution. The distribution of the t-score only depends on one parameter, which is called the degrees of freedom (df). Student showed that the t-score is t distributed with n 1 degrees of freedom (df = n 1). The appendix provides a table (Table VI) with values from this distribution for different choices for the df. t-test for a Population Mean µ 6

1. Hypotheses: Test type Upper tail H 0 : µ µ 0 versus H a : µ > µ 0 Lower tail H 0 : µ µ 0 versus H a : µ < µ 0 Two tail H 0 : µ = µ 0 versus H a : µ µ 0 Choose α. 2. Assumption: The sample is a random sample and the population has a normal distribution or the sample is large. 3. Test statistic: t 0 = x µ 0 s/ n with df = n 1 degrees of freedom. 4. P-value and Rejection Region: 5. Decision: Test type P-value Rejection Region Upper tail P (t > t 0 ) t 0 > t α Lower tail P (t < t 0 ) t 0 < t α Two tail 2 P (t > abs(t 0 )) abs(t 0 ) > t α/2 If p-value α then reject H 0 If p-value> α then do not reject H 0 6. Context (1 α) t-confidence Interval for a Population Mean µ x ± t n 1 α/2 s n where t n 1 α/2 is the (1 α/2) percentile of a t-distribution with df = n 1. All that changes is, that you will have to use the critical value of the t-distribution (table IV) and you may use the sample standard deviation instead of pretending you know σ. Example:In recent decades, the mean weight of human males, aged 18 to less than 75, has been 78.1 kg with a standard deviation of 13.5 kg. In a study wether weights are changing, a researcher samples 40 males in that age group and obtains a mean of 82.3 kg with a standard deviation of 15.7 kg. At significance level of 5% can the researcher conclude that the mean weight has increased? 7

1. H 0 : µ 78.1 versus H a : µ > 78.1, where µ is the mean weight of males aged 18 to 75. α = 0.05. 2. The sample size is large enough, and we will assume that the participants were randomly chosen. 3. t 0 = 82.3 78.1 15.7/ 40 = 1.69, df = 39 4. This is an upper tail test, so the p-value is the upper tail probability. Use df = 39 in the table. Then 1.69 falls between 1.685 and 2.023, giving that: 0.025 <p-value< 0.05 5. Since the p-value is smaller than α = 0.05, reject H 0 6. At significance level of 5% that data provide sufficient evidence that the mean weight of males aged between 18 and 75 increased lately. 8