HYPOTHESIS TESTING AND TYPE I AND TYPE II ERROR

Similar documents
HYPOTHESIS TESTING: POWER OF THE TEST

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Hypothesis testing - Steps

Estimation of σ 2, the variance of ɛ

Independent samples t-test. Dr. Tom Pierce Radford University

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

DDBA 8438: Introduction to Hypothesis Testing Video Podcast Transcript

Correlational Research

Hypothesis Testing. Hypothesis Testing

Lesson 9 Hypothesis Testing

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Hypothesis Testing --- One Mean

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to Hypothesis Testing OPRE 6301

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives

Confidence intervals

Interaction between quantitative predictors

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

Simple Linear Regression Inference

Hypothesis Testing. Reminder of Inferential Statistics. Hypothesis Testing: Introduction

Two Related Samples t Test

Section 13, Part 1 ANOVA. Analysis Of Variance

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Sample Size and Power in Clinical Trials

Simple Regression Theory II 2010 Samuel L. Baker

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Difference of Means and ANOVA Problems

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Descriptive Statistics

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Pearson's Correlation Tests

How To Test For Significance On A Data Set

Study Guide for the Final Exam

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

II. DISTRIBUTIONS distribution normal distribution. standard scores

Introduction to Hypothesis Testing

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

Lecture Notes Module 1

In the past, the increase in the price of gasoline could be attributed to major national or global

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Principles of Hypothesis Testing for Public Health

Statistics 2014 Scoring Guidelines

6: Introduction to Hypothesis Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Chapter 2. Hypothesis testing in one population

3.4 Statistical inference for 2 populations based on two samples

Hypothesis Testing for Beginners

Chapter Study Guide. Chapter 11 Confidence Intervals and Hypothesis Testing for Means

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

WISE Power Tutorial All Exercises

22. HYPOTHESIS TESTING

Testing a claim about a population mean

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Point Biserial Correlation Tests

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Test Positive True Positive False Positive. Test Negative False Negative True Negative. Figure 5-1: 2 x 2 Contingency Table

Using Excel for inferential statistics

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Chapter 7 TEST OF HYPOTHESIS

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Statistics Review PSY379

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Tests for One Proportion

Two-sample hypothesis testing, II /16/2004

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

This chapter discusses some of the basic concepts in inferential statistics.

Name: (b) Find the minimum sample size you should use in order for your estimate to be within 0.03 of p when the confidence level is 95%.

Step 6: Writing Your Hypotheses Written and Compiled by Amanda J. Rockinson-Szapkiw

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

p ˆ (sample mean and sample

1-3 id id no. of respondents respon 1 responsible for maintenance? 1 = no, 2 = yes, 9 = blank

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

November 08, S8.6_3 Testing a Claim About a Standard Deviation or Variance

2 Sample t-test (unequal sample sizes and unequal variances)

Two Correlated Proportions (McNemar Test)

Minimum LM Unit Root Test with One Structural Break. Junsoo Lee Department of Economics University of Alabama

Testing Research and Statistical Hypotheses

Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory

CHAPTER 14 NONPARAMETRIC TESTS

Introduction. Statistics Toolbox

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Transcription:

HYPOTHESIS TESTING AND TYPE I AND TYPE II ERROR Hypothesis is a conjecture (an inferring) about one or more population parameters. Null Hypothesis (H 0 ) is a statement of no difference or no relationship and is the logical counterpart to the alternative hypothesis. Alternative Hypothesis (H 1 or H a ) claims the differences in results between conditions is due to the independent variable. Also known as the research hypothesis and covers possible outcome not covered by the null hypothesis. Hypothesis Testing involves making inferences about the nature (or reality) of the population on the basis of observations of a sample drawn from the population. Type I Error (also known as alpha,α) is defined as a decision to reject the null hypothesis when the null hypothesis is true. Type II Error (also known as beta,β) is defined as a decision to retain (or fail to reject) the null hypothesis when the null hypothesis is false. POSSIBLE OUTCOMES (CONCLUSIONS) IN HYPOTHESIS TESTING STATE OF REALITY H 0 IS TRUE H 0 IS FALSE DECISION RETAIN H 0 (CI, 1 ) TYPE II ERROR (β) MADE REJECT H 0 TYPE I ERROR (α) (POWER, 1 β) When we retain (or fail to reject) the null hypothesis we are concluding that there is no difference between the treatment groups or no relationship between the variables. Retaining H 0 we can only make a Correct Decision (when the H 0 is in reality true) or a Type II Error (when the H 0 is in reality false). When we reject (or fail to support) the null hypothesis we are concluding that there is a difference between the treatment groups or there is a relationship between the variables. Rejecting H 0 we can only make a Type I Error (when the H 0 is in reality true) or a Correct Decision (when the H 0 is in reality false). Type I and Type II errors are inversely related: The smaller the risk of one, the greater the risk of the other. By minimizing the probability of making these errors, we maximize the probability of concluding correctly, regardless of whether the null hypothesis is true or false. By controlling the alpha level we can minimize the probability of making a Type I error.

STATE OF REALITY H 0 IS TRUE H 0 IS FALSE DECISION MADE RETAIN H 0 REJECT H 0 (CI, 1 ) TYPE I ERROR (α) TYPE II ERROR (β) (POWER, 1 β) If in reality the Null Hypothesis (H 0 ) is TRUE, there is NO significant difference between (among) groups. If in reality the Null Hypothesis (H 0 ) is FALSE, this is a significant difference between (among) groups. If our decision is to RETAIN the Null Hypothesis (H 0 ), we are concluding that there is NO significant difference between (among) groups. If our decision is to REJECT the Null Hypothesis (H 0 ), we are concluding that there is a significant difference between (among) groups.

IN REALITY WHAT WE CONCLUDE We retain the Null Hypothesis (H 0 ) We reject the Alternative Hypothesis (H a ) We say There is no relationship There is no difference, no gain There is no effect Our theory is wrong We reject the Null Hypothesis (H 0 ) We accept the Alternative Hypothesis (H a ) We say There is a relationship There is a difference, is a gain There is an effect Our theory is correct H 0 (Null Hypothesis) is True H a (Alternative Hypothesis) is False In Reality There is no relationship There is no difference, no gain, no effect Our theory is wrong 1 (e.g.,.95) CONFIDENCE LEVEL/INTERVAL The odds of saying there is no relationship, difference, gain, when in fact there is none. The odds of correctly not confirming our theory. 95 times out of 100 when there is no effect, we will say there is none. (e.g.,.05) TYPE I ERROR (SIGNIFICANCE LEVEL) The odds of saying there is a relationship, difference, gain, when in fact there is not. The odds of confirming our theory incorrectly. 5 times out of 100, when there is no effect, we will say there is one. We should keep this small when we cannot afford/risk wrongly concluding that our program works. H 0 (Null Hypothesis) is False H a (Alternative Hypothesis) is True In Reality There is a relationship There is a difference, gain, or effect Our theory is correct (e.g.,.20) TYPE II ERROR The odds of saying there is no relationship, difference, gain, when in fact there is one. The odds of not confirming our theory when it is true. 20 times out of 100, when there is an effect, we will say there is not one. 1 (e.g.,.80) POWER The odds of saying that there is a relationship, difference, gain, when in fact there is one. The odds of confirming our theory correctly. 80 times out of 100, when there is an effect, we will say there is one. We generally want this to be as large as possible.

All statistical conclusions involve constructing two mutually exclusive hypotheses, called the null (H 0 ) and alternative (H a or H 1 ) hypothesis. Together, the hypotheses describe all possible outcomes with respect to the inference. That is, one must be right and one must be wrong. The central decision involves determining which hypothesis to accept (retain or support) and which to reject (fail to retain or support). The null hypothesis is so termed because it usually refers to the no relationship, no difference, no gain, or no effect case. Usually in research we expect that our treatment and/or program will make a difference. So, typically, our theory is described in the alternative hypothesis, that there is a relationship, difference, gain, or effect. Researchers want to have the highest degree of accuracy (power) with any study they conduct. As such, they want to increase the power of the study. As you increase power, you increase the chances that you are going to find an effect if it is there. However, if you increase power, you are at the same time increasing the chances of making a Type I Error. There is clearly a direct relationship between power and alpha (the level of significance). Since we usually want high power and low Type I Error, you should be able to appreciate the dilemma researchers face. We often talk about alpha () and beta () using the language of higher and lower. For example, we might talk about the advantages of a higher or lower level in a particular study. You have to be careful about interpreting the meaning of these terms. When we talk about higher levels, we mean that we are increasing the chance of a Type I Error. Therefore, a lower level actually means that you are conducting a more rigorous/stringent test. That is, it takes more to convince us that something is going on (i.e., that there is a real effect). Consider the following associations among alpha (), beta (), and Power: The lower the, the lower the power; the higher the, the higher the power The lower the, the less likely it is that you will make a Type I Error (i.e., rejecting the null hypothesis when in reality it is true) The lower the (in numeric value), the more rigorous or stringent the test An of.01 (compared to.05 or.10) means the researcher is being relatively careful and is only willing to risk being wrong 1 out of 100 times in rejecting the null hypothesis when in reality it is true (i.e., saying there is an effect when there really is not one) An of.01 (compared to.05 or.10) limits one s chances of concluding that there is an effect. This would mean that both the statistical power and the chances of making a Type I Error are lower An of.01 means that you have a 99% chance (1 ) of saying there is no difference when in reality there is no difference (i.e., making a correct decision) Increasing (e.g., from.01 to.05 or.10) increases the chances of making a Type I Error (i.e., saying there is a difference when in reality there is not), decreases the chances of making a Type II Error (i.e., saying there is no difference when in reality there is) and decreases the rigor of the test Increasing (e.g., from.01 to.05 or.10) increases power because the researcher will be rejecting the null hypothesis more often (i.e., accepting the alternative hypothesis) and, consequently, when the alternative hypothesis is true, there is a greater chance of accepting it (i.e., power)

NON-DIRECTIONAL (TWO-TAILED) Retain H 0 Region of Retention p > α Reject H 0 Reject H 0 DIRECTIONAL (ONE-TAILED NEGATIVE END) Region of Retention Retain H 0 p > α Reject H 0 DIRECTIONAL (ONE-TAILED POSITIVE END) Region of Retention Retain H 0 p > α Reject H 0

THE FOUR STEPS IN TESTING A NULL HYPOTHESIS: 1. STATE THE HYPOTHESES State the null hypothesis and alternative hypothesis(es) 2. SET THE CRITERION FOR REJECTING H 0 Set the alpha level, which in turn identifies the critical values 3. COMPUTE THE TEST STATISTIC If parameter information is known calculate z If parameter information is not known calculate t 4. DECIDE WHETHER TO RETAIN OR REJECT H 0 Using the obtained probability value (compared to the alpha level) If p > α Retain the null hypothesis If Reject the null hypothesis Using the obtained statistic value (compared to the critical value) If z obt or t obt < z crit or t crit Retain the null hypothesis If z obt or t obt > z crit or t crit Reject the null hypothesis s of the z Test Statistic Level of Significance Two-tailed Test Level of Significance One-tailed Test of Test Statistic.20.10 1.282.10.05 1.645.05.025 1.960.02.01 2.326.01.005 2.576.001.0005 3.291 NOTE that the + and/or sign is dependent on the type of test being used When parameter information (σ) is unknown we will use the t statistic. Critical values for the Student s t distribution are found on the table on the following page. Note however, when the sample size (N) is large (when df = ) we can use the normal distribution table (Table A Areas under the normal curve) to obtain critical values. That is, the t distribution is identical to the z distribution.

df CRITICAL VALUES OF STUDENT S t DISTRIBUTION Level of Significance for One-Tailed Test.10.05.025.01.005.0005 Level of Significance for Two-Tailed Test.20.10.05.02.01.001 1 3.078 6.314 12.706 31.821 63.657 636.619 2 1.886 2.920 4.303 6.965 9.925 31.598 3 1.638 2.353 3.182 4.541 5.841 12.941 4 1.533 2.132 2.776 3.747 4.604 8.610 5 1.476 2.015 2.571 3.365 4.032 6.859 6 1.440 1.943 2.447 3.143 3.707 5.959 7 1.415 1.895 2.365 2.998 3.499 5.405 8 1.397 1.860 2.306 2.986 3.355 5.041 9 1.383 1.833 2.262 2.821 3.250 4.781 10 1.372 1.812 2.228 2.764 3.169 4.587 11 1.363 1.796 2.201 2.718 3.106 4.437 12 1.356 1.782 2.179 2.681 3.055 4.318 13 1.350 1.771 2.160 2.650 3.012 4.221 14 1.345 1.761 2.145 2.624 2.977 4.140 15 1.341 1.753 2.131 2.602 2.947 4.073 16 1.337 1.746 2.120 2.583 2.921 4.015 17 1.333 1.740 2.110 2.567 2.898 3.965 18 1.330 1.734 2.101 2.552 2.878 3.922 19 1.328 1.729 2.093 2.539 2.861 3.883 20 1.325 1.725 2.086 2.528 2.845 3.850 21 1.323 1.721 2.080 2.518 2.831 3.819 22 1.321 1.717 2.074 2.508 2.819 3.792 23 1.319 1.714 2.069 2.500 2.807 3.767 24 1.318 1.711 2.064 2.492 2.797 3.745 25 1.316 1.708 2.060 2.485 2.787 3.725 26 1.315 1.706 2.056 2.479 2.779 3.707 27 1.314 1.703 2.052 2.473 2.771 3.690 28 1.313 1.701 2.048 2.467 2.763 3.674 29 1.311 1.699 2.045 2.462 2.756 3.659 30 1.310 1.697 2.042 2.457 2.750 3.646 40 1.303 1.684 2.021 2.423 2.704 3.551 60 1.296 1.671 2.000 2.390 2.660 3.460 120 1.289 1.658 1.980 2.358 2.617 3.373 1.282 1.645 1.960 2.326 2.576 3.291 The values listed in the table are the critical values of t for the specified degrees of freedom (left column) and the alpha level (column heading). For two-tailed alpha levels, t crit is both + and. To be significant, t obt > t crit.