6.3 One-Sample Mean (Sigma Unknown) known. In practice, most hypothesis involving population parameters in which the

Similar documents
Two Related Samples t Test

Independent t- Test (Comparing Two Means)

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Hypothesis testing - Steps

Chapter 2 Probability Topics SPSS T tests

Chapter 7 Section 7.1: Inference for the Mean of a Population

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

1.5 Oneway Analysis of Variance

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Lecture Notes Module 1

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Regression Analysis: A Complete Example

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

One-Way Analysis of Variance

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

DDBA 8438: The t Test for Independent Samples Video Podcast Transcript

Simple linear regression

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

12: Analysis of Variance. Introduction

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Estimation of σ 2, the variance of ɛ

Standard Deviation Estimator

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

Chapter 5 Analysis of variance SPSS Analysis of variance

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Confidence Intervals for the Difference Between Two Means

HYPOTHESIS TESTING: POWER OF THE TEST

Section 13, Part 1 ANOVA. Analysis Of Variance

Recall this chart that showed how most of our course would be organized:

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Study Guide for the Final Exam

p ˆ (sample mean and sample

NCSS Statistical Software. One-Sample T-Test

How To Test For Significance On A Data Set

THE KRUSKAL WALLLIS TEST

Module 2 Probability and Statistics

Simple Regression Theory II 2010 Samuel L. Baker

Comparing Means in Two Populations

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

6.4 Normal Distribution

3.4 Statistical inference for 2 populations based on two samples

Non-Parametric Tests (I)

Week 4: Standard Error and Confidence Intervals

Chapter Study Guide. Chapter 11 Confidence Intervals and Hypothesis Testing for Means

Lesson 9 Hypothesis Testing

Descriptive Statistics

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Stats Review Chapters 9-10

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

2 Sample t-test (unequal sample sizes and unequal variances)

13: Additional ANOVA Topics. Post hoc Comparisons

HYPOTHESIS TESTING WITH SPSS:

Chapter 7. One-way ANOVA

Unit 26 Estimation with Confidence Intervals

Final Exam Practice Problem Answers

4. Continuous Random Variables, the Pareto and Normal Distributions

Association Between Variables

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

How To Check For Differences In The One Way Anova

individualdifferences

Statistics Review PSY379

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

NCSS Statistical Software

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

Hypothesis Testing for Beginners

Randomized Block Analysis of Variance

Hypothesis Testing --- One Mean

Descriptive Statistics and Measurement Scales

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

In the past, the increase in the price of gasoline could be attributed to major national or global

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

ABSORBENCY OF PAPER TOWELS

CHAPTER 14 NONPARAMETRIC TESTS

Transcription:

6.3 One-Sample ean (Sigma Unknown) In the previous section we examined the hypothesis of comparing a sample mean with a population mean with both the population mean and standard deviation well known. In practice, most hypothesis involving population parameters in which the population parameters are known are rare. Often we must use sample statistics to estimate and test hypothesis statement about population parameter. When the population standard deviation is known we observed that the Central Limit Theorem that the sampling distribution s standard deviation is the same as the standard error of the sampling mean, s, where n, where is the population standard deviation But when the population variance is not known, the sample variance (standard deviation squared) may be used to estimate the population variance. Since the sample variance has n - 1 degree of freedom, it will be larger than the true population variance, but will be close to the true population variance when the sample size gets very large. Recall that the sample standard deviation is the square root of the variance and is given by the formula below: sample standard deviation = s = sample variance s s 2 SS n 1 SS df, where SS is the sum of square deviation The sampling distribution s standard error when the sample standard deviation is used to estimate it is therefore estimated standard error = s = s n, where s is the sample standard deviation 1

2 The estimated standard error (s ) is uses the sample standard deviation with degree of freedom, n 1 to estimate the true sampling standard error (s ) when the population standard deviation, s is not known. The t Distribution The standard normal distribution is used as the underlying sampling distribution when the population standard deviation is known; however, when it is unknown, we use the It distribution introduced here. The t-distribution is similar to a normal distribution and approaches that of a normal curve when the sample size is very large. The degree of freedom (df) is the number of independent scores of a sample remaining after estimating one or more population parameters. The degree of freedom is related to the sample size. Degree of freedom (df) is the number of scores that are independent and allow to vary after estimating a population parameter. It is the number of observations minus the number of restrictions or observations that are restricted. The sampling distribution of the t statistics is called the Student s t distribution. The following are properties of the t-distribution: 1. There are an infinite number of student t distributions, one for each degree of freedom, df. The number of degrees of freedom for the One-sample case is 1 minus the sample size or n - 1. That is, df = n - 1. 2. The Student's t distribution is similar to the standard normal distribution in shape: That is it as a mean of t = 0 and the total area under the curve (or the probability density or distribution of the function) is 1. 3. As the sample size gets large the curve gets closer to the standard normal curve. Therefore for sample size = 30 the z-score statistics can be used.

3 There are many t distributions for each degree of freedom. Figure 6.3.1 shows a comparison between a t distribution and the normal distribution. Figure 6.2.1 Normal and t-distribution curves Hypothesis Testing: One-Sample Case for the ean One-Sample Case: If you wish to compare the mean of a sample to a test value, a constant, you use the t distribution. For this test you will need the sample mean, ; the sample standard deviation, s, the sample size, n and the value that you want the sample mean compared to. The sample mean is often compared against a hypothesized or known population mean, µ. Hypotheses: Your null hypothesis, H 0 (test statement) is that, your sample mean equals some test value. Your alternate hypothesis, H a is that there is no reason to believe that the sample mean differs from the test value, E or a hypothesized population mean, µ.

4 H 0 : µ = E or µ - E = 0, where E is the test value As before, the alternative hypothesis can be stated in one of three ways: H a : µ E (non-directional) or H a : µ > E (directional) or H a : µ < E (directional). The t Statistics: The t statistics is similar to the z statistics, except that the estimated standard error, s is used to estimate s. The t statistics is simply t statistics = t s s n is the sample mean and µ is the hypothesized mean or test value. Just as we were able to use the normal distribution table to find proportions or probabilities associated with the z statistics, we will use the t distribution table in the appendix (Tables A2 & A3) to find proportions or probabilities for the t statistics. Table A2 is a two-tailed t probability distribution table at alpha levels of 0.05 (5%) and 0.01 (1%) at various degree of freedom. Table A3 is a one-tailed t probability distribution table at various alpha levels and degree of freedom. When the degree of freedom, df, is large, the normal probability distribution table can be used. Let us now examine the hypothesis testing procedure involve in make inference about a hypothesized population mean from sample estimates. These three problems explore the procedure for the various alternative hypothesis statements.

5 Problem statement 1: The output of a manufacturing process is monitored by taking a sample size of n = 25 vials to determine the level of an important ingredient. If the amount of the ingredient is not equal to exactly 0.05 gram per liter, the process will be stopped and the tanks purged. Otherwise, the process will continue. The mean from the sample is 0.064 with a standard deviation of 0.017. At a significant level of alpha = 0.01, should the process be stopped and the tanks purged? Step 1: State the hypotheses H 0 : µ = 0.05 and H a : µ 0.05 (non-directional) Step 2: Select the significance level a = 0.01 (1%, two-tailed with 0.5% in each tail) Step 3: Compute the test statistics standard error = s = s n 0.017 25 0.0034 t statistics = t s 0.064 0.05 0.0034 4.1176 Step 4: Determine the critical region The critical region for the t distribution with df = 24 = (n -1) and alpha = 0.01 (two-tailed, Table A2) is t = -2.797 or t = 2.797 (i.e., reject H 0 if t statistics is within these regions) Step 5: ake a decision We reject the null hypothesis for any of the following reasons: 1. The test statistics t = 4.1176 is within the critical region: t = 4.12 > t CV = 2.797 or 2. The p-value for the test is 0.004 (from any statistics program) < 0.01 or 3. The hypothesized µ = 0.05 is outside the sample CI 95 of 0.055 to 0.074 CI 95 = 0.064 ± 2.797(0.0034) or 4. The CI 95 for the difference does not contain zero: H 0 : µ E = 0 CI 95 = 0.014 ± 2.797(0.0034) = 0.0045 to 0.0235 ean Difference = D = 0.064 0.05 = 0.014 The effect size is d = 0.82 (large effect, d = 0.8) effect size = d = s 0.064 0.05 0.017 0.8235 So the manufacturing process has changed and so the tanks should be purged. There is significant difference between the sample mean and the hypothesized population mean.

6 Problem statement 2: Test the hypothesis, at a significance level of a = 0.05, that the average enrollment for the math department (across all its classes) has increased from an average enrollment of 160. A recent sample of 30 math classes show the average enrollment is 170 with a standard deviation of 30. Step 1: State the hypotheses H 0 : µ = 160 and H a : µ > 160 (directional) Step 2: Select the significance level a = 0.05 (5%, upper one-tailed) Step 3: Compute the test statistics standard error = s = s n 30 25 6.0 t statistics = t s 170 160 6 1.6667 Step 4: Determine the critical region The critical region for the t distribution with df = 24 = (n -1) and alpha = 0.05 (one-tailed, Table A3) is t = 1.711 (i.e., reject H 0 if t statistics is = 1.711, t critical value) Step 5: ake a decision We Do not reject the null hypothesis for any of the following reasons: 1. The test statistics t = 1.67 is not in the critical region (t = 1.67 < t CV = 1.71) or 2. The p-value for the test is 0.054 (from any statistics program) > 0.05 or 3. The hypothesized µ = 160 is within the sample CI 95 of 157.62 to 182.38 CI 95 = 170 ± 2.064(6) [t CV = 2.064: df = 24, a = 0.05, two-tailed] * 4. The CI 95 for the difference does contain zero: H 0 : µ E = 0 CI 95 = 10 ± 2.064(6) = -2.38 to 22.38 ean Difference = D =170-160 = 10 The effect size is d = 0.33 (medium effect, 0.2 < d < 0.6) effect size = d = s 170 160 30 0.3333 The math department s enrollment average has not changed significantly from 160. * Note: the two-tailed critical value is used to compute the confidence interval

7 Problem statement 3: A sample of 8 observations averages 5.8 and the standard deviation is 2.034. Is this mean less than the existing mean of 7.0? Test this hypothesis using a 5% level of significance. Step 1: State the hypotheses H 0 : µ = 7.0 H a : µ < 7.0 (directional) Step 2: Select the significance level a = 0.05 (5%, two-tailed with 2.5% in each tail) Step 3: Compute the test statistics standard error = s = s n 2.034 8 0.7191 t statistics = t s 5.8 7.0 0.7191 1.6687 Step 4: Determine the critical region The critical region for the t distribution with df = 7 = (n -1) and alpha = 0.05 (one-tailed, Table A3) is t = 1.895 (i.e., reject H 0 if t statistics is = 1.895, t critical value) Step 5: ake a decision We Do not reject the null hypothesis for any of the following reasons: 5. The test statistics t = 1.6687 is not in the critical region (t = 1.67 < t CV = 1.895) or 6. The p-value for the test is 0.07 (from any statistics program) > 0.05 or 7. The hypothesized µ = 7.0 is within the sample CI 95 of 4.1 to 7.5 CI 95 = 5.8 ± 2.3646(0.7191) [t CV = 2.3646: df = 7, a = 0.05, two-tailed] 8. The CI 95 for the difference does contain zero: H 0 : µ E = 0 CI 95 = -1.2 ± 2.3646(0.7191) = -2.9 to 0.50 ean Difference = D =5.8 7.0 = -1.2 The effect size is d = 0.59 (medium effect, 0.2 < d < 0.6; use absolute value of result) effect size = d = s 5.8 7.0 2.034-0.59 = +0.59 The average is not significantly less than 7.0.

8 Acceptance and Rejection Criteria: Reject the null hypothesis for any of the following reasons: 1. The significance level < alpha or p-value < 0.05 2. The absolute value of the t-test (calculated) > t-value (table lookup) 3. The confidence interval for the mean difference does not contain zero. Do not reject (accept) the null hypothesis for any of the following reasons: 1. The significance level > alpha or p-value > 0.05 2. The absolute value of the t-test (calculated) < t-value (table lookup) 3. The confidence interval for the mean difference contains zero The SPSS output for the One-sample t-test will provide sufficient information for you to make a decision about your null hypothesis. The following are two examples of the one-sample t-test. The SPSS procedure to calculate the one-sample t-test for a single variable and compare it to a constant value is as follows: Analyze -> Compare eans -> One-Sample t Test -> Enter Test Value and under Option Select Confidence Interval of 95% (for alpha of 0.05 or significance level). Figure 6.2.2 shows the SPSS procedure for computing the one-sample t-test for mean using sample data against a hypothesized population mean. Example 1: Is the mean of the pass9th variable 70? Since we which to compare the mean of a single variable to a constant value of 70, we can use the one-sample t-test with an alpha or a of 0.05 (5%). When we performed this analysis using SPSS we get the basic statistics summary in Table 6.2.1.

9 Figure 6.2.2. SPSS one-sample t-test procedure Table 6.2.1 SPSS One-Sample t-test of pass9th Variable: Basic Statistics N ean Std. Deviation Std. Error ean Passed 9th Grade 94 65.86 13.609 1.404 The null hypothesis is: H 0 : The true mean of pass9th variable = 70; the SPSS program actually calculates the test statistics for H a : mean of pass9th is not > 70, so the t-test will be negative because µ (65.86 70) is negative; so we used its absolute

10 value of t = 2.948 to compare against the t CV obtained for Table A2 in appendix A. Table 6.2.2 shows the SPSS analysis results of the one-sample t test with a test value of 70. Table 6.2.2 SPSS One-Sample t-test of pass9th Variable: t-test Analysis t df Sig. (2-tailed) Test Value = 70 ean Difference 95% Confidence Interval of the Difference Lower Upper Passed 9th Grade -2.948 93.004-4.138-6.93-1.35 The t critical value obtained from the one-tailed t-distribution table in the is 1.661 (Table A2: df = 90+, a = 0.05) and since the t statistics obtained from Table 6.2.2 is 2.948 (absolute value used, so -2.948 becomes 2.948) is greater than t cv = 1.661; we reject the null hypothesis in favor of the alternate hypothesis that there is a significant difference between the test value of 70 and the sample mean for the pass9th variable. The critical value used to compute the CI95 is t CV = 1.9858 (Table A3: df = 93, a = 0.05, twotailed). We could also say that the mean for the pass9th variable of 65.86 is significantly less that 70. We reject the null hypothesis for the following reasons (any one): 1. The significance p-value = (0.004 x 2) = 0.008 < 0.05 2. The absolute value of the t-test of 2.948 > t-value of 1.661 (appendix) 3. The confidence interval (-6.93 to -1.35) of the mean difference does not contain zero. The CI Difference = -4.138 ± 1.9858(1.404).