Chapter 2. Hypothesis testing in one population



Similar documents
HYPOTHESIS TESTING: POWER OF THE TEST

3.4 Statistical inference for 2 populations based on two samples

Hypothesis testing - Steps

Hypothesis Testing --- One Mean

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Chapter 4 Statistical Inference in Quality Control and Improvement. Statistical Quality Control (D. C. Montgomery)

Introduction to Hypothesis Testing

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Estimation of σ 2, the variance of ɛ

Introduction to Hypothesis Testing OPRE 6301

Business Statistics, 9e (Groebner/Shannon/Fry) Chapter 9 Introduction to Hypothesis Testing

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Hypothesis Testing. Hypothesis Testing

22. HYPOTHESIS TESTING

Practice problems for Homework 12 - confidence intervals and hypothesis testing. Open the Homework Assignment 12 and solve the problems.

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Confidence Intervals for One Standard Deviation Using Standard Deviation

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Confidence Intervals for Cp

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Unit 26 Estimation with Confidence Intervals

Permutation Tests for Comparing Two Populations

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Comparing Means in Two Populations

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Difference of Means and ANOVA Problems

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Math 251, Review Questions for Test 3 Rough Answers

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

How To Test For Significance On A Data Set

Normal distribution. ) 2 /2σ. 2π σ

Tests for Two Proportions

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

Lecture Notes Module 1

4. Continuous Random Variables, the Pareto and Normal Distributions

Recall this chart that showed how most of our course would be organized:

Confidence Intervals for the Difference Between Two Means

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

p ˆ (sample mean and sample

1 Sufficient statistics

Simple Linear Regression Inference

CHAPTER 14 NONPARAMETRIC TESTS

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Hypothesis Testing for Beginners

Lecture 8. Confidence intervals and the central limit theorem

Chapter 4: Statistical Hypothesis Testing

START Selected Topics in Assurance

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Non-Inferiority Tests for Two Proportions

How To Check For Differences In The One Way Anova

Non-Inferiority Tests for Two Means using Differences

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives

Chapter 8: Hypothesis Testing for One Population Mean, Variance, and Proportion

Stats Review Chapters 9-10

Inference for two Population Means

Tests for One Proportion

1.5 Oneway Analysis of Variance

Point and Interval Estimates

Name: Date: Use the following to answer questions 3-4:

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Topic 8. Chi Square Tests

Testing Hypotheses About Proportions

5.1 Identifying the Target Parameter

Review #2. Statistics

Tutorial 5: Hypothesis Testing

Fairfield Public Schools

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

NCSS Statistical Software

Hypothesis Testing. Steps for a hypothesis test:

November 08, S8.6_3 Testing a Claim About a Standard Deviation or Variance

Solutions to Questions on Hypothesis Testing and Regression

Two Correlated Proportions (McNemar Test)

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

Confidence Intervals for Cpk

The Wilcoxon Rank-Sum Test

STAT 350 Practice Final Exam Solution (Spring 2015)

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Name: (b) Find the minimum sample size you should use in order for your estimate to be within 0.03 of p when the confidence level is 95%.

Exact Confidence Intervals

Hypothesis Testing. Reminder of Inferential Statistics. Hypothesis Testing: Introduction

Using Stata for One Sample Tests

Two-sample inference: Continuous data

SAMPLE SIZE CONSIDERATIONS

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Practice Problems and Exams

Transcription:

Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance and rejection/acceptance regions in upper-, lower- and two-tail tests Test of hypothesis: procedure p-value Two-tail tests and confidence intervals Examples with various parameters Power and sample size calculations

Chapter 2. Hypothesis testing in one population Learning goals At the end of this chapter you should be able to: Perform a test of hypothesis in a one-population setting Formulate the null and alternative hypotheses Understand Type I and Type II errors, define the significance level, define the power Choose a suitable test statistic and identify the corresponding rejection region in upper-, lower- and two-tail tests Use the p-value to perform a test Know the connection between a two-tail test and a confidence interval Calculate the power of a test and identify a sample size needed to achieve a desired power

Chapter 2. Hypothesis testing in one population References Newbold, P. Statistics for Business and Economics Chapter 9 (9.1-9.5) Ross, S. Introduction to Statistics Chapter 9

Test of hypothesis: introduction A test of hypothesis is a procedure that: is based on a data sample and allows us to make a decision about a validity of some conjecture or hypothesis about the population X, typically the value of a population parameter θ (θ can be any of the parameters we covered so far: µ, p, σ 2, etc) This hypothesis, called a null hypothesis (H 0 ): Can be thought of as a hypothesis being supported (before the test is carried out) Will be believed unless sufficient contrary sample evidence is produced When sample information is collected, this hypothesis is put in jeopardy, or tested

The null hypothesis: examples 1. A manufacturer who produces boxes of cereal claims that, on average, their contents weigh at least 20 ounces. To check this claim, the contents of a random sample of boxes are weighed and inference is made. Population: X = weight of a box of cereal (in oz) µ 0 z} { Null hypothesis, H 0 : µ 20 'SRS Does sample data produce evidence against H 0? 2. A company receiving a large shipment of parts accepts their delivery only if no more than 50% of the parts are defective. The decision is based on a check of a random sample of these parts. Population: X = 1 if a part is defective and 0 otherwise X Bernoulli(p), p = proportion of defective parts in the entire shipment p 0 z} { Null hypothesis, H 0 : p 0.5'SRS Does sample data produce evidence against H 0?

Null hypothesis, H 0 States the assumption to be tested We begin with the assumption that the null hypothesis is true (similar to the notion of innocent until proven guilty) Refers to the status quo Always contains a =, or sign (closed set) May or may not be rejected Simple hypothesis (specifies a single value): H 0 : µ = µ 0 z} { 5, H 0 : p = p 0 z} { 0.6, H 0 : σ 2 = Parameter space under this null: Θ 0 = {θ 0} Composite hypothesis (specifies a range of values): H 0 : µ σ 2 0 z} { 9 In general: H 0 : θ = θ 0 µ 0 p 0 z} { z} { 5, H 0 : p 0.6 In general: H 0 : θ θ 0 or H 0 : θ θ 0 Parameter space under this null: Θ 0 = (, θ 0] or Θ 0 = [θ 0, )

Alternative hypothesis, H 1 If the null hypothesis is not true, then some alternative must be true, and in carrying out a hypothesis test, the investigator formulates an alternative hypothesis against which the null hypothesis is tested. The alternative hypothesis H 1: Is the opposite of the null hypothesis Challenges the status quo Never contains =, or sign May or may not be supported Is generally the hypothesis that the researcher is trying to support One-sided hypothesis: (upper-tail) H 1 : µ > 5 (lower-tail) H 0 : p < 0.6 In general: H 1 : θ > θ 0 or H 1 : θ < θ 0 Parameter space under this alternative: Θ 1 = (θ 0, ) or Θ 1 = (, θ 0) Two-sided hypothesis (two-tail): H 1 : σ 2 9 In general: H 1 : θ θ 0 Parameter space under this alternative: Θ 1 = (, θ 0) (θ 0, )

The alternative hypothesis: examples 1. A manufacturer who produces boxes of cereal claims that, on average, their contents weigh at least 20 ounces. To check this claim, the contents of a random sample of boxes are weighed and inference is made. Population: X = weight of a box of cereal (in oz) Null hypothesis, H 0 : µ 20 versus Alternative hypothesis, H 1 : µ < 20'SRS Does sample data produce evidence against H 0 in favour of H 1? 2. A company receiving a large shipment of parts accepts their delivery only if no more than 50% of the parts are defective. The decision is based on a check of a random sample of these parts. Population: X = 1 if a part is defective and 0 otherwise X Bernoulli(p), p = proportion of defective parts in the entire shipment Null hypothesis, H 0 : p 0.5 versus Alternative hypothesis, H 1 : p > 0.5'SRS Does sample data produce evidence against H 0 in favour of H 1?

Hypothesis testing process xyyxxxxyy Population: X = height of a UC3M student (in m) Claim: On average, students are shorter than 1.6 Hypotheses: H 0 : µ 1.6 versus H 1 : µ > 1.6 'SRS yyxx Sample: Suppose the sample mean height is 1.65 m, x = 1.65 Is it likely to observe a sample mean x = 1.65 if the population mean is µ 1.6? If not likely, reject the null hypothesis in favour of the alternative.

Hypothesis testing process Having specified the null and alternative hypotheses and collected the sample information, a decision concerning the null hypothesis (reject or fail to reject H 0 ) must be made. The decision rule is based on the value of a distance between the sample data we have collected and those values that would have a nigh probabiilty under the null hypothesis. This distance is calculated as the value of a so-called test statistic (closely related to the pivotal quantities we talked about in Chapter 1). We will discuss specific cases later on. However, whatever decision is made, there is some chance of reaching an erroneous conclusion about the population parameter, because all that we have available is a sample and thus we cannot know for sure if the null hypothesis is true or not. There are two possible states of nature and thus two errors can be committed: Type I and Type II errors.

Type I and Type II errors, power Type I Error: to reject a true null hypothesis. A Type I error is considered a serious type of error. The probability of a Type I Error is equal to α and is called the significance level. α = P(reject the null H 0 is true) Type II Error: to fail to reject a false null hypothesis. The probability of a Type II Error is β. β = P(fail to reject the null H 1 is true) power: is the probability of rejecting a null hypothesis (that is false). power = 1 β = P(reject the null H 1 is true) Actual situation Decision H 0 true H 0 false Do not No error Type II Error Reject H 0 (1 α) (β) Reject Type I error No Error H 0 (α) (1 β = power)

Type I and Type II errors, power Type I and Type II errors can not happen at the same time Type I error can only occur if H0 is true Type II error can only occur if H0 is false If the Type I error probability (α), then the Type II error probability β All else being equal: β when the difference between the hypothesized parameter value and its true value β when α β when σ β when n The power of the test increases as the sample size increases For θ Θ1 power(θ) = 1 β For θ Θ0 power(θ) α

Test statistic, level of significance and rejection region Test statistic, T Allows us to decide if the sample data is likely or unlikely to occur, assuming the null hypothesis is true. It is the pivotal quantity from Chapter 1 calculated under the null hypothesis. The decision in the test of hypothesis is based on the observed value of the test statistic, t. The idea is that, if the data provide an evidence against the null hypothesis, the observed test statistic should be extreme, that is, very unusual. It should be typical otherwise. In distinguishing between extreme and typical we use: the sampling distribution of the test statistic the significance level α to define so-called rejection (or critical) region and the acceptance region.

Test statistic, level of significance and rejection region Rejection region (RR) and acceptance region (AR) in size α tests: Upper-tail test H 1 : θ > θ 0 α RR α = {t : t > T α} AR α = {t : t T α} AR CRITICAL VALUE RR Lower-tail test H 1 : θ < θ 0 α RR α = {t : t < T 1 α} AR α = {t : t T 1 α} RR CRITICAL VALUE AR Two-tail test H 1 : θ θ 0 RR α = {t : t < T 1 α/2 or t > T α/2 } AR α = {t : T 1 α/2 t T α/2 } α 2 α 2 RRCRITICAL AR CRITICALRR VALUE VALUE

Test statistics Let X n be a s.r.s. from a population X with mean µ and variance σ 2, α a significance level, z α the upper α quantile of N(0,1), µ 0 the population mean under H 0, etc. Parameter Assumptions Test statistic RRα in two-tail test Mean Variance Normal data Known variance Non-normal data Large sample Bernoulli data Large sample Normal data Unknown variance Normal data X µ 0 σ/ N(0, 1) n X µ 0 ˆσ/ ap. N(0, 1) n ˆp p 0 p p0 (1 p 0 )/n ap. N(0, 1) jz : X µ 0 s/ n t n 1 (n 1)s 2 σ 2 0 χ 2 n 1 >< χ 2 : 8 z 9 z } { >< x µ 0 z : σ/ < z 1 α/2 or x µ >= 0 n σ/ n > z α/2 >: >; j x µ z : 0 ˆσ/ n < z 1 α/2 or x µ ff 0 ˆσ/ n > z α/2 ff ˆp p p 0 p0 (1 p 0 )/n < z 1 α/2 or ˆp p p 0 p0 (1 p 0 )/n > z α/2 8 t 9 z } { >< x µ 0 t : s/ < t n 1;1 α/2 or x µ >= 0 n s/ n > t n 1;α/2 >: >; 8 9 χ 2 z } { (n 1)s 2 σ 2 0 < χ 2 (n 1)s2 or n 1;1 α/2 σ 0 2 > χ 2 n 1;α/2 >= St. dev. Normal data (n 1)s 2 σ 2 0 χ 2 n 1 >: ( χ 2 : (n 1)s 2 σ 2 0 >; ) < χ 2 (n 1)s2 or n 1;1 α/2 σ 0 2 > χ 2 n 1;α/2 Question: How would you define RR α in upper- and lower-tail tests?

Test of hypothesis: procedure 1. State the null and alternative hypotheses. 2. Calculate the observed value of the test statistic (see the formula sheet). 3. For a given significance level α define the rejection region (RR α ). Reject H0, the null hypothesis, if the test statistic is in RR α and fail to reject H 0 otherwise. 4. Write down the conclusions in a sentence.

Upper-tail test for the mean, variance known: example Example: 9.1 (Newbold) When a process producing ball bearings is operating correctly, the weights of the ball bearings have a normal distribution with mean 5 ounces and standard deviation 0.1 ounces. The process has been adjusted and the plant manager suspects that this has raised the mean weight of the ball bearings, while leaving the standard deviation unchanged. A random sample of sixteen bearings is selected and their mean weight is found to be 5.038 ounces. Is the manager right? Carry out a suitable test at a 5% level of significance. Population: X = weight of a ball bearing (in oz) X N(µ, σ 2 = 0.1 2 ) Test statistic: Z = X µ 0 σ/ N(0, 1) n Observed test statistic: 'SRS: n = 16 Sample: x = 5.038 Objective: test µ 0 z} { H 0 : µ = 5 against H 1 : µ > 5 (Upper-tail test) σ = 0.1 µ 0 = 5 n = 16 x = 5.038 z = x µ0 σ/ n = 5.038 5 0.1/ 16 = 1.52

Upper-tail test for the mean, variance known: example Example: 9.1 (cont.) Rejection (or critical) region: RR 0.05 = {z : z > z 0.05 } = {z : z > 1.645} z= 1.52 Since z = 1.52 / RR 0.05 we fail to reject H 0 at a 5% significance level. N(0,1) density AR z α = 1.645 Conclusion: The sample data did not provide sufficient evidence to reject the claim that the average weight of the bearings is 5oz. RR

Definition of p-value It is the probability of obtaining a test statistic at least as extreme ( or ) as the observed one (given H 0 is true) Also called the observed level of significance It is the smallest value of α for which H 0 can be rejected Can be used in step 3) of the testing procedure with the following rule: If p-value < α, reject H0 If p-value α, fail to reject H0 Roughly: small p-value - evidence against H0 large p-value - evidence in favour of H0

p-value p-value when t is the observed value of the test statistic T : Upper-tail test H 1 : θ > θ 0 test stat p value =area p-value = P(T t) Lower-tail test H 1 : θ < θ 0 p-value = P(T t) Two-tail test H 1 : θ θ 0 p-value = P(T t ) + P(T t ) p value =area test stat test stat p value =left+right areas test stat

p-value: example Example: 9.1 (cont.) Population: X = weight of a ball bearing (in oz) X N(µ, σ 2 = 0.1 2 ) 'SRS: n = 16 Sample: x = 5.038 Objective: test µ 0 z} { H 0 : µ = 5 against H 1 : µ > 5 (Upper-tail test) Test statistic: Z = X µ 0 σ/ N(0, 1) n Observed test statistic: z = 1.52 N(0,1) density p-value = P(Z z) = P(Z 1.52) = 0.0643 where Z N(0, 1) Since it holds that p-value = 0.0643 α = 0.05 we fail to reject H 0 (but would reject at any α greater than 0.0643, e.g., α = 0.1). z= 1.52 p value =area

The p-value and the probability of the null hypothesis 1 The p-value: is not the probability of H0 nor the Type I error α; but it can be used as a test statistic to be compared with α (i.e. reject H 0 if p-value < α). We are interested in answering: How probable is the null given the data? Remember that we defined the p-value as the probability of the data (or values even more extreme) given the null. We cannot answer exactly. But under fairly general conditions and assuming that if we had no observations Pr(H 0) = Pr(H 1) = 1/2, then for p-values, p, such that p < 0.36: ep ln(p) Pr(H 0 Observed Data) 1 ep ln(p). 1 Selke, Bayarri and Berger, The American Statistician, 2001

The p-value and the probability of the null hypothesis This table helps to calibrate a desired p-value as a function of the probability of the null hypothesis: p-value Pr(H 0 Observed Data) 0.1 0.39 0.05 0.29 0.01 0.11 0.001 0.02 0.00860 0.1 0.00341 0.05 0.00004 0.01 0.00001 0.001 For a p-value equal to 0.05 the null has a probability of at least 29% of being true While if we want the probability of the null being true to be at most 5%, the p-value should be no larger than 0.0034.

Confidence intervals and two-tail tests: duality A two-tail test of hypothesis at a significance level α can be carried out using a (two-tail) 100(1 α)% confidence interval in the following way: 1. State the null and two-sided alternative H 0 : θ = θ 0 against H 1 : θ θ 0 2. Find a 100(1 α)% confidence interval for θ 3. If θ 0 doesn t belong to this interval, reject the null. If θ 0 belongs to this interval, fail to reject the null. 4. Write down the conclusions in a sentence.

Two-tail test for the mean, variance known: example Example: 9.2 (Newbold) A drill is used to make holes in sheet metal. When the drill is functioning properly, the diameters of these holes have a normal distribution with mean 2 in and a standard deviation of 0.06 in. To check that the drill is functioning properly, the diameters of a random sample of nine holes are measured. Their mean diameter was 1.95 in. Perform a two-tailed test at a 5% significance level using a CI-approach. Population: 100(1 α)% = 95% confidence X = diameter of a hole (in inches) interval for µ: X N(µ, σ 2 = 0.06 2 ) ( x 1.96 n σ ) 'SRS: n = 9 Sample: x = 1.95 Objective: test µ 0 {}}{ H 0 : µ = 2 against H 1 : µ 2 (Two-tail test) CI 0.95 (µ) = = ( 1.95 1.96 0.06 ) 9 = (1.9108, 1.9892) Since µ 0 = 2 / CI 0.95 (µ) we reject H 0 at a 5% significance level.

Two-tail test for the proportion: example Example: 9.6 (Newbold) In a random sample of 199 audit partners in U.S. accounting firms, 104 partners indicated some measure of agreement with the statement: Cash flow from operations is a valid measure of profitability. Test at the 10% level against a two-sided alternative the null hypothesis that one-half of the members of this population would agree with the preceding statement. Population: X = 1 if a member agrees with the Test statistic: statement and 0 otherwise Z = ˆp p 0 X Bernoulli(p) approx. N(0, 1) p0(1 p 0)/n Observed test statistic: 'SRS: n = 199 large n Sample: ˆp = 104 199 = 0.523 Objective: test p 0 {}}{ H 0 : p = 0.5 against H 1 : p 0.5 (Two-tail test) p 0 = 0.5 n = 199 ˆp = 0.523 z = ˆp p 0 p0 (1 p 0 )/n = 0.523 0.5 0.5(1 0.5)/199 = 0.65

Two-tail test for the proportion: example Example: 9.6 (cont.) Rejection (or critical) region: RR 0.10 = {z : z > z 0.05 } {z : z < z 0.05 } = {z : z > 1.645} {z : z < 1.645} z= 0.65 Since z = 0.65 / RR 0.10 we fail to reject H 0 at a 10% significance level. N(0,1) density RR z α 2 = 1.645AR zα = 1.645 RR 2 Conclusion: The sample data does not contain sufficiently strong evidence against the hypothesis that one-half of all audit partners agree that cash flow from operations is a valid measure of profitability.

Lower-tail test for the mean, variance unknown: example Example: 9.4 (Newbold, modified) A retail chain knows that, on average, sales in its stores are 20% higher in December than in November. For a random sample of six stores the percentages of sales increases were found to be: 19.2, 18.4, 19.8, 20.2, 20.4, 19.0. Assuming a normal population, test at a 10% significance level the null hypothesis (use a p-value approach) that the true mean percentage sales increase is at least 20, against a one-sided alternative. Population: X = stores increase in sales from Nov to Dec (in %s) X N(µ, σ 2 ) σ 2 unknown 'SRS: n = 6 small n Sample: x = 117 6 = 19.5 s 2 = 2284.44 6(19.5)2 6 1 = 0.588 Objective: test µ 0 z} { H 0 : µ 20 against H 1 : µ < 20 (Lower-tail test) Test statistic: T = X µ 0 s/ n tn 1 Observed test statistic: µ 0 = 20 n = 6 x = 1.95 s = 0.588 = 0.767 t = x µ0 s/ n = 19.5 20 0.767/ 6 = 1.597

Lower-tail test for the mean, variance unknown: example Example: 9.4 (cont.) p-value = P(T 1.597) (0.05, 0.1) because t 5;0.05 t 5;0.10 { }} { { }} { 2.015 < 1.597 < 1.476 Hence, given that p-value < α = 0.1 we reject the null hypothesis at this level. p value =area t= 1.597 t n 1 density 2.015 1.476 Conclusion: The sample data gave enough evidence to reject the claim that the average increase in sales was at least 20%. p-value interpretation: if the null hypothesis were true, the probability of obtaining such sample data would be at most 10%, which is quite unlikely, so we reject the null hypothesis.

Lower-tail test for the mean, variance unknown: example Example: 9.4 (cont.) in Excel: Go to menu: Data, submenu: Data Analysis, choose function: two-sample t-test with unequal variances. Column A (data), Column B (n repetitions of µ 0 = 20), in yellow (observed t stat, p-value and t n 1;α ).

Upper-tail test for the variance: example Example: 9.5 (Newbold) In order to meet the standards in consignments of a chemical product, it is important that the variance of their percentage impurity levels does not exceed 4. A random sample of twenty consignments had a sample quasi-variance of 5.62 for impurity level percentages. a) Perform a suitable test of hypothesis (α = 0.1). b) Find the power of the test. What is the power at σ 2 1 = 7? c) What sample size would guarantee a power of 0.9 at σ 2 1 = 7? Population: X = impurity level of a consignment of a chemical (in %s) X N(µ, σ 2 ) Test statistic: χ 2 = (n 1)s2 σ0 2 Observed test statistic: χ 2 n 1 'SRS: n = 20 Sample: s 2 = 5.62 Objective: test H 0 : σ 2 σ0 2 z} { 4 against H 1 : σ 2 > 4 (Upper-tail test) σ 2 0 = 4 n = 20 s 2 = 5.62 χ 2 = (n 1)s2 σ0 2 = (20 1)5.62 4 = 26.695

Upper-tail test for the variance: example Example: 9.5 a) (cont.) p-value = P(χ 2 26.695) (0.1, 0.25) because χ 2 19;0.25 {}}{ 22.7 < 26.695 < χ 2 19;0.1 {}}{ 27.2 Hence, given that p-value exceeds α = 0.1, we cannot reject the null hypothesis at this level. χ 2 n 1 density χ 2 = 26.695 22.7 27.2 p value =area Conclusion: The sample data did not provide enough evidence to reject the claim that the variance of the percentage impurity levels in consignments of this chemical is at most 4.

Upper-tail test for the variance: power Example: 9.5 b) Recall that: power = P(reject H 0 H 1 is true) When do we reject H 0? j ff (n 1)s 2 RR 0.1 = > χ 2 σ0 2 n 1;0.1 power(σ 2 ) versus σ 2 8 9 27.2 4 = 108.8 >< z } { >= = (n 1)s 2 > χ 2 n 1;0.1 σ0 2 >: >; Hence the power is: power(σ1) 2 = P reject H 0 σ 2 = σ1 2 = P (n 1)s 2 > 108.8 σ 2 = σ1 2 (n 1)s 2 = P > 108.8 «σ1 2 σ1 2 = P χ 2 > 108.8 ««108.8 = 1 F χ 2 σ 2 1 σ 2 1 0.0 0.2 0.4 0.6 0.8 1.0 α (F χ 2 is the cdf of χ 2 n 1) Hence, power(7) = P `χ 2 > 108.8 7 σ 0 2 = 4 power(σ 2 = 1 β(σ 2 ) Θ 0 Θ 1 σ 2 0 2 4 6 8 10 = 0.6874.

Upper-tail test for the variance: sample size calculations Example: 9.5 c) From our previous calculations, ( we know that ) potencia(σ1 2) = P (n 1)s 2 > χ 2 σ 2 0 n 1;0.1, σ 2 1 Our objective is to find the smallest n such that: σ 2 1 (n 1)s 2 σ 2 1 χ 2 n 1 0.571 {}}{ (n 1)s 2 power(7) = P > χ 2 4 n 1;0.1 7 0.9 σ 2 1 The last equation implies that we are dealing with a χ 2 n 1 distribution, whose upper 0.9-quantile satisfies χ 2 n 1;0.9 0.571χ2 n 1;0.1. chi-square χ table 2 43;0.9 /χ2 43;0.1 = 0.573 > 0.571 n 1 = 43 Thus, if we collect 44 observations we should be able to detect the alternative value σ1 2 = 7 with at least 90% chance.

Another power example: lower-tail test for the mean, normal population, known σ 2 0.0 0.2 0.4 0.6 0.8 1.0 α H 0 : µ µ 0 versus H 1 : µ < µ 0 at α = 0.05 Say that µ 0 = 5, n = 16, σ = 0.1 We reject H 0 if x µ0 σ/ n < z α = 1.645 that is when x 4.96, hence ( ) power(µ 1 ) = P Z < 4.96 µ1 0.1/ 16 power(µ) = 1 β(µ) µ 0 = 5 4.85 4.95 5.05 Θ 0 Θ 1 µ 0.0 0.2 0.4 0.6 0.8 1.0 4.85 4.95 5.05 n=16 n=9 n=4

Another power example: lower-tail test for the mean, normal population, known σ 2 Note that the power = 1 P(Type II error) function has the following features (everything else being equal): The farther the true mean µ 1 from the hypothesized µ 0, the greater the power The smaller the α, the smaller the power, that is, reducing the probability of Type I error will increase the probability of Type II error The larger the population variance, the lower the power (we are less likely to detect small departures from µ 0, when there is greater variability in the population) The larger the sample size, the greater the power of the test (the more info from the population, the greater the chance of detecting any departures from the null hypothesis).