Chapter 5: Hypothesis Testing and Statistical Inference. Hypothesis Testing. Hypothesis Testing Procedure 8/14/2007

8/4/007 Chapter 5: Hypothesis Testig ad Statistical Iferece 007 Pearso Educatio Hypothesis Testig Hypothesis testig ivolves drawig ifereces about two cotrastig propositios (hypotheses) relatig to the value of a populatio parameter, oe of which is assumed to be true i the absece of cotradictory data. We seek evidece to determie if the hypothesis ca be rejected; if ot, we ca oly assume it to be true but have ot statistically prove it true. Hypothesis Testig Procedure. Formulate the hypothesis. Select a level of sigificace, which defies the risk of drawig a icorrect coclusio that a true hypothesis is false 3. Determie a decisio rule 4. Collect data ad calculate a test statistic 5. Apply the decisio rule ad draw a coclusio

8/4/007 Hypothesis Formulatio Null hypothesis, H 0 a statemet that is accepted as correct Alterative hypothesis, H a propositio that must be true if H 0 is false Formulatig the correct set of hypotheses depeds o burde of proof what you wish to prove statistically should be H Tests ivolvig a sigle populatio parameter are called oe-sample tests; tests ivolvig two populatios are called two-sample tests. Types of Hypothesis Tests Oe Sample Tests : populatio parameter costat vs. H : populatio parameter < costat : populatio parameter costat vs. H : populatio parameter > costat : populatio parameter = costat vs. H : populatio parameter costat Two Sample Tests : populatio parameter () - populatio parameter () 0 vs. H : populatio parameter () - populatio parameter () < 0 : populatio parameter () - populatio parameter () 0 vs. H : populatio parameter () - populatio parameter () > 0 : populatio parameter () - populatio parameter () = 0 vs. H : populatio parameter () - populatio parameter () 0 Four Outcomes. The ull hypothesis is actually true, ad the test correctly fails to reject it.. The ull hypothesis is actually false, ad the hypothesis test correctly reaches this coclusio. 3. The ull hypothesis is actually true, but the hypothesis test icorrectly rejects it (Type I error). 4. The ull hypothesis is actually false, but the hypothesis test icorrectly fails to reject it (Type II error).

8/4/007 Quatifyig Outcomes Probability of Type I error (rejectig H 0 whe it is true) = = level of sigificace Probability of correctly failig to reject H 0 = = cofidece coefficiet Probability of Type II error (failig to reject H 0 whe it is false) = Probability of correctly rejectig H 0 whe it is false = = power of the test Decisio Rules Compute a test statistic from sample data ad compare it to the hypothesized samplig distributio of the test statistic Divide the samplig distributio ito a rejectio regio ad o-rejectio regio. If the test statistic falls i the rejectio regio, reject H 0 (cocludig that H is true); otherwise, fail to reject H 0 Rejectio Regios 3

8/4/007 Hypothesis Tests ad Spreadsheet Support Type of Test Excel/PHStat Procedure Oe sample test for mea, ukow PHStat: Oe Sample Test Z-test for the Mea, Sigma Kow Oe sample test for mea, ukow PHStat: Oe Sample Test t-test for the Mea, Sigma Ukow Oe sample test for proportio PHStat: Oe Sample Test Z-test for the Proportio Two sample test for meas, kow Excel z-test: Two-Sample for Meas PHStat: Two Sample Tests Z-Test for Differeces i Two Meas Two sample test for meas, ukow, uequal Excel t-test: Two-Sample Assumig Uequal Variaces Hypothesis Tests ad Spreadsheet Support (cot d) Type of Test Two sample test for meas, assumed equal ukow, Excel/PHStat Procedure Excel t-test: Two-Sample Assumig Equal Variaces PHStat: Two Sample Tests t-test for Differeces i Two Meas Paired two sample test for meas Two sample test for proportios Equality of variaces Excel t-test: Paired Two-Sample for Meas PHStat: Two Sample Tests Z-Test for Differeces i Two Proportios Excel F-test Two-Sample for Variaces PHStat: Two Sample Tests F-Test for Differeces i Two Variaces Oe Sample Tests for Meas Stadard Deviatio Ukow Example hypothesis : 0 versus H : < 0 Test statistic: t x s / 0 Reject H 0 if t < -t -, 4

8/4/007 Example For the Customer Support Survey.xls data, test the hypotheses : mea respose time 30 miutes H : mea respose time < 30 miutes Sample mea =.9; sample stadard deviatio = 9.49; = 44 observatios Reject H0 because t =.75 < -t43,0.05 = -.68 PHStat Tool: t-test for Mea PHStat meu > Oe Sample Tests > t-test for the Mea, Sigma Ukow Eter ull hypothesis ad alpha Eter sample statistics or data rage Choose type of test Results 5

8/4/007 Usig p-values p-value = probability of obtaiig a test statistic value equal to or more extreme tha that obtaied from the sample data whe H 0 is true 0 Test Statistic Lower oe-tailed test 0 Two-tailed test Test Statistic Oe Sample Tests for Proportios Example hypothesis : 0 versus H : < 0 Test statistic: p 0 z ) Reject if z < -z 0 ( 0 Example For the Customer Support Survey.xls data, test the hypothesis that the proportio of overall quality resposes i the top two boxes is at least 0.75 : : < Sample proportio = 0.68; = 44 For a level of sigificace of 0.05, the critical value of z is -.645; therefore, we caot reject the ull hypothesis 6

8/4/007 PHStat Tool: Oe Sample z- Test for Proportios PHStat > Oe Sample Tests > z-tests for the Proportio Eter ull hypothesis, sigificace level, umber of successes, ad sample size Eter type of test Results Type II Errors ad the Power of a Test The probability of a Type II error,, ad the power of the test ( ) caot be chose by the experimeter. The power of the test depeds o the true value of the populatio mea, the level of cofidece used, ad the sample size. A power curve shows ( ) as a fuctio of 7

8/4/007 Example Power Curve Two Sample Tests for Meas Stadard Deviatio Kow Example hypothesis : 0 versus H : - < 0 Test Statistic: Reject if z < -z z x x / / Two Sample Tests for Meas Sigma Ukow ad Equal Example hypothesis : 0 versus H : - > 0 Test Statistic: z Reject if z > z ( ) s x ( x ) s 8

8/4/007 Two Sample Tests for Meas Sigma Ukow ad Uequal Example hypothesis : = 0 versus H : - 0 Test Statistic: s s t = ( x - x ) / with df = ( s / ) s s ( s / ) Reject if z > z or z < - z Excel Data Aalysis Tool: Two Sample t-tests Tools > Data Aalysis > t-test: Two Sample Assumig Uequal Variaces, or t-test: Two Sample Assumig Equal Variaces Eter rage of data, hypothesized mea differece, ad level of sigificace Tool allows you to test H 0 : - = d Output is provided for upper-tail test oly For lower-tail test, chage the sig o t Critical oe-tail, ad subtract P(T<=t) oe-tail from.0 for correct p-value PHStat Tool: Two Sample t-tests PHStat > Two Sample Tests > t-test for Differeces i Two Meas Test assumes equal variaces Must compute ad eter the sample mea, sample stadard deviatio, ad sample size 9

8/4/007 Compariso of Excel ad PHStat Results Lower-Tail Test Two Sample Test for Meas With Paired Samples Example hypothesis : average differece = 0 versus H : average differece 0 Test Statistic: t D s D / D Reject if t > t -, or t < - t -, Two Sample Tests for Proportios Example hypothesis : = 0 versus H : - 0 Test Statistic: p p z p( p) where p umber of successes i both samples Reject if z > z or z < - z 0

8/4/007 Hypothesis Tests ad Cofidece Itervals If a 00( )% cofidece iterval cotais the hypothesized value, the we would ot reject the ull hypothesis based o this value with a level of sigificace. Example hypothesis : 0 versus H : < 0 If a 00(- )% cofidece iterval does ot cotai 0, the we ca reject H 0 F-Test for Differeces i Two Variaces Hypothesis : = 0 versus H : - 0 Test Statistic: s F s Assume s > s Reject if F > F -,- (see Appedix A.4) Assumes both samples draw from ormal distributios Excel Data Aalysis Tool: F- Test for Equality of Variaces Tools > Data Aalysis > F-test for Equality of Variaces Specify data rages Use / for the sigificace level! If the variace of Variable is greater tha the variace of variable, the output will specify the upper tail; otherwise, you obtai the lower tail iformatio.

8/4/007 PHStat Tool: F-Test for Differeces i Variaces PHStat meu > Two Sample Tests > F- test for Differeces i Two Variaces Compute ad eter sample stadard deviatios Eter the sigificace level, ot / as i Excel Excel ad PHStat Results Aalysis of Variace (ANOVA) Compare the meas of m differet groups (factors) to determie if all are equal : m H : at least oe mea is differet from the others

8/4/007 ANOVA Theory j = umber of observatios i sample j SST = total variatio i the data SST SSB = variatio betwee groups SSB SSW = variatio withi groups SSW j j i ( X ij ( X j j j j i ( X j ij X ) X ) X j ) SST = SSB + SSW ANOVA Test Statistic MSB = SSB/(m ) MSW = SSW/( m) Test statistic: F = MSB/MSW Has a F-distributio with m- ad -m degrees of freedom Reject H 0 if F > F /,m-,-m Excel Data Aalysis Tool for ANOVA Tools > Data Aalysis > ANOVA: Sigle Factor 3

8/4/007 ANOVA Results ANOVA Assumptios The m groups or factor levels beig studied represet populatios whose outcome measures are Radomly ad idepedetly obtaied Are ormally distributed Have equal variaces Violatio of these assumptios ca affect the true level of sigificace ad power of the test. Noparametric Tests Used whe assumptios (usually ormality) are violated. Examples: Wilcoxo rak sum test for testig differece betwee two medias Kurskal-Wallis rak test for determiig whether multiple populatios have equal medias. Both supported by PHStat 4

8/4/007 Tukey-Kramer Multiple Compariso Procedure ANOVA caot idetify which meas may differ from the rest PHStat meu > Multiple Sample Tests > Tukey-Kramer Multiple Compariso Procedure Eter Q Statistic from Table A.5 Chi-Square Test for Idepedece Test whether two categorical variables are idepedet : the two categorical variables are idepedet H : the two categorical variables are depedet Example Is geder idepedet of holdig a CPA i a accoutig firm? 5

8/4/007 Chi-Square Test for Idepedece Test statistic where f 0 = observed frequecy ( f o f e ) f e = expected frequecy if H 0 true i the cells of the cotigecy table f e Reject H 0 if >, (r-)(c-) PHSta tool available i Multiple Sample Tests meu Example Expected No CPA CPA Total Female 6.74 7.6 4 Male 6.6 6.74 3 Total 3 4 7 Critical value with = 0.05 ad ( - )( - ) - df = 3.84; therefore, we caot reject the ull hypothesis that the two categorical variables are idepedet. PHStat Procedure Results 6

8/4/007 Desig of Experimets A test or series of tests that eables the experimeter to compare two or more methods to determie which is better, or determie levels of cotrollable factors to optimize the yield of a process or miimize the variability of a respose variable. Factorial Experimets All combiatios of levels of each factor are cosidered. With m factors at k levels, there are k m experimets. Example: Suppose that temperature ad reactio time are thought to be importat factors i the percet yield of a chemical process. Curretly, the process operates at a temperature of 00 degrees ad a 60 miute reactio time. I a effort to reduce costs ad improve yield, the plat maager wats to determie if chagig the temperature ad reactio time will have ay sigificat effect o the percet yield, ad if so, to idetify the best levels of these factors to optimize the yield. Desiged Experimet Aalyze the effect of two levels of each factor (for istace, temperature at 00 ad 5 degrees, ad time at 60 ad 90 miutes) The differet combiatios of levels of each factor are commoly called treatmets. 7

8/4/007 Treatmet Combiatios Low High Low High Experimetal Results Mai Effects Measures the differece i the respose that results from differet factor levels Calculatios Temperature effect = (Average yield at high level) (Average yield at low level) = (B + D)/ (A + C)/ = (90.5 + 8)/ (84 + 88.5)/ = 85.75 86.5 = 0.5 percet. Reactio effect = (Average yield at high level) (Average yield at low level) = (C + D)/ (A + B)/ = (88.5 + 8)/ (84 + 90.5)/ = 84.75 87.5 =.5 percet. 8

8/4/007 Iteractios Whe the effect of chagig oe factor depeds o the level of other factors. Whe iteractios are preset, we caot estimate respose chages by simply addig mai effects; the effect of oe factor must be iterpreted relative to levels of the other factor. Iteractio Calculatios Take the average differece i respose whe the factors are both at the high or low levels ad subtractig the average differece i respose whe the factors are at opposite levels. Temperature Time Iteractio = (Average yield, both factors at same level) (Average yield, both factors at opposite levels) = (A + D)/ (B + C)/ = (84 + 8)/ (90.5 + 88.5)/ = -7.0 percet Graphical Illustratio of Iteractios 9

8/4/007 Two-Way ANOVA Method for aalyzig variatio i a -factor experimet SST = SSA + SSB + SSAB + SSW where SST = total sum of squares SSA = sum of squares due to factor A SSB = sum of squares due to factor B SSAB = sum of squares due to iteractio SSW = sum of squares due to radom variatio (error) Mea Squares MSA = SSA/(r ) MSB = SSB/(c ) MSAB = SSAB/(r-)(c-) MSW = SSW/rc(k-), where k = umber of replicatios of each treatmet combiatio. Hypothesis Tests Compute F statistics by dividig each mea square by MSW. F = MSA/MSW tests the ull hypothesis that meas for each treatmet level of factor A are the same agaist the alterative hypothesis that ot all meas are equal. F = MSB/MSW tests the ull hypothesis that meas for each treatmet level of factor A are the same agaist the alterative hypothesis that ot all meas are equal. F = MSAB/MSW tests the ull hypothesis that the iteractio betwee factors A ad B is zero agaist the alterative hypothesis that the iteractio is ot zero. 0

8/4/007 Excel Aova: Two-Factor with Replicatio Results Examie p- values for sigificace