MEAN SEPARATION TESTS (LSD AND Tukey s Procedure) is rejected, we need a method to determine which means are significantly different from the others.



Similar documents
ORTHOGONAL POLYNOMIAL CONTRASTS INDIVIDUAL DF COMPARISONS: EQUALLY SPACED TREATMENTS

Experimental Design for Influential Factors of Rates on Massive Open Online Courses

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Random effects and nested models with SAS

Topic 9. Factorial Experiments [ST&D Chapter 15]

BIOL 933 Lab 6 Fall Data Transformation

Section 13, Part 1 ANOVA. Analysis Of Variance

Analysis of Variance. MINITAB User s Guide 2 3-1

N-Way Analysis of Variance

5 Analysis of Variance models, complex linear models and Random effects models

Analysis of Variance ANOVA

Chapter 4 and 5 solutions

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

One-Way Analysis of Variance (ANOVA) Example Problem

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

Chapter 19 Split-Plot Designs

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

A Basic Guide to Analyzing Individual Scores Data with SPSS

CHAPTER 13. Experimental Design and Analysis of Variance

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

ANOVA ANOVA. Two-Way ANOVA. One-Way ANOVA. When to use ANOVA ANOVA. Analysis of Variance. Chapter 16. A procedure for comparing more than two groups

Final Exam Practice Problem Answers

Multiple Linear Regression

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

THE KRUSKAL WALLLIS TEST

Regression step-by-step using Microsoft Excel

Chapter 5 Analysis of variance SPSS Analysis of variance

xtmixed & denominator degrees of freedom: myth or magic

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA

Recall this chart that showed how most of our course would be organized:

ANOVA. February 12, 2015

8. Comparing Means Using One Way ANOVA

Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes)

The ANOVA for 2x2 Independent Groups Factorial Design

Week TSX Index

Stat 5303 (Oehlert): Tukey One Degree of Freedom 1

Evaluation of Correlation between Within-Barn Curing Environment and TSNA Accumulation in Dark Air-Cured Tobacco

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Regression Analysis: A Complete Example

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

Statistical Models in R

General Regression Formulae ) (N-2) (1 - r 2 YX

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Multiple-Comparison Procedures

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Chapter 7: Simple linear regression Learning Objectives

Univariate Regression

2013 MBA Jump Start Program. Statistics Module Part 3

Lin s Concordance Correlation Coefficient

Independent t- Test (Comparing Two Means)

An analysis method for a quantitative outcome and two categorical explanatory variables.

Getting Correct Results from PROC REG

Statistics Review PSY379

One-Way Analysis of Variance

Chapter 2 Simple Comparative Experiments Solutions

SPSS Guide: Regression Analysis

Multivariate Analysis of Variance (MANOVA)

Testing for Lack of Fit

Main Effects and Interactions

Mixed 2 x 3 ANOVA. Notes

Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance

1 Overview. Fisher s Least Significant Difference (LSD) Test. Lynne J. Williams Hervé Abdi

Data Analysis Tools. Tools for Summarizing Data

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Statistical Models in R

5. Linear Regression

Assessing Measurement System Variation

Randomized Block Analysis of Variance

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

How To Run Statistical Tests in Excel

ii. More than one per Trt*Block -> EXPLORATORY MODEL

Minitab Tutorials for Design and Analysis of Experiments. Table of Contents

Detecting Spam. MGS 8040, Data Mining. Audrey Gies Matt Labbe Tatiana Restrepo

The importance of graphing the data: Anscombe s regression examples

1.1. Simple Regression in Excel (Excel 2010).

13: Additional ANOVA Topics. Post hoc Comparisons

Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

Gage Studies for Continuous Data

HOW TO USE MINITAB: DESIGN OF EXPERIMENTS. Noelle M. Richard 08/27/14

1 Simple Linear Regression I Least Squares Estimation

1 Nonparametric Statistics

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Using R for Linear Regression

Repeatability and Reproducibility

Supplementary Materials. Dispersive micro-solid-phase extraction of dopamine, epinephrine and norepinephrine

Estimation of σ 2, the variance of ɛ

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

12: Analysis of Variance. Introduction

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

Data Analysis Methodology 1

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Hypothesis testing - Steps

Basic Statistical and Modeling Procedures Using SAS

Quick Stata Guide by Liz Foster

Transcription:

MEAN SEPARATION TESTS (LSD AND Tukey s Procedure) If Ho 1 2... n is rejected, we need a method to determine which means are significantly different from the others. We ll look at three separation tests during the semester: 1. F-protected least significant difference (F-protected LSD) 2. Tukey s Procedure 3. Orthogonal linear contrasts (covered at the end of the semester) F-protected Least Significant Difference The LSD we will use is called an F-protected LSD because it is calculated and used only when H o is rejected. Sometimes when one fails to reject H o and an LSD is calculated, the LSD will wrongly suggest that there are significant differences between treatment means. To prevent against this conflict, we calculate the LSD only when H o is rejected. LSD t and df for t = Error df / 2 * s Y 1Y 2 If r 1 r2... rn then s Y Y 1 2 2ErrorMS r If ri r i ' then s Y1Y 2 2 1 1 s ri ri ' If the difference between two treatment means is greater than the LSD, then those treatment means are significantly different at the 1 % level of confidence.

Example Given an experiment analyzed as a CRD that has 7 treatments and 4 replicates with the following analysis SOV Df SS MS F Treatment 6 5,587,174 931,196 9.83 ** Error 21 1,990,238 94,773 Total 27 7,577,412 and the following treatment means Treatment Mean A 2,127 B 2,678 C 2,552 D 2,128 E 1,796 F 1,681 G 1,316 Calculate the LSD and show what means are significantly different at the 95% level of confidence. Step 1. Calculate the LSD LSD t / 2 * s Y Y 1 2 2.080 2ErrorMS r 2.080 2(94,773) 4 452.8 453

Step 2. Rank treatment means from low to high Treatment Mean G 1,316 F 1,681 E 1,796 A 2,127 D 2,128 C 2,552 B 2,678 Step 3. Calculate differences between treatment means to determine which ones are significantly different from each other. If the difference between two treatment means is greater than the LSD, then those treatment means are significantly different at the 95% level of confidence. Treatment F vs. Treatment G 1681 1316 = 365 ns Treatment E vs. Treatment G 1796 1316 = 480 * Since E is significantly greater than Treatment G, then the rest of the means greater than that of Treatment E also are significantly different than Treatment G. Thus there is no need to keep comparing the difference between the mean of Treatment G and Treatments with means greater than the mean of Treatment E. Treatment E vs. Treatment F 1796 1681 = 115 ns Treatment A vs. Treatment F 2127 1681 = 446 ns Treatment D vs. Treatment F 2128 1681 = 447 ns Treatment C vs. Treatment F 2552 1681 = 871 * *Therefore Treatment B must also be different from Treatment F Treatment A vs. Treatment E 2127 1796 = 331 ns Treatment D vs. Treatment E 2128 1796 = 332 ns Treatment C vs. Treatment E 2552 1796 = 756 * *Therefore Treatment B must also be different from Treatment E Treatment D vs. Treatment A 2128 2127 = 1 ns Treatment C vs. Treatment A 2552 2127 = 425 ns Treatment B vs. Treatment A 2678 2127 = 551 *

Treatment C vs. Treatment D 2552 2128 = 424 ns Treatment B vs. Treatment D 2678 2128 = 550 * Treatment B vs. Treatment C 2678 2552 = 126 ns Step 4. Place lowercase letters behind treatment means to show which treatments are significantly different. Step 4.1. Write letters horizontally G F E A D C B Step 4.2. Under line treatments that are not significantly different. G F E A D C B Step 4.3. Ignore those lines that fall within the boundary of another line. G F E A D C B Step 4.4 Label each line, beginning with the top one, with lowercase letters beginning with a. G F E A D C B a b c d Step 4.5 Add lowercase letters behind the respective means. Treatment Mean G 1,316 a F 1,681 ab E 1,796 b A 2,127 bc D 2,128 bc C 2,552 cd B 2,678 d

F-protected LSD when r j r j / LSD t.05/ 2; errordf s 2 1 rj 1 ' r j Given: SOV Df SS MS F Treatment 3 0.978 0.326 6.392 ** Error 13 0.660 0.051 Total 16 1.638 And Treatment n Mean A 5 2.0 B 3 1.7 C 5 2.4 D 4 2.1 How man LSD s do we need to calculate? Step 1. Calculate the LSD s. LSD #1) Treatment A or C vs. Treatment B: 2.160 1 1 0.051 0.356 0. 4 5 3 LSD #2) Treatment A or C vs. Treatment D: 2.160 1 1 0.051 0.327 0. 3 5 4 LSD #3) Treatment A vs. C: 2.160 1 1 0.051 0.309 0. 3 5 5 LSD #4) Treatment B vs. D: 2.160 1 1 0.051 0.373 0. 4 3 4

Step 2. Write down the means in order from low to high. Treatment n Mean B 3 1.7 A 5 2.0 D 4 2.1 C 5 2.4 Step 3. Calculate differences between treatment means to determine which ones are significantly different from each other. If the difference between two treatment means is greater than the LSD, then those treatment means are significantly different at the 95% level of confidence. Treatment A vs. Treatment B (LSD #1) 2.0 1.7 = 0.3 ns Treatment D vs. Treatment B (LSD #4) 2.1 1.7 = 0.4 ns Treatment C vs. Treatment B (LSD #1) 2.4 1.7 = 0.7 * Treatment D vs. Treatment A (LSD #2) 2.1 2.0 = 0.1 ns Treatment C vs. Treatment A (LSD #3) 2.4 2.0 = 0.4 * Treatment C vs. Treatment D (LSD #2) 2.4 2.1 = 0.3 ns Step 4. Place lowercase letters behind treatment means to show which treatments are significantly different. Step 4.1. Write letters horizontally B A D C Step 4.2. Under line treatments that are not significantly different. B A D C Step 4.3. Ignore those lines that fall within the boundary of another line. B A D C

Step 4.4 Label each line, beginning with the top one, with lowercase letters beginning with a. a B A D C b Step 4.5 Add lowercase letters behind the respective means. Treatment n Mean B 3 1.7 a A 5 2.0 a D 4 2.1 ab C 5 2.4 b F-protected LSD with Sampling when r j s k r j s k or r j s k =r j s k LSD t.05/ 2; errordf 2 1 s rjsk r 1 ' s j k ' If r j s k =r i s k : LSD t.05/ 2; errordf 2s rs 2 Tukey s Procedure This test takes into consideration the number of means involved in the comparison. Tukey s procedure uses the distribution of the studentized range statistic. q y max y min MS Error r Where ymax and ymin are the largest and smallest treatment means, respectively, out of a group of p treatment means. Appendix Table VII, pages 621 and 622, contains values of q ( p, f ), the upper α percentage points of q where f is the number of degrees of freedom associated with the Mean Square Error.

As the number of means involved in a comparison increases, the studentized range statistic increases. The basis behind Tukey s Procedure is that in general, as the number of means involved in a test increases, the smaller or less likely is the probability that they will be alike (i.e. the probability of detecting differences increases). Tukey s Procedure accounts for this fact by increasing the studentized range statistic as the number of treatments (p) increases, such that the probability that the means will be alike remains the same. MS Error If r i = r i, Tukey s statistic = T q ( p, f ) r Two treatments means are considered significantly different if the different between their means is greater than T α. Example (using the same data previously used for the LSD example) Given an experiment analyzed as a CRD that has 7 treatments and 4 replicates with the following analysis SOV Df SS MS F Treatment 6 5,587,174 931,196 9.83 ** Error 21 1,990,238 94,773 Total 27 7,577,412 and the following treatment means Treatment Mean A 2,127 B 2,678 C 2,552 D 2,128 E 1,796 F 1,681 G 1,316 Calculate used Tukey s procedure to show what means are significantly different at the 95% level of confidence.

Step 1. Calculate Tukey s statistic. T T 0.05 q q 4.60 708.06 708 ( p, f ) 0.05 (7,21) 23,693.26 MS Error r 94,773 4 Step 2. Rank treatment means from low to high Treatment Mean G 1,316 F 1,681 E 1,796 A 2,127 D 2,128 C 2,552 B 2,678 Step 3. Calculate differences between treatment means to determine which ones are significantly different from each other. If the difference between two treatment means is greater than T α, then those treatment means are significantly different at the 95% level of confidence. Treatment F vs. Treatment G 1681 1316 = 365 ns Treatment E vs. Treatment G 1796 1316 = 480 ns Treatment A vs. Treatment G 2127-1316 = 811 * Since A is significantly greater than Treatment G, then the rest of the means greater than that of Treatment A also are significantly different than Treatment G. Thus there is no need to keep comparing the difference between the mean of Treatment G and Treatments with means greater than the mean of Treatment A. Treatment E vs. Treatment F 1796 1681 = 115 ns Treatment A vs. Treatment F 2127 1681 = 446 ns Treatment D vs. Treatment F 2128 1681 = 447 ns Treatment C vs. Treatment F 2552 1681 = 871 * *Therefore Treatment B must also be different from Treatment F

Treatment A vs. Treatment E 2127 1796 = 331 ns Treatment D vs. Treatment E 2128 1796 = 332 ns Treatment C vs. Treatment E 2552 1796 = 756 * *Therefore Treatment B must also be different from Treatment E Treatment D vs. Treatment A 2128 2127 = 1 ns Treatment C vs. Treatment A 2552 2127 = 425 ns Treatment B vs. Treatment A 2678 2127 = 551 ns Step 4. Place lowercase letters behind treatment means to show which treatments are significantly different. Step 4.1. Write letters horizontally G F E A D C B Step 4.2. Under line treatments that are not significantly different. G F E A D C B Step 4.3. Ignore those lines that fall within the boundary of another line. G F E A D C B Step 4.4 Label each line, beginning with the top one, with lowercase letters beginning with a. G F E A D C B a b c

Step 4.5 Add lowercase letters behind the respective means. Treatment Mean G 1,316 a F 1,681 ab E 1,796 ab A 2,127 bc D 2,128 bc C 2,552 c B 2,678 c Tukey-Kramer Procedure Used for unbalanced data (i.e., ri r i ' ). T q p, f 2 1 1 Error MS ri r i ' Example Given: SOV Df SS MS F Treatment 3 0.978 0.326 6.392 ** Error 13 0.660 0.051 Total 16 1.638 And Treatment n Mean A 5 2.0 B 3 1.7 C 5 2.4 D 4 2.1 How man T α values do we need to calculate?

Step 1. Calculate the T α values. Where T q p, f 2 1 1 Error MS ri r i ' And q α (p,f) = q 0.05 (4,13) = 4.15 4.15 1 1 T #1) Treatment A or C vs. Treatment B: 0.051 0.48 0. 5 2 5 3 4.15 1 1 T #2) Treatment A or C vs. Treatment D: 0.051 0.445 0. 4 2 5 4 4.15 1 1 T #3) Treatment A vs. C: 0.051 0.415 0. 4 2 5 5 4.15 1 1 T #4) Treatment B vs. D: 0.051 0.508 0. 5 2 3 4 Step 2. Write down the means in order from low to high. Treatment n Mean B 3 1.7 A 5 2.0 D 4 2.1 C 5 2.4 Step 3. Calculate differences between treatment means to determine which ones are significantly different from each other. If the difference between two treatment means is greater than the T -value, then those treatment means are significantly different at the 95% level of confidence. Treatment A vs. Treatment B (T α value #1) 2.0 1.7 = 0.3 ns Treatment D vs. Treatment B (T α value #4) 2.1 1.7 = 0.4 ns Treatment C vs. Treatment B (T α value #1) 2.4 1.7 = 0.7 *

Treatment D vs. Treatment A (T α value #2) 2.1 2.0 = 0.1 ns Treatment C vs. Treatment A (T α value #3) 2.4 2.0 = 0.4 ns Step 4. Place lowercase letters behind treatment means to show which treatments are significantly different. Step 4.1. Write letters horizontally B A D C Step 4.2. Under line treatments that are not significantly different. B A D C Step 4.3. Ignore those lines that fall within the boundary of another line. B A D C Step 4.4 Label each line, beginning with the top one, with lowercase letters beginning with a. a B A D C b Step 4.5 Add lowercase letters behind the respective means. Treatment n Mean B 3 1.7 a A 5 2.0 ab D 4 2.1 ab C 5 2.4 b Tukey s Procedure with Sampling 2 q ( p, f ) s where s Y Y rs T * s

Tukey Kramer Procedure with Sampling T q ( p, f ) 2 1 s rjsk r s 2 1 j k '

Output of the Proc Anova Command The ANOVA Procedure Class Level Information Class Levels Values trt 3 a b c Number of Observations Read 12 Number of Observations Used 12

Output of the Proc Anova Command The ANOVA Procedure Dependent Variable: yield Source DF Sum of Squares Mean Square F Value Pr > F Model 2 300.5000000 150.2500000 3.56 0.0725 Error 9 379.5000000 42.1666667 Corrected Total 11 680.0000000 R-Square Coeff Var Root MSE yield Mean 0.441912 17.55023 6.493587 37.00000 Source DF Anova SS Mean Square F Value Pr > F trt 2 300.5000000 150.2500000 3.56 0.0725

Output of the Proc Anova Command The ANOVA Procedure

Output of the Proc Anova Command The ANOVA Procedure t Tests (LSD) for yield NoteThis test controls the Type I comparisonwise error rate, not the : experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 9 Error Mean Square 42.16667 Critical Value of t 2.26216 Least Significant Difference 10.387 Means with the same letter are not significantly different. t Grouping Mean N trt A 43.000 4 c A B A 37.250 4 b B B 30.750 4 a

Output of the Proc Anova Command The ANOVA Procedure Tukey's Studentized Range (HSD) Test for yield NoteThis test controls the Type I experimentwise error rate, but it generally has a higher Type II error : rate than REGWQ. Alpha 0.05 Error Degrees of Freedom 9 Error Mean Square 42.16667 Critical Value of Studentized Range 3.94840 Minimum Significant Difference 12.82 Means with the same letter are not significantly different. Tukey Grouping Mean N trt A 43.000 4 c A A 37.250 4 b A A 30.750 4 a