Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes)



Similar documents
Experimental Design for Influential Factors of Rates on Massive Open Online Courses

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

13: Additional ANOVA Topics. Post hoc Comparisons

Section 13, Part 1 ANOVA. Analysis Of Variance

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

Analysis of Variance ANOVA

Chapter 4 and 5 solutions

UNDERSTANDING THE TWO-WAY ANOVA

Multiple-Comparison Procedures

Dongfeng Li. Autumn 2010

Final Exam Practice Problem Answers

Descriptive Statistics

MEAN SEPARATION TESTS (LSD AND Tukey s Procedure) is rejected, we need a method to determine which means are significantly different from the others.

Analysis of Variance. MINITAB User s Guide 2 3-1

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1.5 Oneway Analysis of Variance

Non-Parametric Tests (I)

SPSS Tests for Versions 9 to 13

THE KRUSKAL WALLLIS TEST

Statistics Review PSY379

Comparing Means in Two Populations

Chapter 5 Analysis of variance SPSS Analysis of variance

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

1 Nonparametric Statistics

Topic 9. Factorial Experiments [ST&D Chapter 15]

NCSS Statistical Software

Nonparametric Statistics

Module 5: Multiple Regression Analysis

Chapter 19 Split-Plot Designs

Normality Testing in Excel

Study Guide for the Final Exam

Chapter 12 Nonparametric Tests. Chapter Table of Contents

Statistical Models in R

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Notes on Applied Linear Regression

Post-hoc comparisons & two-way analysis of variance. Two-way ANOVA, II. Post-hoc testing for main effects. Post-hoc testing 9.

SPSS Explore procedure

Recall this chart that showed how most of our course would be organized:

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Simple linear regression

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Statistiek II. John Nerbonne. October 1, Dept of Information Science

Randomized Block Analysis of Variance

Randomization Based Confidence Intervals For Cross Over and Replicate Designs and for the Analysis of Covariance

CHAPTER 14 NONPARAMETRIC TESTS

5. Linear Regression

How To Test For Significance On A Data Set

Additional sources Compilation of sources:

Fairfield Public Schools

Chapter 2 Simple Comparative Experiments Solutions

Simple Linear Regression Inference

ABSORBENCY OF PAPER TOWELS

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

SAS/STAT. 9.2 User s Guide. Introduction to. Nonparametric Analysis. (Book Excerpt) SAS Documentation

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Concepts of Experimental Design

Introduction to General and Generalized Linear Models

Confidence Intervals for the Difference Between Two Means

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Estimation of σ 2, the variance of ɛ

Coefficient of Determination

individualdifferences

Chapter 3 RANDOM VARIATE GENERATION

Basic Statistical and Modeling Procedures Using SAS

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

Part II. Multiple Linear Regression

Friedman's Two-way Analysis of Variance by Ranks -- Analysis of k-within-group Data with a Quantitative Response Variable

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

Using R for Linear Regression

Random effects and nested models with SAS

Statistics 522: Sampling and Survey Techniques. Topic 5. Consider sampling children in an elementary school.

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

II. DISTRIBUTIONS distribution normal distribution. standard scores

CHAPTER 13. Experimental Design and Analysis of Variance

Association Between Variables

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

1 Basic ANOVA concepts

Getting Correct Results from PROC REG

1 Simple Linear Regression I Least Squares Estimation

Parametric and non-parametric statistical methods for the life sciences - Session I

How To Compare Birds To Other Birds

An analysis method for a quantitative outcome and two categorical explanatory variables.

T-test & factor analysis

Simple Regression Theory II 2010 Samuel L. Baker

Lecture Notes Module 1

SAS Software to Fit the Generalized Linear Model

BIOL 933 Lab 6 Fall Data Transformation

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

STAT 350 Practice Final Exam Solution (Spring 2015)

Logit Models for Binary Data

Research Methods & Experimental Design

Transcription:

Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes) Asheber Abebe Discrete and Statistical Sciences Auburn University

Contents 1 Completely Randomized Design 1 11 Introduction 1 12 The Fixed Effects Model 2 121 Decomposition of the Total Sum of Squares 2 122 Statistical Analysis 3 123 Comparison of Individual Treatment Means 5 13 The Random Effects Model 10 14 More About the One-Way Model 18 141 Model Adequacy Checking 18 142 Some Remedial Measures 23 2 Randomized Blocks 25 21 The Randomized Complete Block Design 25 211 Introduction 25 212 Decomposition of the Total Sum of Squares 26 213 Statistical Analysis 27 214 Relative Efficiency of the RCBD 29 215 Comparison of Treatment Means 30 216 Model Adequacy Checking 30 217 Missing Values 36 22 The Latin Square Design 38 221 Statistical Analysis 39 222 Missing Values 41 223 Relative Efficiency 41 224 Replicated Latin Square 42 23 The Graeco-Latin Square Design 48 24 Incomplete Block Designs 51 241 Balanced Incomplete Block Designs (BIBD s) 51 242 Youden Squares 55 243 Other Incomplete Designs 55 3 Factorial Designs 57 31 Introduction 57 32 The Two-Factor Factorial Design 59 321 The Fixed Effects Model 60 322 Random and Mixed Models 70 33 Blocking in Factorial Designs 75 34 The General Factorial Design 79 iii

iv CONTENTS 4 2 k and 3 k Factorial Designs 83 41 Introduction 83 42 The 2 k Factorial Design 83 421 The 2 2 Design 83 422 The 2 3 Design 86 423 The General 2 k Design 89 424 The Unreplicated 2 k Design 90 43 The 3 k Design 99 5 Repeated Measurement Designs 101 51 Introduction 101 511 The Mixed RCBD 101 52 One-Way Repeated Measurement Designs 103 521 The Huynh-Feldt Sphericity (S) Structure 104 522 The One-Way RM Design : (S) Structure 104 523 One-way RM Design : General 107 53 Two-Way Repeated Measurement Designs 109 6 More on Repeated Measurement Designs 129 61 Trend Analyses in One- and Two-way RM Designs 129 611 Regression Components of the Between Treatment SS (SS B ) 129 612 RM Designs 134 62 The Split-Plot Design 140 63 Crossover Designs 146 64 Two-Way Repeated Measurement Designs with Repeated Measures on Both Factors 151 7 Introduction to the Analysis of Covariance 157 71 Simple Linear Regression 157 711 Estimation : The Method of Least Squares 157 712 Partitioning the Total SS 158 713 Tests of Hypotheses 158 72 Single Factor Designs with One Covariate 162 73 ANCOVA in Randomized Complete Block Designs 166 74 ANCOVA in Two-Factor Designs 170 75 The Johnson-Neyman Technique: Heterogeneous Slopes 174 751 Two Groups, One Covariate 174 752 Multiple Groups, One Covariate 180 8 Nested Designs 181 81 Nesting in the Design Structure 181 82 Nesting in the Treatment Structure 185

Chapter 1 Completely Randomized Design 11 Introduction Suppose we have an experiment which compares k treatments or k levels of a single factor Suppose we have n experimental units to be included in the experiment We can assign the first treatment to n 1 units randomly selected from among the n, assign the second treatment to n 2 units randomly selected from the remaining n n 1 units, and so on until the kth treatment is assigned to the final n k units Such an experimental design is called a completely randomized design (CRD) We shall describe the observations using the linear statistical model where y ij is the jth observation on treatment i, y ij = µ + τ i + ɛ ij, i = 1,, k, j = 1,, n i, (11) µ is a parameter common to all treatments (overall mean), τ i is a parameter unique to the ith treatment (ith treatment effect), and ɛ ij is a random error component In this model the random errors are assumed to be normally and independently distributed with mean zero and variance σ 2, which is assumed constant for all treatments The model is called the one-way classification analysis of variance (one-way ANOVA) The typical data layout for a one-way ANOVA is shown below: Treatment 1 2 k y 11 y 21 y k1 y 11 y 21 y k1 y 1n1 y 2n2 y knk The model in Equation (11) describes two different situations : 1 Fixed Effects Model : The k treatments could have been specifically chosen by the experimenter The goal here is to test hypotheses about the treatment means and estimate the model parameters (µ, τ i, and σ 2 ) Conclusions reached here only apply to the treatments considered and cannot be extended to other treatments that were not in the study 1

2 CHAPTER 1 COMPLETELY RANDOMIZED DESIGN 2 Random Effects Model : The k treatments could be a random sample from a larger population of treatments Conclusions here extend to all the treatments in the population The τ i are random variables; thus, we are not interested in the particular ones in the model We test hypotheses about the variability of τ i Here are a few examples taken from Peterson : Design and Analysis of Experiments: 1 Fixed : A scientist develops three new fungicides His interest is in these fungicides only Random : A scientist is interested in the way a fungicide works He selects, at random, three fungicides from a group of similar fungicides to study the action 2 Fixed : Measure the rate of production of five particular machines Random : Choose five machines to represent machines as a class 3 Fixed : Conduct an experiment to obtain information about four specific soil types Random : Select, at random, four soil types to represent all soil types 12 The Fixed Effects Model In this section we consider the ANOVA for the fixed effects model The treatment effects, τ i, are expressed as deviations from the overall mean, so that k τ i = 0 i=1 Denote by µ i the mean of the ith treatment; µ i = E(y ij ) = µ + τ i, i = 1,, k We are interested in testing the equality of the k treatment means; H 0 H A : µ 1 = µ 2 = = µ k : µ i µ j for at least one i, j An equivalent set of hypotheses is H 0 : τ 1 = τ 2 = = τ k = 0 H A : τ i 0 for at least one i 121 Decomposition of the Total Sum of Squares In the following let n = k i=1 n i Further, let ȳ i = 1 n i n i j=1 y ij, ȳ = 1 n k n i i=1 j=1 y ij The total sum of squares (corrected) given by SS T = k n i (y ij ȳ ) 2, i=1 j=1 measures the total variability in the data The total sum of squares, SS T, may be decomposed as

12 THE FIXED EFFECTS MODEL 3 k n i (y ij ȳ ) 2 = i=1 j=1 The proof is left as an exercise We will write k n i (ȳ i ȳ ) 2 + i=1 SS T = SS B + SS W, k n i (y ij ȳ i ) 2 i=1 j=1 where SS B = k i=1 n i(ȳ i ȳ ) 2 is called the between treatments sum of squares and SS W = k i=1 ni j=1 (y ij ȳ i ) 2 is called the within treatments sum of squares One can easily show that the estimate of the common variance σ 2 is SS W /(n k) Mean squares are obtained by dividing the sum of squares by their respective degrees of freedoms as 122 Statistical Analysis Testing MS B = SS B /(k 1), MS W = SS W /(n k) Since we assumed that the random errors are independent, normal random variables, it follows by Cochran s Theorem that if the null hypothesis is true, then F 0 = MS B MS W if follows an F distribution with k 1 and n k degrees of freedom Thus an α level test of H 0 rejects H 0 F 0 > F k 1,n k (α) The following ANOVA table summarizes the test procedure: Estimation Source df SS MS F 0 Between k 1 SS B MS B F 0 = MS B /MS W Within (Error) n k SS W MS W Total n 1 SS T Once again consider the one-way classification model given by Equation (11) We now wish to estimate the model parameters (µ, τ i, σ 2 ) The most popular method of estimation is the method of least squares (LS) which determines the estimators of µ and τ i by minimizing the sum of squares of the errors L = k n i ɛ 2 ij = i=1 j=1 k n i (y ij µ τ i ) 2 i=1 j=1 Minimization of L via partial differentiation provides the estimates ˆµ = ȳ and ˆτ i = ȳ i ȳ, for i = 1,, k By rewriting the observations as y ij = ȳ + (ȳ i ȳ ) + (y ij ȳ i ) one can easily observe that it is quite reasonable to estimate the random error terms by These are the model residuals e ij = y ij ȳ i

4 CHAPTER 1 COMPLETELY RANDOMIZED DESIGN Alternatively, the estimator of y ij based on the model (11) is ŷ ij = ˆµ + ˆτ i, which simplifies to ŷ ij = ȳ i Thus, the residuals are y ij ŷ ij = y ij ȳ i An estimator of the ith treatment mean, µ i, would be ˆµ i = ˆµ + ˆτ i = ȳ i Using MS W as an estimator of σ 2, we may provide a 100(1 α)% confidence interval for the treatment mean, µ i, ȳ i ± t n k (α/2) MS W /n i A 100(1 α)% confidence interval for the difference of any two treatment means, µ i µ j, would be ȳ i ȳ j ± t n k (α/2) MS W (1/n i + 1/n j ) We now consider an example from Montgomery : Design and Analysis of Experiments Example The tensile strength of a synthetic fiber used to make cloth for men s shirts is of interest to a manufacturer It is suspected that the strength is affected by the percentage of cotton in the fiber Five levels of cotton percentage are considered: 15%, 20%, 25%, 30% and 35% For each percentage of cotton in the fiber, strength measurements (time to break when subject to a stress) are made on five pieces of fiber The corresponding ANOVA table is 15 20 25 30 35 7 12 14 19 7 7 17 18 25 10 15 12 18 22 11 11 18 19 19 15 9 18 19 23 11 Source df SS MS F 0 Between 4 47576 11894 F 0 = 1476 Within (Error) 20 16120 806 Total 24 63696 Performing the test at α = 01 one can easily conclude that the percentage of cotton has a significant effect on fiber strength since F 0 = 1476 is greater than the tabulated F 4,20 (01) = 443 The estimate of the overall mean is ˆµ = ȳ = 1504 Point estimates of the treatment effects are A 95% percent CI on the mean treatment 4 is which gives the interval 1895 µ 4 2425 ˆτ 1 = ȳ 1 ȳ = 980 1504 = 524 ˆτ 2 = ȳ 2 ȳ = 1540 1504 = 036 ˆτ 3 = ȳ 3 ȳ = 1760 1504 = 256 ˆτ 4 = ȳ 4 ȳ = 2160 1504 = 656 ˆτ 5 = ȳ 5 ȳ = 1080 1504 = 424 2160 ± (2086) 806/5,

12 THE FIXED EFFECTS MODEL 5 123 Comparison of Individual Treatment Means Suppose we are interested in a certain linear combination of the treatment means, say, L = k l i µ i, where l i, i = 1,, k, are known real numbers not all zero The natural estimate of L is k k ˆL = l i ˆµ i = l i ȳ i i=1 i=1 Under the one-way classification model (11), we have : 1 ˆL follows a N(L, σ 2 k i=1 l2 i /n i), ˆL L 2 follows a t MSW (Pk i=1 l2 i /n n k distribution, i) 3 ˆL ± tn k (α/2) MS W ( k i=1 l2 i /n i), 4 An α-level test of i=1 H 0 : L = 0 H A : L 0 is ˆL MS W ( k i=1 l2 i /n > t n k(α/2) i) A linear combination of all the treatment means φ = k c i µ i is known as a contrast of µ 1,, µ k if k i=1 c i = 0 Its sample estimate is ˆφ = i=1 k c i ȳ i Examples of contrasts are µ 1 µ 2 and µ 1 µ Consider r contrasts of µ 1,, µ k, called planned comparisons, such as, i=1 k φ i = c is µ s with s=1 k c is = 0 for i = 1,, r, s=1 and the experiment consists of H 0 : φ 1 = 0 H 0 : φ r = 0 H A : φ 1 0 H A : φ r 0

6 CHAPTER 1 COMPLETELY RANDOMIZED DESIGN Example The most common example is the set of all ( k 2) pairwise tests H 0 H A : µ i = µ j : µ i µ j for 1 i < j k of all µ 1,, µ k The experiment consists of all ( k 2) pairwise tests An experimentwise error occurs if at least one of the null hypotheses is declared significant when H 0 : µ 1 = = µ k is known to be true The Least Significant Difference (LSD) Method Suppose that following an ANOVA F test where the null hypothesis is rejected, we wish to test H 0 : µ i = µ j, for all i j This could be done using the t statistic t 0 = ȳ i ȳ j MSW (1/n i + 1/n j ) and comparing it to t n k (α/2) An equivalent test declares µ i and µ j to be significantly different if ȳ i ȳ j > LSD, where LSD = t n k (α/2) MS W (1/n i + 1/n j ) The following gives a summary of the steps Stage 1 : Test H 0 : µ 1 = = µ k with F 0 = MS B /MS W Stage 2 : Test if F 0 < F k 1,n k (α), if F 0 > F k 1,n k (α), then go to Stage 2 then declare H 0 : µ 1 = = µ k true and stop H 0 H A : µ i = µ j : µ i µ j for all ( k 2) pairs with t ij = ȳ i ȳ j MSW (1/n i + 1/n j ) if t ij < t n k (α/2), then accept H 0 : µ i = µ j if t ij > t n k (α/2), then reject H 0 : µ i = µ j Example Consider the fabric strength example we considered above The ANOVA F -test rejected H 0 : µ 1 = = µ 5 The LSD at α = 05 is LSD = t 20 (025) 2(806) MS W (1/5 + 1/5) = 2086 = 375 5 Thus any pair of treatment averages that differ by more than 375 would imply that the corresponding pair of population means are significantly different The ( 5 2) = 10 pairwise differences among the treatment means are

12 THE FIXED EFFECTS MODEL 7 ȳ 1 ȳ 2 = 98 154 = 56 ȳ 1 ȳ 3 = 98 176 = 78 ȳ 1 ȳ 4 = 98 216 = 118 ȳ 1 ȳ 5 = 98 108 = 10 ȳ 2 ȳ 3 = 154 176 = 22 ȳ 2 ȳ 4 = 154 216 = 62 ȳ 2 ȳ 5 = 154 108 = 46 ȳ 3 ȳ 4 = 176 216 = 40 ȳ 3 ȳ 5 = 176 108 = 68 ȳ 4 ȳ 5 = 216 108 = 108 Using underlining the result may be summarized as ȳ 1 ȳ 5 ȳ 2 ȳ 3 ȳ 4 98 108 154 176 216 As k gets large the experimentwise error becomes large Sometimes we also find that the LSD fails to find any significant pairwise differences while the F -test declares significance This is due to the fact that the ANOVA F -test considers all possible comparisons, not just pairwise comparisons Scheffé s Method for Comparing all Contrasts Often we are interested in comparing different combinations of the treatment means Scheffé (1953) has proposed a method for comparing all possible contrasts between treatment means The Scheffé method controls the experimentwise error rate at level α Consider the r contrasts k φ i = c is µ s with s=1 k c is = 0 for i = 1,, r, s=1 and the experiment consists of H 0 : φ 1 = 0 H 0 : φ r = 0 H A : φ 1 0 H A : φ r 0 The Scheffé method declares φ i to be significant if ˆφ i > S α,i, where and k ˆφ i = c is ȳ s s=1 k S α,i = (k 1)F k 1,n k (α) MSW (c 2 is /n i) s=1 Example As an example, consider the fabric strength data and suppose that we are interested in the contrasts φ 1 = µ 1 + µ 3 µ 4 µ 5 and φ 2 = µ 1 µ 4

8 CHAPTER 1 COMPLETELY RANDOMIZED DESIGN The sample estimates of these contrasts are ˆφ 1 = ȳ 1 + ȳ 3 ȳ 4 ȳ 5 = 500 and ˆφ 2 = ȳ 1 ȳ 4 = 1180 We compute the Scheffé 1% critical values as k S 01,1 = (k 1)F k 1,n k (01) MSW (c 2 1s /n 1) = 4(443) 806(1 + 1 + 1 + 1)/5 = 1069 s=1 and S 01,2 = (k 1)F k 1,n k (01) MSW = 4(443) 806(1 + 1)/5 = 758 k (c 2 2s /n 2) s=1 Since ˆφ 1 < S 01,1, we conclude that the contrast φ 1 = µ 1 + µ 3 µ 4 µ 5 is not significantly different from zero However, since ˆφ 2 > S 01,2, we conclude that φ 2 = µ 1 µ 2 is significantly different from zero; that is, the mean strengths of treatments 1 and 4 differ significantly The Tukey-Kramer Method The Tukey-Kramer procedure declares two means, µ i and µ j, to be significantly different if the absolute value of their sample differences exceeds MS ( W 1 T α = q k,n k (α) + 1 ), 2 n i n j where q k,n k (α) is the α percentile value of the studentized range distribution with k groups and n k degrees of freedom Example Reconsider the fabric strength example From the studentized range distribution table, we find that q 4,20 (05) = 423 Thus, a pair of means, µ i and µ j, would be declared significantly different if ȳ i ȳ j exceeds 806 ( 1 T 05 = 423 2 5 + 1 = 537 5) Using this value, we find that the following pairs of means do not significantly differ: µ 1 and µ 5 µ 5 and µ 2 µ 2 and µ 3 µ 3 and µ 4 Notice that this result differs from the one reported by the LSD method

12 THE FIXED EFFECTS MODEL 9 The Bonferroni Procedure We start with the Bonferroni Inequality Let A 1, A 2,, A k be k arbitrary events with P (A i ) 1 α/k Then P (A 1 A 2 A k ) 1 α The proof of this result is left as an exercise We may use this inequality to make simultaneous inference about linear combinations of treatment means in a one-way fixed effects ANOVA set up Let L 1, L 2,, L r be r linear combinations of µ 1,, µ k where L i = k j=1 l ijµ j and ˆL i = k j=1 l ijȳ j for i = 1,, r A (1 α)100% simultaneous confidence interval for L 1,, L r is ˆL i ( α ) MSW k ± t n k lij 2 2r /n j j=1 for i = 1,, r A Bonferroni α-level test of is performed by testing with for 1 i < j k There is no need to perform an overall F -test Example H 0 : µ 1 = µ 2 = = µ k H 0 : µ i = µ j vs H A : µ i µ j ( ) ȳ i ȳ j t ij MSW (1/n i + 1/n j ) > t α n k 2 ( ), k 2 Consider the tensile strength example considered above We wish to test at 05 level of significance This is done using H 0 : µ 1 = = µ 5 t 20 (05/(2 10)) = t 20 (0025) = 3153 So the test rejects H 0 : µ i = µ j in favor of H A : µ i µ j if ȳ i ȳ j exceeds 3153 MS W (2/5) = 566 Exercise : Use underlining to summarize the results of the Bonferroni testing procedure Dunnett s Method for Comparing Treatments to a Control Assume µ 1 is a control mean and µ 2,, µ k are k 1 treatment means Our purpose here is to find a set of (1 α)100% simultaneous confidence intervals for the k 1 pairwise differences comparing treatment to control, µ i µ 1, for i = 2,, k Dunnett s method rejects the null hypothesis H 0 : µ i = µ 1 at level α if for i = 2,, k The value d k 1,n k (α) is read from a table ȳ i ȳ 1 > d k 1,n k (α) MS W (1/n i + 1/n 1 ),

10 CHAPTER 1 COMPLETELY RANDOMIZED DESIGN Example Consider the tensile strength example above and let treatment 5 be the control The Dunnett critical value is d 4,20 (05) = 265 Thus the critical difference is So the test rejects H 0 : µ i = µ 5 if d 4,20 (05) MS W (2/5) = 476 ȳ i ȳ 5 > 476 Only the differences ȳ 3 ȳ 5 = 68 and ȳ 4 ȳ 5 = 108 indicate any significant difference Thus we conclude µ 3 µ 5 and µ 4 µ 5 13 The Random Effects Model The treatments in an experiment may be a random sample from a larger population of treatments Our purpose is to estimate (and test, if any) the variability among the treatments in the population Such a model is known as a random effects model The mathematical representation of the model is the same as the fixed effects model: y ij = µ + τ i + ɛ ij, i = 1,, k, j = 1,, n i, except for the assumptions underlying the model Assumptions 1 The treatment effects, τ i, are a random sample from a population that is normally distributed with mean 0 and variance σ 2 τ, ie τ i N(0, σ 2 τ ) 2 The ɛ ij are random errors which follow the normal distribution with mean 0 and common variance σ 2 If the τ i are independent of ɛ ij, the variance of an observation will be Var(y ij ) = σ 2 + σ 2 τ The two variances, σ 2 and σ 2 τ are known as variance components The usual partition of the total sum of squares still holds: SS T = SS B + SS W Since we are interested in the bigger population of treatments, the hypothesis of interest is versus H 0 : σ 2 τ = 0 H A : σ 2 τ > 0 If the hypothesis H 0 : σ 2 τ = 0 is rejected in favor of H A : σ 2 τ > 0, then we claim that there is a significant difference among all the treatments Testing is performed using the same F statistic that we used for the fixed effects model: An α-level test rejects H 0 if F 0 > F k 1,n k (α) The estimators of the variance components are F 0 = MS B MS W ˆσ 2 = MS W

13 THE RANDOM EFFECTS MODEL 11 and where ˆσ 2 τ = MS B MS W n 0, [ k n 0 = 1 ] k i=1 n i n2 i k 1 k i=1 i=1 n i We are usually interested in the proportion of the variance of an observation, Var(y ij ), that is the result of the differences among the treatments: σ 2 τ σ 2 + σ 2 τ A 100(1 α)% confidence interval for στ 2 /(σ 2 + στ 2 ) is ( ) L 1 + L, U, 1 + U where and ( ) L = 1 MS B 1 n 0 MS W F k 1,n k (α/2) 1, ( ) U = 1 MS B 1 n 0 MS W F k 1,n k (1 α/2) 1 The following example is taken from from Montgomery : Design and Analysis of Experiments Example A textile company weaves a fabric on a large number of looms They would like the looms to be homogeneous so that they obtain a fabric of uniform strength The process engineer suspects that, in addition to the usual variation in strength within samples of fabric from the same loom, there may also be significant variations in strength between looms To investigate this, he selects four looms at random and makes four strength determinations on the fabric manufactured on each loom The data are given in the following table: The corresponding ANOVA table is Observations Looms 1 2 3 4 1 98 97 99 96 2 91 90 93 92 3 96 95 97 95 4 95 96 99 98 Source df SS MS F 0 Between (Looms) 3 8919 2973 1568 Within (Error) 12 2275 190 Total 15 11194 Since F 0 > F 3,12 (05), we conclude that the looms in the plant differ significantly The variance components are estimated by ˆσ 2 = 190 and ˆσ 2 τ = 2973 190 4 = 696

12 CHAPTER 1 COMPLETELY RANDOMIZED DESIGN Thus, the variance of any observation on strength is estimated by ˆσ 2 + ˆσ τ 2 = 886 Most of this variability (about 696/886 = 79%) is attributable to the difference among looms The engineer must now try to isolate the causes for the difference in loom performance (faulty set-up, poorly trained operators, ) Lets now find a 95% confidence interval for στ 2 /(σ 2 + στ 2 ) From properties of the F distribution we have that F a,b (α) = 1/F b,a (1 α) From the F table we see that F 3,12 (025) = 447 and F 3,12 (975) = 1/F 12,3 (025) = 1/522 = 0192 Thus and L = 1 4 U = 1 4 [( [( which gives the 95% confidence interval )( ) 2973 190 1 447 )( ) 2973 190 1 0192 1 1 ] ] = 0625 = 20124 (0625/1625 = 039, 20124/21124 = 095) We conclude that the variability among looms accounts for between 39 and 95 percent of the variance in the observed strength of fabric produced Using SAS The following SAS code may be used to analyze the tensile strength example considered in the fixed effects CRD case OPTIONS LS=80 PS=66 NODATE; DATA MONT; INPUT TS GROUP@@; CARDS; 7 1 7 1 15 1 11 1 9 1 12 2 17 2 12 2 18 2 18 2 14 3 18 3 18 3 19 3 19 3 19 4 25 4 22 4 19 4 23 4 7 5 10 5 11 5 15 5 11 5 ; /* print the data */ PROC PRINT DATA=MONT; PROC GLM; CLASS GROUP; MODEL TS=GROUP; MEANS GROUP/ CLDIFF BON TUKEY SCHEFFE LSD DUNNETT( 5 ); CONTRAST PHI1 GROUP 1 0 1-1 -1; ESTIMATE PHI1 GROUP 1 0 1-1 -1; CONTRAST PHI2 GROUP 1 0 0-1 0; ESTIMATE PHI2 GROUP 1 0 0-1 0; A random effects model may be analyzed using the RANDOM statement to specify the random factor: PROC GLM DATA=A1; CLASS OFFICER; MODEL RATING=OFFICER; RANDOM OFFICER;

13 THE RANDOM EFFECTS MODEL 13 SAS Output The SAS System 1 Obs TS GROUP 1 7 1 2 7 1 3 15 1 4 11 1 5 9 1 6 12 2 7 17 2 8 12 2 9 18 2 10 18 2 11 14 3 12 18 3 13 18 3 14 19 3 15 19 3 16 19 4 17 25 4 18 22 4 19 19 4 20 23 4 21 7 5 22 10 5 23 11 5 24 15 5 25 11 5 The SAS System 2 The GLM Procedure Class Level Information Class Levels Values GROUP 5 1 2 3 4 5 Dependent Variable: TS Number of observations 25 The SAS System 3 The GLM Procedure Sum of Source DF Squares Mean Square F Value Pr > F Model 4 4757600000 1189400000 1476 <0001 Error 20 1612000000 80600000

14 CHAPTER 1 COMPLETELY RANDOMIZED DESIGN Corrected Total 24 6369600000 R-Square Coeff Var Root MSE TS Mean 0746923 1887642 2839014 1504000 Source DF Type I SS Mean Square F Value Pr > F GROUP 4 4757600000 1189400000 1476 <0001 Source DF Type III SS Mean Square F Value Pr > F GROUP 4 4757600000 1189400000 1476 <0001 The SAS System 4 The GLM Procedure t Tests (LSD) for TS NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate Alpha 005 Error Degrees of Freedom 20 Error Mean Square 806 Critical Value of t 208596 Least Significant Difference 37455 Comparisons significant at the 005 level are indicated by *** Difference GROUP Between 95% Confidence Comparison Means Limits 4-3 4000 0255 7745 *** 4-2 6200 2455 9945 *** 4-5 10800 7055 14545 *** 4-1 11800 8055 15545 *** 3-4 -4000-7745 -0255 *** 3-2 2200-1545 5945 3-5 6800 3055 10545 *** 3-1 7800 4055 11545 *** 2-4 -6200-9945 -2455 *** 2-3 -2200-5945 1545 2-5 4600 0855 8345 *** 2-1 5600 1855 9345 *** 5-4 -10800-14545 -7055 *** 5-3 -6800-10545 -3055 *** 5-2 -4600-8345 -0855 *** 5-1 1000-2745 4745

13 THE RANDOM EFFECTS MODEL 15 1-4 -11800-15545 -8055 *** 1-3 -7800-11545 -4055 *** 1-2 -5600-9345 -1855 *** 1-5 -1000-4745 2745 The SAS System 5 The GLM Procedure Tukey s Studentized Range (HSD) Test for TS NOTE: This test controls the Type I experimentwise error rate Alpha 005 Error Degrees of Freedom 20 Error Mean Square 806 Critical Value of Studentized Range 423186 Minimum Significant Difference 5373 Comparisons significant at the 005 level are indicated by *** Difference GROUP Between Simultaneous 95% Comparison Means Confidence Limits 4-3 4000-1373 9373 4-2 6200 0827 11573 *** 4-5 10800 5427 16173 *** 4-1 11800 6427 17173 *** 3-4 -4000-9373 1373 3-2 2200-3173 7573 3-5 6800 1427 12173 *** 3-1 7800 2427 13173 *** 2-4 -6200-11573 -0827 *** 2-3 -2200-7573 3173 2-5 4600-0773 9973 2-1 5600 0227 10973 *** 5-4 -10800-16173 -5427 *** 5-3 -6800-12173 -1427 *** 5-2 -4600-9973 0773 5-1 1000-4373 6373 1-4 -11800-17173 -6427 *** 1-3 -7800-13173 -2427 *** 1-2 -5600-10973 -0227 *** 1-5 -1000-6373 4373 The SAS System 6 The GLM Procedure Bonferroni (Dunn) t Tests for TS NOTE: This test controls the Type I experimentwise error rate, but it generally

16 CHAPTER 1 COMPLETELY RANDOMIZED DESIGN has a higher Type II error rate than Tukey s for all pairwise comparisons Alpha 005 Error Degrees of Freedom 20 Error Mean Square 806 Critical Value of t 315340 Minimum Significant Difference 56621 Comparisons significant at the 005 level are indicated by *** Difference GROUP Between Simultaneous 95% Comparison Means Confidence Limits 4-3 4000-1662 9662 4-2 6200 0538 11862 *** 4-5 10800 5138 16462 *** 4-1 11800 6138 17462 *** 3-4 -4000-9662 1662 3-2 2200-3462 7862 3-5 6800 1138 12462 *** 3-1 7800 2138 13462 *** 2-4 -6200-11862 -0538 *** 2-3 -2200-7862 3462 2-5 4600-1062 10262 2-1 5600-0062 11262 5-4 -10800-16462 -5138 *** 5-3 -6800-12462 -1138 *** 5-2 -4600-10262 1062 5-1 1000-4662 6662 1-4 -11800-17462 -6138 *** 1-3 -7800-13462 -2138 *** 1-2 -5600-11262 0062 1-5 -1000-6662 4662 The SAS System 7 The GLM Procedure Scheffe s Test for TS NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than Tukey s for all pairwise comparisons Alpha 005 Error Degrees of Freedom 20 Error Mean Square 806 Critical Value of F 286608 Minimum Significant Difference 60796 Comparisons significant at the 005 level are indicated by ***

13 THE RANDOM EFFECTS MODEL 17 Difference GROUP Between Simultaneous 95% Comparison Means Confidence Limits 4-3 4000-2080 10080 4-2 6200 0120 12280 *** 4-5 10800 4720 16880 *** 4-1 11800 5720 17880 *** 3-4 -4000-10080 2080 3-2 2200-3880 8280 3-5 6800 0720 12880 *** 3-1 7800 1720 13880 *** 2-4 -6200-12280 -0120 *** 2-3 -2200-8280 3880 2-5 4600-1480 10680 2-1 5600-0480 11680 5-4 -10800-16880 -4720 *** 5-3 -6800-12880 -0720 *** 5-2 -4600-10680 1480 5-1 1000-5080 7080 1-4 -11800-17880 -5720 *** 1-3 -7800-13880 -1720 *** 1-2 -5600-11680 0480 1-5 -1000-7080 5080 The SAS System 8 The GLM Procedure Dunnett s t Tests for TS NOTE: This test controls the Type I experimentwise error for comparisons of all treatments against a control Alpha 005 Error Degrees of Freedom 20 Error Mean Square 806 Critical Value of Dunnett s t 265112 Minimum Significant Difference 47602 Comparisons significant at the 005 level are indicated by *** Difference GROUP Between Simultaneous 95% Comparison Means Confidence Limits 4-5 10800 6040 15560 *** 3-5 6800 2040 11560 *** 2-5 4600-0160 9360 1-5 -1000-5760 3760

18 CHAPTER 1 COMPLETELY RANDOMIZED DESIGN Dependent Variable: TS The SAS System 9 The GLM Procedure Contrast DF Contrast SS Mean Square F Value Pr > F PHI1 1 312500000 312500000 388 00630 PHI2 1 3481000000 3481000000 4319 <0001 Standard Parameter Estimate Error t Value Pr > t PHI1-50000000 253929124-197 00630 PHI2-118000000 179555005-657 <0001 14 More About the One-Way Model 141 Model Adequacy Checking Consider the one-way CRD model y ij = µ + τ i + ɛ ij, i = 1,, k, j = 1,, n i, where it is assumed that ɛ ij iid N(0, σ 2 ) In the random effects model, we additionally assume that τ i iid N(0, σ 2 τ ) independently of ɛ ij Diagnostics depend on the residuals, The Normality Assumption e ij = y ij ŷ ij = y ij ȳ i The simplest check for normality involves plotting the empirical quantiles of the residuals against the expected quantiles if the residuals were to follow a normal distribution This is known as the normal QQ-plot Other formal tests for normality (Kolmogorov-Smirnov, Shapiro-Wilk, Anderson-Darling, Cramer-von Mises) may also be performed to assess the normality of the residuals Example The following SAS code and partial output checks the normality assumption for the tensile strength example considered earlier The results from the QQ-plot as well as the formal tests (α = 05) indicate that the residuals are fairly normal SAS Code OPTIONS LS=80 PS=66 NODATE; DATA MONT; INPUT TS GROUP@@; CARDS; 7 1 7 1 15 1 11 1 9 1 12 2 17 2 12 2 18 2 18 2 14 3 18 3 18 3 19 3 19 3 19 4 25 4 22 4 19 4 23 4 7 5 10 5 11 5 15 5 11 5 ; TITLE1 STRENGTH VS PERCENTAGE ; SYMBOL1 V=CIRCLE I=NONE; PROC GPLOT DATA=MONT; PLOT TS*GROUP/FRAME;

14 MORE ABOUT THE ONE-WAY MODEL 19 PROC GLM; CLASS GROUP; MODEL TS=GROUP; OUTPUT OUT=DIAG R=RES P=PRED; PROC SORT DATA=DIAG; BY PRED; TITLE1 RESIDUAL PLOT ; SYMBOL1 V=CIRCLE I=SM50; PROC GPLOT DATA=DIAG; PLOT RES*PRED/FRAME; PROC UNIVARIATE DATA=DIAG NORMAL; VAR RES; TITLE1 QQ-PLOT OF RESIDUALS ; QQPLOT RES/NORMAL (L=1 MU=EST SIGMA=EST); Partial Output The UNIVARIATE Procedure Variable: RES Moments N 25 Sum Weights 25 Mean 0 Sum Observations 0 Std Deviation 259165327 Variance 671666667 Skewness 011239681 Kurtosis -08683604 Uncorrected SS 1612 Corrected SS 1612 Coeff Variation Std Error Mean 051833065 Basic Statistical Measures Location Variability Mean 000000 Std Deviation 259165 Median 040000 Variance 671667 Mode -340000 Range 900000 Interquartile Range 400000 NOTE: The mode displayed is the smallest of 7 modes with a count of 2

20 CHAPTER 1 COMPLETELY RANDOMIZED DESIGN Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student s t t 0 Pr > t 10000 Sign M 25 Pr >= M 04244 Signed Rank S 05 Pr >= S 09896 Tests for Normality Test --Statistic--- -----p Value------ Shapiro-Wilk W 0943868 Pr < W 01818 Kolmogorov-Smirnov D 0162123 Pr > D 00885 Cramer-von Mises W-Sq 0080455 Pr > W-Sq 02026 Anderson-Darling A-Sq 0518572 Pr > A-Sq 01775

14 MORE ABOUT THE ONE-WAY MODEL 21 Constant Variance Assumption Once again there are graphical and formal tests for checking the constant variance assumption The graphical tool we shall utilize in this class is the plot of residuals versus predicted values The hypothesis of interest is versus H 0 : σ 2 1 = σ 2 2 = = σ 2 k H A : σ 2 i σ 2 j for at least one pair i j One procedure for testing the above hypothesis is Bartlett s test The test statistic is B 0 = 23026 q c where k q = (n k) log 10 MS W (n i 1) log 10 Si 2 i=1

22 CHAPTER 1 COMPLETELY RANDOMIZED DESIGN c = 1 + ( k 1 3(k 1) i=1 ( ) 1 ) 1 n i 1 n k We reject H 0 if B 0 > χ 2 k 1(α) where χ 2 k 1 (α) is read from the chi-square table Bartlett s test is too sensitive deviations from normality So, it should not be used if the normality assumption is not satisfied A test which is more robust to deviations from normality is Levene s test Levene s test proceeds by computing d ij = y ij m i, where m i is the median of the observations in group i, and then running the usual ANOVA F -test using the transformed observations, d ij, instead of the original observations, y ij Example Once again we consider the tensile strength example The plot of residuals versus predicted values (see above) indicates no serious departure from the constant variance assumption The following modification to the proc GLM code given above generates both Bartlett s and Levene s tests The tests provide no evidence that indicates the failure of the constant variance assumption Partial SAS Code PROC GLM; CLASS GROUP; MODEL TS=GROUP; MEANS GROUP/HOVTEST=BARTLETT HOVTEST=LEVENE; Partial SAS Output The GLM Procedure Levene s Test for Homogeneity of TS Variance ANOVA of Squared Deviations from Group Means Sum of Mean Source DF Squares Square F Value Pr > F GROUP 4 916224 229056 045 07704 Error 20 10154 507720 Bartlett s Test for Homogeneity of TS Variance Source DF Chi-Square Pr > ChiSq GROUP 4 09331 09198

14 MORE ABOUT THE ONE-WAY MODEL 23 142 Some Remedial Measures The Kruskal-Wallis Test When the assumption of normality is suspect, we may wish to use nonparametric alternatives to the F -test The Kruskal-Wallis test is one such procedure based on the rank transformation To perform the Kruskal-Wallis test, we first rank all the observations, y ij, in increasing order Say the ranks are R ij The Kruskal-Wallis test statistic is [ k KW 0 = 1 Ri 2 S 2 n i=1 i where R i is the sum of the ranks of group i, and [ k S 2 = 1 n i Rij 2 n 1 The test rejects H 0 : µ 1 = = µ k if Example i=1 j=1 KW 0 > χ 2 k 1(α) ] n(n + 1)2 4 ] n(n + 1)2 4 For the tensile strength data the ranks, R ij, of the observations are given in the following table: 15 20 25 30 35 20 90 110 205 20 20 140 165 250 50 125 95 165 230 70 70 165 205 205 125 40 165 205 240 70 R i 275 660 850 1130 335 We find that S 2 = 5303 and KW 0 = 1925 From the chi-square table we get χ 2 4(01) = 1328 Thus we reject the null hypothesis and conclude that the treatments differ The SAS procedure NPAR1WAY may be used to obtain the Kruskal-Wallis test OPTIONS LS=80 PS=66 NODATE; DATA MONT; INPUT TS GROUP@@; CARDS; 7 1 7 1 15 1 11 1 9 1 12 2 17 2 12 2 18 2 18 2 14 3 18 3 18 3 19 3 19 3 19 4 25 4 22 4 19 4 23 4 7 5 10 5 11 5 15 5 11 5 ; PROC NPAR1WAY WILCOXON; CLASS GROUP; VAR TS;

24 CHAPTER 1 COMPLETELY RANDOMIZED DESIGN The NPAR1WAY Procedure Wilcoxon Scores (Rank Sums) for Variable TS Classified by Variable GROUP Sum of Expected Std Dev Mean GROUP N Scores Under H0 Under H0 Score --------------------------------------------------------------------- 1 5 2750 650 14634434 550 2 5 6600 650 14634434 1320 3 5 8500 650 14634434 1700 4 5 11300 650 14634434 2260 5 5 3350 650 14634434 670 Average scores were used for ties Kruskal-Wallis Test Chi-Square 190637 DF 4 Pr > Chi-Square 00008 Variance Stabilizing Transformations There are several variance stabilizing transformations one might consider in the case of heterogeneity of variance (heteroscedasticity) The common transformations are y, log(y), 1/y, arcsin( y), 1/ y A simple method of choosing the appropriate transformation is to plot log S i versus log ȳ i or regress log S i versus log ȳ i We then choose the transformation depending on the slope of the relationship The following table may be used as a guide: Slope Transformation 0 No Transformation 1/2 Square root 1 Log 3/2 Reciprocal square root 2 Reciprocal A slightly more involved technique of choosing a variance stabilizing transformation is the Box-Cox transformation It uses the maximum likelihood method to simultaneously estimate the transformation parameter as well as the overall mean and the treatment effects

Chapter 2 Randomized Blocks, Latin Squares, and Related Designs 21 The Randomized Complete Block Design 211 Introduction In a completely randomized design (CRD), treatments are assigned to the experimental units in a completely random manner The random error component arises because of all the variables which affect the dependent variable except the one controlled variable, the treatment Naturally, the experimenter wants to reduce the errors which account for differences among observations within each treatment One of the ways in which this could be achieved is through blocking This is done by identifying supplemental variables that are used to group experimental subjects that are homogeneous with respect to that variable This creates differences among the blocks and makes observations within a block similar The simplest design that would accomplish this is known as a randomized complete block design (RCBD) Each block is divided into k subblocks of equal size Within each block the k treatments are assigned at random to the subblocks The design is complete in the sense that each block contains all the k treatments The following layout shows a RCBD with k treatments and b blocks There is one observation per treatment in each block and the treatments are run in a random order within each block The statistical model for RCBD is where µ is the overall mean, τ i is the ith treatment effect, β j is the effect of the jth block, and Treatment 1 Treatment 2 Treatment k Block 1 y 11 y 21 y k1 Block 2 y 12 y 22 y k2 Block 3 y 13 y 23 y k3 Block b y 1b y 2b y kb y ij = µ + τ i + β j + ɛ ij, i = 1,, k, j = 1,, b (21) ɛ ij is the random error term associated with the ijth observation 25

26 CHAPTER 2 RANDOMIZED BLOCKS We make the following assumptions concerning the RCBD model: k i=1 τ i = 0, b j=1 β j = 0, and ɛ ij iid N(0, σ 2 ) We are mainly interested in testing the hypotheses Here the ith treatment mean is defined as H 0 : µ 1 = µ 2 = = µ k H A : µ i µ j for at least one pair i j µ i = 1 b b (µ + τ i + β j ) = µ + τ i j=1 Thus the above hypotheses may be written equivalently as H 0 : τ 1 = τ 2 = = τ k = 0 H A : τ i 0 for at least one i 212 Decomposition of the Total Sum of Squares Let n = kb be the total number of observations Define One may show that k i=1 j=1 ȳ i = 1 b ȳ j = 1 k ȳ = 1 n b y ij, j=1 k y ij, i=1 k i=1 j=1 b (y ij ȳ ) 2 = b b y ij = 1 k + i = 1,, k j = 1,, b k ȳ i = 1 b i=1 k (ȳ i ȳ ) 2 + k i=1 k i=1 j=1 b j=1 ȳ j b (ȳ j ȳ ) 2 j=1 b (y ij ȳ i ȳ j + ȳ ) 2 Thus the total sum of squares is partitioned into the sum of squares due to the treatments, the sum of squares due to the blocking, and the sum of squares due to error Symbolically, SS T = SS Treatments + SS Blocks + SS E The degrees of freedom are partitioned accordingly as (n 1) = (k 1) + (b 1) + (k 1)(b 1)

21 THE RANDOMIZED COMPLETE BLOCK DESIGN 27 213 Statistical Analysis Testing The test for equality of treatment means is done using the test statistic F 0 = MS Treatments MS E where An α level test rejects H 0 if The ANOVA table for RCBD is MS Treatments = SS Treatments k 1 and MS E = F 0 > F k 1,(k 1)(b 1) (α) SS E (k 1)(b 1) Source df SS MS F -statistic Treatments k 1 SS Treatments MS Treatments F 0 = MS Treatments MS E Blocks b 1 SS Blocks MS Blocks F B = MS Blocks MS E Error (k 1)(b 1) SS E MS E Total n 1 SS T Since there is no randomization of treatments across blocks the use of F B = MS Blocks /MS E as a test for block effects is questionable However, a large value of F B would indicate that the blocking variable is probably having the intended effect of reducing noise Estimation Estimation of the model parameters is performed using the least squares procedure as in the case of the completely randomized design The estimators of µ, τ i, and β j are obtained via minimization of the sum of squares of the errors The solution is k b k b L = ɛ 2 ij = (y ij µ τ i β j ) 2 i=1 j=1 i=1 j=1 ˆµ = ȳ ˆτ i = ȳ i ȳ ˆβ j = ȳ j ȳ i = 1,, k j = 1,, b From the model in (21), we can see that the estimated values of y ij are ŷ ij = ˆµ + ˆτ i + ˆβ j = ȳ + ȳ i ȳ + ȳ j ȳ = ȳ i + ȳ j ȳ Example An experiment was designed to study the performance of four different detergents for cleaning clothes The following cleanliness readings (higher=cleaner) were obtained using a special device for three different types of common stains Is there a significant difference among the detergents?

28 CHAPTER 2 RANDOMIZED BLOCKS Stain 1 Stain 2 Stain 3 Total Detergent 1 45 43 51 139 Detergent 2 47 46 52 145 Detergent 3 48 50 55 153 Detergent 4 42 37 49 128 Total 182 176 207 565 Using the formulæ for SS given above one may compute: Thus SS T = 265 SS Treatments = 111 SS Blocks = 135 SS E = 265 111 135 = 19 F 0 = 111/3 19/6 = 116 which has a p-value < 01 Thus we claim that there is a significant difference among the four detergents The following SAS code gives the ANOVA table: OPTIONS LS=80 PS=66 NODATE; DATA WASH; INPUT STAIN SOAP Y @@; CARDS; 1 1 45 1 2 47 1 3 48 1 4 42 2 1 43 2 2 46 2 3 50 2 4 37 3 1 51 3 2 52 3 3 55 3 4 49 ; PROC GLM; CLASS STAIN SOAP; MODEL Y = SOAP STAIN; The corresponding output is Dependent Variable: Y The GLM Procedure Sum of Source DF Squares Mean Square F Value Pr > F Model 5 2460833333 492166667 1568 00022 Error 6 188333333 31388889 Corrected Total 11 2649166667 R-Square Coeff Var Root MSE Y Mean 0928908 3762883 1771691 4708333 Source DF Type I SS Mean Square F Value Pr > F SOAP 3 1109166667 369722222 1178 00063 STAIN 2 1351666667 675833333 2153 00018 Source DF Type III SS Mean Square F Value Pr > F SOAP 3 1109166667 369722222 1178 00063 STAIN 2 1351666667 675833333 2153 00018 The SAS Type I analysis gives the correct F = 1178 with a p-value of 0063 An incorrect analysis of the data using a one-way ANOVA set up (ignoring the blocking factor) is

21 THE RANDOMIZED COMPLETE BLOCK DESIGN 29 PROC GLM; CLASS SOAP; MODEL Y = SOAP; The corresponding output is Dependent Variable: Y The GLM Procedure Sum of Source DF Squares Mean Square F Value Pr > F Model 3 1109166667 369722222 192 02048 Error 8 1540000000 192500000 Corrected Total 11 2649166667 Notice that H 0 is not rejected indicating no significant difference among the detergents 214 Relative Efficiency of the RCBD The example in the previous section shows that RCBD and CRD may lead to different conclusions A natural question to ask is How much more efficient is the RCBD compared to a CRD? One way to define this relative efficiency is R = (df b + 1)(df r + 3) (df b + 3)(df r + 1) σ2 r σb 2 where σ 2 r and σ 2 b are the error variances of the CRD and RCBD, respectively, and df r and df b are the corresponding error degrees of freedom R is the increase in the number of replications required if a CRD to achieve the same precision as a RCBD Using the ANOVA table from RCBD, we may estimate σ 2 r and σ 2 b as Example ˆσ 2 b = MS E ˆσ 2 r = (b 1)MS Blocks + b(k 1)MS E kb 1 Consider the detergent example considered in the previous section From the ANOVA table for the RCBD we see that MS E = 3139, df b = (k 1)(b 1) = 6, df r = kb k = 8 Thus ˆσ 2 b = MS E = 3139 ˆσ 2 r = (b 1)MS Blocks + b(k 1)MS E kb 1 The relative efficiency of RCBD to CRD is estimated to be = ˆR = (df b + 1)(df r + 3) (df b + 3)(df r + 1) ˆσ2 r ˆσ b 2 (6 + 1)(8 + 3)(1486) = (6 + 3)(8 + 1)(3139) = 45 (2)(6758) + (3)(3)(3139) 12 1 = 1486 This means that a CRD will need about 45 as many replications to obtain the same precision as obtained by blocking on stain types

30 CHAPTER 2 RANDOMIZED BLOCKS Another natural question is What is the cost of blocking if the blocking variable is not really important, ie, if blocking was not necessary? The answer to this question lies in the differing degrees of freedom we use for the error variable Notice that we are using (k 1)(b 1) degrees of freedom in the RCBD as opposed to kb k in the case of a CRD Thus we lose b 1 degrees of freedom unnecessarily This makes the test on the treatment means less sensitive, ie, differences among the means will remain undetected 215 Comparison of Treatment Means As in the case of CRD, we are interested in multiple comparisons to find out which treatment means differ We may use any of the multiple comparison procedures discussed in Chapter 1 The only difference here is that we use the number of blocks b in place of the common sample size Thus in all the equations we replace n i by b Example Once again consider the detergent example of the previous section Suppose we wish to make pairwise comparisons of the treatment means via the Tukey-Kramer procedure The Tukey-Kramer procedure declares two treatment means, µ i and µ j, to be significantly different if the absolute value of their sample differences exceeds MSE ( 2 ) T α = q k,(k 1)(b 1) (α), 2 b where q k,(k 1)(b 1) (α) is the α percentile value of the studentized range distribution with k groups and (k 1)(b 1) degrees of freedom The sample treatment means are We also have Thus using underlining ȳ 1 = 4633, ȳ 2 = 4833, ȳ 3 = 5100, ȳ 4 = 4267, T 05 = q 4,6 (05) 3139/3 = (490)(1023) = 50127 ȳ 4 ȳ 1 ȳ 2 ȳ 3 4267 4633 4833 5100 216 Model Adequacy Checking Additivity The initial assumption we made when considering the model y ij = µ + τ i + β j + ɛ ij is that the model is additive If the first treatment increases the expected response by 2 and the first block increases it by 4, then, according to our model, the expected increase of the response in block 1 and treatment 1 is 6 This setup rules out the possibility of interactions between blocks and treatments In reality, the way the treatment affects the outcome may be different from block to block A quick graphical check for nonadditivity is to plot the residuals, e ij = y ij ŷ ij, versus the fitted values, ŷ ij Any nonlinear pattern indicates nonadditivity A formal test is Tukey s one degree of freedom test for nonadditivity We start out by fitting the model Then testing the hypothesis y ij = µ + τ i + β j + γτ i β j + ɛ ij H 0 : γ = 0 is equivalent to testing the presence of nonadditivity We use the regression approach of testing by fitting the full and reduced models Here is the procedure:

21 THE RANDOMIZED COMPLETE BLOCK DESIGN 31 Fit the model y ij = µ + τ i + β j + ɛ ij Let e ij and ŷ ij be the residual and the fitted value, respectively, corresponding to observation ij in resulting from fitting the model above Let z ij = ŷ 2 ij and fit z ij = µ + τ i + β j + ɛ ij Let r ij = z ij ẑ ij be the residuals from this model Regress e ij on r ij, ie, fit the model Let ˆγ be the estimated slope e ij = α + γr ij + ɛ ij The sum of squares due to nonadditivity is The test statistic for nonadditivity is Example F 0 = SS N = ˆγ 2 k b i=1 j=1 r 2 ij SS N /1 (SS E SS N )/[(k 1)(b 1) 1] The impurity in a chemical product is believed to be affected by pressure We will use temperature as a blocking variable The data is given below The following SAS code is used Options ls=80 ps=66 nodate; title "Tukey s 1 DF Nonadditivity Test"; Data Chemical; Input Temp @; Do Pres = 25,30,35,40,45; Input Im @; output; end; cards; 100 5 4 6 3 5 125 3 1 4 2 3 150 1 1 3 1 2 ; proc print; run; quit; proc glm; class Temp Pres; model Im = Temp Pres; Pressure Temp 25 30 35 40 45 100 5 4 6 3 5 125 3 1 4 2 3 150 1 1 3 1 2

32 CHAPTER 2 RANDOMIZED BLOCKS output out=out1 predicted=pred; run; quit; /* Form a new variable called Psquare */ Data Tukey; set out1; Psquare = Pred*Pred; run; quit; proc glm; class Temp Pres; model Im = Temp Pres Psquare; run; quit; The following is the corresponding output Tukey s 1 DF Nonadditivity Test The GLM Procedure Dependent Variable: Im Sum of Source DF Squares Mean Square F Value Pr > F Model 7 3503185550 500455079 1842 00005 Error 7 190147783 027163969 Corrected Total 14 3693333333 R-Square Coeff Var Root MSE Im Mean 0948516 1776786 0521191 2933333 Source DF Type I SS Mean Square F Value Pr > F Temp 2 2333333333 1166666667 4295 00001 Pres 4 1160000000 290000000 1068 00042 Psquare 1 009852217 009852217 036 05660 Source DF Type III SS Mean Square F Value Pr > F Temp 2 125864083 062932041 232 01690 Pres 4 109624963 027406241 101 04634 Psquare 1 009852217 009852217 036 05660 Thus F 0 = 036 with 1 and 7 degrees of freedom It has a p-value of 05660 Thus we have no evidence to declare nonadditivity Normality The diagnostic tools for the normality of the error terms are the same as those use in the case of the CRD The graphic tools are the QQ-plot and the histogram of the residuals Formal tests like the Kolmogorov-Smirnov test may also be used to assess the normality of the errors Example Consider the detergent example above The following SAS code gives the normality diagnostics

21 THE RANDOMIZED COMPLETE BLOCK DESIGN 33 OPTIONS LS=80 PS=66 NODATE; DATA WASH; INPUT STAIN SOAP Y @@; CARDS; 1 1 45 1 2 47 1 3 48 1 4 42 2 1 43 2 2 46 2 3 50 2 4 37 3 1 51 3 2 52 3 3 55 3 4 49 ; PROC GLM; CLASS STAIN SOAP; MODEL Y = SOAP STAIN; MEANS SOAP/ TUKEY LINES; OUTPUT OUT=DIAG R=RES P=PRED; PROC UNIVARIATE NOPRINT; QQPLOT RES / NORMAL (L=1 MU=0 SIGMA=EST); HIST RES / NORMAL (L=1 MU=0 SIGMA=EST); PROC GPLOT; PLOT RES*SOAP; PLOT RES*STAIN; PLOT RES*PRED; The associated output is (figures given first):

34 CHAPTER 2 RANDOMIZED BLOCKS Tukey s Studentized Range (HSD) Test for Y NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than REGWQ Alpha 005 Error Degrees of Freedom 6 Error Mean Square 3138889 Critical Value of Studentized Range 489559 Minimum Significant Difference 50076 Means with the same letter are not significantly different Tukey Groupi ng Mean N SOAP A 51000 3 3 A A 48333 3 2 A B A 46333 3 1 B B 42667 3 4 RCBD Diagnostics The UNIVARIATE Procedure Fitted Distribution for RES Parameters for Normal Distribution Parameter Symbol Estimate Mean Mu 0 Std Dev Sigma 1252775 Goodness-of-Fit Tests for Normal Distribution Test ---Statistic---- -----p Value----- Cramer-von Mises W-Sq 001685612 Pr > W-Sq >0250 Anderson-Darling A-Sq 013386116 Pr > A-Sq >0250 Th QQ-plot and the formal tests do not indicate the presence of nonnormality of the errors

21 THE RANDOMIZED COMPLETE BLOCK DESIGN 35 Constant Variance The tests for constant variance are the same as those used in the case of the CRD One may use formal tests, such as Levene s test or perform graphical checks to see if the assumption of constant variance is satisfied The plots we need to examine in this case are residuals versus blocks, residuals versus treatments, and residuals versus predicted values The plots below (produced by the SAS code above) suggest that there may be nonconstant variance The spread of the residuals seems to differ from detergent to detergent We may need to transform the values and rerun the analsis

36 CHAPTER 2 RANDOMIZED BLOCKS 217 Missing Values In a randomized complete block design, each treatment appears once in every block A missing observation would mean a loss of the completeness of the design One way to proceed would be to use a multiple regression analysis Another way would be to estimate the missing value If only one value is missing, say y ij, then we substitute a value where T i is the total for treatment i, T j is the total for block j, and T is the grand total y ij = kt i + bt j T (k 1)(b 1) We then substitute y ij and carry out the ANOVA as usual There will, however, be a loss of one degree of freedom from both the total and error sums of squares Since the substituted value adds no practical information to the design, it should not be used in computations of means, for instance, when performing multiple comparisons When more than one value is missing, they may be estimated via an iterative process We first guess the values of all except one We then estimate the one missing value using the procedure above We then estimate the second one using the one estimated value and the remaining guessed values We proceed to estimate the rest in a similar fashion We repeat this process until convergence, ie difference between consecutive estimates is small If several observations are missing from a single block or a single treatment group, we usually eliminate the block or treatment in question The analysis is then performed as if the block or treatment is nonexistent Example Consider the detergent comparison example Suppose y 4,2 = 37 is missing Note that the totals (without 37) are T 4 = 91, T 2 = 139, T = 528 The estimate is y 4,2 = 4(91) + 3(139) 528 6 = 4217