RBD for fungicide treatments. The GLM Procedure Class Level Information Class Levels Values block fungicide 3 Control New Old

Transcription

1 CROP 590 Lab 3 on the Randomized Complete Block Design Annotations for SAS output RBD for fungicide treatments Class Level Information Class Levels Values block fungicide 3 Control New Old Number of Observations Read 12 Number of Observations Used 12 The data set is complete so all 12 observations are used RBD for fungicide treatments We will be using Type III SS for the most part in this class. It is the same as Type I SS when the data are balanced (no missing plots). The default in R is to use Type I SS. Dependent Variable: rootwt This is the Mean Square Error Source DF Sum of Squares Mean Square F Value Pr > F Model Error Corrected Total Source DF Type I SS Mean Square F Value Pr > F block fungicide Source R-Square Coeff Var Root MSE rootwt Mean DF Type III SS Mean Square F Value Pr > F block fungicide Tests the combined effects of including both blocks and fungicide in the model. The model SS is the sum of the block and fungicide SS. Blocking may have been useful because the MS for block is bigger than MS for error, but the test is not significant at the 0.05 level. Differences among fungicide treatments are highly significant. 1

2 RBD for fungicide treatments The spread (variation) among replications (blocks) for each treatment group appears to be of a similar magnitude. This suggests that our assumption of homogeneous variance is reasonable. Next week we will learn how to make formal tests to see if the ANOVA assumptions are met. 2

3 RBD for fungicide treatments t Tests (LSD) for rootwt Note: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. All of the means have been assigned different letters, so they are all significantly different from each other. Alpha 0.05 Error Degrees of Freedom 6 Error Mean Square Critical Value of t Least Significant Difference Means with the same letter are not significantly different. t Grouping Mean N fungicide A New B Old The critical t that is used to calculate the LSD is based on the df associated with the error term (MSE). When the difference between the means for a pair of treatments exceeds the LSD, they are significantly different at the 0.05 probability level. C Control RBD for fungicide treatments Least Squares Means fungicide rootwt LSMEAN Standard Error Pr > t Control <.0001 New <.0001 Old <.0001 The t tests for individual means just tell us that the means are significantly different than zero (which is not very surprising or interesting for the trait rootwt.) 3 We requested lsmeans because it has the stderr option (standard error of a mean). Because the data are balanced (no missing plots), the lsmeans are the same as the arithmetic means and the standard errors are the same for all means. The lsmeans are sorted in alphanumeric order based on the treatment names. For the LSD test output, the means were sorted from high to low based on their values.

4 RBD with missing plots Class Level Information Class Levels Values block fungicide 3 Control New Old Number of Observations Read 12 Number of Observations Used 11 Now there is a missing plot so only 11 observations are used. Dependent Variable: rootwt Source DF Sum of Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE rootwt Mean There are only 5 df for error now rather than 6. The MSE is similar (but not exactly the same) as before. Source DF Type I SS Mean Square F Value Pr > F block fungicide Source DF Type III SS Mean Square F Value Pr > F block fungicide There are still highly significant differences among the fungicide treatments. The Type I and Type III MS are no longer the same for blocks due to the missing plot. There is no difference between Type I and Type III for fungicide because it is the last term entered in the model. Type I SS are sequential and the order of terms in the model makes a difference. Type III SS are partial and are always calculated as if they were the last term in the model. Consequently, for Type III SS the order that terms appear in the model statement makes no difference. 4

5 RBD with missing plots This table shows t values and probabilities for t tests comparing all possible combinations of means. The i/j values correspond to the different fungicide treatments i.e., Control=1, New=2, and Old=3. All probability values are <0.05, so all of the mean comparisons are significant. Adjustments have been made for the fact that the Old fungicide only had 3 reps. The graph below depicts the confidence intervals for each of the comparisons. Least Squares Means fungicide rootwt LSMEAN Standard Error Pr > t LSMEAN Number Control < New < Old < Least Squares Means for Effect fungicide t for H0: LSMean(i)=LSMean(j) / Pr > t Dependent Variable: rootwt i/j Now the lsmean for the Old fungicide has a higher standard error because it was missing from Block 1. The lsmean estimate for the Old fungicide is also adjusted for the missing observation in Block 1. The t tests here only tell us that the means are significantly different from zero. 5

6 RBD with missing plots This table shows the contents of the output data set that we created called new. It includes predicted values (Yhat) and residuals (errors not accounted for by the model). Note that for any observation, the residual is the difference between the observed value of rootwt and the predicted value. Predicted values are determined from the average effects for blocks and treatments. There is no residual associated with the missing plot, so it doesn t add anything to the Error SS. For treatments with no missing plots, the arithmetic average of the observations across all blocks = the average of predicted values. For the Old fungicide treatment, the lsmean could be calculated by including its predicted value for block one in an average with the observations for that treatment in blocks 2, 3, and 4. There are philosophical differences of opinion on the merits of calculating lsmeans (many programmers in the R community don t like them.) As a plant breeder I find them useful, because I need to evaluate many genotypes and must consider many traits at the same time in order to make the best selections. With lsmeans, I can use all of the information that I have available and be reasonably confident that I am making fair comparisons among genotypes. Obs block fungicide rootwt Yhat Resid 1 1 Control Old New Control Old New Control Old New Control Old New

7 RBD with missing plots using mixed models A mixed model includes both random effects and fixed effects. In this example, the fungicide treatments are fixed effects and the blocks are random. As computers have become more powerful, it has become possible to solve for the parameters in mixed models using maximum likelihood estimation (REML) without the use of an ANOVA. PROC Mixed is a SAS procedure that can solve mixed model equations for variables that have a normal distribution. The Mixed Procedure Model Information Data Set WORK.TWO Dependent Variable rootwt Covariance Structure Variance Components Estimation Method REML Residual Variance Method Profile Fixed Effects SE Method Model-Based Degrees of Freedom Method Containment Class Level Information Class Levels Values block fungicide 3 Control New Old Dimensions Covariance Parameters 2 Columns in X 4 Columns in Z 4 Subjects 1 Max Obs Per Subject 12 Number of Observations Number of Observations Read 12 Number of Observations Used 11 Number of Observations Not Used 1 7

8 RBD with missing plots using mixed models Iteration History Iteration Evaluations -2 Res Log Like Criterion An iterative process is used until stable maximum likelihood estimates of parameters are obtained. Convergence criteria met. Covariance Parameter Estimates Cov Parm Estimate block Residual Fit Statistics -2 Res Log Likelihood 37.2 AIC (smaller is better) 41.2 AICC (smaller is better) 43.6 BIC (smaller is better) 40.0 Type 3 Tests of Fixed Effects Effect Num DF Den DF F Value Pr > F fungicide This table shows estimates of variances for random effects in the model. The Residual variance should be similar to what we obtained for the MSE from PROC GLM (it would be identical for balanced data). The value shown for block is the variance component estimate for blocks. The test for fixed effects (fungicide treatments) is the same as for the ANOVA. Note that there are no SS or MS, however. 8

9 RBD with missing plots using mixed models The lsmean for the Old fungicide is slightly different than it was from PROC GLM a slightly different adjustment has been made for the missing plot. Use of mixed models is recommended when there are missing plots. The estimates of standard errors for means will generally be different in PROC Mixed than in PROC GLM, even when the data are balanced. They will tend to be a little larger because they include variation among the random effects (blocks in this case). Least Squares Means Effect fungicide Estimate Standard Error DF t Value Pr > t fungicide Control <.0001 fungicide New <.0001 fungicide Old <.0001 Conclusions from mean comparison tests are similar in PROC mixed and PROC GLM. With balanced data, results would be the same for PROC mixed and PROC GLM, because the standard error of a difference is not affected by the variation among blocks. Balanced comparisons can be made among treatments within each block. Differences of Least Squares Means Effect fungicide _fungicide Estimate Standard Error DF t Value Pr > t fungicide Control New fungicide Control Old fungicide New Old

10 Power for fungicide experiment proc glmpower data=one; class Block Fungicide; model RootWt = Block Fungicide; power stddev = ntotal = 12 power =.; run; The GLMPOWER Procedure Fixed Scenario Elements Dependent Variable rootwt Error Standard Deviation Total Sample Size 12 Alpha 0.05 Error Degrees of Freedom 6 Computed Power Index Source Test DF Power 1 block fungicide 2 >.999 In our program, power was represented by a missing plot symbol (.), so SAS will determine the value for power. Because the differences among treatments were large, power was very high in this experiment. proc glmpower data=one; class Block Fungicide; model RootWt = Block Fungicide; power stddev = power = 0.80 ntotal =.; Run; The GLMPOWER Procedure Fixed Scenario Elements Dependent Variable rootwt Error Standard Deviation Nominal Power 0.8 Alpha 0.05 Computed N Total Index Source Test DF Error DF Actual Power N Total 1 block fungicide 2 6 > For this program, power was set at 0.80 and the number of plots was represented by a (.), so SAS will determine the required number of experimental units to attain the desired power. Results show that 12 plots were sufficient to have a good chance of detecting differences among treatments. 10