5 Analysis of Variance models, complex linear models and Random effects models In this chapter we will show any of the theoretical background of the analysis. The focus is to train the set up of ANOVA models in GenStat. GenStat comes with a very extensive help system and in addition several PDF files which sever as a documentation and reference. You should also consult secondary literature on statistical modelling to properly use these procedures. At the beginning of each chapter you find a short introductions on how to generate the example field trails in GenStat. GENSTAT is also a good software to create field trails, although it is somewhat limited regarding the number of treatments. For complex design specialised software should be used. The design creation in GenStat always give good advice on how the analysis model for a certain design should be set up. In any case you should consult a biometrician. 5.1 Basic syntax of ANOVA models Table 5: Notation of ANOVA models + = A +B = Main effects of A and B. = A.B = Interaction of A and B only * = A*B = A+B+A.B factorial structure / = A/B = A+A.B without main effect of B, but B nested within A Table 6: Examples of ANOVA Models A*B*C = A+B+C+A.B+A.C+B.C+A.B.C full factorial model (A+B)*(C+D) = (A+B)+(C+D)+(A+B).(C+D) = A+B+C+D+A.C+A.D+B.C+B.D Block/Plot/Subplot = Block+Block.Plot+Block.Plot.Subplot A/(B*C) = A+A.B+A.C+A.B.C Recommendation: Syntax of ANOVA Models You can reuse any model from the input log window and copy it in an extrax script window, then edit the model to suite your needs. Whenever you are not completele sure about the syntax of teh model you should use the long form writing using + and.. 5.2 Anaylsis example : Potatoe yield Latin square Please restart the GENSTAT Server via Restart Server (see 1.3.2). 52
5.2.1 Create Design To create a design in GENSTAT you use the menu Stats -> Design -> Generate Standard Design. Please choose the base design Latin Square from the pull down menu of the dialog box Generate a Standard design. Also enter names for the Rows-, Column and Treatment factor as well as the Number of Levels : Graph 102: Create a Latin Square Design Click the Run button and the design will be created in form of a spreadsheet. The specialty to this particular spreadsheet is that the information about the analysis model is saved within the spreadsheet. You can check the power of the design after you hit the run button. Go back to the design creation dialog and click the Check for Power button which is visible now. You need to know some basic information like the hypothesised mean difference Size of difference to detect and an information about the standard deviation Residual Mean Square. In case the power is below 80% you want to rethink the design and could add replications to overcome the low power situation. Please insert a new column for the response values Yield via Spread -> Insert -> Column after current column and enter some data. Now open the menu Stats -> Analysis of Variance -> General.... You can save the design spreadsheet for later use via the menu File and the Save dialog. You should close the design after checking the results. 53
5.2.2 Enter or load data Please restart the GENSTAT Server via Restart Server (see 1.3.2).and load the file Latin_Square_Data_Potato_Yield.xls using the Excel Import Wizards (see 2.1). You should convert Zeile, Spalte and Sorte to factor variables. The following data will be loaded: Table 7: data set Latin_Square_Data_Potato_Yield.xls Zeile Spalte Sorte Ertrag 1 1 C 22 1 2 B 20 1 3 A 39 1 4 D 27 1 5 E 34 2 1 E 29 2 2 D 29 2 3 C 25 2 4 A 30 2 5 B 23 3 1 A 29 3 2 E 25 3 3 D 34 3 4 B 26 3 5 C 27 4 1 B 23 4 2 A 27 4 3 E 27 4 4 C 32 4 5 D 41 5 1 D 33 5 2 C 21 5 3 B 24 5 4 E 30 5 5 A 33 5.2.3 Analysis Please start the ANOVA via Stats -> Analysis of Variances -> General In the following dialog you specify the analysis model. Graph 103: Analysis of Variance : General 54
The response Y-Variate of the model is the yield which is named Ertrag in this data file. The treatment or independent variable is the variety named Sorte. Graph 104: Latin square ANOVA model Built into every latin square model is the block structure of rows*columns here called Zeile*Spalte. Please see chapter 5.1 for more details on the usage of *,., / in setting up ANOVA models. Please specify all other settings as shown in Graph 105 and Graph 106, then click Run In jedem Fall sollten Sie als Zusatzoption die graphische Ausgabe aktivieren. Graph 105: ANOVA Options 55
After you Run the analysis once, you can click the Save button in the Analysis of Variance dialog. To be able to check the assumption of normality of the residuals and run the appropriate test you need to save the residuals first. To be able to run a multiple comparison test or pairwise means comparison you need to save the means first. Graph 106: ANOVA Save The following script listings are automatically generated as commands in the Input Log (see script listing 4 and script listing 5). You can save the Input Log to reuse the command sequence later. script listing 4: Create One Way ANOVA Output "General Analysis of Variance." BLOCK "No Blocking" TREATMENTS Sorte+Spalte+Zeile COVARIATE "No Covariate" ANOVA [PRINT=aovtable,information,means,%cv; FACT=1; CONTRASTS=7; FPROB=yes; PSE=diff,\ means] Ertrag APLOT [RMETHOD=simple] fitted,normal,halfnormal,histogram AGRAPH [METHOD=means] script listing 5: Saving results of the One Way ANOVA DELETE [REDEFINE=yes] Kartoffel_Meantab AKEEP [RESIDUAL=Kartoffel_Residuals; FACT=32]Sorte; MEANS=Kartoffel_Meantab FSPREADSHEET [SHEET=29548864; METHOD=replace] Kartoffel_Residuals FSPREADSHEET Kartoffel_Meantab GENSTAT creates diagnostic plots and means plots for the treatment automatically. The diagnostics for this example look very good and allow the statement that the data comply with the assumtion of normal residuals without any further statistical analysis. In Normal plot as well as in the Half Normal plot a few value seem to be outstanding. Those will also be found in the list of large residuals in the output. 56
Graph 107: Example Potatoe yields Diagnostic plots Graph 108: Potatoe yields Means The result of the numerical analysis is listed in the following Output list. Analysis of variance Variate: Ertrag Ouput 5: Result of the One Way ANOVA Source of variation d.f. s.s. m.s. v.r. F pr. Sorte 4 330.00 82.50 5.64 0.009 Spalte 4 150.00 37.50 2.56 0.093 Zeile 4 20.40 5.10 0.35 0.840 Residual 12 175.60 14.63 Total 24 676.00 Message: the following units have large residuals. *units* 3 6.00 *units* 4-6.40 Tables of means Variate: Ertrag Grand mean 28.40 Sorte A B C D E 31.60 23.20 25.40 32.80 29.00 Spalte 1 2 3 4 5 27.20 24.40 29.80 29.00 31.60 Zeile 1 2 3 4 5 28.40 27.20 28.20 30.00 28.20 57
Standard errors of means Table Sorte Spalte Zeile rep. 5 5 5 d.f. 12 12 12 e.s.e. 1.711 1.711 1.711 Standard errors of differences of means Table Sorte Spalte Zeile rep. 5 5 5 d.f. 12 12 12 s.e.d. 2.419 2.419 2.419 Stratum standard errors and coefficients of variation Variate: Ertrag d.f. s.e. cv% 12 3.825 13.5 5.2.4 Pairwise comparison It is not possible in GenStat to run pairwise LS means comparisons using the menus. If the code for LS-means comparisons is generated manually it works fine. This chapter shows an example. Before you can apply the script to a specific analysis you have to find some information in the previous output and paste that into the script. You have to supply the name of the table of means and information about the standard deviation and degrees of freedom. Besides the overly conservative Bonferroni method you can also run Tukey and Sidak tests. VSN knows about this problem and will add this option to Version 10 of GENSTAT. script listing 6: Create a pairwise comparison of means Data from table Stratum standard errors and coefficients of variation VARIANCE = s.e. from the Output has to be squared DF = d.f. from Output ALLPAIRWISE [METHOD=Bonferroni; DIRECTION=descending; PROBABILITY=0.05]\ MEANS=Kartoffel_Meantab; REPLICATION=5; VARIANCE=14.631; DF=12 Ouput 6: Result of the pairwise comparison on the basis of a One Way ANOVA All pairwise comparisons are tested. Variance = 14.6310 with 12 degrees of freedom Bonferroni test Experimentwise error rate = 0.0500 Comparisonwise error rate = 0.0050 Mean vs Mean t significant D A 0.496 No 58
D E 1.571 No D C 3.059 No D B 3.968 Yes A E 1.075 No A C 2.563 No A B 3.472 Yes E C 1.488 No E B 2.398 No C B 0.909 No Identifier Mean D 32.80 A 31.60 E 29.00 C 25.40 B 23.20 5.2.5 Test if data are Normal The Normality test is started via the Graph menu. Please specify as shown in Graph 110. Graph 109: Menu to Test if data are Normal Graph 110: Options of the Normality test The script command is automatically created by GenStat : 59
script listing 7: Graphical Test of Normality DPROBABILITY [PRINT=parameters,tests;DISTRIBUTION=NORMAL;METHOD=quantile;QMETHOD=standardized;\ BANDS=simultaneous;ALPHA=0.95;PLOT=reference] Kartoffel_Residuals Ouput 7: Numerical results of the Normality tests Critical values of test statistics (marginal tests) Test statistic 15% 10% 5% 2.5% 1% Anderson-Darling 0.576 0.656 0.787 0.918 1.092 Cramer-von Mises 0.091 0.104 0.126 0.148 0.178 Watson 0.085 0.096 0.116 0.136 0.163 Marginal tests Variate Anderson-Darling Cramer-von Mises Watson 1 0.3176 0.0415 0.0415?, *, ** indicate significance at 10%, 5% and 1% levels respectively The graphical analysis shows that all residuals are within the limits of the confidence interval, which is an indication that the residuals are following a gaussian normal distribution. The result is congruent with the numerical analysis. Graph 111: Graphical Output of the Normality test 60