5 Analysis of Variance models, complex linear models and Random effects models



Similar documents
Analysis of Variance. MINITAB User s Guide 2 3-1

Minitab Tutorials for Design and Analysis of Experiments. Table of Contents

Data Analysis Tools. Tools for Summarizing Data

Data analysis and regression in Stata

An Introduction to Statistical Methods in GenStat

Main Effects and Interactions

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

Regression step-by-step using Microsoft Excel

One-Way Analysis of Variance (ANOVA) Example Problem

Microsoft Excel. Qi Wei

Spreadsheet software for linear regression analysis

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

EXCEL Analysis TookPak [Statistical Analysis] 1. First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it:

Directions for using SPSS

Predictor Coef StDev T P Constant X S = R-Sq = 0.0% R-Sq(adj) = 0.

MEAN SEPARATION TESTS (LSD AND Tukey s Procedure) is rejected, we need a method to determine which means are significantly different from the others.

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

ABSORBENCY OF PAPER TOWELS

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information.

CS 147: Computer Systems Performance Analysis

Getting Started with Minitab 17

12: Analysis of Variance. Introduction

Chapter 5 Analysis of variance SPSS Analysis of variance

One-Way Analysis of Variance

8. Comparing Means Using One Way ANOVA

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

SPSS Introduction. Yi Li

Chapter 4 and 5 solutions

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

Statistical Data analysis With Excel For HSMG.632 students

Simple Tricks for Using SPSS for Windows

Experimental Design for Influential Factors of Rates on Massive Open Online Courses

Welcome. This guide will take you through the fundamental Getting Started steps:

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

Section 13, Part 1 ANOVA. Analysis Of Variance

Scatter Plots with Error Bars

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations.

N-Way Analysis of Variance

Simple Linear Regression, Scatterplots, and Bivariate Correlation

How To Run Statistical Tests in Excel

TI-Inspire manual 1. Instructions. Ti-Inspire for statistics. General Introduction

Chapter 7. One-way ANOVA

A Basic Guide to Analyzing Individual Scores Data with SPSS

An analysis method for a quantitative outcome and two categorical explanatory variables.

Assessing Measurement System Variation

Using Excel for Statistics Tips and Warnings

Moderation. Moderation

Figure 1. An embedded chart on a worksheet.

A Guide to Survey Analysis in GenStat. by Steve Langton. Defra Environmental Observatory, 1-2 Peasholme Green, York YO1 7PX, UK.

Multiple Linear Regression

Lesson 07: MS ACCESS - Handout. Introduction to database (30 mins)

Recall this chart that showed how most of our course would be organized:

Simple Predictive Analytics Curtis Seare

Data Analysis. Using Excel. Jeffrey L. Rummel. BBA Seminar. Data in Excel. Excel Calculations of Descriptive Statistics. Single Variable Graphs

Using Excel for inferential statistics

Guidelines for Creating Reports

SPSS: Getting Started. For Windows

Final Exam Practice Problem Answers

Topic 9. Factorial Experiments [ST&D Chapter 15]

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Regression Analysis: A Complete Example

Step 3: Go to Column C. Use the function AVERAGE to calculate the mean values of n = 5. Column C is the column of the means.

Market Pricing Override

Systat: Statistical Visualization Software

GeoGebra Statistics and Probability

Simple Linear Regression Inference

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:

InfiniteInsight 6.5 sp4

Introduction to Microsoft Access 2003

Multiple-Comparison Procedures

Chapter 19 Split-Plot Designs

2013 MBA Jump Start Program. Statistics Module Part 3

Simple Methods and Procedures Used in Forecasting

Gage Studies for Continuous Data

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups

Assignment objectives:

SAS Software to Fit the Generalized Linear Model

STATISTICA Formula Guide: Logistic Regression. Table of Contents

What is a Mail Merge?

Additional sources Compilation of sources:

Premaster Statistics Tutorial 4 Full solutions

APPLYING BENFORD'S LAW This PDF contains step-by-step instructions on how to apply Benford's law using Microsoft Excel, which is commonly used by

January 26, 2009 The Faculty Center for Teaching and Learning

Using Excel in Research. Hui Bian Office for Faculty Excellence

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

28 What s New in IGSS V9. Speaker Notes INSIGHT AND OVERVIEW

CHAPTER 13. Experimental Design and Analysis of Variance

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

There are six different windows that can be opened when using SPSS. The following will give a description of each of them.

Business Valuation Review

A Short Introduction to Eviews

Call Centre Helper - Forecasting Excel Template

Using Microsoft Excel to Analyze Data from the Disk Diffusion Assay

GLM I An Introduction to Generalized Linear Models

Introduction to Statistical Computing in Microsoft Excel By Hector D. Flores; and Dr. J.A. Dobelman

1.1. Simple Regression in Excel (Excel 2010).

Creating a Participants Mailing and/or Contact List:

Transcription:

5 Analysis of Variance models, complex linear models and Random effects models In this chapter we will show any of the theoretical background of the analysis. The focus is to train the set up of ANOVA models in GenStat. GenStat comes with a very extensive help system and in addition several PDF files which sever as a documentation and reference. You should also consult secondary literature on statistical modelling to properly use these procedures. At the beginning of each chapter you find a short introductions on how to generate the example field trails in GenStat. GENSTAT is also a good software to create field trails, although it is somewhat limited regarding the number of treatments. For complex design specialised software should be used. The design creation in GenStat always give good advice on how the analysis model for a certain design should be set up. In any case you should consult a biometrician. 5.1 Basic syntax of ANOVA models Table 5: Notation of ANOVA models + = A +B = Main effects of A and B. = A.B = Interaction of A and B only * = A*B = A+B+A.B factorial structure / = A/B = A+A.B without main effect of B, but B nested within A Table 6: Examples of ANOVA Models A*B*C = A+B+C+A.B+A.C+B.C+A.B.C full factorial model (A+B)*(C+D) = (A+B)+(C+D)+(A+B).(C+D) = A+B+C+D+A.C+A.D+B.C+B.D Block/Plot/Subplot = Block+Block.Plot+Block.Plot.Subplot A/(B*C) = A+A.B+A.C+A.B.C Recommendation: Syntax of ANOVA Models You can reuse any model from the input log window and copy it in an extrax script window, then edit the model to suite your needs. Whenever you are not completele sure about the syntax of teh model you should use the long form writing using + and.. 5.2 Anaylsis example : Potatoe yield Latin square Please restart the GENSTAT Server via Restart Server (see 1.3.2). 52

5.2.1 Create Design To create a design in GENSTAT you use the menu Stats -> Design -> Generate Standard Design. Please choose the base design Latin Square from the pull down menu of the dialog box Generate a Standard design. Also enter names for the Rows-, Column and Treatment factor as well as the Number of Levels : Graph 102: Create a Latin Square Design Click the Run button and the design will be created in form of a spreadsheet. The specialty to this particular spreadsheet is that the information about the analysis model is saved within the spreadsheet. You can check the power of the design after you hit the run button. Go back to the design creation dialog and click the Check for Power button which is visible now. You need to know some basic information like the hypothesised mean difference Size of difference to detect and an information about the standard deviation Residual Mean Square. In case the power is below 80% you want to rethink the design and could add replications to overcome the low power situation. Please insert a new column for the response values Yield via Spread -> Insert -> Column after current column and enter some data. Now open the menu Stats -> Analysis of Variance -> General.... You can save the design spreadsheet for later use via the menu File and the Save dialog. You should close the design after checking the results. 53

5.2.2 Enter or load data Please restart the GENSTAT Server via Restart Server (see 1.3.2).and load the file Latin_Square_Data_Potato_Yield.xls using the Excel Import Wizards (see 2.1). You should convert Zeile, Spalte and Sorte to factor variables. The following data will be loaded: Table 7: data set Latin_Square_Data_Potato_Yield.xls Zeile Spalte Sorte Ertrag 1 1 C 22 1 2 B 20 1 3 A 39 1 4 D 27 1 5 E 34 2 1 E 29 2 2 D 29 2 3 C 25 2 4 A 30 2 5 B 23 3 1 A 29 3 2 E 25 3 3 D 34 3 4 B 26 3 5 C 27 4 1 B 23 4 2 A 27 4 3 E 27 4 4 C 32 4 5 D 41 5 1 D 33 5 2 C 21 5 3 B 24 5 4 E 30 5 5 A 33 5.2.3 Analysis Please start the ANOVA via Stats -> Analysis of Variances -> General In the following dialog you specify the analysis model. Graph 103: Analysis of Variance : General 54

The response Y-Variate of the model is the yield which is named Ertrag in this data file. The treatment or independent variable is the variety named Sorte. Graph 104: Latin square ANOVA model Built into every latin square model is the block structure of rows*columns here called Zeile*Spalte. Please see chapter 5.1 for more details on the usage of *,., / in setting up ANOVA models. Please specify all other settings as shown in Graph 105 and Graph 106, then click Run In jedem Fall sollten Sie als Zusatzoption die graphische Ausgabe aktivieren. Graph 105: ANOVA Options 55

After you Run the analysis once, you can click the Save button in the Analysis of Variance dialog. To be able to check the assumption of normality of the residuals and run the appropriate test you need to save the residuals first. To be able to run a multiple comparison test or pairwise means comparison you need to save the means first. Graph 106: ANOVA Save The following script listings are automatically generated as commands in the Input Log (see script listing 4 and script listing 5). You can save the Input Log to reuse the command sequence later. script listing 4: Create One Way ANOVA Output "General Analysis of Variance." BLOCK "No Blocking" TREATMENTS Sorte+Spalte+Zeile COVARIATE "No Covariate" ANOVA [PRINT=aovtable,information,means,%cv; FACT=1; CONTRASTS=7; FPROB=yes; PSE=diff,\ means] Ertrag APLOT [RMETHOD=simple] fitted,normal,halfnormal,histogram AGRAPH [METHOD=means] script listing 5: Saving results of the One Way ANOVA DELETE [REDEFINE=yes] Kartoffel_Meantab AKEEP [RESIDUAL=Kartoffel_Residuals; FACT=32]Sorte; MEANS=Kartoffel_Meantab FSPREADSHEET [SHEET=29548864; METHOD=replace] Kartoffel_Residuals FSPREADSHEET Kartoffel_Meantab GENSTAT creates diagnostic plots and means plots for the treatment automatically. The diagnostics for this example look very good and allow the statement that the data comply with the assumtion of normal residuals without any further statistical analysis. In Normal plot as well as in the Half Normal plot a few value seem to be outstanding. Those will also be found in the list of large residuals in the output. 56

Graph 107: Example Potatoe yields Diagnostic plots Graph 108: Potatoe yields Means The result of the numerical analysis is listed in the following Output list. Analysis of variance Variate: Ertrag Ouput 5: Result of the One Way ANOVA Source of variation d.f. s.s. m.s. v.r. F pr. Sorte 4 330.00 82.50 5.64 0.009 Spalte 4 150.00 37.50 2.56 0.093 Zeile 4 20.40 5.10 0.35 0.840 Residual 12 175.60 14.63 Total 24 676.00 Message: the following units have large residuals. *units* 3 6.00 *units* 4-6.40 Tables of means Variate: Ertrag Grand mean 28.40 Sorte A B C D E 31.60 23.20 25.40 32.80 29.00 Spalte 1 2 3 4 5 27.20 24.40 29.80 29.00 31.60 Zeile 1 2 3 4 5 28.40 27.20 28.20 30.00 28.20 57

Standard errors of means Table Sorte Spalte Zeile rep. 5 5 5 d.f. 12 12 12 e.s.e. 1.711 1.711 1.711 Standard errors of differences of means Table Sorte Spalte Zeile rep. 5 5 5 d.f. 12 12 12 s.e.d. 2.419 2.419 2.419 Stratum standard errors and coefficients of variation Variate: Ertrag d.f. s.e. cv% 12 3.825 13.5 5.2.4 Pairwise comparison It is not possible in GenStat to run pairwise LS means comparisons using the menus. If the code for LS-means comparisons is generated manually it works fine. This chapter shows an example. Before you can apply the script to a specific analysis you have to find some information in the previous output and paste that into the script. You have to supply the name of the table of means and information about the standard deviation and degrees of freedom. Besides the overly conservative Bonferroni method you can also run Tukey and Sidak tests. VSN knows about this problem and will add this option to Version 10 of GENSTAT. script listing 6: Create a pairwise comparison of means Data from table Stratum standard errors and coefficients of variation VARIANCE = s.e. from the Output has to be squared DF = d.f. from Output ALLPAIRWISE [METHOD=Bonferroni; DIRECTION=descending; PROBABILITY=0.05]\ MEANS=Kartoffel_Meantab; REPLICATION=5; VARIANCE=14.631; DF=12 Ouput 6: Result of the pairwise comparison on the basis of a One Way ANOVA All pairwise comparisons are tested. Variance = 14.6310 with 12 degrees of freedom Bonferroni test Experimentwise error rate = 0.0500 Comparisonwise error rate = 0.0050 Mean vs Mean t significant D A 0.496 No 58

D E 1.571 No D C 3.059 No D B 3.968 Yes A E 1.075 No A C 2.563 No A B 3.472 Yes E C 1.488 No E B 2.398 No C B 0.909 No Identifier Mean D 32.80 A 31.60 E 29.00 C 25.40 B 23.20 5.2.5 Test if data are Normal The Normality test is started via the Graph menu. Please specify as shown in Graph 110. Graph 109: Menu to Test if data are Normal Graph 110: Options of the Normality test The script command is automatically created by GenStat : 59

script listing 7: Graphical Test of Normality DPROBABILITY [PRINT=parameters,tests;DISTRIBUTION=NORMAL;METHOD=quantile;QMETHOD=standardized;\ BANDS=simultaneous;ALPHA=0.95;PLOT=reference] Kartoffel_Residuals Ouput 7: Numerical results of the Normality tests Critical values of test statistics (marginal tests) Test statistic 15% 10% 5% 2.5% 1% Anderson-Darling 0.576 0.656 0.787 0.918 1.092 Cramer-von Mises 0.091 0.104 0.126 0.148 0.178 Watson 0.085 0.096 0.116 0.136 0.163 Marginal tests Variate Anderson-Darling Cramer-von Mises Watson 1 0.3176 0.0415 0.0415?, *, ** indicate significance at 10%, 5% and 1% levels respectively The graphical analysis shows that all residuals are within the limits of the confidence interval, which is an indication that the residuals are following a gaussian normal distribution. The result is congruent with the numerical analysis. Graph 111: Graphical Output of the Normality test 60