St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 10: Mixed model theory II: Tests and confidence intervals 10.1 Notes.................................. 1 10.1.1 Summary of the first theory module.............. 1 10.1.2 Testing fixed effects...................... 2 10.1.3 Confidence intervals of fixed effects.............. 3 10.1.4 The estimate and the contrast statements....... 4 10.1.5 Test for random effects parameters............... 5 10.1.6 Confidence intervals for random effects parameters...... 5 10.1 Notes The first theory module described how a mixed model is defined and how the model parameters in a mixed model are estimated from observed data. This module describes how the tests for the fixed effects are computed (typically represented in an ANOVA table), and how to construct confidence intervals. 10.1.1 Summary of the first theory module Recall from the first theory module that any linear normal mixed model, can be expressed as: y N(Xβ, V), Here X is the design matrix for the fixed effects part of the model, β is the fixed effects parameters, and V is the covariance matrix. The covariance matrix V is specified via the random effects in the model and the additional R matrix, but that is not important here. 02429/Mixed Linear Models http://www.imm.dtu.dk/courses/02429/ Last modified August 23, 2011
Module 10: Mixed model theory II: Tests and confidence intervals 2 10.1.2 Testing fixed effects Typically the hypothesis of interest can be expressed as some linear combination of the model parameters: L β = c where L is a matrix, or a column vector with the same number of rows as there are elements in β. c is a constant and quite often zero. Consider the following example: In a one way ANOVA model with three treatments the fixed effects parameter vector would be β = (µ, α 1, α 2, α 3 ). The test for similar effect of treatment 1 and treatment 2 can be expressed as: ( ) 0 1 1 0 }{{} L µ α 1 α 2 α 3 = 0 which is the same as α 1 α 2 = 0. The hypothesis that all three treatments have the same effect can similarly be expressed as: ( ) µ 0 1 1 0 α 1 0 1 0 1 α 2 = 0 }{{} α L 3 where the L matrix express that α 1 α 2 = 0 and α 1 α 3 = 0, which is the same as all three being equal. Not every hypothesis that can be expressed as a linear combination of the parameters are meaningful. Consider again the one way ANOVA example with parameters β = (µ, α 1, α 2, α 3 ). The hypothesis α 1 = 0 is not meaningful for this model. This is not obvious right away, but consider the fixed part of the model with arbitrary α 1, and with α 1 = 0: E(y) = µ + α 1 α 2 and E(y) = µ + α 3 The model with zero in place of α 1 can provide exactly the same predictions in each treatment group, as the model with arbitrary α 1. If for instance α 1 = 3 in the first case, then setting µ = µ + 3, α 2 = α 2 3 and α 3 = α 3 3 will give the same predictions in the second case. In other words the two models are identical and comparing them with a statistical test is meaningless. To avoid this and similar situations the following definition is given: Definition: A linear combination of the fixed effects model parameters L β is said to be estimable if and only if there is a vector λ such that λ X = L. 0 α 2 α 3
Module 10: Mixed model theory II: Tests and confidence intervals 3 In the following it is assumed that the hypothesis in question is estimable. This is not a restriction as all meaningful hypothesis are estimable. The estimate of the linear combination of model parameters L β is L β. The estimate of β is known from the first theory module, so: L β = L (X V 1 X) 1 X V 1 y Applying the rule cov(ax) = Acov(x)A from the fist theory module, and doing few matrix calculations show that the covariance of L β is L (X V 1 X) 1 L, and the mean is Lβ. This all amounts to: If the hypothesis L β = c is true, then: L β N(L β, L (X V 1 X) 1 L) (L β c) N(0, L (X V 1 X) 1 L) Now the distribution is described, and the so called Wald test can be constructed by: W = (L β c) (L (X V 1 X) 1 L) 1 (L β c) The Wald test can be thought of as the squared difference from the hypothesis divided by its variance. W has an approximate χ 2 df 1 distribution with degrees of freedom df 1 equal to the number of parameters eliminated by the hypothesis, which is the same as the rank of L. This asymptotic result is based on the assumption that the variance V is known without error, but V is estimated from the observations, and not known. A better approximation can be archived by using the Wald F test: F = W df 1 in combination with Satterthwaite s approximation. In this case Satterthwaite s approximation supply an estimate of the denominator degrees of freedom df 2 (assuming that F is F df1,df 2 distributed). The P value for the hypothesis L β = c is computed as: P L β=c = P (F df1,df 2 F ) If the /ddfm=satterth option is specified on proc mixed, then all the tests in the ANOVA table for the fixed effects are computed this way. 10.1.3 Confidence intervals of fixed effects Confidence intervals based on the approximative t distribution, can be applied for linear combinations of the fixed effects. When a single fixed effect parameter or a
Module 10: Mixed model theory II: Tests and confidence intervals 4 single estimable linear combination of fixed effect parameters is considered, the L matrix has only one column, and the 95% confidence interval become: L β = L β ± t0.975,df L (X V 1 X) 1 L Here the covariance matrix V is not known, but based variance parameter estimates. The only problem remaining is to determine the appropriate degrees of freedom df. Once again Satterthwaite s approximation is recommended. The following section will illustrate how to compute these confidence intervals in SAS. 10.1.4 The estimate and the contrast statements A linear combination of fixed effects parameters can be specified directly in SAS proc mixed. These are specified in terms of the L matrix. Consider for instance a one way ANOVA model with five treatments and an additional random block effect: y i = µ + α(treatment i ) + b(block i ) + ε i where b(block i ) N(0, σ 2 b ) and ε i N(0, σ 2 ). The SAS code for this model could look something like: proc mixed; class treatment block; model y = treatment/ddfm=satterth; random block; estimate tmt1-tmt2 treatment 1-1 0 0 0/cl; run; The estimate statement has three arguments. The first argument tmt1-tmt2 is a user defined label and is only used to recognize the estimate in the comprehensive SAS output. The second argument treatment is the name a variable (factor or covariate). The third argument specify one number for each level of the variable. These numbers specify the linear combination by multiplying each to the corresponding parameter estimate and adding it all together. The example above corresponds to: 1 α 1 + ( 1) α 2 + 0 α 3 + 0 α 4 + 0 α 5 = α 1 α 2 which is the comparison of the two first treatments. The added /tt to the estimate statement prints the confidence interval from the previous section in the output. The estimate statement can also be used to compute linear combinations of parameters from more than one variable. For instance to estimate the mean value in the first treatment group including the intercept term, the following estimate statement would do it:
Module 10: Mixed model theory II: Tests and confidence intervals 5 estimate Mean of tmt1 int 1 treatment 1 0 0 0 0/cl; The estimate statement can only handle the case where the resulting linear combination is a single number (L is a single column). For comparison of several treatments in one test the very similar contrast statement is needed. To test if the first three treatments have the same effect α 1 = α 2 = α 3 the following statement can be used: contrast tmt1=tmt2=tmt3 treatment 1-1 0 0 0, treatment 1 0-1 0 0; The contrast statement does not compute confidence intervals and estimates of the different linear combinations, so the estimate statement is not dispensable. 10.1.5 Test for random effects parameters The restricted/residual likelihood ratio test can be used to test the significance a random effects parameter. The likelihood ratio test is used to compare two models A and B, where B is a sub model of A. Here the model including some variance parameter (model A), and the model without this variance parameter (model B) is to be compared. Using the test consists of two steps: 1) Compute the two negative restricted/residual log-likelihood values (l (A) re and l (B) re ) by running both models. 2) Compute the test statistic: G A B = 2l re (B) 2l (A) re Asymptotically G A B follows a χ 2 1 distribution. (One degree of freedom, because one variance parameter is tested when comparing A to B). 10.1.6 Confidence intervals for random effects parameters The confidence interval for a given variance parameter is based on the assumption that the estimate of the variance parameter σ b 2 σ2 is approximately df χ2 df distributed. This is true in balanced (and other nice ) cases. A consequence of this is that the confidence interval takes the form: df σ b 2 < σ χ 2 b 2 < df σ2 b, 0.025;df χ 2 0.975;df but with the degrees of freedom df still undetermined. The task is to choose the df such that the corresponding χ 2 distribution matches the distribution of the estimate. The (theoretical) variance of σ2 b ( ) σ 2 var b df χ2 df = 2σ4 b df df χ2 df is:
Module 10: Mixed model theory II: Tests and confidence intervals 6 The actual variance of the of the parameter can be estimated from the curvature of the negative log likelihood function l. By matching the estimated actual variance of the estimator to the variance of the desired distribution, and solving the equation: var( σ 2 b ) = 2σ4 b df the following estimate of the degrees of freedom is obtained, after plugging in the estimated variance: df = 2 σ4 b var( σ b 2) This way of approximating the degrees of freedom is a special case of Satterthwaite s approximation, which has been used frequently in this course. To get these confidence intervals computed in proc mixed, the option cl must be added to the mixed procedure, like: proc mixed cl; class treatment block; model y = treatment/ddfm=satterth; random block; run; Notice the first line.