ESALQ Course on Models for Longitudinal and Incomplete Data Geert Verbeke Biostatistical Centre K.U.Leuven, Belgium geert.verbeke@med.kuleuven.be www.kuleuven.ac.be/biostat/ Geert Molenberghs Center for Statistics Universiteit Hasselt, Belgium geert.molenberghs@uhasselt.be www.censtat.uhasselt.be ESALQ Course, Diepenbeek, November 2006
Contents 0 Related References... 1 I Continuous Longitudinal Data 6 1 Introduction... 7 2 A Model for Longitudinal Data... 18 3 Estimation and Inference in the Marginal Model... 37 4 Inference for the Random Effects... 52 ESALQ Course on Models for Longitudinal and Incomplete Data i
II Marginal Models for Non-Gaussian Longitudinal Data 64 5 The Toenail Data... 65 6 The Analgesic Trial... 68 7 Generalized Linear Models...... 71 8 Parametric Modeling Families... 83 9 Generalized Estimating Equations.... 87 10 A Family of GEE Methods... 99 III Generalized Linear Mixed Models for Non-Gaussian Longitudinal Data 119 11 Generalized Linear Mixed Models (GLMM).... 120 12 Fitting GLMM s in SAS... 138 IV Marginal Versus Random-effects Models 142 13 Marginal Versus Random-effects Models...143 ESALQ Course on Models for Longitudinal and Incomplete Data ii
V Non-linear Models 154 14 Non-Linear Mixed Models... 155 VI Incomplete Data 187 15 Setting The Scene... 188 16 Proper Analysis of Incomplete Data...196 17 Weighted Generalized Estimating Equations...... 213 18 Multiple Imputation...223 VII Topics in Methods and Sensitivity Analysis for Incomplete Data 227 19 An MNAR Selection Model and Local Influence...228 20 Interval of Ignorance... 237 21 Concluding Remarks...255 ESALQ Course on Models for Longitudinal and Incomplete Data iii
VIII Case Study: Age-related Macular Degeneration 256 22 The Dataset...257 23 Weighted Generalized Estimating Equations...... 261 24 Multiple Imputation...268 ESALQ Course on Models for Longitudinal and Incomplete Data iv
Chapter 0 Related References Aerts, M., Geys, H., Molenberghs, G., and Ryan, L.M. (2002). Topics in Modelling of Clustered Data. London: Chapman & Hall. Brown, H. and Prescott, R. (1999). Applied Mixed Models in Medicine. New York: John Wiley & Sons. Crowder, M.J. and Hand, D.J. (1990). Analysis of Repeated Measures. London: Chapman & Hall. Davidian, M. and Giltinan, D.M. (1995). Nonlinear Models For Repeated Measurement Data. London: Chapman & Hall. ESALQ Course on Models for Longitudinal and Incomplete Data 1
Davis, C.S. (2002). Statistical Methods for the Analysis of Repeated Measurements. New York: Springer. Demidenko, E. (2004). Mixed Models: Theory and Applications. New York: John Wiley & Sons. Diggle, P.J., Heagerty, P.J., Liang, K.Y. and Zeger, S.L. (2002). Analysis of Longitudinal Data. (2nd edition). Oxford: Oxford University Press. Fahrmeir, L. and Tutz, G. (2002). Multivariate Statistical Modelling Based on Generalized Linear Models (2nd ed). New York: Springer. Fitzmaurice, G.M., Laird, N.M., and Ware, J.H. (2004). Applied Longitudinal Analysis. New York: John Wiley & Sons. Goldstein, H. (1979). The Design and Analysis of Longitudinal Studies. London: Academic Press. ESALQ Course on Models for Longitudinal and Incomplete Data 2
Goldstein, H. (1995). Multilevel Statistical Models. London: Edward Arnold. Hand, D.J. and Crowder, M.J. (1995). Practical Longitudinal Data Analysis. London: Chapman & Hall. Hedeker, D. and Gibbons, R.D. (2006). Longitudinal Data Analysis. New York: John Wiley & Sons. Jones, B. and Kenward, M.G. (1989). Design and Analysis of Crossover Trials. London: Chapman & Hall. Kshirsagar, A.M. and Smith, W.B. (1995). Growth Curves. New York: Marcel Dekker. Leyland, A.H. and Goldstein, H. (2001) Multilevel Modelling of Health Statistics. Chichester: John Wiley & Sons. Lindsey, J.K. (1993). Models for Repeated Measurements. Oxford: Oxford ESALQ Course on Models for Longitudinal and Incomplete Data 3
University Press. Littell, R.C., Milliken, G.A., Stroup, W.W., Wolfinger, R.D., and Schabenberger, O. (2005). SAS for Mixed Models (2nd ed.). Cary: SAS Press. Longford, N.T. (1993). Random Coefficient Models. Oxford: Oxford University Press. McCullagh, P. and Nelder, J.A. (1989). Generalized Linear Models (second edition). London: Chapman & Hall. Molenberghs, G. and Verbeke, G. (2005). Models for Discrete Longitudinal Data. New York: Springer-Verlag. Pinheiro, J.C. and Bates D.M. (2000). Mixed effects models in S and S-Plus. New York: Springer. Searle, S.R., Casella, G., and McCulloch, C.E. (1992). Variance Components. ESALQ Course on Models for Longitudinal and Incomplete Data 4
New-York: Wiley. Senn, S.J. (1993). Cross-over Trials in Clinical Research. Chichester: Wiley. Verbeke, G. and Molenberghs, G. (1997). Linear Mixed Models In Practice: A SAS Oriented Approach, Lecture Notes in Statistics 126. New-York: Springer. Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data. Springer Series in Statistics. New-York: Springer. Vonesh, E.F. and Chinchilli, V.M. (1997). Linear and Non-linear Models for the Analysis of Repeated Measurements. Basel: Marcel Dekker. Weiss, R.E. (2005). Modeling Longitudinal Data. New York: Springer. Wu, H. and Zhang, J.-T. (2006). Nonparametric Regression Methods for Longitudinal Data Analysis. New York: John Wiley & Sons. ESALQ Course on Models for Longitudinal and Incomplete Data 5
Part I Continuous Longitudinal Data ESALQ Course on Models for Longitudinal and Incomplete Data 6
Chapter 1 Introduction Repeated Measures / Longitudinal data Examples ESALQ Course on Models for Longitudinal and Incomplete Data 7
1.1 Repeated Measures / Longitudinal Data Repeated measures are obtained when a response is measured repeatedly on a set of units Units: Subjects, patients, participants,... Animals, plants,... Clusters: families, towns, branches of a company,...... Special case: Longitudinal data ESALQ Course on Models for Longitudinal and Incomplete Data 8
1.2 Rat Data Research question (Dentistry, K.U.Leuven): How does craniofacial growth depend on testosteron production? Randomized experiment in which 50 male Wistar rats are randomized to: Control (15 rats) Low dose of Decapeptyl (18 rats) High dose of Decapeptyl (17 rats) ESALQ Course on Models for Longitudinal and Incomplete Data 9
Treatment starts at the age of 45 days; measurements taken every 10 days, from day 50 on. The responses are distances (pixels) between well defined points on x-ray pictures of the skull of each rat: ESALQ Course on Models for Longitudinal and Incomplete Data 10
Measurements with respect to the roof, base and height of the skull. Here, we consider only one response, reflecting the height of the skull. Individual profiles: ESALQ Course on Models for Longitudinal and Incomplete Data 11
Complication: Dropout due to anaesthesia (56%): #Observations Age (days) Control Low High Total 50 15 18 17 50 60 13 17 16 46 70 13 15 15 43 80 10 15 13 38 90 7 12 10 29 100 4 10 10 24 110 4 8 10 22 Remarks: Much variability between rats, much less variability within rats Fixed number of measurements scheduled per subject, but not all measurements available due to dropout, for known reason. Measurements taken at fixed time points ESALQ Course on Models for Longitudinal and Incomplete Data 12
1.3 Prostate Data References: Carter et al (1992, Cancer Research). Carter et al (1992, Journal of the American Medical Association). Morrell et al (1995, Journal of the American Statistical Association). Pearson et al (1994, Statistics in Medicine). Prostate disease is one of the most common and most costly medical problems in the United States Important to look for markers which can detect the disease at an early stage Prostate-Specific Antigen is an enzyme produced by both normal and cancerous prostate cells ESALQ Course on Models for Longitudinal and Incomplete Data 13
PSA level is related to the volume of prostate tissue. Problem: Patients with Benign Prostatic Hyperplasia also have an increased PSA level Overlap in PSA distribution for cancer and BPH cases seriously complicates the detection of prostate cancer. Research question (hypothesis based on clinical practice): Can longitudinal PSA profiles be used to detect prostate cancer in an early stage? ESALQ Course on Models for Longitudinal and Incomplete Data 14
A retrospective case-control study based on frozen serum samples: 16 control patients 20 BPH cases 14 local cancer cases 4 metastatic cancer cases Complication: No perfect match for age at diagnosis and years of follow-up possible Hence, analyses will have to correct for these age differences between the diagnostic groups. ESALQ Course on Models for Longitudinal and Incomplete Data 15
Individual profiles: ESALQ Course on Models for Longitudinal and Incomplete Data 16
Remarks: Much variability between subjects Little variability within subjects Highly unbalanced data ESALQ Course on Models for Longitudinal and Incomplete Data 17
Chapter 2 A Model for Longitudinal Data Introduction The 2-stage model formulation Example: Rat data The general linear mixed-effects model Hierarchical versus marginal model Example: Rat data Example: Bivariate observations ESALQ Course on Models for Longitudinal and Incomplete Data 18
2.1 Introduction In practice: often unbalanced data: unequal number of measurements per subject measurements not taken at fixed time points Therefore, multivariate regression techniques are often not applicable Often, subject-specific longitudinal profiles can be well approximated by linear regression functions This leads to a 2-stage model formulation: Stage 1: Linear regression model for each subject separately Stage 2: Explain variability in the subject-specific regression coefficients using known covariates ESALQ Course on Models for Longitudinal and Incomplete Data 19
2.2 A 2-stage Model Formulation 2.2.1 Stage 1 Response Y ij for ith subject, measured at time t ij, i =1,...,N, j =1,...,n i Response vector Y i for ith subject: Y i =(Y i1,y i2,...,y ini ) Stage 1 model: Y i = Z i β i + ε i ESALQ Course on Models for Longitudinal and Incomplete Data 20
Z i is a (n i q) matrix of known covariates β i is a q-dimensional vector of subject-specific regression coefficients ε i N(0, Σ i ),oftenσ i = σ 2 I ni Note that the above model describes the observed variability within subjects ESALQ Course on Models for Longitudinal and Incomplete Data 21
2.2.2 Stage 2 Between-subject variability can now be studied from relating the β i to known covariates Stage 2 model: β i = K i β + b i K i is a (q p) matrix of known covariates β is a p-dimensional vector of unknown regression parameters b i N(0,D) ESALQ Course on Models for Longitudinal and Incomplete Data 22
2.3 Example: The Rat Data Individual profiles: ESALQ Course on Models for Longitudinal and Incomplete Data 23
Transformation of the time scale to linearize the profiles: Age ij t ij =ln[1+(age ij 45)/10)] Note that t =0corresponds to the start of the treatment (moment of randomization) Stage 1 model: Y ij = β 1i + β 2i t ij + ε ij, j =1,...,n i Matrix notation: Y i = Z i β i + ε i with Z i = 1 t i1 1 t i2.. 1 t ini ESALQ Course on Models for Longitudinal and Incomplete Data 24
In the second stage, the subject-specific intercepts and time effects are related to the treatment of the rats Stage 2 model: β 1i = β 0 + b 1i, β 2i = β 1 L i + β 2 H i + β 3 C i + b 2i, L i, H i,andc i are indicator variables: L i = 1 if low dose 0 otherwise H i = 1 if high dose 0 otherwise C i = 1 if control 0 otherwise ESALQ Course on Models for Longitudinal and Incomplete Data 25
Parameter interpretation: β 0 : average response at the start of the treatment (independent of treatment) β 1, β 2,andβ 3 : average time effect for each treatment group ESALQ Course on Models for Longitudinal and Incomplete Data 26
2.4 The General Linear Mixed-effects Model A 2-stage approach can be performed explicitly in the analysis However, this is just another example of the use of summary statistics: Y i is summarized by β i summary statistics β i analysed in second stage The associated drawbacks can be avoided by combining the two stages into one model: Y i = Z i β i + ε i β i = K i β + b i = Y i = Z i K i } {{ } X i β + Z i b i + ε i = X i β + Z i b i + ε i ESALQ Course on Models for Longitudinal and Incomplete Data 27
General linear mixed-effects model: Y i = X i β + Z i b i + ε i b i N(0,D), ε i N(0, Σ i ), b 1,...,b N, ε 1,...,ε N independent Terminology: Fixed effects: β Random effects: b i Variance components: elements in D and Σ i ESALQ Course on Models for Longitudinal and Incomplete Data 28
2.5 Hierarchical versus Marginal Model The general linear mixed model is given by: Y i = X i β + Z i b i + ε i b i N(0,D), ε i N(0, Σ i ), b 1,...,b N, ε 1,...,ε N independent It can be rewritten as: Y i b i N(X i β + Z i b i, Σ i ), b i N(0,D) ESALQ Course on Models for Longitudinal and Incomplete Data 29
It is therefore also called a hierarchical model: AmodelforY i given b i Amodelforb i Marginally, we have that Y i is distributed as: Y i N(X i β,z i DZ i +Σ i ) Hence, very specific assumptions are made about the dependence of mean and covariance onn the covariates X i and Z i : Implied mean : X i β Implied covariance : V i = Z i DZ i +Σ i Note that the hierarchical model implies the marginal one, NOT vice versa ESALQ Course on Models for Longitudinal and Incomplete Data 30
2.6 Example: The Rat Data Stage 1 model: Y ij = β 1i + β 2i t ij + ε ij, j =1,...,n i Stage 2 model: β 1i = β 0 + b 1i, β 2i = β 1 L i + β 2 H i + β 3 C i + b 2i, Combined: Y ij =(β 0 + b 1i )+(β 1 L i + β 2 H i + β 3 C i + b 2i )t ij + ε ij = β 0 + b 1i +(β 1 + b 2i )t ij + ε ij, if low dose β 0 + b 1i +(β 2 + b 2i )t ij + ε ij, if high dose β 0 + b 1i +(β 3 + b 2i )t ij + ε ij, if control. ESALQ Course on Models for Longitudinal and Incomplete Data 31
Implied marginal mean structure: Linear average evolution in each group Equal average intercepts Different average slopes Implied marginal covariance structure (Σ i = σ 2 I ni ): Cov(Y i (t 1 ), Y i (t 2 )) = 1 t 1 D 1 t 2 + σ 2 δ {t1,t 2 } = d 22 t 1 t 2 + d 12 (t 1 + t 2 )+d 11 + σ 2 δ {t1,t 2 }. Note that the model implicitly assumes that the variance function is quadratic over time, with positive curvature d 22. ESALQ Course on Models for Longitudinal and Incomplete Data 32
A model which assumes that all variability in subject-specific slopes can be ascribed to treatment differences can be obtained by omitting the random slopes b 2i from the above model: Y ij =(β 0 + b 1i )+(β 1 L i + β 2 H i + β 3 C i )t ij + ε ij = β 0 + b 1i + β 1 t ij + ε ij, if low dose β 0 + b 1i + β 2 t ij + ε ij, if high dose β 0 + b 1i + β 3 t ij + ε ij, if control. This is the so-called random-intercepts model The same marginal mean structure is obtained as under the model with random slopes ESALQ Course on Models for Longitudinal and Incomplete Data 33
Implied marginal covariance structure (Σ i = σ 2 I ni ): Cov(Y i (t 1 ), Y i (t 2 )) = 1 D 1 + σ 2 δ {t1,t 2 } = d 11 + σ 2 δ {t1,t 2 }. Hence, the implied covariance matrix is compound symmetry: constant variance d 11 + σ 2 constant correlation ρ I = d 11 /(d 11 + σ 2 ) between any two repeated measurements within the same rat ESALQ Course on Models for Longitudinal and Incomplete Data 34
2.7 Example: Bivariate Observations Balanced data, two measurements per subject (n i =2), two models: Model 1: Random intercepts + heterogeneous errors V = 1 1 (d) (1 1) + σ 2 1 0 0 σ 2 2 = d + σ 2 1 d d d+ σ 2 2 Model 2: Uncorrelated intercepts and slopes + measurement error V = 1 0 1 1 d 1 0 0 d 2 1 1 0 1 + σ 2 0 0 σ 2 = d 1 + σ 2 d 1 d 1 d 1 + d 2 + σ 2 ESALQ Course on Models for Longitudinal and Incomplete Data 35
Different hierarchical models can produce the same marginal model Hence, a good fit of the marginal model cannot be interpreted as evidence for any of the hierarchical models. A satisfactory treatment of the hierarchical model is only possible within a Bayesian context. ESALQ Course on Models for Longitudinal and Incomplete Data 36
Chapter 3 Estimation and Inference in the Marginal Model ML and REML estimation Fitting linear mixed models in SAS Negative variance components Inference ESALQ Course on Models for Longitudinal and Incomplete Data 37
3.1 ML and REML Estimation Recall that the general linear mixed model equals Y i = X i β + Z i b i + ε i b i N(0,D) ε i N(0, Σ i ) independent The implied marginal model equals Y i N(X i β,z i DZ i +Σ i ) Note that inferences based on the marginal model do not explicitly assume the presence of random effects representing the natural heterogeneity between subjects ESALQ Course on Models for Longitudinal and Incomplete Data 38
Notation: β: vector of fixed effects (as before) α: vector of all variance components in D and Σ i θ =(β, α ) : vector of all parameters in marginal model Marginal likelihood function: L ML (θ) = N i=1 (2π) n i/2 V i (α) 1 2 exp 1 2 (Y i X i β) Vi 1 (α)(y i X i β) If α were known, MLE of β equals where W i equals V 1 i. β(α) = N i=1 X iw i X i 1 N i=1 X iw i y i, ESALQ Course on Models for Longitudinal and Incomplete Data 39
In most cases, α is not known, and needs to be replaced by an estimate α Two frequently used estimation methods for α: Maximum likelihood Restricted maximum likelihood ESALQ Course on Models for Longitudinal and Incomplete Data 40
3.2 Fitting Linear Mixed Models in SAS A model for the prostate data: ln(psa ij +1) = β 1 Age i + β 2 C i + β 3 B i + β 4 L i + β 5 M i +(β 6 Age i + β 7 C i + β 8 B i + β 9 L i + β 10 M i ) t ij +(β 11 Age i + β 12 C i + β 13 B i + β 14 L i + β 15 M i ) t 2 ij + b 1i + b 2i t ij + b 3i t 2 ij + ε ij. SAS program (Σ i = σ 2 I ni ): proc mixed data=prostate method=reml; class id group timeclss; model lnpsa = group age time group*time age*time time2 group*time2 age*time2 / solution; random intercept time time2 / type=un subject=id; run; ESALQ Course on Models for Longitudinal and Incomplete Data 41
ML and REML estimates for fixed effects: Effect Parameter MLE (s.e.) REMLE (s.e.) Age effect β 1 0.026 (0.013) 0.027 (0.014) Intercepts: Control β 2 1.077 (0.919) 1.098 (0.976) BPH β 3 0.493 (1.026) 0.523 (1.090) L/R cancer β 4 0.314 (0.997) 0.296 (1.059) Met. cancer β 5 1.574 (1.022) 1.549 (1.086) Age time effect β 6 0.010 (0.020) 0.011 (0.021) Time effects: Control β 7 0.511 (1.359) 0.568 (1.473) BPH β 8 0.313 (1.511) 0.396 (1.638) L/R cancer β 9 1.072 (1.469) 1.036 (1.593) Met. cancer β 10 1.657 (1.499) 1.605 (1.626) Age time 2 effect β 11 0.002 (0.008) 0.002 (0.009) Time 2 effects: Control β 12 0.106 (0.549) 0.130 (0.610) BPH β 13 0.119 (0.604) 0.158 (0.672) L/R cancer β 14 0.350 (0.590) 0.342 (0.656) Met. cancer β 15 0.411 (0.598) 0.395 (0.666) ESALQ Course on Models for Longitudinal and Incomplete Data 42
ML and REML estimates for variance components: Effect Parameter MLE (s.e.) REMLE (s.e.) Covariance of b i : var(b 1i ) d 11 0.398 (0.083) 0.452 (0.098) var(b 2i ) d 22 0.768 (0.187) 0.915 (0.230) var(b 3i ) d 33 0.103 (0.032) 0.131 (0.041) cov(b 1i,b 2i ) d 12 = d 21 0.443 (0.113) 0.518 (0.136) cov(b 2i,b 3i ) d 23 = d 32 0.273 (0.076) 0.336 (0.095) cov(b 3i,b 1i ) d 13 = d 31 0.133 (0.043) 0.163 (0.053) Residual variance: var(ε ij ) σ 2 0.028 (0.002) 0.028 (0.002) Log-likelihood 1.788 31.235 ESALQ Course on Models for Longitudinal and Incomplete Data 43
Fitted average profiles at median age at diagnosis: ESALQ Course on Models for Longitudinal and Incomplete Data 44
3.3 Negative Variance Components Reconsider the model for the rat data: Y ij =(β 0 + b 1i )+(β 1 L i + β 2 H i + β 3 C i + b 2i )t ij + ε ij REML estimates obtained from SAS procedure MIXED: Effect Parameter REMLE (s.e.) Intercept β 0 68.606 (0.325) Time effects: Low dose β 1 7.503 (0.228) High dose β 2 6.877 (0.231) Control β 3 7.319 (0.285) Covariance of b i : var(b 1i ) d 11 3.369 (1.123) var(b 2i ) d 22 0.000 ( ) cov(b 1i,b 2i ) d 12 = d 21 0.090 (0.381) Residual variance: var(ε ij ) σ 2 1.445 (0.145) REML log-likelihood 466.173 ESALQ Course on Models for Longitudinal and Incomplete Data 45
This suggests that the REML likelihood could be further increased by allowing negative estimates for d 22 In SAS, this can be done by adding the option nobound to the PROC MIXED statement. Results: Parameter restrictions for α d ii 0,σ 2 0 d ii IR, σ 2 IR Effect Parameter REMLE (s.e.) REMLE (s.e.) Intercept β 0 68.606 (0.325) 68.618 (0.313) Time effects: Low dose β 1 7.503 (0.228) 7.475 (0.198) High dose β 2 6.877 (0.231) 6.890 (0.198) Control β 3 7.319 (0.285) 7.284 (0.254) Covariance of b i : var(b 1i ) d 11 3.369 (1.123) 2.921 (1.019) var(b 2i ) d 22 0.000 ( ) 0.287 (0.169) cov(b 1i,b 2i ) d 12 = d 21 0.090 (0.381) 0.462 (0.357) Residual variance: var(ε ij ) σ 2 1.445 (0.145) 1.522 (0.165) REML log-likelihood 466.173 465.193 ESALQ Course on Models for Longitudinal and Incomplete Data 46
Note that the REML log-likelihood value has been further increased and a negative estimate for d 22 is obtained. Brown & Prescott (1999, p. 237) : ESALQ Course on Models for Longitudinal and Incomplete Data 47
Meaning of negative variance component? Fitted variance function: Var(Y i (t)) = 1 t D 1 t + σ 2 = d 22 t 2 +2 d 12 t + d 11 + σ 2 = 0.287t 2 +0.924t +4.443 The suggested negative curvature in the variance function is supported by the sample variance function: ESALQ Course on Models for Longitudinal and Incomplete Data 48
This again shows that the hierarchical and marginal models are not equivalent: The marginal model allows negative variance components, as long as the marginal covariances V i = Z i DZ i + σ 2 I ni are positive definite The hierarchical interpretation of the model does not allow negative variance components ESALQ Course on Models for Longitudinal and Incomplete Data 49
3.4 Inference for the Marginal Model Estimate for β: β(α) = N i=1 X iw i X i with α replacedbyitsmlorremlestimate 1 N i=1 X iw i y i Conditional on α, β(α) is multivariate normal with mean β and covariance Var( β) = = N i=1 X iw i X i N i=1 X iw i X i 1 N 1 i=1 X iw i var(y i )W i X i N i=1 X iw i X i 1 In practice one again replaces α by its ML or REML estimate ESALQ Course on Models for Longitudinal and Incomplete Data 50
Inference for β: Wald tests t-andf -tests LR tests (not with REML) Inference for α: Wald tests LR tests (even with REML) Caution 1: Marginal vs. hierarchical model Caution 2: Boundary problems ESALQ Course on Models for Longitudinal and Incomplete Data 51
Chapter 4 Inference for the Random Effects Empirical Bayes inference Example: Prostate data Shrinkage Example: Prostate data ESALQ Course on Models for Longitudinal and Incomplete Data 52
4.1 Empirical Bayes Inference Random effects b i reflect how the evolution for the ith subject deviates from the expected evolution X i β. Estimation of the b i helpful for detecting outlying profiles This is only meaningful under the hierarchical model interpretation: Y i b i N(X i β + Z i b i, Σ i ) b i N(0,D) Since the b i are random, it is most natural to use Bayesian methods Terminology: prior distribution N(0,D) for b i ESALQ Course on Models for Longitudinal and Incomplete Data 53
Posterior density: f(b i y i ) f(b i Y i = y i ) = f(y i b i ) f(b i ) f(y i b i ) f(b i ) f(y i b i ) f(b i ) db i... exp 1 2 (b i DZ iw i (y i X i β)) Λ 1 i (b i DZ iw i (y i X i β)) for some positive definite matrix Λ i. Posterior distribution: b i y i N (DZ iw i (y i X i β), Λ i ) ESALQ Course on Models for Longitudinal and Incomplete Data 54
Posterior mean as estimate for b i : b i (θ) =E [b i Y i = y i ] = b i f(b i y i ) db i = DZ iw i (α)(y i X i β) Parameters in θ are replaced by their ML or REML estimates, obtained from fitting the marginal model. b i = b i ( θ) is called the Empirical Bayes estimate of b i. Approximate t- andf -tests to account for the variability introduced by replacing θ by θ, similar to tests for fixed effects. ESALQ Course on Models for Longitudinal and Incomplete Data 55
4.2 Example: Prostate Data We reconsider our linear mixed model: ln(psa ij +1) = β 1 Age i + β 2 C i + β 3 B i + β 4 L i + β 5 M i +(β 6 Age i + β 7 C i + β 8 B i + β 9 L i + β 10 M i ) t ij +(β 11 Age i + β 12 C i + β 13 B i + β 14 L i + β 15 M i ) t 2 ij + b 1i + b 2i t ij + b 3i t 2 ij + ε ij. In SAS the estimates can be obtained from adding the option solution to the random statement: random intercept time time2 / type=un subject=id solution; ods listing exclude solutionr; ods output solutionr=out; ESALQ Course on Models for Longitudinal and Incomplete Data 56
The ODS statements are used to write the EB estimates into a SAS output data set, and to prevent SAS from printing them in the output window. In practice, histograms and scatterplots of certain components of b i are used to detect model deviations or subjects with exceptional evolutions over time ESALQ Course on Models for Longitudinal and Incomplete Data 57
ESALQ Course on Models for Longitudinal and Incomplete Data 58
Strong negative correlations in agreement with correlation matrix corresponding to fitted D: Dcorr = 1.000 0.803 0.658 0.803 1.000 0.968 0.658 0.968 1.000 Histograms and scatterplots show outliers Subjects #22, #28, #39, and #45, have highest four slopes for time 2 and smallest four slopes for time, i.e., with the strongest (quadratic) growth. Subjects #22, #28 and #39 have been further examined and have been shown to be metastatic cancer cases which were misclassified as local cancer cases. Subject #45 is the metastatic cancer case with the strongest growth ESALQ Course on Models for Longitudinal and Incomplete Data 59
4.3 Shrinkage Estimators b i Consider the prediction of the evolution of the ith subject: Y i X i β + Z i b i = X i β + Z i DZ iv i 1 (y i X i β) = ( I ni Z i DZ iv 1 i = Σ i Vi 1 X i β + ( I ni Σ i Vi 1 ) Xi β + Z i DZ iv i 1 y i ) yi, Hence, Ŷi is a weighted mean of the population-averaged profile X i β and the observed data y i, with weights Σ i V i 1 and I ni Σ i V i 1 respectively. ESALQ Course on Models for Longitudinal and Incomplete Data 60
Note that X i β gets much weight if the residual variability is large in comparison to the total variability. This phenomenon is usually called shrinkage : The observed data are shrunk towards the prior average profile X i β. This is also reflected in the fact that for any linear combination λ b i of random effects, var(λ b i ) var(λ b i ). ESALQ Course on Models for Longitudinal and Incomplete Data 61
4.4 Example: Prostate Data Comparison of predicted, average, and observed profiles for the subjects #15 and #28, obtained under the reduced model: ESALQ Course on Models for Longitudinal and Incomplete Data 62
Illustration of the shrinkage effect : Var( b i ) = 0.403 0.440 0.131 0.440 0.729 0.253 0.131 0.253 0.092, D = 0.443 0.490 0.148 0.490 0.842 0.300 0.148 0.300 0.114 ESALQ Course on Models for Longitudinal and Incomplete Data 63
Part II Marginal Models for Non-Gaussian Longitudinal Data ESALQ Course on Models for Longitudinal and Incomplete Data 64
Chapter 5 The Toenail Data Toenail Dermatophyte Onychomycosis: Common toenail infection, difficult to treat, affecting more than 2% of population. Classical treatments with antifungal compounds need to be administered until the whole nail has grown out healthy. New compounds have been developed which reduce treatment to 3 months Randomized, double-blind, parallel group, multicenter study for the comparison of two such new compounds (A and B) for oral treatment. ESALQ Course on Models for Longitudinal and Incomplete Data 65
Research question: Severity relative to treatment of TDO? 2 189 patients randomized, 36 centers 48 weeks of total follow up (12 months) 12 weeks of treatment (3 months) measurements at months 0, 1, 2, 3, 6, 9, 12. ESALQ Course on Models for Longitudinal and Incomplete Data 66
Frequencies at each visit (both treatments): ESALQ Course on Models for Longitudinal and Incomplete Data 67
Chapter 6 The Analgesic Trial single-arm trial with 530 patients recruited (491 selected for analysis) analgesic treatment for pain caused by chronic nonmalignant disease treatment was to be administered for 12 months we will focus on Global Satisfaction Assessment (GSA) GSA scale goes from 1=very good to 5=very bad GSA was rated by each subject 4 times during the trial, at months 3, 6, 9, and 12. ESALQ Course on Models for Longitudinal and Incomplete Data 68
Research questions: Evolution over time Relation with baseline covariates: age, sex, duration of the pain, type of pain, disease progression, Pain Control Assessment (PCA),... Investigation of dropout Frequencies: GSA Month 3 Month 6 Month 9 Month 12 1 55 14.3% 38 12.6% 40 17.6% 30 13.5% 2 112 29.1% 84 27.8% 67 29.5% 66 29.6% 3 151 39.2% 115 38.1% 76 33.5% 97 43.5% 4 52 13.5% 51 16.9% 33 14.5% 27 12.1% 5 15 3.9% 14 4.6% 11 4.9% 3 1.4% Tot 385 302 227 223 ESALQ Course on Models for Longitudinal and Incomplete Data 69
Missingness: Measurement occasion Month 3 Month 6 Month 9 Month 12 Number % Completers O O O O 163 41.2 Dropouts O O O M 51 12.91 O O M M 51 12.91 O M M M 63 15.95 Non-monotone missingness O O M O 30 7.59 O M O O 7 1.77 O M O M 2 0.51 O M M O 18 4.56 M O O O 2 0.51 M O O M 1 0.25 M O M O 1 0.25 M O M M 3 0.76 ESALQ Course on Models for Longitudinal and Incomplete Data 70
Chapter 7 Generalized Linear Models The model Maximum likelihood estimation Examples McCullagh and Nelder (1989) ESALQ Course on Models for Longitudinal and Incomplete Data 71
7.1 The Generalized Linear Model Suppose a sample Y 1,...,Y N of independent observations is available All Y i have densities f(y i θ i,φ) which belong to the exponential family: f(y θ i,φ)=exp { φ 1 [yθ i ψ(θ i )] + c(y, φ) } θ i the natural parameter Linear predictor: θ i = x i β φ is the scale parameter (overdispersion parameter) ESALQ Course on Models for Longitudinal and Incomplete Data 72
ψ( ) is a function, generating mean and variance: E(Y )=ψ (θ) Var(Y )=φψ (θ) Note that, in general, the mean μ and the variance are related: Var(Y ) = φψ [ ψ 1 (μ) ] = φv(μ) The function v(μ) is called the variance function. The function ψ 1 which expresses θ as function of μ is called the link function. ψ is the inverse link function ESALQ Course on Models for Longitudinal and Incomplete Data 73
7.2 Examples 7.2.1 The Normal Model Model: Y N(μ, σ 2 ) Density function: f(y θ, φ) = 1 2πσ 2 exp 1 σ 2(y μ)2 =exp 1 σ 2 yμ μ2 2 + ln(2πσ 2 ) 2 y2 2σ 2 ESALQ Course on Models for Longitudinal and Incomplete Data 74
Exponential family: θ= μ φ= σ 2 ψ(θ) =θ 2 /2 c(y, φ) = ln(2πφ) 2 y2 2φ Mean and variance function: μ= θ v(μ) =1 Note that, under this normal model, the mean and variance are not related: φv(μ) = σ 2 The link function is here the identity function: θ = μ ESALQ Course on Models for Longitudinal and Incomplete Data 75
7.2.2 The Bernoulli Model Model: Y Bernoulli(π) Density function: f(y θ, φ) =π y (1 π) 1 y =exp{y ln π +(1 y) ln(1 π)} =exp y ln π + ln(1 π) 1 π ESALQ Course on Models for Longitudinal and Incomplete Data 76
Exponential family: θ=ln ( ) π φ=1 1 π ψ(θ) = ln(1 π) = ln(1 + exp(θ)) c(y, φ) =0 Mean and variance function: μ= v(μ) = exp θ 1+exp θ = π exp θ (1+exp θ) 2 = π(1 π) Note that, under this model, the mean and variance are related: φv(μ) = μ(1 μ) The link function here is the logit link: θ =ln ( μ 1 μ ) ESALQ Course on Models for Longitudinal and Incomplete Data 77
7.3 Generalized Linear Models (GLM) Suppose a sample Y 1,...,Y N of independent observations is available All Y i have densities f(y i θ i,φ) which belong to the exponential family In GLM s, it is believed that the differences between the θ i can be explained through a linear function of known covariates: θ i = x i β x i is a vector of p known covariates β is the corresponding vector of unknown regression parameters, to be estimated from the data. ESALQ Course on Models for Longitudinal and Incomplete Data 78
7.4 Maximum Likelihood Estimation Log-likelihood: l(β,φ) = 1 φ i [y iθ i ψ(θ i )] + i c(y i,φ) The score equations: S(β) = i μ i β v 1 i (y i μ i ) = 0 Note that the estimation of β depends on the density only through the means μ i and the variance functions v i = v(μ i ). ESALQ Course on Models for Longitudinal and Incomplete Data 79
The score equations need to be solved numerically: iterative (re-)weighted least squares Newton-Raphson Fisher scoring Inference for β is based on classical maximum likelihood theory: asymptotic Wald tests likelihood ratio tests score tests ESALQ Course on Models for Longitudinal and Incomplete Data 80
7.5 Illustration: The Analgesic Trial Early dropout (did the subject drop out after the first or the second visit)? Binary response PROC GENMOD can fit GLMs in general PROC LOGISTIC can fit models for binary (and ordered) responses SAS code for logit link: proc genmod data=earlydrp; model earlydrp = pca0 weight psychiat physfct / dist=b; run; proc logistic data=earlydrp descending; model earlydrp = pca0 weight psychiat physfct; run; ESALQ Course on Models for Longitudinal and Incomplete Data 81
SAS code for probit link: proc genmod data=earlydrp; model earlydrp = pca0 weight psychiat physfct / dist=b link=probit; run; proc logistic data=earlydrp descending; model earlydrp = pca0 weight psychiat physfct / link=probit; run; Selected output: Analysis Of Parameter Estimates Standard Wald 95% Chi- Parameter DF Estimate Error Confidence Limits Square Pr > ChiSq Intercept 1-1.0673 0.7328-2.5037 0.3690 2.12 0.1453 PCA0 1 0.3981 0.1343 0.1349 0.6614 8.79 0.0030 WEIGHT 1-0.0211 0.0072-0.0353-0.0070 8.55 0.0034 PSYCHIAT 1 0.7169 0.2871 0.1541 1.2796 6.23 0.0125 PHYSFCT 1 0.0121 0.0050 0.0024 0.0219 5.97 0.0145 Scale 0 1.0000 0.0000 1.0000 1.0000 NOTE: The scale parameter was held fixed. ESALQ Course on Models for Longitudinal and Incomplete Data 82
Chapter 8 Parametric Modeling Families Continuous outcomes Longitudinal generalized linear models Notation ESALQ Course on Models for Longitudinal and Incomplete Data 83
8.1 Continuous Outcomes Marginal Models: E(Y ij x ij )=x ijβ Random-Effects Models: E(Y ij b i, x ij )=x ijβ + z ijb i Transition Models: E(Y ij Y i,j 1,...,Y i1, x ij )=x ijβ + αy i,j 1 ESALQ Course on Models for Longitudinal and Incomplete Data 84
8.2 Longitudinal Generalized Linear Models Normal case: easy transfer between models Also non-normal data can be measured repeatedly (over time) Lack of key distribution such as the normal [= ] A lot of modeling options Introduction of non-linearity No easy transfer between model families crosssectional longitudinal normal outcome linear model LMM non-normal outcome GLM? ESALQ Course on Models for Longitudinal and Incomplete Data 85
8.3 Marginal Models Fully specified models & likelihood estimation E.g., multivariate probit model, Bahadur model, odds ratio model Specification and/or fitting can be cumbersome Non-likelihood alternatives Historically: Empirical generalized least squares (EGLS; Koch et al 1977; SAS CATMOD) Generalized estimating equations Pseudo-likelihood ESALQ Course on Models for Longitudinal and Incomplete Data 86
Chapter 9 Generalized Estimating Equations General idea Asymptotic properties Working correlation Special case and application SAS code and output ESALQ Course on Models for Longitudinal and Incomplete Data 87
9.1 General Idea Univariate GLM, score function of the form (scalar Y i ): S(β) = N i=1 μ i β v 1 i (y i μ i )=0 with v i = Var(Y i ) In longitudinal setting: Y =(Y 1,...,Y N ): where S(β) = i j μ ij β v 1 ij (y ij μ ij ) = N i=1 D i [V i (α)] 1 (y i μ i ) = 0 D i is an n i p matrix with (i, j)th elements μ ij β y i and μ i are n i -vectors with elements y ij and μ ij Is V i n i n i diagonal or more complex? ESALQ Course on Models for Longitudinal and Incomplete Data 88
V i = Var(Y i ) is more complex since it involves a set of nuisance parameters α, determining the covariance structure of Y i : in which V i (β, α) =φa 1/2 i (β)r i (α)a 1/2 i (β) A 1/2 i (β) = vi1 (μ i1 (β))... 0..... 0... v ini (μ ini (β)) and R i (α) is the correlation matrix of Y i, parameterized by α. Same form as for full likelihood procedure, but we restrict specification to the first moment only Liang and Zeger (1986) ESALQ Course on Models for Longitudinal and Incomplete Data 89
9.2 Large Sample Properties As N where N(ˆβ β) N(0,I 1 0 ) I 0 = N i=1 D i[v i (α)] 1 D i (Unrealistic) Conditions: α is known the parametric form for V i (α) is known This is the naive purely model based variance estimator Solution: working correlation matrix ESALQ Course on Models for Longitudinal and Incomplete Data 90
9.3 Unknown Covariance Structure Keep the score equations BUT S(β) = N i=1 [D i] [V i (α)] 1 (y i μ i )=0 suppose V i (.) is not the true variance of Y i but only a plausible guess, a so-called working correlation matrix specify correlations and not covariances, because the variances follow from the mean structure the score equations are solved as before ESALQ Course on Models for Longitudinal and Incomplete Data 91
The asymptotic normality results change to N(ˆβ β) N(0,I 1 0 I 1 I 1 0 ) I 0 = N i=1 D i[v i (α)] 1 D i I 1 = N i=1 D i[v i (α)] 1 Var(Y i )[V i (α)] 1 D i. This is the robust empirically corrected sandwich variance estimator I 0 is the bread I 1 is the filling (ham or cheese) Correct guess = likelihood variance ESALQ Course on Models for Longitudinal and Incomplete Data 92
The estimators ˆβ are consistent even if the working correlation matrix is incorrect An estimate is found by replacing the unknown variance matrix Var(Y i ) by (Y i ˆμ i )(Y i ˆμ i ). Even if this estimator is bad for Var(Y i ) it leads to a good estimate of I 1, provided that: replication in the data is sufficiently large same model for μ i is fitted to groups of subjects observation times do not vary too much between subjects A bad choice of working correlation matrix can affect the efficiency of ˆβ Care needed with incomplete data (see Part VI) ESALQ Course on Models for Longitudinal and Incomplete Data 93
9.4 The Working Correlation Matrix V i = V i (β, α,φ)=φa 1/2 i (β)r i (α)a 1/2 i (β) Variance function: A i is (n i n i ) diagonal with elements v(μ ij ),theknownglm variance function. Working correlation: R i (α) possibly depends on a different set of parameters α. Overdispersion parameter: φ, assumed 1 or estimated from the data. The unknown quantities are expressed in terms of the Pearson residuals Note that e ij depends on β. e ij = y ij μ ij v(μij ). ESALQ Course on Models for Longitudinal and Incomplete Data 94
9.5 Estimation of Working Correlation Liang and Zeger (1986) proposed moment-based estimates for the working correlation. Corr(Y ij,y ik ) Estimate Independence 0 Exchangeable α ˆα = 1 N AR(1) α ˆα = 1 N N i=1 1 n i (n i 1) j k e ij e ik N i=1 1 n i 1 j ni 1 e ij e i,j+1 Dispersion parameter: ˆφ = 1 N N i=1 1 n i n i j=1 e2 ij. Unstructured α jk ˆα jk = 1 N N i=1 e ij e ik ESALQ Course on Models for Longitudinal and Incomplete Data 95
9.6 Fitting GEE The standard procedure, implemented in the SAS procedure GENMOD. 1. Compute initial estimates for β, using a univariate GLM (i.e., assuming independence). 2. Compute Pearson residuals e ij. Compute estimates for α and φ. Compute R i (α) and V i (β, α) =φa 1/2 i (β)r i (α)a 1/2 i (β). 3. Update estimate for β: β (t+1) = β (t) N 4. Iterate 2. 3. until convergence. i=1 D ivi 1 D i 1 N i=1 D iv 1 Estimates of precision by means of I 1 0 and/or I 1 0 I 1 I 1 0. i (y i μ i ). ESALQ Course on Models for Longitudinal and Incomplete Data 96
9.7 Special Case: Linear Mixed Models Estimate for β: β(α) = i=1 X iw i X i with α replacedbyitsmlorremlestimate Conditional on α, β has mean N 1 N i=1 X iw i Y i E [ β(α) ] = N i=1 X iw i X i 1 N i=1 X iw i X i β = β provided that E(Y i )=X i β Hence, in order for β to be unbiased, it is sufficient that the mean of the response is correctly specified. ESALQ Course on Models for Longitudinal and Incomplete Data 97
Conditional on α, β has covariance Var( β) = N i=1 X iw i X i 1 N i=1 X iw i Var(Y i )W i X i N i=1 X iw 1 i X i = N i=1 X iw 1 i X i Note that this model-based version assumes that the covariance matrix Var(Y i ) is correctly modelled as V i = Z i DZ i +Σ i. An empirically corrected version is: Var( β) = N i=1 X iw i X i 1 } {{ } } {{ } } {{ } BREAD N i=1 X iw i Var(Y i )W i X i MEAT N i=1 X iw i X i BREAD 1 ESALQ Course on Models for Longitudinal and Incomplete Data 98
Chapter 10 A Family of GEE Methods Classical approach Prentice s two sets of GEE Linearization-based version GEE2 Alternating logistic regressions ESALQ Course on Models for Longitudinal and Incomplete Data 99
10.1 Prentice s GEE where N i=1 D iv 1 i (Y i μ i ) = 0, N i=1 E iw 1 i (Z i δ i ) = 0 Z ijk = (Y ij μ ij )(Y ik μ ik ) μij (1 μ ij )μ ik (1 μ ik ), δ ijk = E(Z ijk ) The joint asymptotic distribution of N(ˆβ β) and N( ˆα α) normal with variance-covariance matrix consistently estimated by N A 0 B C Λ 11 Λ 12 Λ 21 Λ 22 AB 0 C ESALQ Course on Models for Longitudinal and Incomplete Data 100
where A = B = C = N i=1 N i=1 N i=1 D iv 1 i E i W 1 i E i W 1 i 1 D i, E 1 N i i=1 1 E i, E i W 1 i Z i β N i=1 D i V 1 i 1 D i, Λ 11 = N i=1 Λ 12 = N i=1 Λ 21 = Λ 12, Λ 22 = N i=1 D iv i 1 Cov(Y i )Vi 1 D i, D iv i 1 Cov(Y i, Z i )Wi 1 E i, E iw i 1 Cov(Z i )Wi 1 E i, and Statistic Estimator Var(Y i ) (Y i μ i )(Y i μ i ) Cov(Y i, Z i ) (Y i μ i )(Z i δ i ) Var(Z i ) (Z i δ i )(Z i δ i ) ESALQ Course on Models for Longitudinal and Incomplete Data 101
10.2 GEE Based on Linearization 10.2.1 Model formulation Previous version of GEE are formulated directly in terms of binary outcomes This approach is based on a linearization: with y i = μ i + ε i η i = g(μ i ), η i = X i β, Var(y i ) = Var(ε i )=Σ i. η i is a vector of linear predictors, g(.) is the (vector) link function. ESALQ Course on Models for Longitudinal and Incomplete Data 102
10.2.2 Estimation (Nelder and Wedderburn 1972) Solve iteratively: where N i=1 X iw i X i β = N i=1 W iy i, W i = F iσ 1 i F i, y i = ˆη i +(y i ˆμ i )F 1 i, F i = μ i η i, Σ i = Var(ε), μ i = E(y i ). Remarks: y i is called working variable or pseudo data. Basis for SAS macro and procedure GLIMMIX For linear models, D i = I ni and standard linear regression follows. ESALQ Course on Models for Longitudinal and Incomplete Data 103
10.2.3 The Variance Structure Σ i = φa 1/2 i (β)r i (α)a 1/2 i (β) φ is a scale (overdispersion) parameter, A i = v(μ i ), expressing the mean-variance relation (this is a function of β), R i (α) describes the correlation structure: If independence is assumed then R i (α) =I ni. Other structures, such as compound symmetry, AR(1),...can be assumed as well. ESALQ Course on Models for Longitudinal and Incomplete Data 104
10.3 GEE2 Model: Marginal mean structure Pairwise association: Odds ratios Correlations Working assumptions: Third and fourth moments Estimation: Second-order estimating equations Likelihood (assuming 3rd and 4th moments are correctly specified) ESALQ Course on Models for Longitudinal and Incomplete Data 105
10.4 Alternating Logistic Regression Diggle, Heagerty, Liang, and Zeger (2002) and Molenberghs and Verbeke (2005) When marginal odds ratios are used to model association, α can be estimated using ALR, which is almost as efficient as GEE2 almost as easy (computationally) than GEE1 μ ijk as before and α ijk =ln(ψ ijk ) the marginal log odds ratio: logit Pr(Y ij =1 x ij )=x ij β logit Pr(Y ij =1 Y ik = y ik )=α ijk y ik +ln μ ij μ ijk 1 μ ij μ ik + μ ijk ESALQ Course on Models for Longitudinal and Incomplete Data 106
α ijk can be modelled in terms of predictors the second term is treated as an offset the estimating equations for β and α are solved in turn, and the alternating between both sets is repeated until convergence. this is needed because the offset clearly depends on β. ESALQ Course on Models for Longitudinal and Incomplete Data 107
10.5 Application to the Toenail Data 10.5.1 The model Consider the model: Y ij Bernoulli(μ ij ), log μ ij 1 μ ij = β 0 + β 1 T i + β 2 t ij + β 3 T i t ij Y ij : severe infection (yes/no) at occasion j for patient i t ij : measurement time for occasion j T i : treatment group ESALQ Course on Models for Longitudinal and Incomplete Data 108
10.5.2 Standard GEE SAS Code: proc genmod data=test descending; class idnum timeclss; model onyresp = treatn time treatn*time / dist=binomial; repeated subject=idnum / withinsubject=timeclss type=exch covb corrw modelse; run; SAS statements: The REPEATED statements defines the GEE character of the model. type= : working correlation specification (UN, AR(1), EXCH, IND,...) modelse : model-based s.e. s on top of default empirically corrected s.e. s corrw : printout of working correlation matrix withinsubject= : specification of the ordering within subjects ESALQ Course on Models for Longitudinal and Incomplete Data 109
Selected output: Regression parameters: Analysis Of Initial Parameter Estimates Standard Wald 95% Chi- Parameter DF Estimate Error Confidence Limits Square Intercept 1-0.5571 0.1090-0.7708-0.3433 26.10 treatn 1 0.0240 0.1565-0.2827 0.3307 0.02 time 1-0.1769 0.0246-0.2251-0.1288 51.91 treatn*time 1-0.0783 0.0394-0.1556-0.0010 3.95 Scale 0 1.0000 0.0000 1.0000 1.0000 Estimates from fitting the model, ignoring the correlation structure, i.e., from fitting a classical GLM to the data, using proc GENMOD. The reported log-likelihood also corresponds to this model, and therefore should not be interpreted. The reported estimates are used as starting values in the iterative estimation procedure for fitting the GEE s. ESALQ Course on Models for Longitudinal and Incomplete Data 110
Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > Z Intercept -0.5840 0.1734-0.9238-0.2441-3.37 0.0008 treatn 0.0120 0.2613-0.5001 0.5241 0.05 0.9633 time -0.1770 0.0311-0.2380-0.1161-5.69 <.0001 treatn*time -0.0886 0.0571-0.2006 0.0233-1.55 0.1208 Analysis Of GEE Parameter Estimates Model-Based Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > Z Intercept -0.5840 0.1344-0.8475-0.3204-4.34 <.0001 treatn 0.0120 0.1866-0.3537 0.3777 0.06 0.9486 time -0.1770 0.0209-0.2180-0.1361-8.47 <.0001 treatn*time -0.0886 0.0362-0.1596-0.0177-2.45 0.0143 The working correlation: Exchangeable Working Correlation Correlation 0.420259237 ESALQ Course on Models for Longitudinal and Incomplete Data 111
10.5.3 Alternating Logistic Regression type=exch logor=exch Note that α now is a genuine parameter ESALQ Course on Models for Longitudinal and Incomplete Data 112
Selected output: Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > Z Intercept -0.5244 0.1686-0.8548-0.1940-3.11 0.0019 treatn 0.0168 0.2432-0.4599 0.4935 0.07 0.9448 time -0.1781 0.0296-0.2361-0.1200-6.01 <.0001 treatn*time -0.0837 0.0520-0.1856 0.0182-1.61 0.1076 Alpha1 3.2218 0.2908 2.6519 3.7917 11.08 <.0001 Analysis Of GEE Parameter Estimates Model-Based Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > Z Intercept -0.5244 0.1567-0.8315-0.2173-3.35 0.0008 treatn 0.0168 0.2220-0.4182 0.4519 0.08 0.9395 time -0.1781 0.0233-0.2238-0.1323-7.63 <.0001 treatn*time -0.0837 0.0392-0.1606-0.0068-2.13 0.0329 ESALQ Course on Models for Longitudinal and Incomplete Data 113
10.5.4 Linearization Based Method GLIMMIX macro: %glimmix(data=test, procopt=%str(method=ml empirical), stmts=%str( class idnum timeclss; model onyresp = treatn time treatn*time / solution; repeated timeclss / subject=idnum type=cs rcorr; ), error=binomial, link=logit); GLIMMIX procedure: proc glimmix data=test method=rspl empirical; class idnum; model onyresp (event= 1 ) = treatn time treatn*time / dist=binary solution; random _residual_ / subject=idnum type=cs; run; ESALQ Course on Models for Longitudinal and Incomplete Data 114
Both produce the same results The GLIMMIX macro is a MIXED core, with GLM-type surrounding statements The GLIMMIX procedure does not call MIXED, it has its own engine PROC GLIMMIX combines elements of MIXED and of GENMOD RANDOM residual is the PROC GLIMMIX way to specify residual correlation ESALQ Course on Models for Longitudinal and Incomplete Data 115
10.5.5 Results of Models Fitted to Toenail Data Effect Par. IND EXCH UN GEE1 Int. β 0-0.557(0.109;0.171) -0.584(0.134;0.173) -0.720(0.166;0.173) T i β 1 0.024(0.157;0.251) 0.012(0.187;0.261) 0.072(0.235;0.246) t ij β 2-0.177(0.025;0.030) -0.177(0.021;0.031) -0.141(0.028;0.029) T i t ij β 3-0.078(0.039;0.055) -0.089(0.036;0.057) -0.114(0.047;0.052) ALR Int. β 0-0.524(0.157;0.169) T i β 1 0.017(0.222;0.243) t ij β 2-0.178(0.023;0.030) T i t ij β 3-0.084(0.039;0.052) Ass. α 3.222( ;0.291) Linearization based method Int. β 0-0.557(0.112;0.171) -0.585(0.142;0.174) -0.630(0.171;0.172) T i β 1 0.024(0.160;0.251) 0.011(0.196;0.262) 0.036(0.242;0.242) t ij β 2-0.177(0.025;0.030) -0.177(0.022;0.031) -0.204(0.038;0.034) T i t ij β 3-0.078(0.040:0.055) -0.089(0.038;0.057) -0.106(0.058;0.058) estimate (model-based s.e.; empirical s.e.) ESALQ Course on Models for Longitudinal and Incomplete Data 116
10.5.6 Discussion GEE1: All empirical standard errors are correct, but the efficiency is higher for the more complex working correlation structure, as seen in p-values for T i t ij effect: Structure p-value IND 0.1515 EXCH 0.1208 UN 0.0275 Thus, opting for reasonably adequate correlation assumptions still pays off, in spite of the fact that all are consistent and asymptotically normal Similar conclusions for linearization-based method ESALQ Course on Models for Longitudinal and Incomplete Data 117
Model-based s.e. and empirically corrected s.e. in reasonable agreement for UN Typically, the model-based standard errors are much too small as they are based on the assumption that all observations in the data set are independent, hereby overestimating the amount of available information, hence also overestimating the precision of the estimates. ALR: similar inferences but now also α part of the inferences ESALQ Course on Models for Longitudinal and Incomplete Data 118
Part III Generalized Linear Mixed Models for Non-Gaussian Longitudinal Data ESALQ Course on Models for Longitudinal and Incomplete Data 119
Chapter 11 Generalized Linear Mixed Models (GLMM) Introduction: LMM Revisited Generalized Linear Mixed Models (GLMM) Fitting Algorithms Example ESALQ Course on Models for Longitudinal and Incomplete Data 120
11.1 Introduction: LMM Revisited We re-consider the linear mixed model: Y i b i N(X i β + Z i b i, Σ i ), b i N(0,D) The implied marginal model equals Y i N(X i β,z i DZ i +Σ i ) Hence, even under conditional independence, i.e., all Σ i equal to σ 2 I ni,amarginal association structure is implied through the random effects. The same ideas can now be applied in the context of GLM s to model association between discrete repeated measures. ESALQ Course on Models for Longitudinal and Incomplete Data 121
11.2 Generalized Linear Mixed Models (GLMM) Given a vector b i of random effects for cluster i, it is assumed that all responses Y ij are independent, with density f(y ij θ ij,φ)=exp { φ 1 [y ij θ ij ψ(θ ij )] + c(y ij,φ) } θ ij is now modelled as θ ij = x ij β + z ij b i As before, it is assumed that b i N(0,D) Let f ij (y ij b i, β,φ) denote the conditional density of Y ij given b i, the conditional density of Y i equals f i (y i b i, β,φ)= n i j=1 f ij(y ij b i, β,φ) ESALQ Course on Models for Longitudinal and Incomplete Data 122
The marginal distribution of Y i is given by f i (y i β,d,φ)= f i (y i b i, β,φ) f(b i D) db i = n i j=1 f ij(y ij b i, β,φ) f(b i D) db i where f(b i D) is the density of the N(0,D) distribution. The likelihood function for β, D, andφ now equals L(β,D,φ)= N i=1 f i(y i β,d,φ) = N i=1 n i j=1 f ij(y ij b i, β,φ) f(b i D) db i ESALQ Course on Models for Longitudinal and Incomplete Data 123
Under the normal linear model, the integral can be worked out analytically. In general, approximations are required: Approximation of integrand Approximation of data Approximation of integral Predictions of random effects can be based on the posterior distribution f(b i Y i = y i ) Empirical Bayes (EB) estimate : Posterior mode, with unknown parameters replaced by their MLE ESALQ Course on Models for Longitudinal and Incomplete Data 124
11.3 Laplace Approximation of Integrand Integrals in L(β,D,φ) can be written in the form I = e Q(b) db Second-order Taylor expansion of Q(b) around the mode yields Q(b) Q( b)+ 1 2 (b b) Q ( b)(b b), Quadratic term leads to re-scaled normal density. Hence, I (2π) q/2 Q ( b) 1/2 e Q( b). Exact approximation in case of normal kernels Good approximation in case of many repeated measures per subject ESALQ Course on Models for Longitudinal and Incomplete Data 125
11.4 Approximation of Data 11.4.1 General Idea Re-write GLMM as: Y ij = μ ij + ε ij = h(x ijβ + z ijb i )+ε ij with variance for errors equal to Var(Y ij b i )=φv(μ ij ) Linear Taylor expansion for μ ij : Penalized quasi-likelihood (PQL): Around current β and b i Marginal quasi-likelihood (MQL): Around current β and b i = 0 ESALQ Course on Models for Longitudinal and Incomplete Data 126
11.4.2 Penalized quasi-likelihood (PQL) Linear Taylor expansion around current β and b i : Y ij h(x ij β + z ij b i ) + h (x ij β + z ij b i )x ij(β β) + h (x ij β + z ij b i )z ij(b i b i )+ε ij μ ij + v( μ ij )x ij(β β)+v( μ ij )z ij(b i b i )+ε ij In vector notation: Y i μ i + V i X i (β β)+ V i Z i (b i b i )+ε i Re-ordering terms yields: Y 1 i V i (Y i μ i )+X i β + Z i b i X i β + Z i b i + ε i, Model fitting by iterating between updating the pseudo responses Y i and fitting the above linear mixed model to them. ESALQ Course on Models for Longitudinal and Incomplete Data 127
11.4.3 Marginal quasi-likelihood (MQL) Linear Taylor expansion around current β and b i = 0: Y ij h(x ij β) + h (x ij β)x ij(β β) + h (x ij β)z ijb i + ε ij μ ij + v( μ ij )x ij(β β)+v( μ ij )z ijb i + ε ij In vector notation: Y i μ i + V i X i (β β)+ V i Z i b i + ε i Re-ordering terms yields: Y 1 i V i (Y i μ i )+X i β X i β + Z i b i + ε i Model fitting by iterating between updating the pseudo responses Y i and fitting the above linear mixed model to them. ESALQ Course on Models for Longitudinal and Incomplete Data 128
11.4.4 PQL versus MQL MQL only performs reasonably well if random-effects variance is (very) small Both perform bad for binary outcomes with few repeated measurements per cluster With increasing number of measurements per subject: MQL remains biased PQL consistent Improvements possible with higher-order Taylor expansions ESALQ Course on Models for Longitudinal and Incomplete Data 129
11.5 Approximation of Integral The likelihood contribution of every subject is of the form f(z)φ(z)dz where φ(z) is the density of the (multivariate) normal distribution Gaussian quadrature methods replace the integral by a weighted sum: f(z)φ(z)dz Q q=1 w qf(z q ) Q is the order of the approximation. The higher Q the more accurate the approximation will be ESALQ Course on Models for Longitudinal and Incomplete Data 130
The nodes (or quadrature points) z q are solutions to the Qth order Hermite polynomial The w q are well-chosen weights The nodes z q and weights w q are reported in tables. Alternatively, an algorithm is available for calculating all z q and w q for any value Q. With Gaussian quadrature, the nodes and weights are fixed, independent of f(z)φ(z). With adaptive Gaussian quadrature, the nodes and weights are adapted to the support of f(z)φ(z). ESALQ Course on Models for Longitudinal and Incomplete Data 131
Graphically (Q =10): ESALQ Course on Models for Longitudinal and Incomplete Data 132
Typically, adaptive Gaussian quadrature needs (much) less quadrature points than classical Gaussian quadrature. On the other hand, adaptive Gaussian quadrature is much more time consuming. Adaptive Gaussian quadrature of order one is equivalent to Laplace transformation. ESALQ Course on Models for Longitudinal and Incomplete Data 133
11.6 Example: Toenail Data Y ij is binary severity indicator for subject i at visit j. Model: Y ij b i Bernoulli(π ij ), log π ij 1 π ij = β 0 + b i + β 1 T i + β 2 t ij + β 3 T i t ij Notation: T i : treatment indicator for subject i t ij : time point at which jth measurement is taken for ith subject Adaptive as well as non-adaptive Gaussian quadrature, for various Q. ESALQ Course on Models for Longitudinal and Incomplete Data 134
Results: Gaussian quadrature Q =3 Q =5 Q =10 Q =20 Q =50 β 0-1.52 (0.31) -2.49 (0.39) -0.99 (0.32) -1.54 (0.69) -1.65 (0.43) β 1-0.39 (0.38) 0.19 (0.36) 0.47 (0.36) -0.43 (0.80) -0.09 (0.57) β 2-0.32 (0.03) -0.38 (0.04) -0.38 (0.05) -0.40 (0.05) -0.40 (0.05) β 3-0.09 (0.05) -0.12 (0.07) -0.15 (0.07) -0.14 (0.07) -0.16 (0.07) σ 2.26 (0.12) 3.09 (0.21) 4.53 (0.39) 3.86 (0.33) 4.04 (0.39) 2l 1344.1 1259.6 1254.4 1249.6 1247.7 Adaptive Gaussian quadrature Q =3 Q =5 Q =10 Q =20 Q =50 β 0-2.05 (0.59) -1.47 (0.40) -1.65 (0.45) -1.63 (0.43) -1.63 (0.44) β 1-0.16 (0.64) -0.09 (0.54) -0.12 (0.59) -0.11 (0.59) -0.11 (0.59) β 2-0.42 (0.05) -0.40 (0.04) -0.41 (0.05) -0.40 (0.05) -0.40 (0.05) β 3-0.17 (0.07) -0.16 (0.07) -0.16 (0.07) -0.16 (0.07) -0.16 (0.07) σ 4.51 (0.62) 3.70 (0.34) 4.07 (0.43) 4.01 (0.38) 4.02 (0.38) 2l 1259.1 1257.1 1248.2 1247.8 1247.8 ESALQ Course on Models for Longitudinal and Incomplete Data 135
Conclusions: (Log-)likelihoods are not comparable Different Q can lead to considerable differences in estimates and standard errors For example, using non-adaptive quadrature, with Q =3, we found no difference in time effect between both treatment groups (t = 0.09/0.05,p=0.0833). Using adaptive quadrature, with Q =50, we find a significant interaction between the time effect and the treatment (t = 0.16/0.07,p=0.0255). Assuming that Q =50is sufficient, the final results are well approximated with smaller Q under adaptive quadrature, but not under non-adaptive quadrature. ESALQ Course on Models for Longitudinal and Incomplete Data 136
Comparison of fitting algorithms: Adaptive Gaussian Quadrature, Q =50 MQL and PQL Summary of results: Parameter QUAD PQL MQL Intercept group A 1.63 (0.44) 0.72 (0.24) 0.56 (0.17) Intercept group B 1.75 (0.45) 0.72 (0.24) 0.53 (0.17) Slope group A 0.40 (0.05) 0.29 (0.03) 0.17 (0.02) Slope group B 0.57 (0.06) 0.40 (0.04) 0.26 (0.03) Var. random intercepts (τ 2 ) 15.99 (3.02) 4.71 (0.60) 2.49 (0.29) Severe differences between QUAD (gold standard?) and MQL/PQL. MQL/PQL may yield (very) biased results, especially for binary data. ESALQ Course on Models for Longitudinal and Incomplete Data 137
Chapter 12 Fitting GLMM s in SAS Proc GLIMMIX for PQL and MQL Proc NLMIXED for Gaussian quadrature ESALQ Course on Models for Longitudinal and Incomplete Data 138
12.1 Procedure GLIMMIX for PQL and MQL Re-consider logistic model with random intercepts for toenail data SAS code (PQL): proc glimmix data=test method=rspl ; class idnum; model onyresp (event= 1 ) = treatn time treatn*time / dist=binary solution; random intercept / subject=idnum; run; MQL obtained with option method=rmpl Inclusion of random slopes: random intercept time / subject=idnum type=un; ESALQ Course on Models for Longitudinal and Incomplete Data 139
12.2 Procedure NLMIXED for Gaussian Quadrature Re-consider logistic model with random intercepts for toenail data SAS program (non-adaptive, Q =3): proc nlmixed data=test noad qpoints=3; parms beta0=-1.6 beta1=0 beta2=-0.4 beta3=-0.5 sigma=3.9; teta = beta0 + b + beta1*treatn + beta2*time + beta3*timetr; expteta = exp(teta); p = expteta/(1+expteta); model onyresp ~ binary(p); random b ~ normal(0,sigma**2) subject=idnum; run; Adaptive Gaussian quadrature obtained by omitting option noad ESALQ Course on Models for Longitudinal and Incomplete Data 140
Automatic search for optimal value of Q in case of no option qpoints= Good starting values needed! The inclusion of random slopes can be specified as follows: proc nlmixed data=test noad qpoints=3; parms beta0=-1.6 beta1=0 beta2=-0.4 beta3=-0.5 d11=3.9 d12=0 d22=0.1; teta = beta0 + b1 + beta1*treatn + beta2*time + b2*time + beta3*timetr; expteta = exp(teta); p = expteta/(1+expteta); model onyresp ~ binary(p); random b1 b2 ~ normal([0, 0], [d11, d12, d22]) subject=idnum; run; ESALQ Course on Models for Longitudinal and Incomplete Data 141
Part IV Marginal Versus Random-effects Models ESALQ Course on Models for Longitudinal and Incomplete Data 142
Chapter 13 Marginal Versus Random-effects Models Interpretation of GLMM parameters Marginalization of GLMM Example: Toenail data ESALQ Course on Models for Longitudinal and Incomplete Data 143
13.1 Interpretation of GLMM Parameters: Toenail Data We compare our GLMM results for the toenail data with those from fitting GEE s (unstructured working correlation): GLMM GEE Parameter Estimate (s.e.) Estimate (s.e.) Intercept group A 1.6308 (0.4356) 0.7219 (0.1656) Intercept group B 1.7454 (0.4478) 0.6493 (0.1671) Slope group A 0.4043 (0.0460) 0.1409 (0.0277) Slope group B 0.5657 (0.0601) 0.2548 (0.0380) ESALQ Course on Models for Longitudinal and Incomplete Data 144
The strong differences can be explained as follows: Consider the following GLMM: Y ij b i Bernoulli(π ij ), log = β 0 + b i + β 1 t ij 1 π ij The conditional means E(Y ij b i ), as functions of t ij,aregivenby π ij E(Y ij b i ) = exp(β 0 + b i + β 1 t ij ) 1+exp(β 0 + b i + β 1 t ij ) ESALQ Course on Models for Longitudinal and Incomplete Data 145
The marginal average evolution is now obtained from averaging over the random effects: exp(β 0 + b i + β 1 t ij ) E(Y ij ) = E[E(Y ij b i )] = E 1 + exp(β 0 + b i + β 1 t ij ) exp(β 0 + β 1 t ij ) 1 + exp(β 0 + β 1 t ij ) ESALQ Course on Models for Longitudinal and Incomplete Data 146
Hence, the parameter vector β in the GEE model needs to be interpreted completely different from the parameter vector β in the GLMM: GEE: marginal interpretation GLMM: conditional interpretation, conditionally upon level of random effects In general, the model for the marginal average is not of the same parametric form as the conditional average in the GLMM. For logistic mixed models, with normally distributed random random intercepts, it can be shown that the marginal model can be well approximated by again a logistic model, but with parameters approximately satisfying β RE β M = c 2 σ 2 + 1 > 1, σ 2 = variance random intercepts c =16 3/(15π) ESALQ Course on Models for Longitudinal and Incomplete Data 147
For the toenail application, σ was estimated as 4.0164, such that the ratio equals c2 σ 2 + 1 = 2.5649. The ratio s between the GLMM and GEE estimates are: GLMM GEE Parameter Estimate (s.e.) Estimate (s.e.) Ratio Intercept group A 1.6308 (0.4356) 0.7219 (0.1656) 2.2590 Intercept group B 1.7454 (0.4478) 0.6493 (0.1671) 2.6881 Slope group A 0.4043 (0.0460) 0.1409 (0.0277) 2.8694 Slope group B 0.5657 (0.0601) 0.2548 (0.0380) 2.2202 Note that this problem does not occur in linear mixed models: Conditional mean: E(Y i b i )=X i β + Z i b i Specifically: E(Y i b i = 0) =X i β Marginal mean: E(Y i )=X i β ESALQ Course on Models for Longitudinal and Incomplete Data 148
The problem arises from the fact that, in general, E[g(Y )] g[e(y )] So, whenever the random effects enter the conditional mean in a non-linear way, the regression parameters in the marginal model need to be interpreted differently from the regression parameters in the mixed model. In practice, the marginal mean can be derived from the GLMM output by integrating out the random effects. This can be done numerically via Gaussian quadrature, or based on sampling methods. ESALQ Course on Models for Longitudinal and Incomplete Data 149
13.2 Marginalization of GLMM: Toenail Data As an example, we plot the average evolutions based on the GLMM output obtained in the toenail example: P (Y ij =1) = E E exp( 1.6308 + b i 0.4043t ij ), 1+exp( 1.6308 + b i 0.4043t ij ) exp( 1.7454 + b i 0.5657t ij ), 1+exp( 1.7454 + b i 0.5657t ij ) ESALQ Course on Models for Longitudinal and Incomplete Data 150
Average evolutions obtained from the GEE analyses: P (Y ij =1) = exp( 0.7219 0.1409t ij ) 1+exp( 0.7219 0.1409t ij ) exp( 0.6493 0.2548t ij ) 1+exp( 0.6493 0.2548t ij ) ESALQ Course on Models for Longitudinal and Incomplete Data 151
In a GLMM context, rather than plotting the marginal averages, one can also plot the profile for an average subject, i.e., a subject with random effect b i = 0: P (Y ij =1 b i =0) = exp( 1.6308 0.4043t ij ) 1+exp( 1.6308 0.4043t ij ) exp( 1.7454 0.5657t ij ) 1+exp( 1.7454 0.5657t ij ) ESALQ Course on Models for Longitudinal and Incomplete Data 152
13.3 Example: Toenail Data Revisited Overview of all analyses on toenail data: Parameter QUAD PQL MQL GEE Intercept group A 1.63 (0.44) 0.72 (0.24) 0.56 (0.17) 0.72 (0.17) Intercept group B 1.75 (0.45) 0.72 (0.24) 0.53 (0.17) 0.65 (0.17) Slope group A 0.40 (0.05) 0.29 (0.03) 0.17 (0.02) 0.14 (0.03) Slope group B 0.57 (0.06) 0.40 (0.04) 0.26 (0.03) 0.25 (0.04) Var. random intercepts (τ 2 ) 15.99 (3.02) 4.71 (0.60) 2.49 (0.29) Conclusion: GEE < MQL < PQL < QUAD ESALQ Course on Models for Longitudinal and Incomplete Data 153
Part V Non-linear Models ESALQ Course on Models for Longitudinal and Incomplete Data 154
Chapter 14 Non-Linear Mixed Models From linear to non-linear models Orange Tree Example Song Bird Example ESALQ Course on Models for Longitudinal and Incomplete Data 155
14.1 From Linear to Non-linear Models We have studied: linear models generalized linear models for Gaussian data non-gaussian data for cross-sectional (univariate) data for correlated data (clustered data, multivariate data, longitudinal data) In all cases, a certain amount of linearity is preserved: E.g., in generalized linear models, linearity operates at the level of the linear predictor, describing a transformation of the mean (logit, log, probit,...) This implies that all predictor functions are linear in the parameters: β 0 + β 1 x 1i +...β p x pi ESALQ Course on Models for Longitudinal and Incomplete Data 156
This may still be considered a limiting factor, and non-linearity in the parametric functions may be desirable. Examples: Certain growth curves, especially when they include both a growth spurt as well as asymptote behavior towards the end of growth Dose-response modeling Pharmacokinetic and pharmacodynamic models ESALQ Course on Models for Longitudinal and Incomplete Data 157
14.2 LMM and GLMM In linear mixed models, the mean is modeled as a linear function of regression parameters and random effects: E(Y ij b i )=x ij β + z ij b i In generalized linear mixed models, apart from a link function, the mean is again modeled as a linear function of regression parameters and random effects: E(Y ij b i )=h(x ij β + z ij b i ) In some applications, models are needed, in which the mean is no longer modeled as a function of a linear predictor x ij β + z ij b i. These are called non-linear mixed models. ESALQ Course on Models for Longitudinal and Incomplete Data 158
14.3 Non-linear Mixed Models In non-linear mixed models, it is assumed that the conditional distribution of Y ij, given b i is belongs to the exponential family (Normal, Binomial, Poisson,...). The mean is modeled as: E(Y ij b i )=h(x ij, β, z ij, b i ) As before, the random effects are assumed to be normally distributed, with mean 0 and covariance D. Let f ij (y ij b i, β,φ) be the conditional density of Y ij given b i,andletf(b i D) be the density of the N(0,D) distribution. ESALQ Course on Models for Longitudinal and Incomplete Data 159
Under conditional independence, the marginal likelihood equals L(β,D,φ)= N i=1 f i(y i β,d,φ) = N i=1 n i j=1 f ij(y ij b i, β,φ) f(b i D) db i The above likelihood is of the same form as the likelihood obtained earlier for a generalized linear mixed model. As before, numerical quadrature is used to approximate the integral Non-linear mixed models can also be fitted within the SAS procedure NLMIXED. ESALQ Course on Models for Longitudinal and Incomplete Data 160
14.4 Example: Orange Trees We consider an experiment in which the trunk circumference (in mm) is measured for 5 orange trees, on 7 different occasions. Data: Response Day Tree 1 Tree 2 Tree 3 Tree 4 Tree 5 118 30 33 30 32 30 484 58 69 51 62 49 664 87 111 75 112 81 1004 115 156 108 167 125 1231 120 172 115 179 142 1372 142 203 139 209 174 1582 145 203 140 214 177 ESALQ Course on Models for Longitudinal and Incomplete Data 161
The following non-linear mixed model has been proposed: Y ij = β 1 + b i 1 + exp[ (t ij β 2 )/β 3 ] + ε ij b i N(0,σ 2 b) ε ij N(0,σ 2 ) ESALQ Course on Models for Longitudinal and Incomplete Data 162
In SAS PROC NLMIXED, the model can be fitted as follows: proc nlmixed data=tree; parms beta1=190 beta2=700 beta3=350 sigmab=10 sigma=10; num = b; ex = exp(-(day-beta2)/beta3); den = 1 + ex; model y ~ normal(num/den,sigma**2); random b ~ normal(beta1,sigmab**2) subject=tree; run; An equivalent program is given by proc nlmixed data=tree; parms beta1=190 beta2=700 beta3=350 sigmab=10 sigma=10; num = b + beta1; ex = exp(-(day-beta2)/beta3); den = 1 + ex; model y ~ normal(num/den,sigma**2); random b ~ normal(0,sigmab**2) subject=tree; run; ESALQ Course on Models for Longitudinal and Incomplete Data 163
Selected output: Specifications Data Set Dependent Variable Distribution for Dependent Variable Random Effects Distribution for Random Effects Subject Variable Optimization Technique Integration Method WORK.TREE y Normal b Normal tree Dual Quasi-Newton Adaptive Gaussian Quadrature Parameter Estimates Standard Parameter Estimate Error DF t Value Pr > t beta1 192.05 15.6577 4 12.27 0.0003 beta2 727.91 35.2487 4 20.65 <.0001 beta3 348.07 27.0798 4 12.85 0.0002 sigmab 31.6463 10.2614 4 3.08 0.0368 sigma 7.8430 1.0125 4 7.75 0.0015 ESALQ Course on Models for Longitudinal and Incomplete Data 164
Note that the number of quadrature points was selected adaptively to be equal to only one. Refitting the model with 50 quadrature points yielded identical results. Empirical Bayes estimates, and subject-specific predictions can be obtained as follows: proc nlmixed data=tree; parms beta1=190 beta2=700 beta3=350 sigmab=10 sigma=10; num = b + beta1; ex = exp(-(day-beta2)/beta3); den = 1 + ex; ratio=num/den; model y ~ normal(ratio,sigma**2); random b ~ normal(0,sigmab**2) subject=tree out=eb; predict ratio out=ratio; run; ESALQ Course on Models for Longitudinal and Incomplete Data 165
We can now compare the observed data to the subject-specific predictions ŷ ij = β 1 + b i 1+exp[ (t ij β 2 )/ β 3 ] ESALQ Course on Models for Longitudinal and Incomplete Data 166
14.5 Example: Song Birds At the University of Antwerp, a novel in-vivo MRI approach to discern the functional characteristics of specific neuronal populations in a strongly connected brain circuitry has been established. Of particular interest: the song control system (SCS) in songbird brain. The high vocal center (HVC), one of the major nuclei in this circuit, contains interneurons and two distinct types of neurons projecting respectively to the nucleus robustus archistriatalis, RA or to area X. ESALQ Course on Models for Longitudinal and Incomplete Data 167
Schematically, ESALQ Course on Models for Longitudinal and Incomplete Data 168
14.5.1 The MRI Data T1-weighted multi slice SE images were acquired (Van Meir et al 2003). After acquisition of a set of control images, MnCl 2 was injected in the cannulated HVC. Two sets of 5 coronal slices (one through HVC and RA, one through area X) were acquired every 15 min for up to 6 7 hours after injection. This results in 30 data sets of 10 slices of each bird (5 controls and 5 testosterone treated). The change of relative SI of each time point is calculated from the mean signal intensity determined within the defined regions of interest (area X and RA) and in an adjacent control area. ESALQ Course on Models for Longitudinal and Incomplete Data 169
14.5.2 Initial Study The effect of testosterone (T) on the dynamics of Mn 2+ accumulation in RA and area X of female starling has been established. This has been done with dynamic ME-MRI in individual birds injected with Manganese in their HVC. This was done in a 2-stage approach: The individual SI data, determined as the mean intensity of a region of interest, were submitted to a sigmoid curve fitting providing curve parameters which revealed upon a two-way repeated ANOVA analysis that neurons projecting to RA and those to area X were affected differently by testosterone treatment. However, this approach could be less reliable if the fit-parameters were mutually dependent. ESALQ Course on Models for Longitudinal and Incomplete Data 170
ESALQ Course on Models for Longitudinal and Incomplete Data 171
Thus: an integrated non-linear modeling approach is necessary: the MRI signal intensities (SI) need to be fitted to a non-linear mixed model assuming that the SI follow a pre-specified distribution depending on a covariate indicating the time after MnCl 2 injection and parameterized through fixed and random (bird-specific) effects. An initial model needs to be proposed and then subsequently simplified. ESALQ Course on Models for Longitudinal and Incomplete Data 172
14.5.3 A Model for RA in the Second Period Let RA ij be the measurement at time j for bird i. The following initial non-linear model is assumed: RA ij = (φ 0 + φ 1 G i + f i )T η 0+η 1 G i +n i ij (τ 0 + τ 1 G i + t i ) η 0+η 1 G i +n i + T η 0+η 1 G i +n i ij +γ 0 + γ 1 G i + ε ij The following conventions are used: G i is an indicator for group membership (0 for control, 1 for active) T ij is a covariate indicating the measurement time The fixed effects parameters: ESALQ Course on Models for Longitudinal and Incomplete Data 173
the intercept parameters φ 0, η 0, τ 0,andγ 0 the group effect parameters φ 1, η 1, τ 1,andγ 1 If the latter are simultaneously zero, there is no difference between both groups. If at least one of them is (significantly) different from zero, then the model indicates a difference between both groups. There are three bird-specific or random effects, f i, n i,andt i, following a trivariate zero-mean normal and general covariance matrix D The residual error terms ε ij are assumed to be mutually independent and independent from the random effects, and to follow a N(0,σ 2 ) distribution. Thus, the general form of the model has 8 fixed-effects parameters, and 7 variance components (3 variances in D, 3 covariances in D, andσ 2 ). ESALQ Course on Models for Longitudinal and Incomplete Data 174
SAS program: data hulp2; set m.vincent03; if time <= 0 then delete; if periode = 1 then delete; run; proc nlmixed data=hulp2 qpoints=3; parms phim=0.64 phimdiff=0 eta=1.88 etadiff=0 tau=3.68 taudiff=0 gamma=-0.01 gdiff=0 d11=0.01 sigma2=0.01 d22=0.01 d33=0.01; teller = (phim + phimdiff * groep + vm ) * (time ** (eta + etadiff * groep + n )); noemer = ((tau + t + taudiff * groep ) ** (eta + etadiff * groep +n) ) + (time ** (eta + etadiff * groep + n)); gemid = teller/noemer + gamma + gdiff * groep; model si_ra ~ normal(gemid,sigma2); random vm t n ~ normal([0, 0, 0],[d11,0,d22, 0, 0, d33]) subject=vogel out=m.eb; run; Gaussian quadrature is used. ESALQ Course on Models for Longitudinal and Incomplete Data 175
The covariances in D are set equal to zero, due to computational difficulty. Fitting the model produces the following output. First, the bookkeeping information is provided: Specifications Data Set Dependent Variable Distribution for Dependent Variable Random Effects Distribution for Random Effects Subject Variable Optimization Technique Integration Method WORK.HULP2 SI_RA Normal vm t n Normal vogel Dual Quasi-Newton Adaptive Gaussian Quadrature Dimensions Observations Used 262 Observations Not Used 58 Total Observations 320 Subjects 10 Max Obs Per Subject 28 Parameters 12 Quadrature Points 3 ESALQ Course on Models for Longitudinal and Incomplete Data 176
Next, starting values and the corresponding likelihood are provided: Parameters phim phimdiff eta etadiff tau taudiff gamma gdiff d11 0.64 0 1.88 0 3.68 0-0.01 0 0.01 sigma2 d22 d33 NegLogLike 0.01 0.01 0.01-325.62539 The iteration history tells convergence is not straightforward: Iteration History Iter Calls NegLogLike Diff MaxGrad Slope 1 14-571.60061 245.9752 346172-1341908 2 18-590.08646 18.48585 25374.59-3290.57 3 19-617.20333 27.11687 23321.39-61.3569 4 21-630.54411 13.34079 248131.3-15.4346 5 22-640.39285 9.848737 7211.013-21.8031 6 23-656.41057 16.01772 12295.64-10.1471 7 25-666.68301 10.27244 96575.61-11.8592 8 26-670.60017 3.917164 100677.4-5.38138 9 27-676.45628 5.856102 30099.22-6.11815 10 30-677.36448 0.908204 123312.7-1.98302 20 53-688.60175 4.468022 32724.37-4.48668 ESALQ Course on Models for Longitudinal and Incomplete Data 177
30 74-695.15573 0.716814 43575.75-0.18929 40 93-697.90828 0.303986 1094.634-0.51907 50 112-699.51124 0.002001 166.3506-0.00338 60 137-700.79832 0.002657 710.6714-0.00044 70 157-701.17648 0.000123 20.20774-0.00018 71 159-701.17688 0.000396 125.3658-0.00006 72 162-701.21656 0.039676 1426.738-0.0008 73 164-701.31217 0.095616 136.553-0.05932 74 166-701.31463 0.002454 98.78744-0.00443 75 168-701.3147 0.000071 3.915711-0.00015 76 170-701.3147 9.862E-7 1.290999-2.05E-6 NOTE: GCONV convergence criterion satisfied. ESALQ Course on Models for Longitudinal and Incomplete Data 178
The fit statistics can be used for model comparison, as always: Fit Statistics -2 Log Likelihood -1403 AIC (smaller is better) -1379 AICC (smaller is better) -1377 BIC (smaller is better) -1375 Finally, parameter estimates are provided: Parameter Estimates Standard Parameter Estimate Error DF t Value Pr > t Alpha Lower Upper Gradient phim 0.4085 0.06554 7 6.23 0.0004 0.05 0.2535 0.5634 0.000428 phimdiff 0.08167 0.09233 7 0.88 0.4058 0.05-0.1366 0.3000 0.000591 eta 2.2117 0.1394 7 15.87 <.0001 0.05 1.8821 2.5412-9.87E-6 etadiff 0.4390 0.1865 7 2.35 0.0508 0.05-0.00206 0.8801 0.000717 tau 2.8006 0.2346 7 11.94 <.0001 0.05 2.2457 3.3554-0.00028 taudiff 0.07546 0.3250 7 0.23 0.8231 0.05-0.6932 0.8441 0.000046 gamma 0.000237 0.004391 7 0.05 0.9585 0.05-0.01015 0.01062 0.008095 gdiff 0.001919 0.005923 7 0.32 0.7554 0.05-0.01209 0.01592-0.00244 d11 0.02059 0.009283 7 2.22 0.0620 0.05-0.00136 0.04254-0.00547 sigma2 0.000180 0.000017 7 10.66 <.0001 0.05 0.000140 0.000221-1.291 d22 0.2400 0.1169 7 2.05 0.0791 0.05-0.03633 0.5163-0.00053 d33 0.02420 0.03488 7 0.69 0.5101 0.05-0.05829 0.1067-0.00059 ESALQ Course on Models for Longitudinal and Incomplete Data 179
Next, a backward selection is conducted, using likelihood ratio tests. First, the variance d 33 of n i was removed. The corresponding test statistic has a χ 2 0:1 null distribution (p =0.4387). Next, fixed-effect parameters γ 0, γ 1, τ 1,andφ 1 are removed. The final model is: RA ij = (φ 0 + f i )T η 0+η 1 G i ij (τ 0 + t i ) η 0+η 1 G i + T η 0+η 1 G i ij + ε ij ESALQ Course on Models for Longitudinal and Incomplete Data 180
Overview of parameter estimates and standard errors for initial and final model: Estimate (s.e.) Effect Parameter Initial Final φ 0 0.4085(0.0655) 0.4526(0.0478) φ 1 0.0817(0.0923) η 0 2.2117(0.1394) 2.1826(0.0802) η 1 0.4390(0.1865) 0.4285(0.1060) τ 0 2.8006(0.2346) 2.8480(0.1761) τ 1 0.0755(0.3250) γ 0 0.00024(0.0044) γ 1 0.0019(0.0059) Var(f i ) d 11 0.0206(0.0092) 0.0225(0.0101) Var(t i ) d 22 0.2400(0.1169) 0.2881(0.1338) Var(n i ) d 33 0.0242(0.0349) Var(ε ij ) σ 2 0.00018(0.000017) 0.00019(0.000017) ESALQ Course on Models for Longitudinal and Incomplete Data 181
14.5.4 AREA.X at the Second Period The same initial model will be fitted to AREA ij. In this case, a fully general model can be fitted, including also the covariances between the random effects. Code to do so: proc nlmixed data=hulp2 qpoints=3; parms phim=0.1124 phimdiff=0.1200 eta=2.4158 etadiff=-0.1582 tau=3.8297 taudiff=-0.05259 gamma=-0.00187 gdiff=0.002502 sigma2=0.000156 d11=0.002667 d22=0.2793 d33=0.08505 d12=0 d13=0 d23=0; teller = (phim + phimdiff * groep + vm) * (time ** (eta + etadiff * groep + n )); noemer = ((tau + taudiff * groep +t ) ** (eta + etadiff * groep + n) ) + (time ** (eta + etadiff * groep +n)); gemid = teller/noemer + gamma + gdiff * groep; model si_area_x ~ normal(gemid,sigma2); random vm t n ~ normal([0, 0, 0],[d11, d12, d22, d13, d23, d33]) subject=vogel out=m.eb2; run; ESALQ Course on Models for Longitudinal and Incomplete Data 182
Model simplification: First, the random n i effect is removed (implying the removal of d 11, d 12,and d 13 ), using a likelihood ratio test statistic with value 4.08 and null distribution χ 2 2:3. The corresponding p =0.1914. Removal of the random t i effect is not possible since the likelihood ratio equals 54.95 on 1:2 degrees of freedom (p <0.0001). Removal of the covariance between the random t i and f i effects is not possible (G 2 =4.35 on 1 d.f., p =0.0371). Next, the following fixed-effect parameters were removed: γ 0, γ 1, η 1 and τ 1. The fixed-effect φ 1 was found to be highly significant and therefore could not be removed from the model (G 2 =10.5773 on 1 d.f., p =0.0011). This indicates a significant difference between the two groups. The resulting final model is: AREA ij = (φ 0 + φ 1 G i + f i )T η 0 ij (τ 0 + t i ) η 0 + T η 0 ij + ε ij. ESALQ Course on Models for Longitudinal and Incomplete Data 183
Estimate (s.e.) Effect Parameter Initial Final φ 0 0.1118 (0.0333) 0.1035 (0.0261) φ 1 0.1116 (0.0458) 0.1331 (0.0312) η 0 2.4940 (0.5390) 2.3462 (0.1498) η 1-0.0623 (0.5631) τ 0 3.6614 (0.5662) 3.7264 (0.3262) τ 1-0.1303 (0.6226) γ 0-0.0021 (0.0032) γ 1 0.0029 (0.0048) Var(f i ) d 11 0.0038 (0.0020) 0.004271 (0.0022) Var(t i ) d 22 0.2953 (0.2365) 0.5054 (0.2881) Var(n i ) d 33 0.1315 (0.1858) Cov(f i,t i ) d 12 0.0284 (0.0205) 0.03442 (0.0229) Cov(f i,n i ) d 13-0.0116 (0.0159) Cov(t i,n i ) d 23-0.0095 (0.1615) Var(ε ij ) σ 2 0.00016 (0.000014) 0.00016 (0.000014) ESALQ Course on Models for Longitudinal and Incomplete Data 184
0 1 2 3 4 5 6 7 Time Individual curves, showing data points as well as empirical Bayes predictions for the bird-specific curves, show that the sigmoidal curves describe the data quite well. SI.area.X 0.00 0.10 0.20 1 0 1 2 3 4 5 6 7 Time SI.area.X 0.00 0.10 0.20 2 0 1 2 3 4 5 6 7 Time SI.area.X 0.00 0.10 0.20 3 0 1 2 3 4 5 6 7 0.00 0.10 0.20 4 Time 6 0 1 2 3 4 5 6 7 SI.area.X 0.00 0.10 0.20 5 0 1 2 3 4 5 6 7 Time SI.area.X 0.00 0.10 0.20 Time 7 0 1 2 3 4 5 6 7 Time 0 1 2 3 4 5 6 7 SI.area.X 0.00 0.10 0.20 0.00 0.10 0.20 8 Time 0 1 2 3 4 5 6 7 Time SI.area.X 0.00 0.10 0.20 9 0 1 2 3 4 5 6 7 Time SI.area.X 0.00 0.10 0.20 10 ESALQ Course on Models for Longitudinal and Incomplete Data 185 SI.area.X SI.area.X
SI.area.X 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Group 0 Group 1 Marg. Group 0 Marg. Goup 1 We can also explore all individual as well as marginal average fitted curves per group. The marginal effect was obtained using the sampling-based method. It is clear that Area.X is higher for most treated birds (group 1) compared to the untreated birds (group 0). 0 2 4 6 Time ESALQ Course on Models for Longitudinal and Incomplete Data 186
Part VI Incomplete Data ESALQ Course on Models for Longitudinal and Incomplete Data 187
Chapter 15 Setting The Scene Orthodontic growth data Notation Taxonomies ESALQ Course on Models for Longitudinal and Incomplete Data 188
15.1 Growth Data Taken from Potthoff and Roy, Biometrika (1964) Research question: Is dental growth related to gender? The distance from the center of the pituitary to the maxillary fissure was recorded at ages 8, 10, 12, and 14, for 11 girls and 16 boys ESALQ Course on Models for Longitudinal and Incomplete Data 189
Individual profiles: Much variability between girls / boys Considerable variability within girls / boys Fixed number of measurements per subject Measurements taken at fixed time points ESALQ Course on Models for Longitudinal and Incomplete Data 190
15.2 Incomplete Longitudinal Data ESALQ Course on Models for Longitudinal and Incomplete Data 191
15.3 Scientific Question In terms of entire longitudinal profile In terms of last planned measurement In terms of last observed measurement ESALQ Course on Models for Longitudinal and Incomplete Data 192
15.4 Notation Subject i at occasion (time) j =1,...,n i Measurement Y ij Missingness indicator R ij = 1 if Y ij is observed, 0 otherwise. Group Y ij into a vector Y i =(Y i1,...,y ini ) =(Y o i, Y m i ) Y o i contains Y ij for which R ij =1, Y m i contains Y ij for which R ij =0. Group R ij into a vector R i =(R i1,...,r ini ) D i :timeofdropout:d i =1+ n i j=1 R ij ESALQ Course on Models for Longitudinal and Incomplete Data 193
15.5 Framework f(y i,d i θ, ψ) Selection Models: f(y i θ) f(d i Y o i, Y m i, ψ) MCAR MAR MNAR f(d i ψ) f(d i Y o i, ψ) f(d i Y o i, Y m i, ψ) Pattern-Mixture Models: f(y i D i, θ)f(d i ψ) Shared-Parameter Models: f(y i b i, θ)f(d i b i, ψ) ESALQ Course on Models for Longitudinal and Incomplete Data 194
f(y i,d i θ, ψ) Selection Models: f(y i θ) f(d i Y o i, Y m i, ψ) MCAR MAR MNAR CC? direct likelihood! joint model!? LOCF? expectation-maximization (EM). sensitivity analysis?! imputation? multiple imputation (MI).. (weighted) GEE! ESALQ Course on Models for Longitudinal and Incomplete Data 195
Chapter 16 Proper Analysis of Incomplete Data Simple methods Bias for LOCF and CC Direct likelihood inference Weighted generalized estimating equations ESALQ Course on Models for Longitudinal and Incomplete Data 196
16.1 Incomplete Longitudinal Data ESALQ Course on Models for Longitudinal and Incomplete Data 197
Data and Modeling Strategies ESALQ Course on Models for Longitudinal and Incomplete Data 198
Modeling Strategies ESALQ Course on Models for Longitudinal and Incomplete Data 199
16.2 Simple Methods MCAR Complete case analysis: delete incomplete subjects Standard statistical software Loss of information Impact on precision and power Missingness MCAR bias Last observation carried forward: impute missing values Standard statistical software Increase of information Constant profile after dropout: unrealistic Usually bias ESALQ Course on Models for Longitudinal and Incomplete Data 200
Quantifying the Bias Dropouts t ij =0 Probability p 0 Treatment indicator T i =0, 1 E(Y ij )=β 0 + β 1 T i + β 2 t ij + β 3 T i t ij Completers t ij =0, 1 Probability 1 p 0 = p 1 Treatment indicator T i =0, 1 E(Y ij )=γ 0 +γ 1 T i +γ 2 t ij +γ 3 T i t ij CC LOCF MCAR 0 (p 1 p 0 )β 2 (1 p 1 )β 3 σ[(1 p 1 )(β 0 + β 1 γ 0 γ 1 ) p 1 (γ 0 + γ 1 + γ 2 + γ 3 )+(1 p 1 )(β 0 + β 1 ) MAR (1 p 0 )(β 0 γ 0 )] p 0 (γ 0 + γ 2 ) (1 p 0 )β 0 γ 1 γ 3 σ[(1 p 1 )(β 0 + β 1 γ 0 γ 1 ) (1 p 0 )(β 0 γ 0 )] ESALQ Course on Models for Longitudinal and Incomplete Data 201
16.3 Direct Likelihood Maximization MAR : f(y o i θ) f(d i Y o i, ψ) ignorability Mechanism is MAR θ and ψ distinct Interest in θ Use observed information matrix = Likelihood inference is valid Outcome type Modeling strategy Software Gaussian Linear mixed model SAS proc MIXED Non-Gaussian Generalized linear mixed model SAS proc GLIMMIX, NLMIXED ESALQ Course on Models for Longitudinal and Incomplete Data 202
16.3.1 Original, Complete Orthodontic Growth Data Mean Covar # par 1 unstructured unstructured 18 2 slopes unstructured 14 3 = slopes unstructured 13 7 slopes CS 6 ESALQ Course on Models for Longitudinal and Incomplete Data 203
16.3.2 Trimmed Growth Data: Simple Methods Method Model Mean Covar # par Complete case 7a = slopes CS 5 LOCF 2a quadratic unstructured 16 Unconditional mean 7a = slopes CS 5 Conditional mean 1 unstructured unstructured 18 distorting ESALQ Course on Models for Longitudinal and Incomplete Data 204
16.3.3 Trimmed Growth Data: Direct Likelihood Mean Covar # par 7 slopes CS 6 ESALQ Course on Models for Longitudinal and Incomplete Data 205
16.3.4 Growth Data: Comparison of Analyses Data Complete cases LOCF imputed data Model Unstructured group by time mean Unstructured covariance matrix All available data Analysis methods Direct likelihood ML REML MANOVA ANOVA per time point ESALQ Course on Models for Longitudinal and Incomplete Data 206
Principle Method Boys at Age 8 Boys at Age 10 Original Direct likelihood, ML 22.88 (0.56) 23.81 (0.49) Direct likelihood, REML MANOVA 22.88 (0.58) 23.81 (0.51) ANOVA per time point 22.88 (0.61) 23.81 (0.53) Direct Lik. Direct likelihood, ML 22.88 (0.56) 23.17 (0.68) Direct likelihood, REML 22.88 (0.58) 23.17 (0.71) MANOVA 24.00 (0.48) 24.14 (0.66) ANOVA per time point 22.88 (0.61) 24.14 (0.74) CC Direct likelihood, ML 24.00 (0.45) 24.14 (0.62) Direct likelihood, REML MANOVA 24.00 (0.48) 24.14 (0.66) ANOVA per time point 24.00 (0.51) 24.14 (0.74) LOCF Direct likelihood, ML 22.88 (0.56) 22.97 (0.65) Direct likelihood, REML MANOVA 22.88 (0.58) 22.97 (0.68) ANOVA per time point 22.88 (0.61) 22.97 (0.72) ESALQ Course on Models for Longitudinal and Incomplete Data 207
16.3.5 Behind the Scenes R completers N R incompleters Y i1 Y i2 N μ 1 μ 2, σ 11 σ 12 σ 22 Conditional density Y i2 y i1 N(β 0 + β 1 y i1,σ 22.1 ) μ 1 freq. & lik. μ1 = 1 N μ 2 frequentist μ2 = 1 R μ 2 likelihood μ2 = 1 N N i=1 y i1 R i=1 y i2 R i=1 y i2 + N i=r+1 [ y2 + β 1 (y i1 y 1 ) ] ESALQ Course on Models for Longitudinal and Incomplete Data 208
16.3.6 Growth Data: Further Comparison of Analyses Principle Method Boys at Age 8 Boys at Age 10 Original Direct likelihood, ML 22.88 (0.56) 23.81 (0.49) Direct Lik. Direct likelihood, ML 22.88 (0.56) 23.17 (0.68) CC Direct likelihood, ML 24.00 (0.45) 24.14 (0.62) LOCF Direct likelihood, ML 22.88 (0.56) 22.97 (0.65) Data Mean Covariance Boys at Age 8 Boys at Age 10 Complete Unstructured Unstructured 22.88 23.81 Unstructured CS 22.88 23.81 Unstructured Independence 22.88 23.81 Incomplete Unstructured Unstructured 22.88 23.17 Unstructured CS 22.88 23.52 Unstructured Independence 22.88 24.14 ESALQ Course on Models for Longitudinal and Incomplete Data 209
ESALQ Course on Models for Longitudinal and Incomplete Data 210
16.3.7 Growth Data: SAS Code for Model 1 IDNR AGE SEX MEASURE SAS code: 1 8 2 21.0 1 10 2 20.0 1 12 2 21.5 1 14 2 23.0... 3 8 2 20.5 3 12 2 24.5 3 14 2 26.0... proc mixed data = growth method = ml; class sex idnr age; model measure = sex age*sex / s; repeated age / type = un subject = idnr; run; Subjects in terms of IDNR blocks ageensures proper ordering of observations within subjects! ESALQ Course on Models for Longitudinal and Incomplete Data 211
16.3.8 Growth Data: SAS Code for Model 2 IDNR AGE SEX MEASURE 1 8 2 21.0 1 10 2 20.0 1 12 2 21.5 1 14 2 23.0... 3 8 2 20.5 3 12 2 24.5 3 14 2 26.0... SAS code: data help; set growth; agecat = age; run; proc mixed data = growth method = ml; class sex idnr agecat; model measure = sex age*sex / s; repeated agecat / type = un subject = idnr; run; Time ordering variable needs to be categorical ESALQ Course on Models for Longitudinal and Incomplete Data 212
Chapter 17 Weighted Generalized Estimating Equations General Principle Analysis of the analgesic trial ESALQ Course on Models for Longitudinal and Incomplete Data 213
17.1 General Principle MAR and non-ignorable! Standard GEE inference correct only under MCAR Under MAR: weighted GEE Robins, Rotnitzky & Zhao (JASA, 1995) Fitzmaurice, Molenberghs & Lipsitz (JRSSB, 1995) Decompose dropout time D i =(R i1,...,r in )=(1,...,1, 0,...,0) ESALQ Course on Models for Longitudinal and Incomplete Data 214
Weigh a contribution by inverse dropout probability ν idi P [D i = d i ]= d i 1 (1 P [R ik =0 R i2 =...= R i,k 1 =1]) k=2 P [R idi =0 R i2 =...= R i,di 1 =1] I{d i T } Adjust estimating equations N i=1 1 ν idi μ i β V 1 i (y i μ i ) = 0 ESALQ Course on Models for Longitudinal and Incomplete Data 215
17.2 Computing the Weights Predicted values from (PROC GENMOD) output The weights are now defined at the individual measurement level: At the first occasion, the weight is w =1 At other than the last ocassion, the weight is the already accumulated weight, multiplied by 1 the predicted probability At the last occasion within a sequence where dropout occurs the weight is multiplied by the predicted probability At the end of the process, the weight is inverted ESALQ Course on Models for Longitudinal and Incomplete Data 216
17.3 Analysis of the Analgesic Trial A logistic regression for the dropout indicator: logit[p (D i = j D i j, )] = ψ 0 + ψ 11 I(GSA i,j 1 =1)+ψ 12 I(GSA i,j 1 =2) +ψ 13 I(GSA i,j 1 =3)+ψ 14 I(GSA i,j 1 =4) +ψ 2 PCA0 i + ψ 3 PF i + ψ 4 GD i with GSA i,j 1 the 5-point outcome at the previous time I( ) is an indicator function PCA0 i is pain control assessment at baseline PF i is physical functioning at baseline GD i is genetic disorder at baseline are used) ESALQ Course on Models for Longitudinal and Incomplete Data 217
Effect Par. Estimate (s.e.) Intercept ψ 0-1.80 (0.49) Previous GSA= 1 ψ 11-1.02 (0.41) Previous GSA= 2 ψ 12-1.04 (0.38) Previous GSA= 3 ψ 13-1.34 (0.37) Previous GSA= 4 ψ 14-0.26 (0.38) Basel. PCA ψ 2 0.25 (0.10) Phys. func. ψ 3 0.009 (0.004) Genetic disfunc. ψ 4 0.59 (0.24) There is some evidence for MAR: P (D i = j D i j) depends on previous GSA. Furthermore: baseline PCA, physical functioning and genetic/congenital disorder. ESALQ Course on Models for Longitudinal and Incomplete Data 218
GEE and WGEE: logit[p (Y ij =1 t j, PCA0 i )] = β 1 + β 2 t j + β 3 t 2 j + β 4 PCA0) i Effect Par. GEE WGEE Intercept β 1 2.95 (0.47) 2.17 (0.69) Time β 2-0.84 (0.33) -0.44 (0.44) Time 2 β 3 0.18 (0.07) 0.12 (0.09) Basel. PCA β 4-0.24 (0.10) -0.16 (0.13) A hint of potentially important differences between both ESALQ Course on Models for Longitudinal and Incomplete Data 219
Working correlation matrices: R UN, GEE = 10.173 0.246 0.201 1 0.177 0.113 1 0.456 1 R UN, WGEE = 10.215 0.253 0.167 1 0.196 0.113 1 0.409 1 ESALQ Course on Models for Longitudinal and Incomplete Data 220
17.4 Analgesic Trial: Steps for WGEE in SAS 1. Preparatory data manipulation: %dropout(...) 2. Logistic regression for weight model: proc genmod data=gsac; class prevgsa; model dropout = prevgsa pca0 physfct gendis / pred dist=b; ods output obstats=pred; run; 3. Conversion of predicted values to weights:... %dropwgt(...) ESALQ Course on Models for Longitudinal and Incomplete Data 221
4. Weighted GEE analysis: proc genmod data=repbin.gsaw; scwgt wi; class patid timecls; model gsabin = time time pca0 / dist=b; repeated subject=patid / type=un corrw within=timecls; run; ESALQ Course on Models for Longitudinal and Incomplete Data 222
Chapter 18 Multiple Imputation General idea Use of MI in practice Analysis of the growth data ESALQ Course on Models for Longitudinal and Incomplete Data 223
18.1 General Principles Valid under MAR An alternative to direct likelihood and WGEE Three steps: 1. The missing values are filled in M times = M complete data sets 2. The M complete data sets are analyzed by using standard procedures 3. The results from the M analyses are combined into a single inference Rubin (1987), Rubin and Schenker (1986), Little and Rubin (1987) ESALQ Course on Models for Longitudinal and Incomplete Data 224
18.2 Use of MI in Practice Many analyses of the same incomplete set of data A combination of missing outcomes and missing covariates As an alternative to WGEE: MI can be combined with classical GEE MI in SAS: Imputation Task: PROC MI Analysis Task: PROC MYFAVORITE Inference Task: PROC MIANALYZE ESALQ Course on Models for Longitudinal and Incomplete Data 225
18.2.1 MI Analysis of the Orthodontic Growth Data The same Model 1 as before Focus on boys at ages 8 and 10 Results Boys at Age 8 Boys at Age 10 Original Data 22.88 (0.58) 23.81 (0.51) Multiple Imputation 22.88 (0.66) 22.69 (0.81) Between-imputation variability for age 10 measurement Confidence interval for Boys at age 10: [21.08,24.29] ESALQ Course on Models for Longitudinal and Incomplete Data 226
Part VII Topics in Methods and Sensitivity Analysis for Incomplete Data ESALQ Course on Models for Longitudinal and Incomplete Data 227
Chapter 19 An MNAR Selection Model and Local Influence The Diggle and Kenward selection model Mastitis in dairy cattle An informal sensitivity analysis Local influence to conduct sensitivity analysis ESALQ Course on Models for Longitudinal and Incomplete Data 228
19.1 A Full Selection Model MNAR : f(y i θ)f(d i Y i, ψ)dy m i f(y i θ) Linear mixed model f(d i Y i, ψ) Logistic regressions for dropout Y i = X i β + Z i b i + ε i logit [P (D i = j D i j, Y i,j 1,Y ij )] = ψ 0 + ψ 1 Y i,j 1 + ψ 2 Y ij Diggle and Kenward (JRSSC 1994) ESALQ Course on Models for Longitudinal and Incomplete Data 229
19.2 Mastitis in Dairy Cattle Infectious disease of the udder Leads to a reduction in milk yield High yielding cows more susceptible? But this cannot be measured directly because of the effect of the disease: evidence is missing since infected cause have no reported milk yield ESALQ Course on Models for Longitudinal and Incomplete Data 230
Model for milk yield: Y i1 Y i2 N μ μ +Δ, σ 2 1 ρσ 1 σ 2 ρσ 1 σ 2 σ 2 1 Model for mastitis: logit [P (R i =1 Y i1,y i2 )] = ψ 0 + ψ 1 Y i1 + ψ 2 Y i2 =0.37 + 2.25Y i1 2.54Y i2 =0.37 0.29Y i1 2.54(Y i2 Y i1 ) LR test for H 0 : ψ 2 =0:G 2 =5.11 ESALQ Course on Models for Longitudinal and Incomplete Data 231
19.3 Criticism Sensitivity Analysis...,estimatingthe unestimable canbeaccomplished only by making modelling assumptions,... The consequences of model misspecification will(...) bemoresevereinthenon-randomcase. (Laird 1994) Change distributional assumptions (Kenward 1998) Local and global influence methods Pattern-mixture models Several plausible models or ranges of inferences Semi-parametric framework (Scharfstein et al 1999) ESALQ Course on Models for Longitudinal and Incomplete Data 232
19.4 Kenward s Sensitivity Analysis Deletion of #4 and #5 G 2 for ψ 2 : 5.11 0.08 Cows #4 and #5 have unusually large increments Kenward conjectures: #4 and #5 ill during the first year Kenward (SiM 1998) ESALQ Course on Models for Longitudinal and Incomplete Data 233
19.5 Local Influence Verbeke, Thijs, Lesaffre, Kenward (Bcs 2001) Perturbed MAR dropout model: logit [P (D i =1 Y i1,y i2 )] = ψ 0 + ψ 1 Y i1 + ω i Y i2 Likelihood displacement: LD(ω) = 2 [ L ω=0 ( θ, ψ ) L ω=0 ( θω, ψ ) ] ω 0 ESALQ Course on Models for Longitudinal and Incomplete Data 234
19.5 Local Influence Verbeke, Thijs, Lesaffre, Kenward (Bcs 2001) Perturbed MAR dropout model: logit [P (D i =1 Y i1,y i2 )] = ψ 0 + ψ 1 Y i1 + ω i Y i2 or ψ 0 + ψ 1 Y i1 + ω i (Y i2 Y i1 ) Likelihood displacement: LD(ω) = 2 [ L ω=0 ( θ, ψ ) L ω=0 ( θω, ψ ) ] ω 0 ESALQ Course on Models for Longitudinal and Incomplete Data 235
19.6 Application to Mastitis Data Removing #4, #5 and #66 G 2 = 0.005 h max : different signs for (#4,#5) and #66 ESALQ Course on Models for Longitudinal and Incomplete Data 236
Chapter 20 Interval of Ignorance The Slovenian Public Opinion Survey MAR and MNAR analyses Informal sensitivity analysis Interval of ignorance & interval of uncertainty ESALQ Course on Models for Longitudinal and Incomplete Data 237
20.1 The Slovenian Plebiscite Rubin, Stern, and Vehovar (1995) Slovenian Public Opinion (SPO) Survey Four weeks prior to decisive plebiscite Three questions: 1. Are you in favor of Slovenian independence? 2. Are you in favor of Slovenia s secession from Yugoslavia? 3. Will you attend the plebiscite? Political decision: ABSENCE NO Primary Estimand: θ: Proportion in favor of independence ESALQ Course on Models for Longitudinal and Incomplete Data 238
Slovenian Public Opinion Survey Data: Independence Secession Attendance Yes No Yes Yes 1191 8 21 No 8 0 4 107 3 9 No Yes 158 68 29 No 7 14 3 18 43 31 Yes 90 2 109 No 1 2 25 19 8 96 ESALQ Course on Models for Longitudinal and Incomplete Data 239
20.2 Slovenian Public Opinion: 1st Analysis Pessimistic: All who can say NO will say NO θ = 1439 2074 =0.694 Optimistic: All who can say YES will say YES θ = 1439 + 159 + 144 + 136 2074 = 1878 2076 =0.904 Resulting Interval: θ [0.694; 0.904] ESALQ Course on Models for Longitudinal and Incomplete Data 240
Resulting Interval: θ [0.694; 0.904] Complete cases: All who answered on 3 questions θ = 1191 + 158 1454 =0.928? Available cases: All who answered on both questions θ = 1191 + 158 + 90 1549 =0.929? ESALQ Course on Models for Longitudinal and Incomplete Data 241
20.3 Slovenian Public Opinion: 2nd Analysis Missing at Random: Non-response is allowed to depend on observed, but not on unobserved outcomes: Based on two questions: Based on three questions: θ =0.892 θ =0.883 Missing Not at Random (NI): Non-response is allowed to depend on unobserved measurements: θ =0.782 ESALQ Course on Models for Longitudinal and Incomplete Data 242
20.4 Slovenian Public Opinion Survey Estimator θ Pessimistic bound 0.694 Optimistic bound 0.904 Complete cases 0.928? Available cases 0.929? MAR (2 questions) 0.892 MAR (3 questions) 0.883 MNAR 0.782 ESALQ Course on Models for Longitudinal and Incomplete Data 243
20.5 Slovenian Plebiscite: The Truth? θ =0.885 Estimator θ Pessimistic bound 0.694 Optimistic bound 0.904 Complete cases 0.928? Available cases 0.929? MAR (2 questions) 0.892 MAR (3 questions) 0.883 MNAR 0.782 ESALQ Course on Models for Longitudinal and Incomplete Data 244
20.6 Did the MNAR model behave badly? Consider a family of MNAR models Baker, Rosenberger, and DerSimonian (1992) Counts Y r1 r 2 jk j, k =1, 2 indicates YES/NO r 1,r 2 =0, 1 indicates MISSING/OBSERVED ESALQ Course on Models for Longitudinal and Incomplete Data 245
20.6.1 Model Formulation E(Y 11jk )=m jk, E(Y 10jk )=m jk β jk, E(Y 01jk )=m jk α jk, E(Y 00jk )=m jk α jk β jk γ jk, Interpretation: α jk : models non-response on independence question β jk : models non-response on attendance question γ jk : interaction between both non-response indicators (cannot depend on j or k) ESALQ Course on Models for Longitudinal and Incomplete Data 246
20.6.2 Identifiable Models Model Structure d.f. loglik θ C.I. BRD1 (α, β) 6-2495.29 0.892 [0.878;0.906] BRD2 (α, β j ) 7-2467.43 0.884 [0.869;0.900] BRD3 (α k,β) 7-2463.10 0.881 [0.866;0.897] BRD4 (α, β k ) 7-2467.43 0.765 [0.674;0.856] BRD5 (α j,β) 7-2463.10 0.844 [0.806;0.882] BRD6 (α j,β j ) 8-2431.06 0.819 [0.788;0.849] BRD7 (α k,β k ) 8-2431.06 0.764 [0.697;0.832] BRD8 (α j,β k ) 8-2431.06 0.741 [0.657;0.826] BRD9 (α k,β j ) 8-2431.06 0.867 [0.851;0.884] ESALQ Course on Models for Longitudinal and Incomplete Data 247
20.6.3 An Interval of MNAR Estimates θ =0.885 Estimator [Pessimistic; optimistic] [0.694;0.904] Complete cases 0.928 Available cases 0.929 MAR (2 questions) 0.892 MAR (3 questions) 0.883 MNAR 0.782 MNAR interval [0.741;0.892] θ ESALQ Course on Models for Longitudinal and Incomplete Data 248
20.7 A More Formal Look Statistical Imprecision Statistical Uncertainty Statistical Ignorance ESALQ Course on Models for Longitudinal and Incomplete Data 249
Statistical Imprecision: Due to finite sampling Fundamental concept of mathematical statistics Consistency, efficiency, precision, testing,... Disappears as sample size increases Statistical Ignorance: Due to incomplete observations Received less attention Can invalidate conclusions Does not disappear with increasing sample size Kenward, Goetghebeur, and Molenberghs (StatMod 2001) ESALQ Course on Models for Longitudinal and Incomplete Data 250
20.8 Slovenian Public Opinion: 3rd Analysis Model Structure d.f. loglik θ C.I. BRD1 (α, β) 6-2495.29 0.892 [0.878;0.906] BRD2 (α, β j ) 7-2467.43 0.884 [0.869;0.900] BRD3 (α k,β) 7-2463.10 0.881 [0.866;0.897] BRD4 (α, β k ) 7-2467.43 0.765 [0.674;0.856] BRD5 (α j,β) 7-2463.10 0.844 [0.806;0.882] BRD6 (α j,β j ) 8-2431.06 0.819 [0.788;0.849] BRD7 (α k,β k ) 8-2431.06 0.764 [0.697;0.832] BRD8 (α j,β k ) 8-2431.06 0.741 [0.657;0.826] BRD9 (α k,β j ) 8-2431.06 0.867 [0.851;0.884] Model 10 (α k,β jk ) 9-2431.06 [0.762;0.893] [0.744;0.907] Model 11 (α jk,β j ) 9-2431.06 [0.766;0.883] [0.715;0.920] Model 12 (α jk,β jk ) 10-2431.06 [0.694;0.904] ESALQ Course on Models for Longitudinal and Incomplete Data 251
20.9 Every MNAR Model Has Got a MAR Bodyguard Fit an MNAR model to a set of incomplete data Change the conditional distribution of the unobserved outcomes, given the observed ones, to comply with MAR The resulting new model will have exactly the same fit as the original MNAR model The missing data mechanism has changed This implies that definitively testing for MAR versus MNAR is not possible ESALQ Course on Models for Longitudinal and Incomplete Data 252
20.10 Slovenian Public Opinion: 4rd Analysis Model Structure d.f. loglik θ C.I. θ MAR BRD1 (α, β) 6-2495.29 0.892 [0.878;0.906] 0.8920 BRD2 (α, β j ) 7-2467.43 0.884 [0.869;0.900] 0.8915 BRD3 (α k,β) 7-2463.10 0.881 [0.866;0.897] 0.8915 BRD4 (α, β k ) 7-2467.43 0.765 [0.674;0.856] 0.8915 BRD5 (α j,β) 7-2463.10 0.844 [0.806;0.882] 0.8915 BRD6 (α j,β j ) 8-2431.06 0.819 [0.788;0.849] 0.8919 BRD7 (α k,β k ) 8-2431.06 0.764 [0.697;0.832] 0.8919 BRD8 (α j,β k ) 8-2431.06 0.741 [0.657;0.826] 0.8919 BRD9 (α k,β j ) 8-2431.06 0.867 [0.851;0.884] 0.8919 Model 10 (α k,β jk ) 9-2431.06 [0.762;0.893] [0.744;0.907] 0.8919 Model 11 (α jk,β j ) 9-2431.06 [0.766;0.883] [0.715;0.920] 0.8919 Model 12 (α jk,β jk ) 10-2431.06 [0.694;0.904] 0.8919 ESALQ Course on Models for Longitudinal and Incomplete Data 253
θ =0.885 Estimator θ [Pessimistic; optimistic] [0.694;0.904] MAR (3 questions) 0.883 MNAR 0.782 MNAR interval [0.753;0.891] Model 10 [0.762;0.893] Model 11 [0.766;0.883] Model 12 [0.694;0.904] ESALQ Course on Models for Longitudinal and Incomplete Data 254
Chapter 21 Concluding Remarks MCAR/simple CC biased LOCF inefficient MAR direct likelihood easy to conduct not simpler than MAR methods weighted GEE Gaussian & non-gaussian MNAR variety of methods strong, untestable assumptions most useful in sensitivity analysis ESALQ Course on Models for Longitudinal and Incomplete Data 255
Part VIII Case Study: Age-related Macular Degeneration ESALQ Course on Models for Longitudinal and Incomplete Data 256
Chapter 22 The Dataset Visual Acuity Age-related Macular Degeneration Missingness ESALQ Course on Models for Longitudinal and Incomplete Data 257
22.1 Visual Acuity I T I S A P L E A S U R E T O B E W I T H Y O U I N P I R A C I C A B A C A P I T A L O F B R A Z IL ESALQ Course on Models for Longitudinal and Incomplete Data 258
22.2 Age-related Macular Degeneration Trial Pharmacological Therapy for Macular Degeneration Study Group (1997) An occular pressure disease which makes patients progressively lose vision 240 patients enrolled in a multi-center trial (190 completers) Treatment: Interferon-α (6 million units) versus placebo Visits: baseline and follow-up at 4, 12, 24, and 52 weeks Continuous outcome: visual acuity: # letters correctly read on a vision chart Binary outcome: visual acuity versus baseline 0 or 0 ESALQ Course on Models for Longitudinal and Incomplete Data 259
Missingness: Measurement occasion 4 wks 12 wks 24 wks 52 wks Number % Completers O O O O 188 78.33 Dropouts O O O M 24 10.00 O O M M 8 3.33 O M M M 6 2.50 M M M M 6 2.50 Non-monotone missingness O O M O 4 1.67 O M M O 1 0.42 M O O O 2 0.83 M O M M 1 0.42 ESALQ Course on Models for Longitudinal and Incomplete Data 260
Chapter 23 Weighted Generalized Estimating Equations Model for the weights Incorporating the weights within GEE ESALQ Course on Models for Longitudinal and Incomplete Data 261
23.1 Analysis of the ARMD Trial Model for the weights: logit[p (D i = j D i j)] = ψ 0 + ψ 1 y i,j 1 + ψ 2 T i + ψ 31 L 1i + ψ 32 L 2i + ψ 34 L 3i with +ψ 41 I(t j =2)+ψ 42 I(t j =3) y i,j 1 the binary outcome at the previous time t i,j 1 = t j 1 (since time is common to all subjects) T i =1for interferon-α and T i =0for placebo L ki =1if the patient s eye lesion is of level k =1,...,4 (since one dummy variable is redundant, only three are used) I( ) is an indicator function ESALQ Course on Models for Longitudinal and Incomplete Data 262
Results for the weights model: Effect Parameter Estimate (s.e.) Intercept ψ 0 0.14 (0.49) Previous outcome ψ 1 0.04 (0.38) Treatment ψ 2-0.86 (0.37) Lesion level 1 ψ 31-1.85 (0.49) Lesion level 2 ψ 32-1.91 (0.52) Lesion level 3 ψ 33-2.80 (0.72) Time 2 ψ 41-1.75 (0.49) Time 3 ψ 42-1.38 (0.44) ESALQ Course on Models for Longitudinal and Incomplete Data 263
GEE: logit[p (Y ij =1 T i,t j )] = β j1 + β j2 T i with T i =0for placebo and T i =1for interferon-α t j (j =1,...,4) refers to the four follow-up measurements Classical GEE and linearization-based GEE Comparison between CC, LOCF, and GEE analyses SAS code:... proc genmod data=armdhlp descending; class trt prev lesion time; model dropout = prev trt lesion time ods output obstats=pred; ods listing exclude obstats; run; / pred dist=b; ESALQ Course on Models for Longitudinal and Incomplete Data 264
... proc genmod data=armdwgee; title data as is - WGEE ; weight wi; class time treat subject; model bindif = time treat*time / noint dist=binomial; repeated subject=subject / withinsubject=time type=exch modelse; run; proc glimmix data=armdwgee empirical; title data as is - WGEE - linearized version - empirical ; weight wi; nloptions maxiter=50 technique=newrap; class time treat subject; model bindif = time treat*time / noint solution dist=binary ; random _residual_ / subject=subject type=cs; run; ESALQ Course on Models for Longitudinal and Incomplete Data 265
Results: Effect Par. CC LOCF Observed data Unweighted WGEE Standard GEE Int.4 β 11-1.01(0.24;0.24) -0.87(0.20;0.21) -0.87(0.21;0.21) -0.98(0.10;0.44) Int.12 β 21-0.89(0.24;0.24) -0.97(0.21;0.21) -1.01(0.21;0.21) -1.78(0.15;0.38) Int.24 β 31-1.13(0.25;0.25) -1.05(0.21;0.21) -1.07(0.22;0.22) -1.11(0.15;0.33) Int.52 β 41-1.64(0.29;0.29) -1.51(0.24;0.24) -1.71(0.29;0.29) -1.72(0.25;0.39) Tr.4 β 12 0.40(0.32;0.32) 0.22(0.28;0.28) 0.22(0.28;0.28) 0.80(0.15;0.67) Tr.12 β 22 0.49(0.31;0.31) 0.55(0.28;0.28) 0.61(0.29;0.29) 1.87(0.19;0.61) Tr.24 β 32 0.48(0.33;0.33) 0.42(0.29;0.29) 0.44(0.30;0.30) 0.73(0.20;0.52) Tr.52 β 42 0.40(0.38;0.38) 0.34(0.32;0.32) 0.44(0.37;0.37) 0.74(0.31;0.52) Corr. ρ 0.39 0.44 0.39 0.33 ESALQ Course on Models for Longitudinal and Incomplete Data 266
Effect Par. CC LOCF Observed data Unweighted WGEE Linearization-based GEE Int.4 β 11-1.01(0.24;0.24) -0.87(0.21;0.21) -0.87(0.21;0.21) -0.98(0.18;0.44) Int.12 β 21-0.89(0.24;0.24) -0.97(0.21;0.21) -1.01(0.22;0.21) -1.78(0.26;0.42) Int.24 β 31-1.13(0.25;0.25) -1.05(0.21;0.21) -1.07(0.23;0.22) -1.19(0.25;0.38) Int.52 β 41-1.64(0.29;0.29) -1.51(0.24;0.24) -1.71(0.29;0.29) -1.81(0.39;0.48) Tr.4 β 12 0.40(0.32;0.32) 0.22(0.28;0.28) 0.22(0.29;0.29) 0.80(0.26;0.67) Tr.12 β 22 0.49(0.31;0.31) 0.55(0.28;0.28) 0.61(0.28;0.29) 1.85(0.32;0.64) Tr.24 β 32 0.48(0.33;0.33) 0.42(0.29;0.29) 0.44(0.30;0.30) 0.98(0.33;0.60) Tr.52 β 42 0.40(0.38;0.38) 0.34(0.32;0.32) 0.44(0.37;0.37) 0.97(0.49;0.65) σ 2 0.62 0.57 0.62 1.29 τ 2 0.39 0.44 0.39 1.85 Corr. ρ 0.39 0.44 0.39 0.59 ESALQ Course on Models for Longitudinal and Incomplete Data 267
Chapter 24 Multiple Imputation Settings and Models Results for GEE Results for GLMM ESALQ Course on Models for Longitudinal and Incomplete Data 268
24.1 MI Analysis of the ARMD Trial M =10imputations GEE: logit[p (Y ij =1 T i,t j )] = β j1 + β j2 T i GLMM: logit[p (Y ij =1 T i,t j,b i )] = β j1 + b i + β j2 T i, b i N(0,τ 2 ) T i =0for placebo and T i =1for interferon-α t j (j =1,...,4) refers to the four follow-up measurements Imputation based on the continuous outcome ESALQ Course on Models for Longitudinal and Incomplete Data 269
Results: Effect Par. GEE GLMM Int.4 β 11-0.84(0.20) -1.46(0.36) Int.12 β 21-1.02(0.22) -1.75(0.38) Int.24 β 31-1.07(0.23) -1.83(0.38) Int.52 β 41-1.61(0.27) -2.69(0.45) Trt.4 β 12 0.21(0.28) 0.32(0.48) Trt.12 β 22 0.60(0.29) 0.99(0.49) Trt.24 β 32 0.43(0.30) 0.67(0.51) Trt.52 β 42 0.37(0.35) 0.52(0.56) R.I. s.d. τ 2.20(0.26) R.I. var. τ 2 4.85(1.13) ESALQ Course on Models for Longitudinal and Incomplete Data 270
24.2 SAS Code for MI 1. Preparatory data analysis so that there is one line per subject 2. The imputation task: proc mi data=armd13 seed=486048 out=armd13a simple nimpute=10 round=0.1; var lesion diff4 diff12 diff24 diff52; by treat; run; Note that the imputation task is conducted on the continuous outcome diff, indicating the difference in number of letters versus baseline 3. Then, data manipulation takes place to define the binary indicators and to create a longitudinal version of the dataset ESALQ Course on Models for Longitudinal and Incomplete Data 271
4. The analysis task (GEE): proc genmod data=armd13c; class time subject; by _imputation_; model bindif = time1 time2 time3 time4 trttime1 trttime2 trttime3 trttime4 / noint dist=binomial covb; repeated subject=subject / withinsubject=time type=exch modelse; ods output ParameterEstimates=gmparms parminfo=gmpinfo CovB=gmcovb; run; ESALQ Course on Models for Longitudinal and Incomplete Data 272
5. The analysis task (GLMM): proc nlmixed data=armd13c qpoints=20 maxiter=100 technique=newrap cov ecov; by _imputation_; eta = beta11*time1+beta12*time2+beta13*time3+beta14*time4+b +beta21*trttime1+beta22*trttime2+beta23*trttime3+beta24*trttime4; p = exp(eta)/(1+exp(eta)); model bindif ~ binary(p); random b ~ normal(0,tau*tau) subject=subject; estimate tau2 tau*tau; ods output ParameterEstimates=nlparms CovMatParmEst=nlcovb AdditionalEstimates=nlparmsa CovMatAddEst=nlcovba; run; ESALQ Course on Models for Longitudinal and Incomplete Data 273
6. The inference task (GEE): proc mianalyze parms=gmparms covb=gmcovb parminfo=gmpinfo wcov bcov tcov; modeleffects time1 time2 time3 time4 trttime1 trttime2 trttime3 trttime4; run; 7. The inference task (GLMM): proc mianalyze parms=nlparms covb=nlcovb wcov bcov tcov; modeleffects beta11 beta12 beta13 beta14 beta21 beta22 beta23 beta24; run; ESALQ Course on Models for Longitudinal and Incomplete Data 274