ESALQ Course on Models for Longitudinal and Incomplete Data
|
|
- Pierce Beasley
- 8 years ago
- Views:
Transcription
1 ESALQ Course on Models for Longitudinal and Incomplete Data Geert Verbeke Biostatistical Centre K.U.Leuven, Belgium Geert Molenberghs Center for Statistics Universiteit Hasselt, Belgium ESALQ Course, Diepenbeek, November 2006
2 Contents 0 Related References... 1 I Continuous Longitudinal Data 6 1 Introduction A Model for Longitudinal Data Estimation and Inference in the Marginal Model Inference for the Random Effects ESALQ Course on Models for Longitudinal and Incomplete Data i
3 II Marginal Models for Non-Gaussian Longitudinal Data 64 5 The Toenail Data The Analgesic Trial Generalized Linear Models Parametric Modeling Families Generalized Estimating Equations A Family of GEE Methods III Generalized Linear Mixed Models for Non-Gaussian Longitudinal Data Generalized Linear Mixed Models (GLMM) Fitting GLMM s in SAS IV Marginal Versus Random-effects Models Marginal Versus Random-effects Models ESALQ Course on Models for Longitudinal and Incomplete Data ii
4 V Non-linear Models Non-Linear Mixed Models VI Incomplete Data Setting The Scene Proper Analysis of Incomplete Data Weighted Generalized Estimating Equations Multiple Imputation VII Topics in Methods and Sensitivity Analysis for Incomplete Data An MNAR Selection Model and Local Influence Interval of Ignorance Concluding Remarks ESALQ Course on Models for Longitudinal and Incomplete Data iii
5 VIII Case Study: Age-related Macular Degeneration The Dataset Weighted Generalized Estimating Equations Multiple Imputation ESALQ Course on Models for Longitudinal and Incomplete Data iv
6 Chapter 0 Related References Aerts, M., Geys, H., Molenberghs, G., and Ryan, L.M. (2002). Topics in Modelling of Clustered Data. London: Chapman & Hall. Brown, H. and Prescott, R. (1999). Applied Mixed Models in Medicine. New York: John Wiley & Sons. Crowder, M.J. and Hand, D.J. (1990). Analysis of Repeated Measures. London: Chapman & Hall. Davidian, M. and Giltinan, D.M. (1995). Nonlinear Models For Repeated Measurement Data. London: Chapman & Hall. ESALQ Course on Models for Longitudinal and Incomplete Data 1
7 Davis, C.S. (2002). Statistical Methods for the Analysis of Repeated Measurements. New York: Springer. Demidenko, E. (2004). Mixed Models: Theory and Applications. New York: John Wiley & Sons. Diggle, P.J., Heagerty, P.J., Liang, K.Y. and Zeger, S.L. (2002). Analysis of Longitudinal Data. (2nd edition). Oxford: Oxford University Press. Fahrmeir, L. and Tutz, G. (2002). Multivariate Statistical Modelling Based on Generalized Linear Models (2nd ed). New York: Springer. Fitzmaurice, G.M., Laird, N.M., and Ware, J.H. (2004). Applied Longitudinal Analysis. New York: John Wiley & Sons. Goldstein, H. (1979). The Design and Analysis of Longitudinal Studies. London: Academic Press. ESALQ Course on Models for Longitudinal and Incomplete Data 2
8 Goldstein, H. (1995). Multilevel Statistical Models. London: Edward Arnold. Hand, D.J. and Crowder, M.J. (1995). Practical Longitudinal Data Analysis. London: Chapman & Hall. Hedeker, D. and Gibbons, R.D. (2006). Longitudinal Data Analysis. New York: John Wiley & Sons. Jones, B. and Kenward, M.G. (1989). Design and Analysis of Crossover Trials. London: Chapman & Hall. Kshirsagar, A.M. and Smith, W.B. (1995). Growth Curves. New York: Marcel Dekker. Leyland, A.H. and Goldstein, H. (2001) Multilevel Modelling of Health Statistics. Chichester: John Wiley & Sons. Lindsey, J.K. (1993). Models for Repeated Measurements. Oxford: Oxford ESALQ Course on Models for Longitudinal and Incomplete Data 3
9 University Press. Littell, R.C., Milliken, G.A., Stroup, W.W., Wolfinger, R.D., and Schabenberger, O. (2005). SAS for Mixed Models (2nd ed.). Cary: SAS Press. Longford, N.T. (1993). Random Coefficient Models. Oxford: Oxford University Press. McCullagh, P. and Nelder, J.A. (1989). Generalized Linear Models (second edition). London: Chapman & Hall. Molenberghs, G. and Verbeke, G. (2005). Models for Discrete Longitudinal Data. New York: Springer-Verlag. Pinheiro, J.C. and Bates D.M. (2000). Mixed effects models in S and S-Plus. New York: Springer. Searle, S.R., Casella, G., and McCulloch, C.E. (1992). Variance Components. ESALQ Course on Models for Longitudinal and Incomplete Data 4
10 New-York: Wiley. Senn, S.J. (1993). Cross-over Trials in Clinical Research. Chichester: Wiley. Verbeke, G. and Molenberghs, G. (1997). Linear Mixed Models In Practice: A SAS Oriented Approach, Lecture Notes in Statistics 126. New-York: Springer. Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data. Springer Series in Statistics. New-York: Springer. Vonesh, E.F. and Chinchilli, V.M. (1997). Linear and Non-linear Models for the Analysis of Repeated Measurements. Basel: Marcel Dekker. Weiss, R.E. (2005). Modeling Longitudinal Data. New York: Springer. Wu, H. and Zhang, J.-T. (2006). Nonparametric Regression Methods for Longitudinal Data Analysis. New York: John Wiley & Sons. ESALQ Course on Models for Longitudinal and Incomplete Data 5
11 Part I Continuous Longitudinal Data ESALQ Course on Models for Longitudinal and Incomplete Data 6
12 Chapter 1 Introduction Repeated Measures / Longitudinal data Examples ESALQ Course on Models for Longitudinal and Incomplete Data 7
13 1.1 Repeated Measures / Longitudinal Data Repeated measures are obtained when a response is measured repeatedly on a set of units Units: Subjects, patients, participants,... Animals, plants,... Clusters: families, towns, branches of a company, Special case: Longitudinal data ESALQ Course on Models for Longitudinal and Incomplete Data 8
14 1.2 Rat Data Research question (Dentistry, K.U.Leuven): How does craniofacial growth depend on testosteron production? Randomized experiment in which 50 male Wistar rats are randomized to: Control (15 rats) Low dose of Decapeptyl (18 rats) High dose of Decapeptyl (17 rats) ESALQ Course on Models for Longitudinal and Incomplete Data 9
15 Treatment starts at the age of 45 days; measurements taken every 10 days, from day 50 on. The responses are distances (pixels) between well defined points on x-ray pictures of the skull of each rat: ESALQ Course on Models for Longitudinal and Incomplete Data 10
16 Measurements with respect to the roof, base and height of the skull. Here, we consider only one response, reflecting the height of the skull. Individual profiles: ESALQ Course on Models for Longitudinal and Incomplete Data 11
17 Complication: Dropout due to anaesthesia (56%): #Observations Age (days) Control Low High Total Remarks: Much variability between rats, much less variability within rats Fixed number of measurements scheduled per subject, but not all measurements available due to dropout, for known reason. Measurements taken at fixed time points ESALQ Course on Models for Longitudinal and Incomplete Data 12
18 1.3 Prostate Data References: Carter et al (1992, Cancer Research). Carter et al (1992, Journal of the American Medical Association). Morrell et al (1995, Journal of the American Statistical Association). Pearson et al (1994, Statistics in Medicine). Prostate disease is one of the most common and most costly medical problems in the United States Important to look for markers which can detect the disease at an early stage Prostate-Specific Antigen is an enzyme produced by both normal and cancerous prostate cells ESALQ Course on Models for Longitudinal and Incomplete Data 13
19 PSA level is related to the volume of prostate tissue. Problem: Patients with Benign Prostatic Hyperplasia also have an increased PSA level Overlap in PSA distribution for cancer and BPH cases seriously complicates the detection of prostate cancer. Research question (hypothesis based on clinical practice): Can longitudinal PSA profiles be used to detect prostate cancer in an early stage? ESALQ Course on Models for Longitudinal and Incomplete Data 14
20 A retrospective case-control study based on frozen serum samples: 16 control patients 20 BPH cases 14 local cancer cases 4 metastatic cancer cases Complication: No perfect match for age at diagnosis and years of follow-up possible Hence, analyses will have to correct for these age differences between the diagnostic groups. ESALQ Course on Models for Longitudinal and Incomplete Data 15
21 Individual profiles: ESALQ Course on Models for Longitudinal and Incomplete Data 16
22 Remarks: Much variability between subjects Little variability within subjects Highly unbalanced data ESALQ Course on Models for Longitudinal and Incomplete Data 17
23 Chapter 2 A Model for Longitudinal Data Introduction The 2-stage model formulation Example: Rat data The general linear mixed-effects model Hierarchical versus marginal model Example: Rat data Example: Bivariate observations ESALQ Course on Models for Longitudinal and Incomplete Data 18
24 2.1 Introduction In practice: often unbalanced data: unequal number of measurements per subject measurements not taken at fixed time points Therefore, multivariate regression techniques are often not applicable Often, subject-specific longitudinal profiles can be well approximated by linear regression functions This leads to a 2-stage model formulation: Stage 1: Linear regression model for each subject separately Stage 2: Explain variability in the subject-specific regression coefficients using known covariates ESALQ Course on Models for Longitudinal and Incomplete Data 19
25 2.2 A 2-stage Model Formulation Stage 1 Response Y ij for ith subject, measured at time t ij, i =1,...,N, j =1,...,n i Response vector Y i for ith subject: Y i =(Y i1,y i2,...,y ini ) Stage 1 model: Y i = Z i β i + ε i ESALQ Course on Models for Longitudinal and Incomplete Data 20
26 Z i is a (n i q) matrix of known covariates β i is a q-dimensional vector of subject-specific regression coefficients ε i N(0, Σ i ),oftenσ i = σ 2 I ni Note that the above model describes the observed variability within subjects ESALQ Course on Models for Longitudinal and Incomplete Data 21
27 2.2.2 Stage 2 Between-subject variability can now be studied from relating the β i to known covariates Stage 2 model: β i = K i β + b i K i is a (q p) matrix of known covariates β is a p-dimensional vector of unknown regression parameters b i N(0,D) ESALQ Course on Models for Longitudinal and Incomplete Data 22
28 2.3 Example: The Rat Data Individual profiles: ESALQ Course on Models for Longitudinal and Incomplete Data 23
29 Transformation of the time scale to linearize the profiles: Age ij t ij =ln[1+(age ij 45)/10)] Note that t =0corresponds to the start of the treatment (moment of randomization) Stage 1 model: Y ij = β 1i + β 2i t ij + ε ij, j =1,...,n i Matrix notation: Y i = Z i β i + ε i with Z i = 1 t i1 1 t i2.. 1 t ini ESALQ Course on Models for Longitudinal and Incomplete Data 24
30 In the second stage, the subject-specific intercepts and time effects are related to the treatment of the rats Stage 2 model: β 1i = β 0 + b 1i, β 2i = β 1 L i + β 2 H i + β 3 C i + b 2i, L i, H i,andc i are indicator variables: L i = 1 if low dose 0 otherwise H i = 1 if high dose 0 otherwise C i = 1 if control 0 otherwise ESALQ Course on Models for Longitudinal and Incomplete Data 25
31 Parameter interpretation: β 0 : average response at the start of the treatment (independent of treatment) β 1, β 2,andβ 3 : average time effect for each treatment group ESALQ Course on Models for Longitudinal and Incomplete Data 26
32 2.4 The General Linear Mixed-effects Model A 2-stage approach can be performed explicitly in the analysis However, this is just another example of the use of summary statistics: Y i is summarized by β i summary statistics β i analysed in second stage The associated drawbacks can be avoided by combining the two stages into one model: Y i = Z i β i + ε i β i = K i β + b i = Y i = Z i K i } {{ } X i β + Z i b i + ε i = X i β + Z i b i + ε i ESALQ Course on Models for Longitudinal and Incomplete Data 27
33 General linear mixed-effects model: Y i = X i β + Z i b i + ε i b i N(0,D), ε i N(0, Σ i ), b 1,...,b N, ε 1,...,ε N independent Terminology: Fixed effects: β Random effects: b i Variance components: elements in D and Σ i ESALQ Course on Models for Longitudinal and Incomplete Data 28
34 2.5 Hierarchical versus Marginal Model The general linear mixed model is given by: Y i = X i β + Z i b i + ε i b i N(0,D), ε i N(0, Σ i ), b 1,...,b N, ε 1,...,ε N independent It can be rewritten as: Y i b i N(X i β + Z i b i, Σ i ), b i N(0,D) ESALQ Course on Models for Longitudinal and Incomplete Data 29
35 It is therefore also called a hierarchical model: AmodelforY i given b i Amodelforb i Marginally, we have that Y i is distributed as: Y i N(X i β,z i DZ i +Σ i ) Hence, very specific assumptions are made about the dependence of mean and covariance onn the covariates X i and Z i : Implied mean : X i β Implied covariance : V i = Z i DZ i +Σ i Note that the hierarchical model implies the marginal one, NOT vice versa ESALQ Course on Models for Longitudinal and Incomplete Data 30
36 2.6 Example: The Rat Data Stage 1 model: Y ij = β 1i + β 2i t ij + ε ij, j =1,...,n i Stage 2 model: β 1i = β 0 + b 1i, β 2i = β 1 L i + β 2 H i + β 3 C i + b 2i, Combined: Y ij =(β 0 + b 1i )+(β 1 L i + β 2 H i + β 3 C i + b 2i )t ij + ε ij = β 0 + b 1i +(β 1 + b 2i )t ij + ε ij, if low dose β 0 + b 1i +(β 2 + b 2i )t ij + ε ij, if high dose β 0 + b 1i +(β 3 + b 2i )t ij + ε ij, if control. ESALQ Course on Models for Longitudinal and Incomplete Data 31
37 Implied marginal mean structure: Linear average evolution in each group Equal average intercepts Different average slopes Implied marginal covariance structure (Σ i = σ 2 I ni ): Cov(Y i (t 1 ), Y i (t 2 )) = 1 t 1 D 1 t 2 + σ 2 δ {t1,t 2 } = d 22 t 1 t 2 + d 12 (t 1 + t 2 )+d 11 + σ 2 δ {t1,t 2 }. Note that the model implicitly assumes that the variance function is quadratic over time, with positive curvature d 22. ESALQ Course on Models for Longitudinal and Incomplete Data 32
38 A model which assumes that all variability in subject-specific slopes can be ascribed to treatment differences can be obtained by omitting the random slopes b 2i from the above model: Y ij =(β 0 + b 1i )+(β 1 L i + β 2 H i + β 3 C i )t ij + ε ij = β 0 + b 1i + β 1 t ij + ε ij, if low dose β 0 + b 1i + β 2 t ij + ε ij, if high dose β 0 + b 1i + β 3 t ij + ε ij, if control. This is the so-called random-intercepts model The same marginal mean structure is obtained as under the model with random slopes ESALQ Course on Models for Longitudinal and Incomplete Data 33
39 Implied marginal covariance structure (Σ i = σ 2 I ni ): Cov(Y i (t 1 ), Y i (t 2 )) = 1 D 1 + σ 2 δ {t1,t 2 } = d 11 + σ 2 δ {t1,t 2 }. Hence, the implied covariance matrix is compound symmetry: constant variance d 11 + σ 2 constant correlation ρ I = d 11 /(d 11 + σ 2 ) between any two repeated measurements within the same rat ESALQ Course on Models for Longitudinal and Incomplete Data 34
40 2.7 Example: Bivariate Observations Balanced data, two measurements per subject (n i =2), two models: Model 1: Random intercepts + heterogeneous errors V = 1 1 (d) (1 1) + σ σ 2 2 = d + σ 2 1 d d d+ σ 2 2 Model 2: Uncorrelated intercepts and slopes + measurement error V = d d σ σ 2 = d 1 + σ 2 d 1 d 1 d 1 + d 2 + σ 2 ESALQ Course on Models for Longitudinal and Incomplete Data 35
41 Different hierarchical models can produce the same marginal model Hence, a good fit of the marginal model cannot be interpreted as evidence for any of the hierarchical models. A satisfactory treatment of the hierarchical model is only possible within a Bayesian context. ESALQ Course on Models for Longitudinal and Incomplete Data 36
42 Chapter 3 Estimation and Inference in the Marginal Model ML and REML estimation Fitting linear mixed models in SAS Negative variance components Inference ESALQ Course on Models for Longitudinal and Incomplete Data 37
43 3.1 ML and REML Estimation Recall that the general linear mixed model equals Y i = X i β + Z i b i + ε i b i N(0,D) ε i N(0, Σ i ) independent The implied marginal model equals Y i N(X i β,z i DZ i +Σ i ) Note that inferences based on the marginal model do not explicitly assume the presence of random effects representing the natural heterogeneity between subjects ESALQ Course on Models for Longitudinal and Incomplete Data 38
44 Notation: β: vector of fixed effects (as before) α: vector of all variance components in D and Σ i θ =(β, α ) : vector of all parameters in marginal model Marginal likelihood function: L ML (θ) = N i=1 (2π) n i/2 V i (α) 1 2 exp 1 2 (Y i X i β) Vi 1 (α)(y i X i β) If α were known, MLE of β equals where W i equals V 1 i. β(α) = N i=1 X iw i X i 1 N i=1 X iw i y i, ESALQ Course on Models for Longitudinal and Incomplete Data 39
45 In most cases, α is not known, and needs to be replaced by an estimate α Two frequently used estimation methods for α: Maximum likelihood Restricted maximum likelihood ESALQ Course on Models for Longitudinal and Incomplete Data 40
46 3.2 Fitting Linear Mixed Models in SAS A model for the prostate data: ln(psa ij +1) = β 1 Age i + β 2 C i + β 3 B i + β 4 L i + β 5 M i +(β 6 Age i + β 7 C i + β 8 B i + β 9 L i + β 10 M i ) t ij +(β 11 Age i + β 12 C i + β 13 B i + β 14 L i + β 15 M i ) t 2 ij + b 1i + b 2i t ij + b 3i t 2 ij + ε ij. SAS program (Σ i = σ 2 I ni ): proc mixed data=prostate method=reml; class id group timeclss; model lnpsa = group age time group*time age*time time2 group*time2 age*time2 / solution; random intercept time time2 / type=un subject=id; run; ESALQ Course on Models for Longitudinal and Incomplete Data 41
47 ML and REML estimates for fixed effects: Effect Parameter MLE (s.e.) REMLE (s.e.) Age effect β (0.013) (0.014) Intercepts: Control β (0.919) (0.976) BPH β (1.026) (1.090) L/R cancer β (0.997) (1.059) Met. cancer β (1.022) (1.086) Age time effect β (0.020) (0.021) Time effects: Control β (1.359) (1.473) BPH β (1.511) (1.638) L/R cancer β (1.469) (1.593) Met. cancer β (1.499) (1.626) Age time 2 effect β (0.008) (0.009) Time 2 effects: Control β (0.549) (0.610) BPH β (0.604) (0.672) L/R cancer β (0.590) (0.656) Met. cancer β (0.598) (0.666) ESALQ Course on Models for Longitudinal and Incomplete Data 42
48 ML and REML estimates for variance components: Effect Parameter MLE (s.e.) REMLE (s.e.) Covariance of b i : var(b 1i ) d (0.083) (0.098) var(b 2i ) d (0.187) (0.230) var(b 3i ) d (0.032) (0.041) cov(b 1i,b 2i ) d 12 = d (0.113) (0.136) cov(b 2i,b 3i ) d 23 = d (0.076) (0.095) cov(b 3i,b 1i ) d 13 = d (0.043) (0.053) Residual variance: var(ε ij ) σ (0.002) (0.002) Log-likelihood ESALQ Course on Models for Longitudinal and Incomplete Data 43
49 Fitted average profiles at median age at diagnosis: ESALQ Course on Models for Longitudinal and Incomplete Data 44
50 3.3 Negative Variance Components Reconsider the model for the rat data: Y ij =(β 0 + b 1i )+(β 1 L i + β 2 H i + β 3 C i + b 2i )t ij + ε ij REML estimates obtained from SAS procedure MIXED: Effect Parameter REMLE (s.e.) Intercept β (0.325) Time effects: Low dose β (0.228) High dose β (0.231) Control β (0.285) Covariance of b i : var(b 1i ) d (1.123) var(b 2i ) d ( ) cov(b 1i,b 2i ) d 12 = d (0.381) Residual variance: var(ε ij ) σ (0.145) REML log-likelihood ESALQ Course on Models for Longitudinal and Incomplete Data 45
51 This suggests that the REML likelihood could be further increased by allowing negative estimates for d 22 In SAS, this can be done by adding the option nobound to the PROC MIXED statement. Results: Parameter restrictions for α d ii 0,σ 2 0 d ii IR, σ 2 IR Effect Parameter REMLE (s.e.) REMLE (s.e.) Intercept β (0.325) (0.313) Time effects: Low dose β (0.228) (0.198) High dose β (0.231) (0.198) Control β (0.285) (0.254) Covariance of b i : var(b 1i ) d (1.123) (1.019) var(b 2i ) d ( ) (0.169) cov(b 1i,b 2i ) d 12 = d (0.381) (0.357) Residual variance: var(ε ij ) σ (0.145) (0.165) REML log-likelihood ESALQ Course on Models for Longitudinal and Incomplete Data 46
52 Note that the REML log-likelihood value has been further increased and a negative estimate for d 22 is obtained. Brown & Prescott (1999, p. 237) : ESALQ Course on Models for Longitudinal and Incomplete Data 47
53 Meaning of negative variance component? Fitted variance function: Var(Y i (t)) = 1 t D 1 t + σ 2 = d 22 t 2 +2 d 12 t + d 11 + σ 2 = 0.287t t The suggested negative curvature in the variance function is supported by the sample variance function: ESALQ Course on Models for Longitudinal and Incomplete Data 48
54 This again shows that the hierarchical and marginal models are not equivalent: The marginal model allows negative variance components, as long as the marginal covariances V i = Z i DZ i + σ 2 I ni are positive definite The hierarchical interpretation of the model does not allow negative variance components ESALQ Course on Models for Longitudinal and Incomplete Data 49
55 3.4 Inference for the Marginal Model Estimate for β: β(α) = N i=1 X iw i X i with α replacedbyitsmlorremlestimate 1 N i=1 X iw i y i Conditional on α, β(α) is multivariate normal with mean β and covariance Var( β) = = N i=1 X iw i X i N i=1 X iw i X i 1 N 1 i=1 X iw i var(y i )W i X i N i=1 X iw i X i 1 In practice one again replaces α by its ML or REML estimate ESALQ Course on Models for Longitudinal and Incomplete Data 50
56 Inference for β: Wald tests t-andf -tests LR tests (not with REML) Inference for α: Wald tests LR tests (even with REML) Caution 1: Marginal vs. hierarchical model Caution 2: Boundary problems ESALQ Course on Models for Longitudinal and Incomplete Data 51
57 Chapter 4 Inference for the Random Effects Empirical Bayes inference Example: Prostate data Shrinkage Example: Prostate data ESALQ Course on Models for Longitudinal and Incomplete Data 52
58 4.1 Empirical Bayes Inference Random effects b i reflect how the evolution for the ith subject deviates from the expected evolution X i β. Estimation of the b i helpful for detecting outlying profiles This is only meaningful under the hierarchical model interpretation: Y i b i N(X i β + Z i b i, Σ i ) b i N(0,D) Since the b i are random, it is most natural to use Bayesian methods Terminology: prior distribution N(0,D) for b i ESALQ Course on Models for Longitudinal and Incomplete Data 53
59 Posterior density: f(b i y i ) f(b i Y i = y i ) = f(y i b i ) f(b i ) f(y i b i ) f(b i ) f(y i b i ) f(b i ) db i... exp 1 2 (b i DZ iw i (y i X i β)) Λ 1 i (b i DZ iw i (y i X i β)) for some positive definite matrix Λ i. Posterior distribution: b i y i N (DZ iw i (y i X i β), Λ i ) ESALQ Course on Models for Longitudinal and Incomplete Data 54
60 Posterior mean as estimate for b i : b i (θ) =E [b i Y i = y i ] = b i f(b i y i ) db i = DZ iw i (α)(y i X i β) Parameters in θ are replaced by their ML or REML estimates, obtained from fitting the marginal model. b i = b i ( θ) is called the Empirical Bayes estimate of b i. Approximate t- andf -tests to account for the variability introduced by replacing θ by θ, similar to tests for fixed effects. ESALQ Course on Models for Longitudinal and Incomplete Data 55
61 4.2 Example: Prostate Data We reconsider our linear mixed model: ln(psa ij +1) = β 1 Age i + β 2 C i + β 3 B i + β 4 L i + β 5 M i +(β 6 Age i + β 7 C i + β 8 B i + β 9 L i + β 10 M i ) t ij +(β 11 Age i + β 12 C i + β 13 B i + β 14 L i + β 15 M i ) t 2 ij + b 1i + b 2i t ij + b 3i t 2 ij + ε ij. In SAS the estimates can be obtained from adding the option solution to the random statement: random intercept time time2 / type=un subject=id solution; ods listing exclude solutionr; ods output solutionr=out; ESALQ Course on Models for Longitudinal and Incomplete Data 56
62 The ODS statements are used to write the EB estimates into a SAS output data set, and to prevent SAS from printing them in the output window. In practice, histograms and scatterplots of certain components of b i are used to detect model deviations or subjects with exceptional evolutions over time ESALQ Course on Models for Longitudinal and Incomplete Data 57
63 ESALQ Course on Models for Longitudinal and Incomplete Data 58
64 Strong negative correlations in agreement with correlation matrix corresponding to fitted D: Dcorr = Histograms and scatterplots show outliers Subjects #22, #28, #39, and #45, have highest four slopes for time 2 and smallest four slopes for time, i.e., with the strongest (quadratic) growth. Subjects #22, #28 and #39 have been further examined and have been shown to be metastatic cancer cases which were misclassified as local cancer cases. Subject #45 is the metastatic cancer case with the strongest growth ESALQ Course on Models for Longitudinal and Incomplete Data 59
65 4.3 Shrinkage Estimators b i Consider the prediction of the evolution of the ith subject: Y i X i β + Z i b i = X i β + Z i DZ iv i 1 (y i X i β) = ( I ni Z i DZ iv 1 i = Σ i Vi 1 X i β + ( I ni Σ i Vi 1 ) Xi β + Z i DZ iv i 1 y i ) yi, Hence, Ŷi is a weighted mean of the population-averaged profile X i β and the observed data y i, with weights Σ i V i 1 and I ni Σ i V i 1 respectively. ESALQ Course on Models for Longitudinal and Incomplete Data 60
66 Note that X i β gets much weight if the residual variability is large in comparison to the total variability. This phenomenon is usually called shrinkage : The observed data are shrunk towards the prior average profile X i β. This is also reflected in the fact that for any linear combination λ b i of random effects, var(λ b i ) var(λ b i ). ESALQ Course on Models for Longitudinal and Incomplete Data 61
67 4.4 Example: Prostate Data Comparison of predicted, average, and observed profiles for the subjects #15 and #28, obtained under the reduced model: ESALQ Course on Models for Longitudinal and Incomplete Data 62
68 Illustration of the shrinkage effect : Var( b i ) = , D = ESALQ Course on Models for Longitudinal and Incomplete Data 63
69 Part II Marginal Models for Non-Gaussian Longitudinal Data ESALQ Course on Models for Longitudinal and Incomplete Data 64
70 Chapter 5 The Toenail Data Toenail Dermatophyte Onychomycosis: Common toenail infection, difficult to treat, affecting more than 2% of population. Classical treatments with antifungal compounds need to be administered until the whole nail has grown out healthy. New compounds have been developed which reduce treatment to 3 months Randomized, double-blind, parallel group, multicenter study for the comparison of two such new compounds (A and B) for oral treatment. ESALQ Course on Models for Longitudinal and Incomplete Data 65
71 Research question: Severity relative to treatment of TDO? patients randomized, 36 centers 48 weeks of total follow up (12 months) 12 weeks of treatment (3 months) measurements at months 0, 1, 2, 3, 6, 9, 12. ESALQ Course on Models for Longitudinal and Incomplete Data 66
72 Frequencies at each visit (both treatments): ESALQ Course on Models for Longitudinal and Incomplete Data 67
73 Chapter 6 The Analgesic Trial single-arm trial with 530 patients recruited (491 selected for analysis) analgesic treatment for pain caused by chronic nonmalignant disease treatment was to be administered for 12 months we will focus on Global Satisfaction Assessment (GSA) GSA scale goes from 1=very good to 5=very bad GSA was rated by each subject 4 times during the trial, at months 3, 6, 9, and 12. ESALQ Course on Models for Longitudinal and Incomplete Data 68
74 Research questions: Evolution over time Relation with baseline covariates: age, sex, duration of the pain, type of pain, disease progression, Pain Control Assessment (PCA),... Investigation of dropout Frequencies: GSA Month 3 Month 6 Month 9 Month % % % % % % % % % % % % % % % % % % % 3 1.4% Tot ESALQ Course on Models for Longitudinal and Incomplete Data 69
75 Missingness: Measurement occasion Month 3 Month 6 Month 9 Month 12 Number % Completers O O O O Dropouts O O O M O O M M O M M M Non-monotone missingness O O M O O M O O O M O M O M M O M O O O M O O M M O M O M O M M ESALQ Course on Models for Longitudinal and Incomplete Data 70
76 Chapter 7 Generalized Linear Models The model Maximum likelihood estimation Examples McCullagh and Nelder (1989) ESALQ Course on Models for Longitudinal and Incomplete Data 71
77 7.1 The Generalized Linear Model Suppose a sample Y 1,...,Y N of independent observations is available All Y i have densities f(y i θ i,φ) which belong to the exponential family: f(y θ i,φ)=exp { φ 1 [yθ i ψ(θ i )] + c(y, φ) } θ i the natural parameter Linear predictor: θ i = x i β φ is the scale parameter (overdispersion parameter) ESALQ Course on Models for Longitudinal and Incomplete Data 72
78 ψ( ) is a function, generating mean and variance: E(Y )=ψ (θ) Var(Y )=φψ (θ) Note that, in general, the mean μ and the variance are related: Var(Y ) = φψ [ ψ 1 (μ) ] = φv(μ) The function v(μ) is called the variance function. The function ψ 1 which expresses θ as function of μ is called the link function. ψ is the inverse link function ESALQ Course on Models for Longitudinal and Incomplete Data 73
79 7.2 Examples The Normal Model Model: Y N(μ, σ 2 ) Density function: f(y θ, φ) = 1 2πσ 2 exp 1 σ 2(y μ)2 =exp 1 σ 2 yμ μ2 2 + ln(2πσ 2 ) 2 y2 2σ 2 ESALQ Course on Models for Longitudinal and Incomplete Data 74
80 Exponential family: θ= μ φ= σ 2 ψ(θ) =θ 2 /2 c(y, φ) = ln(2πφ) 2 y2 2φ Mean and variance function: μ= θ v(μ) =1 Note that, under this normal model, the mean and variance are not related: φv(μ) = σ 2 The link function is here the identity function: θ = μ ESALQ Course on Models for Longitudinal and Incomplete Data 75
81 7.2.2 The Bernoulli Model Model: Y Bernoulli(π) Density function: f(y θ, φ) =π y (1 π) 1 y =exp{y ln π +(1 y) ln(1 π)} =exp y ln π + ln(1 π) 1 π ESALQ Course on Models for Longitudinal and Incomplete Data 76
82 Exponential family: θ=ln ( ) π φ=1 1 π ψ(θ) = ln(1 π) = ln(1 + exp(θ)) c(y, φ) =0 Mean and variance function: μ= v(μ) = exp θ 1+exp θ = π exp θ (1+exp θ) 2 = π(1 π) Note that, under this model, the mean and variance are related: φv(μ) = μ(1 μ) The link function here is the logit link: θ =ln ( μ 1 μ ) ESALQ Course on Models for Longitudinal and Incomplete Data 77
83 7.3 Generalized Linear Models (GLM) Suppose a sample Y 1,...,Y N of independent observations is available All Y i have densities f(y i θ i,φ) which belong to the exponential family In GLM s, it is believed that the differences between the θ i can be explained through a linear function of known covariates: θ i = x i β x i is a vector of p known covariates β is the corresponding vector of unknown regression parameters, to be estimated from the data. ESALQ Course on Models for Longitudinal and Incomplete Data 78
84 7.4 Maximum Likelihood Estimation Log-likelihood: l(β,φ) = 1 φ i [y iθ i ψ(θ i )] + i c(y i,φ) The score equations: S(β) = i μ i β v 1 i (y i μ i ) = 0 Note that the estimation of β depends on the density only through the means μ i and the variance functions v i = v(μ i ). ESALQ Course on Models for Longitudinal and Incomplete Data 79
85 The score equations need to be solved numerically: iterative (re-)weighted least squares Newton-Raphson Fisher scoring Inference for β is based on classical maximum likelihood theory: asymptotic Wald tests likelihood ratio tests score tests ESALQ Course on Models for Longitudinal and Incomplete Data 80
86 7.5 Illustration: The Analgesic Trial Early dropout (did the subject drop out after the first or the second visit)? Binary response PROC GENMOD can fit GLMs in general PROC LOGISTIC can fit models for binary (and ordered) responses SAS code for logit link: proc genmod data=earlydrp; model earlydrp = pca0 weight psychiat physfct / dist=b; run; proc logistic data=earlydrp descending; model earlydrp = pca0 weight psychiat physfct; run; ESALQ Course on Models for Longitudinal and Incomplete Data 81
87 SAS code for probit link: proc genmod data=earlydrp; model earlydrp = pca0 weight psychiat physfct / dist=b link=probit; run; proc logistic data=earlydrp descending; model earlydrp = pca0 weight psychiat physfct / link=probit; run; Selected output: Analysis Of Parameter Estimates Standard Wald 95% Chi- Parameter DF Estimate Error Confidence Limits Square Pr > ChiSq Intercept PCA WEIGHT PSYCHIAT PHYSFCT Scale NOTE: The scale parameter was held fixed. ESALQ Course on Models for Longitudinal and Incomplete Data 82
88 Chapter 8 Parametric Modeling Families Continuous outcomes Longitudinal generalized linear models Notation ESALQ Course on Models for Longitudinal and Incomplete Data 83
89 8.1 Continuous Outcomes Marginal Models: E(Y ij x ij )=x ijβ Random-Effects Models: E(Y ij b i, x ij )=x ijβ + z ijb i Transition Models: E(Y ij Y i,j 1,...,Y i1, x ij )=x ijβ + αy i,j 1 ESALQ Course on Models for Longitudinal and Incomplete Data 84
90 8.2 Longitudinal Generalized Linear Models Normal case: easy transfer between models Also non-normal data can be measured repeatedly (over time) Lack of key distribution such as the normal [= ] A lot of modeling options Introduction of non-linearity No easy transfer between model families crosssectional longitudinal normal outcome linear model LMM non-normal outcome GLM? ESALQ Course on Models for Longitudinal and Incomplete Data 85
91 8.3 Marginal Models Fully specified models & likelihood estimation E.g., multivariate probit model, Bahadur model, odds ratio model Specification and/or fitting can be cumbersome Non-likelihood alternatives Historically: Empirical generalized least squares (EGLS; Koch et al 1977; SAS CATMOD) Generalized estimating equations Pseudo-likelihood ESALQ Course on Models for Longitudinal and Incomplete Data 86
92 Chapter 9 Generalized Estimating Equations General idea Asymptotic properties Working correlation Special case and application SAS code and output ESALQ Course on Models for Longitudinal and Incomplete Data 87
93 9.1 General Idea Univariate GLM, score function of the form (scalar Y i ): S(β) = N i=1 μ i β v 1 i (y i μ i )=0 with v i = Var(Y i ) In longitudinal setting: Y =(Y 1,...,Y N ): where S(β) = i j μ ij β v 1 ij (y ij μ ij ) = N i=1 D i [V i (α)] 1 (y i μ i ) = 0 D i is an n i p matrix with (i, j)th elements μ ij β y i and μ i are n i -vectors with elements y ij and μ ij Is V i n i n i diagonal or more complex? ESALQ Course on Models for Longitudinal and Incomplete Data 88
94 V i = Var(Y i ) is more complex since it involves a set of nuisance parameters α, determining the covariance structure of Y i : in which V i (β, α) =φa 1/2 i (β)r i (α)a 1/2 i (β) A 1/2 i (β) = vi1 (μ i1 (β)) v ini (μ ini (β)) and R i (α) is the correlation matrix of Y i, parameterized by α. Same form as for full likelihood procedure, but we restrict specification to the first moment only Liang and Zeger (1986) ESALQ Course on Models for Longitudinal and Incomplete Data 89
95 9.2 Large Sample Properties As N where N(ˆβ β) N(0,I 1 0 ) I 0 = N i=1 D i[v i (α)] 1 D i (Unrealistic) Conditions: α is known the parametric form for V i (α) is known This is the naive purely model based variance estimator Solution: working correlation matrix ESALQ Course on Models for Longitudinal and Incomplete Data 90
96 9.3 Unknown Covariance Structure Keep the score equations BUT S(β) = N i=1 [D i] [V i (α)] 1 (y i μ i )=0 suppose V i (.) is not the true variance of Y i but only a plausible guess, a so-called working correlation matrix specify correlations and not covariances, because the variances follow from the mean structure the score equations are solved as before ESALQ Course on Models for Longitudinal and Incomplete Data 91
97 The asymptotic normality results change to N(ˆβ β) N(0,I 1 0 I 1 I 1 0 ) I 0 = N i=1 D i[v i (α)] 1 D i I 1 = N i=1 D i[v i (α)] 1 Var(Y i )[V i (α)] 1 D i. This is the robust empirically corrected sandwich variance estimator I 0 is the bread I 1 is the filling (ham or cheese) Correct guess = likelihood variance ESALQ Course on Models for Longitudinal and Incomplete Data 92
98 The estimators ˆβ are consistent even if the working correlation matrix is incorrect An estimate is found by replacing the unknown variance matrix Var(Y i ) by (Y i ˆμ i )(Y i ˆμ i ). Even if this estimator is bad for Var(Y i ) it leads to a good estimate of I 1, provided that: replication in the data is sufficiently large same model for μ i is fitted to groups of subjects observation times do not vary too much between subjects A bad choice of working correlation matrix can affect the efficiency of ˆβ Care needed with incomplete data (see Part VI) ESALQ Course on Models for Longitudinal and Incomplete Data 93
99 9.4 The Working Correlation Matrix V i = V i (β, α,φ)=φa 1/2 i (β)r i (α)a 1/2 i (β) Variance function: A i is (n i n i ) diagonal with elements v(μ ij ),theknownglm variance function. Working correlation: R i (α) possibly depends on a different set of parameters α. Overdispersion parameter: φ, assumed 1 or estimated from the data. The unknown quantities are expressed in terms of the Pearson residuals Note that e ij depends on β. e ij = y ij μ ij v(μij ). ESALQ Course on Models for Longitudinal and Incomplete Data 94
100 9.5 Estimation of Working Correlation Liang and Zeger (1986) proposed moment-based estimates for the working correlation. Corr(Y ij,y ik ) Estimate Independence 0 Exchangeable α ˆα = 1 N AR(1) α ˆα = 1 N N i=1 1 n i (n i 1) j k e ij e ik N i=1 1 n i 1 j ni 1 e ij e i,j+1 Dispersion parameter: ˆφ = 1 N N i=1 1 n i n i j=1 e2 ij. Unstructured α jk ˆα jk = 1 N N i=1 e ij e ik ESALQ Course on Models for Longitudinal and Incomplete Data 95
101 9.6 Fitting GEE The standard procedure, implemented in the SAS procedure GENMOD. 1. Compute initial estimates for β, using a univariate GLM (i.e., assuming independence). 2. Compute Pearson residuals e ij. Compute estimates for α and φ. Compute R i (α) and V i (β, α) =φa 1/2 i (β)r i (α)a 1/2 i (β). 3. Update estimate for β: β (t+1) = β (t) N 4. Iterate until convergence. i=1 D ivi 1 D i 1 N i=1 D iv 1 Estimates of precision by means of I 1 0 and/or I 1 0 I 1 I 1 0. i (y i μ i ). ESALQ Course on Models for Longitudinal and Incomplete Data 96
102 9.7 Special Case: Linear Mixed Models Estimate for β: β(α) = i=1 X iw i X i with α replacedbyitsmlorremlestimate Conditional on α, β has mean N 1 N i=1 X iw i Y i E [ β(α) ] = N i=1 X iw i X i 1 N i=1 X iw i X i β = β provided that E(Y i )=X i β Hence, in order for β to be unbiased, it is sufficient that the mean of the response is correctly specified. ESALQ Course on Models for Longitudinal and Incomplete Data 97
103 Conditional on α, β has covariance Var( β) = N i=1 X iw i X i 1 N i=1 X iw i Var(Y i )W i X i N i=1 X iw 1 i X i = N i=1 X iw 1 i X i Note that this model-based version assumes that the covariance matrix Var(Y i ) is correctly modelled as V i = Z i DZ i +Σ i. An empirically corrected version is: Var( β) = N i=1 X iw i X i 1 } {{ } } {{ } } {{ } BREAD N i=1 X iw i Var(Y i )W i X i MEAT N i=1 X iw i X i BREAD 1 ESALQ Course on Models for Longitudinal and Incomplete Data 98
104 Chapter 10 A Family of GEE Methods Classical approach Prentice s two sets of GEE Linearization-based version GEE2 Alternating logistic regressions ESALQ Course on Models for Longitudinal and Incomplete Data 99
105 10.1 Prentice s GEE where N i=1 D iv 1 i (Y i μ i ) = 0, N i=1 E iw 1 i (Z i δ i ) = 0 Z ijk = (Y ij μ ij )(Y ik μ ik ) μij (1 μ ij )μ ik (1 μ ik ), δ ijk = E(Z ijk ) The joint asymptotic distribution of N(ˆβ β) and N( ˆα α) normal with variance-covariance matrix consistently estimated by N A 0 B C Λ 11 Λ 12 Λ 21 Λ 22 AB 0 C ESALQ Course on Models for Longitudinal and Incomplete Data 100
106 where A = B = C = N i=1 N i=1 N i=1 D iv 1 i E i W 1 i E i W 1 i 1 D i, E 1 N i i=1 1 E i, E i W 1 i Z i β N i=1 D i V 1 i 1 D i, Λ 11 = N i=1 Λ 12 = N i=1 Λ 21 = Λ 12, Λ 22 = N i=1 D iv i 1 Cov(Y i )Vi 1 D i, D iv i 1 Cov(Y i, Z i )Wi 1 E i, E iw i 1 Cov(Z i )Wi 1 E i, and Statistic Estimator Var(Y i ) (Y i μ i )(Y i μ i ) Cov(Y i, Z i ) (Y i μ i )(Z i δ i ) Var(Z i ) (Z i δ i )(Z i δ i ) ESALQ Course on Models for Longitudinal and Incomplete Data 101
107 10.2 GEE Based on Linearization Model formulation Previous version of GEE are formulated directly in terms of binary outcomes This approach is based on a linearization: with y i = μ i + ε i η i = g(μ i ), η i = X i β, Var(y i ) = Var(ε i )=Σ i. η i is a vector of linear predictors, g(.) is the (vector) link function. ESALQ Course on Models for Longitudinal and Incomplete Data 102
108 Estimation (Nelder and Wedderburn 1972) Solve iteratively: where N i=1 X iw i X i β = N i=1 W iy i, W i = F iσ 1 i F i, y i = ˆη i +(y i ˆμ i )F 1 i, F i = μ i η i, Σ i = Var(ε), μ i = E(y i ). Remarks: y i is called working variable or pseudo data. Basis for SAS macro and procedure GLIMMIX For linear models, D i = I ni and standard linear regression follows. ESALQ Course on Models for Longitudinal and Incomplete Data 103
109 The Variance Structure Σ i = φa 1/2 i (β)r i (α)a 1/2 i (β) φ is a scale (overdispersion) parameter, A i = v(μ i ), expressing the mean-variance relation (this is a function of β), R i (α) describes the correlation structure: If independence is assumed then R i (α) =I ni. Other structures, such as compound symmetry, AR(1),...can be assumed as well. ESALQ Course on Models for Longitudinal and Incomplete Data 104
110 10.3 GEE2 Model: Marginal mean structure Pairwise association: Odds ratios Correlations Working assumptions: Third and fourth moments Estimation: Second-order estimating equations Likelihood (assuming 3rd and 4th moments are correctly specified) ESALQ Course on Models for Longitudinal and Incomplete Data 105
111 10.4 Alternating Logistic Regression Diggle, Heagerty, Liang, and Zeger (2002) and Molenberghs and Verbeke (2005) When marginal odds ratios are used to model association, α can be estimated using ALR, which is almost as efficient as GEE2 almost as easy (computationally) than GEE1 μ ijk as before and α ijk =ln(ψ ijk ) the marginal log odds ratio: logit Pr(Y ij =1 x ij )=x ij β logit Pr(Y ij =1 Y ik = y ik )=α ijk y ik +ln μ ij μ ijk 1 μ ij μ ik + μ ijk ESALQ Course on Models for Longitudinal and Incomplete Data 106
112 α ijk can be modelled in terms of predictors the second term is treated as an offset the estimating equations for β and α are solved in turn, and the alternating between both sets is repeated until convergence. this is needed because the offset clearly depends on β. ESALQ Course on Models for Longitudinal and Incomplete Data 107
113 10.5 Application to the Toenail Data The model Consider the model: Y ij Bernoulli(μ ij ), log μ ij 1 μ ij = β 0 + β 1 T i + β 2 t ij + β 3 T i t ij Y ij : severe infection (yes/no) at occasion j for patient i t ij : measurement time for occasion j T i : treatment group ESALQ Course on Models for Longitudinal and Incomplete Data 108
114 Standard GEE SAS Code: proc genmod data=test descending; class idnum timeclss; model onyresp = treatn time treatn*time / dist=binomial; repeated subject=idnum / withinsubject=timeclss type=exch covb corrw modelse; run; SAS statements: The REPEATED statements defines the GEE character of the model. type= : working correlation specification (UN, AR(1), EXCH, IND,...) modelse : model-based s.e. s on top of default empirically corrected s.e. s corrw : printout of working correlation matrix withinsubject= : specification of the ordering within subjects ESALQ Course on Models for Longitudinal and Incomplete Data 109
115 Selected output: Regression parameters: Analysis Of Initial Parameter Estimates Standard Wald 95% Chi- Parameter DF Estimate Error Confidence Limits Square Intercept treatn time treatn*time Scale Estimates from fitting the model, ignoring the correlation structure, i.e., from fitting a classical GLM to the data, using proc GENMOD. The reported log-likelihood also corresponds to this model, and therefore should not be interpreted. The reported estimates are used as starting values in the iterative estimation procedure for fitting the GEE s. ESALQ Course on Models for Longitudinal and Incomplete Data 110
116 Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > Z Intercept treatn time <.0001 treatn*time Analysis Of GEE Parameter Estimates Model-Based Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > Z Intercept <.0001 treatn time <.0001 treatn*time The working correlation: Exchangeable Working Correlation Correlation ESALQ Course on Models for Longitudinal and Incomplete Data 111
117 Alternating Logistic Regression type=exch logor=exch Note that α now is a genuine parameter ESALQ Course on Models for Longitudinal and Incomplete Data 112
118 Selected output: Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > Z Intercept treatn time <.0001 treatn*time Alpha <.0001 Analysis Of GEE Parameter Estimates Model-Based Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > Z Intercept treatn time <.0001 treatn*time ESALQ Course on Models for Longitudinal and Incomplete Data 113
119 Linearization Based Method GLIMMIX macro: %glimmix(data=test, procopt=%str(method=ml empirical), stmts=%str( class idnum timeclss; model onyresp = treatn time treatn*time / solution; repeated timeclss / subject=idnum type=cs rcorr; ), error=binomial, link=logit); GLIMMIX procedure: proc glimmix data=test method=rspl empirical; class idnum; model onyresp (event= 1 ) = treatn time treatn*time / dist=binary solution; random _residual_ / subject=idnum type=cs; run; ESALQ Course on Models for Longitudinal and Incomplete Data 114
120 Both produce the same results The GLIMMIX macro is a MIXED core, with GLM-type surrounding statements The GLIMMIX procedure does not call MIXED, it has its own engine PROC GLIMMIX combines elements of MIXED and of GENMOD RANDOM residual is the PROC GLIMMIX way to specify residual correlation ESALQ Course on Models for Longitudinal and Incomplete Data 115
121 Results of Models Fitted to Toenail Data Effect Par. IND EXCH UN GEE1 Int. β (0.109;0.171) (0.134;0.173) (0.166;0.173) T i β (0.157;0.251) 0.012(0.187;0.261) 0.072(0.235;0.246) t ij β (0.025;0.030) (0.021;0.031) (0.028;0.029) T i t ij β (0.039;0.055) (0.036;0.057) (0.047;0.052) ALR Int. β (0.157;0.169) T i β (0.222;0.243) t ij β (0.023;0.030) T i t ij β (0.039;0.052) Ass. α 3.222( ;0.291) Linearization based method Int. β (0.112;0.171) (0.142;0.174) (0.171;0.172) T i β (0.160;0.251) 0.011(0.196;0.262) 0.036(0.242;0.242) t ij β (0.025;0.030) (0.022;0.031) (0.038;0.034) T i t ij β (0.040:0.055) (0.038;0.057) (0.058;0.058) estimate (model-based s.e.; empirical s.e.) ESALQ Course on Models for Longitudinal and Incomplete Data 116
122 Discussion GEE1: All empirical standard errors are correct, but the efficiency is higher for the more complex working correlation structure, as seen in p-values for T i t ij effect: Structure p-value IND EXCH UN Thus, opting for reasonably adequate correlation assumptions still pays off, in spite of the fact that all are consistent and asymptotically normal Similar conclusions for linearization-based method ESALQ Course on Models for Longitudinal and Incomplete Data 117
123 Model-based s.e. and empirically corrected s.e. in reasonable agreement for UN Typically, the model-based standard errors are much too small as they are based on the assumption that all observations in the data set are independent, hereby overestimating the amount of available information, hence also overestimating the precision of the estimates. ALR: similar inferences but now also α part of the inferences ESALQ Course on Models for Longitudinal and Incomplete Data 118
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationIntroduction to mixed model and missing data issues in longitudinal studies
Introduction to mixed model and missing data issues in longitudinal studies Hélène Jacqmin-Gadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationChapter 29 The GENMOD Procedure. Chapter Table of Contents
Chapter 29 The GENMOD Procedure Chapter Table of Contents OVERVIEW...1365 WhatisaGeneralizedLinearModel?...1366 ExamplesofGeneralizedLinearModels...1367 TheGENMODProcedure...1368 GETTING STARTED...1370
More information200612 - ADL - Longitudinal Data Analysis
Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 200 - FME - School of Mathematics and Statistics 715 - EIO - Department of Statistics and Operations Research 749 - MAT - Department
More informationHighlights the connections between different class of widely used models in psychological and biomedical studies. Multiple Regression
GLMM tutor Outline 1 Highlights the connections between different class of widely used models in psychological and biomedical studies. ANOVA Multiple Regression LM Logistic Regression GLM Correlated data
More informationHow to use SAS for Logistic Regression with Correlated Data
How to use SAS for Logistic Regression with Correlated Data Oliver Kuss Institute of Medical Epidemiology, Biostatistics, and Informatics Medical Faculty, University of Halle-Wittenberg, Halle/Saale, Germany
More informationAnalysis of Correlated Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington
Analysis of Correlated Data Patrick J Heagerty PhD Department of Biostatistics University of Washington Heagerty, 6 Course Outline Examples of longitudinal data Correlation and weighting Exploratory data
More informationOverview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models
Overview 1 Introduction Longitudinal Data Variation and Correlation Different Approaches 2 Mixed Models Linear Mixed Models Generalized Linear Mixed Models 3 Marginal Models Linear Models Generalized Linear
More informationSPPH 501 Analysis of Longitudinal & Correlated Data September, 2012
SPPH 501 Analysis of Longitudinal & Correlated Data September, 2012 TIME & PLACE: Term 1, Tuesday, 1:30-4:30 P.M. LOCATION: INSTRUCTOR: OFFICE: SPPH, Room B104 Dr. Ying MacNab SPPH, Room 134B TELEPHONE:
More informationLinear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure
Technical report Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure Table of contents Introduction................................................................ 1 Data preparation
More informationSPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way
More informationLongitudinal Data Analysis
Longitudinal Data Analysis Acknowledge: Professor Garrett Fitzmaurice INSTRUCTOR: Rino Bellocco Department of Statistics & Quantitative Methods University of Milano-Bicocca Department of Medical Epidemiology
More informationReview of the Methods for Handling Missing Data in. Longitudinal Data Analysis
Int. Journal of Math. Analysis, Vol. 5, 2011, no. 1, 1-13 Review of the Methods for Handling Missing Data in Longitudinal Data Analysis Michikazu Nakai and Weiming Ke Department of Mathematics and Statistics
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More informationModels for discrete longitudinal data Analysing discrete correlated data using R. Christophe Lalanne
Models for discrete longitudinal data Analysing discrete correlated data using R Christophe Lalanne Novembre 2007 Preface This textbook is intended to the R user already accustomed to the Generalized Linear
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationLecture 6: Poisson regression
Lecture 6: Poisson regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction EDA for Poisson regression Estimation and testing in Poisson regression
More informationHandling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
More informationOverview of Methods for Analyzing Cluster-Correlated Data. Garrett M. Fitzmaurice
Overview of Methods for Analyzing Cluster-Correlated Data Garrett M. Fitzmaurice Laboratory for Psychiatric Biostatistics, McLean Hospital Department of Biostatistics, Harvard School of Public Health Outline
More informationShort-Course: Introduction to regression analysis for longitudinal data (Part 1)
Short-Course: Introduction to regression analysis for longitudinal data (Part 1) Instructor: Juan Carlos Salazar-Uribe *. School of Statistics. Faculty of Sciences National University of Colombia at Medellin
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationLecture 18: Logistic Regression Continued
Lecture 18: Logistic Regression Continued Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina
More informationLecture 14: GLM Estimation and Logistic Regression
Lecture 14: GLM Estimation and Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South
More informationRandom Effects Models for Longitudinal Data
Random Effects Models for Longitudinal Data Geert Verbeke 1,2 Geert Molenberghs 2,1 Dimitris Rizopoulos 3 1 I-BioStat, Katholieke Universiteit Leuven, Kapucijnenvoer 35, B-3000 Leuven, Belgium 2 I-BioStat,
More informationMultivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
More informationIndividual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA
Paper P-702 Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA ABSTRACT Individual growth models are designed for exploring longitudinal data on individuals
More informationDepartment of Epidemiology and Public Health Miller School of Medicine University of Miami
Department of Epidemiology and Public Health Miller School of Medicine University of Miami BST 630 (3 Credit Hours) Longitudinal and Multilevel Data Wednesday-Friday 9:00 10:15PM Course Location: CRB 995
More informationMATH5885 LONGITUDINAL DATA ANALYSIS
MATH5885 LONGITUDINAL DATA ANALYSIS Semester 1, 2013 CRICOS Provider No: 00098G 2013, School of Mathematics and Statistics, UNSW MATH5885 Course Outline Information about the course Course Authority: William
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationVI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
More informationThe Probit Link Function in Generalized Linear Models for Data Mining Applications
Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications
More informationPrediction for Multilevel Models
Prediction for Multilevel Models Sophia Rabe-Hesketh Graduate School of Education & Graduate Group in Biostatistics University of California, Berkeley Institute of Education, University of London Joint
More informationTechnical report. in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE
Linear mixedeffects modeling in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE Table of contents Introduction................................................................3 Data preparation for MIXED...................................................3
More informationRandom Effects Models for Longitudinal Data
Chapter 2 Random Effects Models for Longitudinal Data Geert Verbeke, Geert Molenberghs, and Dimitris Rizopoulos Abstract Mixed models have become very popular for the analysis of longitudinal data, partly
More informationi=1 In practice, the natural logarithm of the likelihood function, called the log-likelihood function and denoted by
Statistics 580 Maximum Likelihood Estimation Introduction Let y (y 1, y 2,..., y n be a vector of iid, random variables from one of a family of distributions on R n and indexed by a p-dimensional parameter
More informationLecture 8: Gamma regression
Lecture 8: Gamma regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Models with constant coefficient of variation Gamma regression: estimation and testing
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationAPPLIED MISSING DATA ANALYSIS
APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview
More informationA course on Longitudinal data analysis : what did we learn? COMPASS 11/03/2011
A course on Longitudinal data analysis : what did we learn? COMPASS 11/03/2011 1 Abstract We report back on methods used for the analysis of longitudinal data covered by the course We draw lessons for
More informationModel Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc.
Paper 264-26 Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc. Abstract: There are several procedures in the SAS System for statistical modeling. Most statisticians who use the SAS
More informationAuxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationAn extension of the factoring likelihood approach for non-monotone missing data
An extension of the factoring likelihood approach for non-monotone missing data Jae Kwang Kim Dong Wan Shin January 14, 2010 ABSTRACT We address the problem of parameter estimation in multivariate distributions
More informationAssignments Analysis of Longitudinal data: a multilevel approach
Assignments Analysis of Longitudinal data: a multilevel approach Frans E.S. Tan Department of Methodology and Statistics University of Maastricht The Netherlands Maastricht, Jan 2007 Correspondence: Frans
More informationSAS Syntax and Output for Data Manipulation:
Psyc 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (in preparation) chapter 5. We will be examining
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationFactorial experimental designs and generalized linear models
Statistics & Operations Research Transactions SORT 29 (2) July-December 2005, 249-268 ISSN: 1696-2281 www.idescat.net/sort Statistics & Operations Research c Institut d Estadística de Transactions Catalunya
More information10 Dichotomous or binary responses
10 Dichotomous or binary responses 10.1 Introduction Dichotomous or binary responses are widespread. Examples include being dead or alive, agreeing or disagreeing with a statement, and succeeding or failing
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More informationUsing Repeated Measures Techniques To Analyze Cluster-correlated Survey Responses
Using Repeated Measures Techniques To Analyze Cluster-correlated Survey Responses G. Gordon Brown, Celia R. Eicheldinger, and James R. Chromy RTI International, Research Triangle Park, NC 27709 Abstract
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More informationMissing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional
More informationPower and sample size in multilevel modeling
Snijders, Tom A.B. Power and Sample Size in Multilevel Linear Models. In: B.S. Everitt and D.C. Howell (eds.), Encyclopedia of Statistics in Behavioral Science. Volume 3, 1570 1573. Chicester (etc.): Wiley,
More informationReject Inference in Credit Scoring. Jie-Men Mok
Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business
More informationLogistic Regression (1/24/13)
STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used
More informationPATTERN MIXTURE MODELS FOR MISSING DATA. Mike Kenward. London School of Hygiene and Tropical Medicine. Talk at the University of Turku,
PATTERN MIXTURE MODELS FOR MISSING DATA Mike Kenward London School of Hygiene and Tropical Medicine Talk at the University of Turku, April 10th 2012 1 / 90 CONTENTS 1 Examples 2 Modelling Incomplete Data
More informationMissing data and net survival analysis Bernard Rachet
Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics Warwick, 27-29 July 2015 Missing data and net survival analysis Bernard Rachet General context Population-based,
More informationLogistic Regression (a type of Generalized Linear Model)
Logistic Regression (a type of Generalized Linear Model) 1/36 Today Review of GLMs Logistic Regression 2/36 How do we find patterns in data? We begin with a model of how the world works We use our knowledge
More informationUsing PROC MIXED in Hierarchical Linear Models: Examples from two- and three-level school-effect analysis, and meta-analysis research
Using PROC MIXED in Hierarchical Linear Models: Examples from two- and three-level school-effect analysis, and meta-analysis research Sawako Suzuki, DePaul University, Chicago Ching-Fan Sheu, DePaul University,
More informationPrinciple Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate
More informationMultivariate Statistical Inference and Applications
Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationLife Table Analysis using Weighted Survey Data
Life Table Analysis using Weighted Survey Data James G. Booth and Thomas A. Hirschl June 2005 Abstract Formulas for constructing valid pointwise confidence bands for survival distributions, estimated using
More informationIntroduction to Longitudinal Data Analysis
Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction
More informationSun Li Centre for Academic Computing lsun@smu.edu.sg
Sun Li Centre for Academic Computing lsun@smu.edu.sg Elementary Data Analysis Group Comparison & One-way ANOVA Non-parametric Tests Correlations General Linear Regression Logistic Models Binary Logistic
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationModels for Longitudinal and Clustered Data
Models for Longitudinal and Clustered Data Germán Rodríguez December 9, 2008, revised December 6, 2012 1 Introduction The most important assumption we have made in this course is that the observations
More informationAn Introduction to Modeling Longitudinal Data
An Introduction to Modeling Longitudinal Data Session I: Basic Concepts and Looking at Data Robert Weiss Department of Biostatistics UCLA School of Public Health robweiss@ucla.edu August 2010 Robert Weiss
More informationApplied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne
Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationA Composite Likelihood Approach to Analysis of Survey Data with Sampling Weights Incorporated under Two-Level Models
A Composite Likelihood Approach to Analysis of Survey Data with Sampling Weights Incorporated under Two-Level Models Grace Y. Yi 13, JNK Rao 2 and Haocheng Li 1 1. University of Waterloo, Waterloo, Canada
More informationMISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
More informationReal longitudinal data analysis for real people: Building a good enough mixed model
Tutorial in Biostatistics Received 28 July 2008, Accepted 30 September 2009 Published online 10 December 2009 in Wiley Interscience (www.interscience.wiley.com) DOI: 10.1002/sim.3775 Real longitudinal
More informationLogit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
More informationBIO 226: APPLIED LONGITUDINAL ANALYSIS COURSE SYLLABUS. Spring 2015
BIO 226: APPLIED LONGITUDINAL ANALYSIS COURSE SYLLABUS Spring 2015 Instructor: Teaching Assistants: Dr. Brent Coull HSPH Building II, Room 413 Phone: (617) 432-2376 E-mail: bcoull@hsph.harvard.edu Office
More informationAdequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection
Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics
More informationAn Application of the G-formula to Asbestos and Lung Cancer. Stephen R. Cole. Epidemiology, UNC Chapel Hill. Slides: www.unc.
An Application of the G-formula to Asbestos and Lung Cancer Stephen R. Cole Epidemiology, UNC Chapel Hill Slides: www.unc.edu/~colesr/ 1 Acknowledgements Collaboration with David B. Richardson, Haitao
More informationActuarial Statistics With Generalized Linear Mixed Models
Actuarial Statistics With Generalized Linear Mixed Models Katrien Antonio Jan Beirlant Revision Feburary 2006 Abstract Over the last decade the use of generalized linear models (GLMs) in actuarial statistics
More informationIntroducing the Multilevel Model for Change
Department of Psychology and Human Development Vanderbilt University GCM, 2010 1 Multilevel Modeling - A Brief Introduction 2 3 4 5 Introduction In this lecture, we introduce the multilevel model for change.
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationStatistical Rules of Thumb
Statistical Rules of Thumb Second Edition Gerald van Belle University of Washington Department of Biostatistics and Department of Environmental and Occupational Health Sciences Seattle, WA WILEY AJOHN
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationLOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as
LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More informationBayesX - Software for Bayesian Inference in Structured Additive Regression
BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich
More informationCHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA
Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or
More informationService courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
More informationChapter 1. Longitudinal Data Analysis. 1.1 Introduction
Chapter 1 Longitudinal Data Analysis 1.1 Introduction One of the most common medical research designs is a pre-post study in which a single baseline health status measurement is obtained, an intervention
More information2. Making example missing-value datasets: MCAR, MAR, and MNAR
Lecture 20 1. Types of missing values 2. Making example missing-value datasets: MCAR, MAR, and MNAR 3. Common methods for missing data 4. Compare results on example MCAR, MAR, MNAR data 1 Missing Data
More informationEDMS 769L: Statistical Analysis of Longitudinal Data 1809 PAC, Th 4:15-7:00pm 2009 Spring Semester
Instructor Dr. Jeffrey Harring 1230E Benjamin Building Phone: (301) 405-3630 Email: harring@umd.edu Office Hours Tuesday 2:00-3:00pm, or by appointment Course Objectives, Description and Prerequisites
More information15 Ordinal longitudinal data analysis
15 Ordinal longitudinal data analysis Jeroen K. Vermunt and Jacques A. Hagenaars Tilburg University Introduction Growth data and longitudinal data in general are often of an ordinal nature. For example,
More informationWhat s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
More informationOverview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
More information