Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models.
|
|
- David Patterson
- 7 years ago
- Views:
Transcription
1 Overview 1 Introduction Longitudinal Data Variation and Correlation Different Approaches 2 Mixed Models Generalized 3 Marginal Models 4 Transition Models 5 Further Topics
2 The General Linear Mixed Model (LMM) Y = X β + Zb + ε, (1) X and Z known design matrices (n p and n r) β vector of unknown fixed parameters b vector of random effects ε vector of unobservable random errors Assumptions: Independence of b and ε, and ( ) ( ) ( b 0 b E = and Cov ε 0 ε ) = ( D 0 0 R ) (2) Typically, also multivariate normality of (b, ε ) is assumed. (3) Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
3 The Longitudinal Linear Mixed Model The longitudinal linear mixed model is typically given on the level of the subject. Y i = X i β + Z i b i + ε i, b i N(0, D 0 ), ε i N(0, R i ) (4) where the subscript i indicates subject-specific quantities (Laird & Ware, 1982). A typical longitudinal model is the growth-curve model Y ij = β 0 + β 1 t ij + b 0i + b 1i t ij + ε ij, j = 1,..., n i, i = 1,..., I, with ( ) ( ( )) b0i iid σ 2 N 0, 1 σ 12 b 1i σ 12 σ2 2 independent of iid ε ij N(0, σ 2 ). ( ) σ 2 Here, D 0 = 1 σ 12 σ 12 σ2 2 and R i = σ 2 I ni. Example: A two-stage analysis blackboard Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
4 The Longitudinal Linear Mixed Model Longitudinal linear mixed model (LLMM) special case of LMM (1). Let D, R and Z be block-diagonal with blocks D 0, R i and Z i, respectively. Stack Y i, X i, b i and ε i to obtain Y, X, b and ε. LLMM assumes that Y can be divided into independent subvectors Y i for the ith subject. LMM (1) more general - allows for inclusion of other model components such as smooth functions modeled using penalized splines in the mixed model framework. Loss of this independence, but very flexible models. LMM (1) can also incorporate additional clusters (e.g. growth curves for children clustered in schools), or nested structures of random effects. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
5 Excursus: Penalized Splines in a Nutshell Consider the nonparametric regression problem with unknown smooth function m( ). Y i = m(x i ) + ε i, i = 1,..., n, Approximate m( ) by a linear combination of (many) spline basis functions, e.g. truncated polynomials, m(x) q K β j x j + b j (x κ j ) q +, j=0 where κ 1,..., κ K is a sequence of knots, and u q + = (max{0, u}) q. This approximation gives a piecewise polynomial function of degree q with certain smoothness properties (continuity of (q 1)th derivative). j=1 Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
6 Excursus: Penalized Splines in a Nutshell Quadratic Spline Basis Three Example Functions κ 1 κ 2 κ 3 κ 4 κ 5 κ 1 κ 2 κ 3 κ 4 κ 5 Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
7 Excursus: Penalized Splines in a Nutshell Estimation using a regularization penalty, to avoid overly wiggly function: (Y X β Zb) (Y X β Zb) + 1 λ b b min β,b, X and Z design matrices for the spline basis functions x j resp. (x κ j ) q + λ a smoothing parameter. Penalized least squares estimation equivalent to estimation in the LMM with fixed effects β 0,..., β q: subspace of polynomial functions (degree q) random effects b 1,..., b K iid N(0, σ 2 b): deviations from subspace. Can estimate smoothing parameter λ = σ 2 b /σ2 ε data driven. Analogous for other basis choices (e.g. B-Splines) and for spatial effects, interaction surfaces, varying coefficients,... can combine with random effects for longitudinal data in a general LMM (additive mixed model). Examples: smooth dose-response functions for covariates, smooth profiles over time, time-varying effects, spatial surfaces (spatial information on subjects),.... More detailed information: Ruppert, Wand & Carroll, Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
8 D, R and V D, R and Z together imply the correlation structure for Y, as V = Cov(Y ) = ZDZ + R. Z b with D models differences between subjects, e.g. in their average level and average evolution in time. R models the residual serial correlation not explained by Zb, as well as measurement error. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
9 Some common models and implications for V The random intercept model Y ij = x ij β + b i + ε ij, b i iid N(0, σ 2 b ), ε ij iid N(0, σ 2 ε ) with R = σεi 2 n, D = σb 2I I, D 0 = σb 2 and Z i = (1,..., 1) implies V is block-diagonal, with the ith block for the ith subject equal to σε 2 + σb 2 σb 2... σb 2 σb 2 σε 2 + σb 2... σ 2 b σb 2 σb 2... σε 2 + σb 2 This is called compound symmetry. In a marginal model, this is often written with σ 2 for σε 2 + σb 2 and ρσ2 for σb 2. (Then, the correlation ρ could be negative.) This implies that all observations on a subject are equally correlated, i.e. it assumes that observations on the same subject closer in time are not more similar than observations further away in time. (E.g. blood marker with short half-life.) Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
10 Some common models and implications for V The growth-curve model ( iid σ 2 Y ij = β 0 + β 1 t ij + b 0i + b 1i t ij + ε ij, b i N (0, 1 σ 12 σ 12 σ2 2 ( σ 2 with D block-diagonal with blocks 1 σ 12 σ 12 σ2 2 R = σεi 2 n, implies )), ε ij iid N(0, σ 2 ε ) ) ( , Z i = t t Ini Var(Y ij ) = σ σ 12 t ij + σ 2 2t 2 ij + σ 2 ε and Cov(Y ij, Y ik ) = σ σ 12 t ij + σ 12 t ik + σ 2 2t ij t ik, j k. ) and In particular, the variance function is quadratic in time. Adding a quadratic term b 2i t 2 ij would make it a forth-order polynomial in time. (Extrapolation??) Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
11 Residual Covariance Structure Often, the conditional independence model R = σ 2 εi n is chosen, which implies that the responses of a subject i are independent, given β and b i. This may be unrealistically simplistic. If the covariance structure can not be meaningfully explained by random effects alone, more general forms of R should be chosen. R is typically chosen to be block-diagonal with blocks depending on i only through n i and the time points t ij, j = 1,..., n i. Usually, it is modeled as Cov(ε ij, ε ik ) = σ 2 + τ 2 g( t ij t ik ) with a suitable decreasing function g( ) with g(0) = 1 and lim g(u) = 0. u Two frequently used functions are the exponential and Gaussian correlation functions g(u) = exp( u) and g(u) = exp( φu 2 ) (with φ > 0). In applications, however, the effect of serial correlation is often dominated by the combination of random effects and measurement error (also causing estimation problems when all are present in the model). Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
12 Conditional and Marginal Perspectives Conditional perspective on the mixed model: Y b N(X β + Zb, σ 2 I n ), b N(0, D). (5) Interpretation: Random effects are subject-specific effects on the mean that vary within the population and are estimated subject to a regularization constraint. Marginal perspective on the mixed model: Y N(X β, V ), (6) where V = Cov(Y ) = ZDZ + R. Interpretation: The random effects induce a correlation structure and thus enable a proper statistical analysis of the correlated data. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
13 Inference in the Linear Mixed Model What is the likelihood for the linear mixed model (1)-(3)? Inference is usually based on the marginal model (6) for Y (unless Bayesian methods are used). Then, the unknown parameters are (β, θ), where θ contains the unknown parameters in D and R (and thus V ). (5) implies (6), but they are not equivalent! Example: random intercept + independent errors compound symmetry for V σb 2 covariance in (6) could be negative variance in (5) has to be non-negative σ 2 b And (6) does not include b. Not as much of a problem if interest is in (β, θ), but does not work if interest is in b. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
14 Estimation of β For a given θ, (6) is simply a general linear model. We can estimate β using generalized least squares (Y X β) V 1 (Y X β) min β which gives β = ( X V 1 X ) 1 X V 1 Y. (7) (We assume that the inverses V 1 and ( X V 1 X ) 1 exist. Generalizations use generalized inverses.) This is also the maximum likelihood (ML) estimator when we have the normality assumption (3): The log-likelihood for β and θ from the marginal model (6) is l(β, θ) = n 2 log(2π) 1 2 log V 1 2 (Y X β) V 1 (Y X β). (8) β l(β, θ) = X V 1 (Y X β)! = 0 β = ( X V 1 X ) 1 X V 1 Y. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
15 Estimation of β β is also the BLUE, the best linear unbiased estimator (Gauss-Markov theorem): Let β = a + By be another linear unbiased estimator of β. E( β) = a + BX β = β a = 0 and BX = I p. For all c R p Var(c β) Var(c β) = c BVB c c (X V 1 X ) 1 X VX (X V 1 X ) 1 c = c BVB c c BX (X V 1 X ) 1 X B c = c BV 1/2 [I n V 1/2 X (X V 1 X ) 1 X V 1/2 ]V 1/2 B c 0 as the middle matrix is a projection matrix and thus positive semi-definite. As β is unbiased with smallest variance, it is the BLUE. With the normality assumption (3), β can even be shown to be the best unbiased estimator. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
16 Estimation of θ Maximum likelihood estimation Substitute β = β(θ) into (8) profile log-likelihood l( β(θ), θ). Maximize to obtain ML estimate for θ. No closed form solution in general numerical maximization. Restricted maximum likelihood estimation Maximum likelihood estimates for variances are biased downward. iid n Example: Y i N(µ, σ 2 ), ML estimator σ 2 = 1 n (Y i Ȳ ) 2, but E( σ 2 ) = n 1 i=1 n Everyone uses σ 2 = 1 n 1 (Y i Ȳ ) 2. i=1 This is the restricted maximum likelihood estimate. It accounts for estimation of µ by Ȳ. n σ2. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
17 Restricted Maximum Likelihood Estimation of θ Idea: Use likelihood not of Y, but of (n p) linearly independent error contrasts AY with distribution independent of unknown β. (Patterson & Thompson, 1971). Split the log-likelihood into two parts for AY (for θ) and BY (for β) with 1 A of rank n p and B of rank p. 2 The two parts are statistically independent, i.e. (here:) Cov(AY, BY ) = 0. 3 AY are error contrasts, i.e. E(AY ) = 0. 4 BX is of rank p ( BY estimates one-to-one function of β). Suitable matrices: A = I n X (X X ) 1 X and B = (X V 1 X ) 1 X V 1. Alternative name residual maximum likelihood estimation. From the first two points, the log-likelihood is l = l + l (up to an additive constant), l for AY and l for BY. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
18 Restricted Maximum Likelihood Estimation of θ Maximizing l (from BY ) with respect to β gives again the MLE/BLUE β (depending on θ). Maximizing l (from AY ) with respect to θ: Harville (1974) showed that the restricted log-likelihood l is (up to an additive constant) independent of the precise error contrasts used. He derived the restricted log-likelihood, which is (up to a constant) l(θ) = 1 2 log V 1 2 log X V 1 X 1 2 (Y X β) V 1 (Y X β) = l( β(θ), θ) 1 2 log X V 1 X (Proof: blackboard) Maximize restricted maximum likelihood (REML) estimate for θ. Example: In a linear model, this gives the usual estimate for σ 2 (dividing by n p). It takes into account the loss in degrees of freedom from estimation of β. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
19 Restricted Maximum Likelihood Estimation of θ The restricted likelihood is also the conditional likelihood for θ given the sufficient statistic BY for β and it can be argued that this entails no information loss (Smyth & Verbyla, 1996). The concept that BY contains no information about θ absent knowledge about β has also been called marginal sufficiency (Sprott, 1975; Harville, 1977). Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
20 Implementation of ML or REML Lindstrom & Bates, JASA, 1988 derive the first- and second-order derivatives of log-likelihood and restricted log-likelihood. They give details of implementation of a Newton-Raphson Algorithm for estimation. Note that the requirements for θ in the marginal model (V positive (semi-)definite) are different from those in the conditional model (D 0 and R i positive (semi-)definite). Thus, software packages maximize for θ often over a larger parameter space than the conditional model implies. E.g. SAS proc mixed (at least v6.12) by default only requires the diagonal elements of D 0 and R i to be positive. Also, be aware of other numerical issues. For example, R lme() maximizes the (restricted) log-likelihood with respect to the scaled logarithm of the variances, and thus can never find a maximum at zero (Pinheiro & Bates, 2000, section 2.2). An alternative is the EM algorithm, treating the random effects as missing data, which, however, can be slow to converge. R lme() uses a hybrid of EM and then Newton-Raphson. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
21 EBLUE and EBLUP After estimation of θ, we can substitute θ into the formula β = β(θ) to obtain an estimate. This is the estimated BLUE (EBLUE). Often it is also of interest to obtain predictions of b (e.g. find unusual subjects). Then, we have to go back to the conditional model formulation (5). The best linear unbiased predictor (BLUP) for b is (see next slide) b = DZ V 1 (Y X β). The combined BLUE and BLUP (or collectively referred to as BLUPs) can also be written as ( ) β = (C b R 1 C + G) 1 C R 1 Y, where C = (X Z) and G = ( D 1 ). The estimated BLUP (EBLUP) is obtained by substituting θ for θ in D, V and β. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
22 BLUE and BLUP BLUE and BLUP (or collectively referred to as BLUP) β and b are the solution to the simultaneous Henderson s mixed model equations (Henderson, 1950) X R 1 X β + X R 1 Z b = X R 1 Y (9) Z R 1 X β + (Z R 1 Z + D 1 ) b = Z R 1 Y There are (at least) four different justifications for these equations (Robinson, 1991) a) They arise when maximizing the joint density of Y and b. b) In a Bayesian approach, regard β as a parameter with a uniform improper prior distribution, independent of b. (9) gives the posterior mode for β and b. Thus, the resulting estimates are also called empirical Bayes estimates. c) β and b have minimum mean squared error within the class of linear unbiased estimates for β and b. (Unbiasedness for b: E( b) = E(b) = 0.) d) Goldberger s derivation: For a future observation Y = x β + z b + ε, the best linear unbiased predictor is x β + z b. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
23 Shrinkage b is a shrinkage estimator of b, i.e. its components have less spread than the generalized least squares estimators would have when treating b as fixed effects. This seems to contradict the unbiasedness property of the BLUP. However, as Robinson, 1991 points out, unbiasedness here means not E( b) = E(b) = 0, E( b b) = b for all b. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
24 Shrinkage Ŷ i = X i β + Z i bi = X i β + Z i DZ iv 1 i (Y i X i β) = (V i Z i DZ i)v 1 i X i β + Z i DZ iv 1 i Y = R i V 1 i X i β + Z i DZ iv 1 i Y i The BLUP for Y i thus is a weighted average of the population-averaged profile X i β and the individual data Yi. The observed data is shrunk towards the population-averaged profile ( borrowing of strength ). Note that V i = R i + Z i DZ i. If R i V 1 i is large, i.e. the residual variability (R i ) is large compared to the between-subject variability (Z i DZ i), much weight is given to the population-averaged profile X i β. If R i V 1 i is small, the opposite is the case. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
25 Inference for β For known θ, we have β (β, (X V 1 X ) 1 ). Given the normality assumption (3), or asymptotically, β is also normal. In practice, for unknown θ, the covariance has to be estimated as (X V 1 X ) 1 = (X V ( θ) 1 X ) 1. Reported standard errors and confidence intervals are usually based on this estimated covariance matrix. Note that this ignores the uncertainty in θ. This covariance estimate is model-based. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
26 Robust Variance Estimation What happens if our model for Cov(Y ) is wrong? For example, we could assume compound symmetry with constant variance over time, but in fact variances could increase over time, and correlations might decrease with time distance. Suppose the true covariance is Cov(Y ) = V, but our model specifies instead W as the so-called working covariance matrix. Then, β = (X W 1 X ) 1 X W 1 Y, and β is still consistent. However, the covariance changes to Cov( β) = (X W 1 X ) 1 X W 1 V W 1 X (X W 1 X ) 1 and we could use β together with an estimated robust covariance matrix. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
27 Robust Variance Estimation A consistent estimator (for I ) for Cov( β) is (Liang & Zeger, 1986) ( IX i=1 X i c W 1 i X i «1j IX i=1 ff IX «X c i W i (Y i X i β)(y b i X i β) b c W i X i X c i W 1 i X i i=1 This is the sandwich estimator originally due to Huber, 1967 and White, 1980, also called robust or empirical variance estimator. This estimator is valid under more general conditions, though the model-based estimator has less variance (and no asymptotic bias) when the variance model is correctly specified, i.e. the robust is less efficient than the model-based variance estimator when we do a good job of estimating the covariance. It has also been shown that the empirical variance estimator can be highly biased when the number of clusters (here, I) is small. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
28 Hypothesis Testing for β We can use our variance estimators to construct Wald tests of H 0 : Lβ = 0 versus H A : Lβ 0 (10) or to construct confidence intervals. For this, we use that T W = ( β β) L (L(X V 1 X ) 1 L ) 1 L( β β) asymptotically follows a chi-square distribution with rank(l) degrees of freedom. (Or use the corresponding robust covariance estimate.) Note: The results given here for testing (of fixed effects or variance components) assume the longitudinal linear mixed model (4). Asymptotics are with regard to I. Results for more general types of linear mixed models often remain elusive (Ruppert, Wand & Carroll, 2003, section 4.8). Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
29 Hypothesis Testing for β The Wald test statistic is based on estimated standard errors which underestimate the true variability in β, as they ignore the variability introduced by estimating θ. This problem is often alleviated by using approximate t- or F -statistics instead: For a single parameter β j in β, the distribution of ( β j β j )/ŜE( β j ) is approximated by a t-distribution. For general hypotheses (10), an F -approximation to the distribution of ) 1 ( β β) L (L(X 1 V X ) 1 L L( β β) F = rank(l) is used, with numerator degrees of freedom rank(l). Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
30 Hypothesis Testing for β The t degrees of freedom or the denominator degrees of freedom for the F -distribution are estimated from the data. Common methods include a so-called Satterthwaite-type approximation (Satterthwaite, Psychometrika,1941). (The general idea is to match the moments of the distributions.) For small samples, the method by Kenward and Roger, Biometrics, 1997 gives better results, as it first estimates the amount by which the asymptotic covariance matrix underestimates (in a matrix sense) Cov( β), and then inflates the variance-covariance matrix by this amount before using a Satterthwaite-type approach. Computation of Satterthwaite-type degrees of freedom are computationally expensive and can add substantially to computation time. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
31 Hypothesis Testing for β We could also use a likelihood ratio test (LRT) based on large sample theory. When n is not that large, the LRT tends to be more reliable than the Wald test. For an LRT, we use that the test statistic T LRT = 2 sup l(β, θ) 2 sup l(β, θ) H A H 0 under the null hypothesis converges in distribution to a χ 2 rank(l) variable. We need nested models with the same covariance structure in each model. And don t use with REML estimation! (Due to the different fixed effects under null and alternative, the error contrasts are different and the two likelihoods thus not comparable.) Also, be aware that the approximation can be very poor when n p is small. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
32 Hypothesis Testing for D Suppose now we want to test for certain variance/covariance parameters. Let K be the dimension of D 0 and d d 1K d D 0 =.... D 1K 1. =.. d 1K... d KK d 1K... d KK We might be interested in testing 0 D 1 H 0 : D 0 =. with D 1 positive-definite (K 1 K 1) H A : D 0 positive-semidefinite (K K). versus An example would be to test for a random intercept model versus a growth-curve model (K = 2), or for no random effect versus a random intercept (K = 1). Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
33 Hypothesis Testing for D This case violates one of the standard regularity assumptions, that the tested parameters be in the interior of the parameter space. Instead, d KK as a variance has to be non-negative and d KK = 0 thus is on the boundary of the parameter space. (Actually, D 0 and D 1 have to fulfill more complicated restrictions to be positive-(semi)definite.) One can show that for the above hypothesis, T LRT under the null converges in distribution to a variable with a 0.5:0.5 mixture between a χ 2 K 1 and a χ2 K distribution (Self & Liang, JASA, 1987, Stram & Lee, Biometrics, 1994, 1995, Giampaoli & Singer, JSPI, 2009). The asymptotics assume the longitudinal linear mixed model (4) with I. Crainiceanu & Ruppert (JRSS-B, 2004) show that if Y cannot be subdivided into a large number of independent subvectors Y i, this mixture distribution is typically a bad approximation. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
34 Hypothesis Testing for D Crainiceanu & Ruppert (JRSS-B, 2004) also derive the exact distribution for the case K = 1 (for the general LMM (1)). This is implemented in the R-package RLRsim. Some extensions can be found in Greven et al (JCGS, 2008) and Scheipl, Greven & Küchenhoff (CSDA, 2008). Note also that the χ 2 K 1 : χ2 K mixture distribution given above is not valid, if a nuisance parameter is also on the boundary (e.g. if, when testing for a random slope, the random intercept is also (near) zero). Self & Liang, JASA, 1987 show that then the whole geometry of the problem changes. For more complex hypotheses for D 0, more complex chi-square mixtures can arise (Self & Liang, JASA, 1987, Stram & Lee, Biometrics, 1994, 1995). When testing only for variance components, one can also use a restricted LRT (RLRT) after REML estimation. The asymptotics are the same. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
35 Hypothesis Testing for D Molenberghs & Verbeke, AmerStatistician, 2007 look at Score-Tests and Wald-Tests in addition to LRTs and discuss how the same asymptotics apply to these two tests. However, they are more difficult to compute in the constraint case, as they require additional derivatives as well as numerical constraint optimization. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
36 Precision of the BLUPs Remember ( β b ) where C = (X Z) and G = = (C R 1 C + G) 1 C R 1 Y, ( D 1 The common variance-covariance matrix can be shown to be (blackboard) ( ) β Cov = (C b R 1 C + G) 1. b ). Note that Cov( b b) is used rather than Cov( b), as the latter does not recognize the random variation in b. Estimated versions can be used to construct approximate confidence/prediction intervals for expressions involving the BLUPs (such as Y i or new observations Y i ). Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
37 Model Selection Mixed model software also gives AIC and BIC values. These are based on the marginal likelihood, the likelihood from the marginal model (6). For example, the Akaike information criterion (AIC) is typically defined as AIC = 2 log l( β, θ) + 2k, where k = p d is the number of unknown parameters (if d is the number of unknown parameters in D). (The AIC using the marginal restricted log-likelihood is defined analogously.) However, be aware that the derivation of the AIC also assumes a parameter space R k and independent and identically distributed (i.i.d.) observations. The AIC thus suffers from similar problems to LRTs (boundary!) when it is used to select parameters in D / random effects. It will choose the smaller model more often than you think. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
38 Model Selection There are approaches to use an AIC based on the conditional log-likelihood for this case (Vaida & Blanchard, Biometrika, 2005; Liang, Wu & Zhou, Biometrika, 2008). However, they still suffer from problems and more research is needed: The first approach always chooses the larger model, and the second approach is computationally very demanding and not very robust (Greven & Kneib, TechReport, 2008). Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
39 Model Diagnostics Fully covering model diagnostics would lead to far here. A few remarks: Normality assumption for the random effects: Due to the shrinkage effect, the BLUPs b can look normal even if the true distribution for b is not normal (e.g. bimodal). Verbeke & Molenberghs, 2000 give a discussion on fitting mixture distributions for b instead to check for normality. Residual diagnostics: Remember that Cov(Y i X i β) = V i. Thus, the residual vector r i = Y i X i β will have mean zero, but will be correlated. Diagnostics are typically based on standardized residuals r i = L 1 i r i, where V i = L i L i is the Cholesky decomposition. (E.g. residual plots, Mahalanobis distance between observed and predicted responses, semi-variogram.) r i are approximately uncorrelated with unit variance. One could also use the subject-specific residuals y i X i β Z i bi to e.g. look at serial correlation. However, b i very much depends on the normality assumption for b, and is also influenced by the assumed structure for V i. Thus, these residuals are not well-suited to check assumptions on V i. Influence diagnostics: We can remove one subject from the analysis and recompute the parameter estimates. There are several distance measures to measure the influence of a single subject. Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
40 in R See examples_mixed_models.r for example code for linear mixed models additive mixed models testing for variance components. A reference for using the lme() function in the nlme package in R: Pinheiro & Bates, A reference for Additive Mixed Models or Generalized Additive Mixed Models using the mgcv package in R: Wood, For many examples of in SAS, see Verbeke & Molenberghs, Advanced Longitudinal Data Analysis Johns Hopkins University, Spring
Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models
Overview 1 Introduction Longitudinal Data Variation and Correlation Different Approaches 2 Mixed Models Linear Mixed Models Generalized Linear Mixed Models 3 Marginal Models Linear Models Generalized Linear
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationMultivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
More informationPenalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
More informationFitting Subject-specific Curves to Grouped Longitudinal Data
Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationLinear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure
Technical report Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure Table of contents Introduction................................................................ 1 Data preparation
More informationTechnical report. in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE
Linear mixedeffects modeling in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE Table of contents Introduction................................................................3 Data preparation for MIXED...................................................3
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationLongitudinal Data Analysis
Longitudinal Data Analysis Acknowledge: Professor Garrett Fitzmaurice INSTRUCTOR: Rino Bellocco Department of Statistics & Quantitative Methods University of Milano-Bicocca Department of Medical Epidemiology
More informationFactor analysis. Angela Montanari
Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationSYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
More information5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
More informationJoint models for classification and comparison of mortality in different countries.
Joint models for classification and comparison of mortality in different countries. Viani D. Biatat 1 and Iain D. Currie 1 1 Department of Actuarial Mathematics and Statistics, and the Maxwell Institute
More informationIntroduction to mixed model and missing data issues in longitudinal studies
Introduction to mixed model and missing data issues in longitudinal studies Hélène Jacqmin-Gadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models
More informationModel based clustering of longitudinal data: application to modeling disease course and gene expression trajectories
1 2 3 Model based clustering of longitudinal data: application to modeling disease course and gene expression trajectories 4 5 A. Ciampi 1, H. Campbell, A. Dyacheno, B. Rich, J. McCuser, M. G. Cole 6 7
More informationBayesX - Software for Bayesian Inference in Structured Additive Regression
BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationAuxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More information171:290 Model Selection Lecture II: The Akaike Information Criterion
171:290 Model Selection Lecture II: The Akaike Information Criterion Department of Biostatistics Department of Statistics and Actuarial Science August 28, 2012 Introduction AIC, the Akaike Information
More informationChapter 4: Vector Autoregressive Models
Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...
More informationPenalized Splines - A statistical Idea with numerous Applications...
Penalized Splines - A statistical Idea with numerous Applications... Göran Kauermann Ludwig-Maximilians-University Munich Graz 7. September 2011 1 Penalized Splines - A statistical Idea with numerous Applications...
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationCS229 Lecture notes. Andrew Ng
CS229 Lecture notes Andrew Ng Part X Factor analysis Whenwehavedatax (i) R n thatcomesfromamixtureofseveral Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting, we usually
More informationAdequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection
Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics
More informationChapter 6: Multivariate Cointegration Analysis
Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration
More informationMATHEMATICAL METHODS OF STATISTICS
MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS
More informationMissing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional
More informationReview of the Methods for Handling Missing Data in. Longitudinal Data Analysis
Int. Journal of Math. Analysis, Vol. 5, 2011, no. 1, 1-13 Review of the Methods for Handling Missing Data in Longitudinal Data Analysis Michikazu Nakai and Weiming Ke Department of Mathematics and Statistics
More informationLinear Models for Continuous Data
Chapter 2 Linear Models for Continuous Data The starting point in our exploration of statistical models in social research will be the classical linear model. Stops along the way include multiple linear
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationMISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationLeast-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
More informationApplied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne
Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationWhat s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationMaster s Theory Exam Spring 2006
Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem
More informationReal longitudinal data analysis for real people: Building a good enough mixed model
Tutorial in Biostatistics Received 28 July 2008, Accepted 30 September 2009 Published online 10 December 2009 in Wiley Interscience (www.interscience.wiley.com) DOI: 10.1002/sim.3775 Real longitudinal
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationCHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA
Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or
More informationSections 2.11 and 5.8
Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and
More informationINDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationHow To Understand The Theory Of Probability
Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More informationChapter 15. Mixed Models. 15.1 Overview. A flexible approach to correlated data.
Chapter 15 Mixed Models A flexible approach to correlated data. 15.1 Overview Correlated data arise frequently in statistical analyses. This may be due to grouping of subjects, e.g., students within classrooms,
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More informationDepartment of Epidemiology and Public Health Miller School of Medicine University of Miami
Department of Epidemiology and Public Health Miller School of Medicine University of Miami BST 630 (3 Credit Hours) Longitudinal and Multilevel Data Wednesday-Friday 9:00 10:15PM Course Location: CRB 995
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More informationSmoothing and Non-Parametric Regression
Smoothing and Non-Parametric Regression Germán Rodríguez grodri@princeton.edu Spring, 2001 Objective: to estimate the effects of covariates X on a response y nonparametrically, letting the data suggest
More informationA Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
More informationMultivariate normal distribution and testing for means (see MKB Ch 3)
Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
More informationImputing Missing Data using SAS
ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are
More informationPenalized Logistic Regression and Classification of Microarray Data
Penalized Logistic Regression and Classification of Microarray Data Milan, May 2003 Anestis Antoniadis Laboratoire IMAG-LMC University Joseph Fourier Grenoble, France Penalized Logistic Regression andclassification
More informationUsing Repeated Measures Techniques To Analyze Cluster-correlated Survey Responses
Using Repeated Measures Techniques To Analyze Cluster-correlated Survey Responses G. Gordon Brown, Celia R. Eicheldinger, and James R. Chromy RTI International, Research Triangle Park, NC 27709 Abstract
More information11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression
Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c11 2013/9/9 page 221 le-tex 221 11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationE(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F
Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,
More information**BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1.
**BEGINNING OF EXAMINATION** 1. You are given: (i) The annual number of claims for an insured has probability function: 3 p x q q x x ( ) = ( 1 ) 3 x, x = 0,1,, 3 (ii) The prior density is π ( q) = q,
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
More informationAn extension of the factoring likelihood approach for non-monotone missing data
An extension of the factoring likelihood approach for non-monotone missing data Jae Kwang Kim Dong Wan Shin January 14, 2010 ABSTRACT We address the problem of parameter estimation in multivariate distributions
More informationAN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS
AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS Revised Edition James Epperson Mathematical Reviews BICENTENNIAL 0, 1 8 0 7 z ewiley wu 2007 r71 BICENTENNIAL WILEY-INTERSCIENCE A John Wiley & Sons, Inc.,
More informationIndividual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA
Paper P-702 Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA ABSTRACT Individual growth models are designed for exploring longitudinal data on individuals
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationService courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
More informationThe Basic Two-Level Regression Model
2 The Basic Two-Level Regression Model The multilevel regression model has become known in the research literature under a variety of names, such as random coefficient model (de Leeuw & Kreft, 1986; Longford,
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More information3.1 Least squares in matrix form
118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationClustering in the Linear Model
Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationRegression Modeling Strategies
Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions
More informationIntroducing the Multilevel Model for Change
Department of Psychology and Human Development Vanderbilt University GCM, 2010 1 Multilevel Modeling - A Brief Introduction 2 3 4 5 Introduction In this lecture, we introduce the multilevel model for change.
More informationi=1 In practice, the natural logarithm of the likelihood function, called the log-likelihood function and denoted by
Statistics 580 Maximum Likelihood Estimation Introduction Let y (y 1, y 2,..., y n be a vector of iid, random variables from one of a family of distributions on R n and indexed by a p-dimensional parameter
More informationHandling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
More informationPrediction for Multilevel Models
Prediction for Multilevel Models Sophia Rabe-Hesketh Graduate School of Education & Graduate Group in Biostatistics University of California, Berkeley Institute of Education, University of London Joint
More information