I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANACHAMPAIGN


 Gloria Hamilton
 1 years ago
 Views:
Transcription
1 Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANACHAMPAIGN Linear Algebra Slide 1 of 34
2 Outline Logistic regression example Reaction time example Question 3 Feel free to browse lecture notes at: and Slides at Linear Algebra Slide 2 of 34
3 Binary Logistic Regression: Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On Response variable: Y ijk = targetfix (target fixtation?) y ijk = 0; n = 70, 198; 58.89% y ijk = 1; n = 48, 996; 41.11% where i =subject, j =trialid, k replication 39 subjects 44 trials 72 replications Total number of observations = 119, 194 Time in seconds: x = 0.649, s = 0.207, min= 0.3, max= 1.0 Gender of speaker (half female, half male) Looks like fully crossed design: Subject Trial ID ( Replication) Linear Algebra Slide 3 of 34
4 Random Effect Logistic Regression Model Random Component: y ijk U 0i, U 0j binomial(π ijk ) Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On Linear Algebra Slide 4 of 34
5 Random Effect Logistic Regression Model Random Component: Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On y ijk U 0i, U 0j binomial(π ijk ) Link Function: logit (natural log of odds) ( ) ( ) P(Y ijk = 1) πijk ln = ln = η ijk P(Y ijk = 0) 1 π ijk Linear Algebra Slide 4 of 34
6 Random Effect Logistic Regression Model Random Component: Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On y ijk U 0i, U 0j binomial(π ijk ) Link Function: logit (natural log of odds) ( ) ( ) P(Y ijk = 1) πijk ln = ln = η ijk P(Y ijk = 0) 1 π ijk Linear Predictor η ijk = β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j, Linear Algebra Slide 4 of 34
7 Random Effect Logistic Regression Model Random Component: Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On y ijk U 0i, U 0j binomial(π ijk ) Link Function: logit (natural log of odds) ( ) ( ) P(Y ijk = 1) πijk ln = ln = η ijk P(Y ijk = 0) 1 π ijk Linear Predictor η ijk = β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j, where ( U0i U 0j ) MVN (( 0 0 ), ( τ 2 Ss 0 0 τ 2 Tr )) Linear Algebra Slide 4 of 34
8 Putting it All Together Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On The conditional model: ( ) πijk ln = β 0 + β 1 (male) 1 π ijk + β 2 (time) ijk + U 0i + U 0j ijk Linear Algebra Slide 5 of 34
9 Putting it All Together Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On The conditional model: ( ) πijk ln = β 0 + β 1 (male) 1 π ijk + β 2 (time) ijk + U 0i + U 0j ijk or P(Y ijk = 1 U 0i, U 0j ) = exp[β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j ] 1 + exp[β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j ] Linear Algebra Slide 5 of 34
10 Putting it All Together Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On The conditional model: ( ) πijk ln = β 0 + β 1 (male) 1 π ijk + β 2 (time) ijk + U 0i + U 0j ijk or P(Y ijk = 1 U 0i, U 0j ) = exp[β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j ] 1 + exp[β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j ] The U 0i and U 0j are unobserved contributions to the intercept of the model. We assume that they are random and estimate their variances. The model is collapsed over the U s; they are integrated out. The model that is estimated is P(Y ijk = 1) exp[β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j ] = 1 + exp[β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j ] f(u 0k, U 0j )du 0i Linear Algebra Slide 5 of 34
11 Estimation For Normal Models: MLE and REML Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On Linear Algebra Slide 6 of 34
12 Estimation For Normal Models: MLE and REML Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On For Others: A difficult problem GaussHermite quadrature: MLE but problematic for large number of random effects. Laplace: Does pretty well (close to MLE) Bayesian: Difficult and very time consuming. Others: Can lead to very biased results, especially estimates of variances and covariances (i.e., τ s). Active area of development. If only interested in population model, then use GEE (a marginal model and not a random effects one). Linear Algebra Slide 6 of 34
13 Summary of Logistic Models fit To Data No Random Ss Random Trial Random Both Std Std Std Std Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On Effect Est. Err Est. Err Est. Err Est. Err Fixed Effects (regression coefficients) Intercept β (.02) 3.04 (.06) 3.04 (.08) 3.12 (.10) female β (.01) 0.39 (.01) 0.39 (.11) 0.40 (.11) male time β (.03) 3.73 (.03) 3.74 (.03) 3.84 (.03) Random Effects (Variances) Subject τss (.03) 0.12 (.03) Trial τt 2 r 0.13 (.03) 0.14 (.03) # param lnlike 146, , , AIC 146, , , BIC 146, , , Linear Algebra Slide 7 of 34
14 An even better model? Fit Statistics Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On 2 Log Likelihood AIC (smaller is better) BIC (smaller is better) Covariance Parameter Estimates Standard Cov Parm Subject Estimate Error Intercept trialid Intercept subject Linear Algebra Slide 8 of 34
15 An even better model... Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On Solutions for Fixed Effects Standard Effect Estimate Error t Value Pr > t Intercept <.0001 Female Male 0... time <.0001 time*female <.0001 time*male Infty <.0001 A problem here? Linear Algebra Slide 9 of 34
16 What s Going On Message in the LOG file: Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On NOTE: Convergence criterion (GCONV=1E8) satisfied. NOTE: At least one element of the gradient is greater than 1e3. NOTE: PROCEDURE GLIMMIX used (Total process time): real time seconds cpu time seconds Elements of the Gradient should be 0 at the maximum of the likelihood. In this model, the largest element of the gradient is Estimation methods other than Laplace did not converge. Model too complex for the data? The model is not a good one for the data. Linear Algebra Slide 10 of 34
17 Reaction times are (generally) not normal: Nonnegative continuous and positively skewed. y Gamma(µ, φ) and possibily ln as the link The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification f(y) Gamma(4,1) Gamma(6,0.50) Gamma(4,0.50) Gamma(6,0.33) y Linear Algebra Slide 11 of 34
18 The Data: Overall Distribution The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 12 of 34
19 Distribution of RT without Outliers The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 13 of 34
20 Distribution of RT1 by Verb The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 14 of 34
21 Distribution of RT1 by Verb The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 15 of 34
22 Distribution of RT1 by Verb The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 16 of 34
23 How about distribution for some Subjects The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 17 of 34
24 A couple more The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 18 of 34
25 ... and more The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 19 of 34
26 ... and even some more The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 20 of 34
27 A Model for the Data y ijk = reaction time for subject i on item j on replication k. The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 21 of 34
28 A Model for the Data y ijk = reaction time for subject i on item j on replication k. Random component: y ijk U 0i, U 1j Gamma(µ ijk, V ijk ) The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 21 of 34
29 A Model for the Data y ijk = reaction time for subject i on item j on replication k. Random component: y ijk U 0i, U 1j Gamma(µ ijk, V ijk ) Link: ln(µ ijk ) = η ijk The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 21 of 34
30 A Model for the Data y ijk = reaction time for subject i on item j on replication k. Random component: y ijk U 0i, U 1j Gamma(µ ijk, V ijk ) The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Link: ln(µ ijk ) = η ijk Linear predictor: η ijk = β 0 + U 0i + U 0j + β 1 age } {{ } i + β 2 gender } {{ } i intercept subject specific + β 3 DO j + β 4 M j + β 5 MM j + β 6 O j } {{ } item specific where DO j, M j, MM j and O j are dummy codes for verb bias (note: all equal 0 when verb bias is SO). Linear Algebra Slide 21 of 34
31 A Model for the Data y ijk = reaction time for subject i on item j on replication k. Random component: y ijk U 0i, U 1j Gamma(µ ijk, V ijk ) The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Link: ln(µ ijk ) = η ijk Linear predictor: η ijk = β 0 + U 0i + U 0j + β 1 age } {{ } i + β 2 gender } {{ } i intercept subject specific + β 3 DO j + β 4 M j + β 5 MM j + β 6 O j } {{ } item specific where DO j, M j, MM j and O j are dummy codes for verb bias (note: all equal 0 when verb bias is SO). Generalized Linear Mixed Model: In the scale of the data µ ijk = exp[β 0 + U 0i + U 0j + β 1 age i + β 2 gender i +β 3 DO j + β 4 M j + β 5 MM j + β 6 O j ] Linear Algebra Slide 21 of 34
32 Parameters of the Model The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Since subjects and items are viewed as random samples, we assume a distribution for the unobserved effects U 0i and U 0j : ( U 0i U 0j ) MVN (( 0 0 ), ( τ 2 Ss 0 0 τ 2 It We collapse or integrate out the random effects µ ijk = )) exp [ β 0 + U 0i + U 0j + β 1 age i + β 2 gender i +β 3 DO j + β 4 M j + β 5 MM j + β 6 O j ]f(u 0i )f(u 0j )d(u 0i ), d(u 0j The parameters of the distribution at the βs and the τs. The variance V ijk is complicated... Linear Algebra Slide 22 of 34
33 The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Summary of Models fit to Data Fixed Effect No Random Random Subject Random Item Subject & Item est se est se est se est se intercept β age β female β DO β M β MM β O β Subject τ 2 Ss Item τ 2 It Scale φ # params lnLike 170, , , , AIC 170, , , , BIC 170, , , , Linear Algebra Slide 23 of 34
34 A little Modification The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Since the estimated parameters for M and MM are similar in value (relative to their standard errors), verb bias for these two categories was recoded as 1 if verb bias was M or MM x M,MM = 0 otherwise Original Revised Normal dist. est se est se est se Subject τss Item τit Scale φ , # params lnLike 164, , , AIC 164, , , BIC 164, , , Linear Algebra Slide 24 of 34
35 Question1 Question: How does the modeling and estimation of random effects differ from fixed effects? Specifically, what is the term whose distribution is modeled as random and how many parameters are estimated in this process? Question1 a b Answer: We think of the Subject as random sample and trial ID s as a random sample. Rather than estimating an effect for each individual and each trial ID, we assume a distribution for them (the Us) and estimate the parameters of that distribution. Since we are assuming the distribution of a U N(0, τ 2 ), we only estimate it s variance; that is 1 parameter (τ 2 ) for each random effect, rather than 39 U 0i s (one for each subject) and 44 U 0j s (one for each trial type). The Us can be estimated AFTER the model is fit to the data. They are estimated using Bayesian methods: BLUPS. Linear Algebra Slide 25 of 34
36 a Question1 a b Question: Do the models calculate intercepts for each random effect? If so, is it the case that each random effect estimates an additional constant that is ADDED to the general intercept of the model, and which varies randomly according to some distribution?... I think the examples on the previous slides may have answered this... Linear Algebra Slide 26 of 34
37 b Question: VARIANCE COVARIANCE STRUCTURES what are they? Question1 a b how to we choose which ones(s?) to use? what consequences does this choice have? Answer: If you form a model using a multilevel perspective, they are implied by the model you specify For Normal data: var(y) = ZTZ } {{ } Level2 + }{{} σ 2 I. Level1 You can also specify a particular form for T (usually this is unstructured) and/or a particular form for the level 1 covariance matrix (e.g., AR(lag) for longitudinal data). The conditional variance (regardless of the distribution of y) is var(y U) = ZTZ. The marginal covariance matrix is much more complicated. Linear Algebra Slide 27 of 34
38 Testing a Random Effect a b c Question: What is the recommended procedure for adding/taking factors out of models when you are model testing? Answer: If it is a fixed effect, SAS gives ones for the effect and ones for individual parameters. If not significant, then I remove it from the model and do a likelihood ratio test (LR is more powerful and isn t as sensitive to multicolinearity). If it is a random effect (variance) and you re using MLE (or laplace), then Fit a model with and without the random effect (a model with variances and covariance for it and one without). Compare the LR statistic to a mixture of chisquare distributions. This is done getting pvalue using a χ 2 df where df equal the normal way to computer df and get the pvalue from χ 2 df 1. Take the average of these two. Linear Algebra Slide 28 of 34
39 Testing a Random Effect Using the data from the logistic regression example... Testing a Random Effect a b c H o : τ 2 Tr = 0 Using the models with both subject and trial effects: LR = = pvalue comparing this to χ 2 1 is tiny and pvalue from χ 2 0 = 0. So in this case just take half the pvalue from χ 2 1. Linear Algebra Slide 29 of 34
40 a Testing a Random Effect a b c Question: When do you include all the factor that you want to control for in your final models ( psychology/descriptive approach ) vs. including only the predictors that significantly improve the fit of the model to the data ( statistical/predictive approach )? Answer: The fixed effects that you include in the model affect the variances, and the random effects that you include in the model affect the fixed effects. I think the best approach is to find the best model for the data both in terms of fixed effects and random effects. (i.e., if effects that you re controlling for are not significant, take them out). Linear Algebra Slide 30 of 34
41 b Question: What do you conclude when model comparison indicates a factor should be in the model (say according to a deviance statistic or AIC), but the factor is not significant according to a test statistic for that factor (say a zstatistics)? Testing a Random Effect a b c Answer: It depends on Does your theory say it should be significant? Is the effect important? Is the effect large or small? When you take it out of the model or include it, do the other estimated parameters stay basically the same or do they change? How many tests have you done? By deviance, I m assuming you mean likelihood ratio test statistic. LR tests are powerful than z, t, and score tests. In the abstract, I ld say kept it in the model (at least at this point in modeling revisit it when you have a final model). Linear Algebra Slide 31 of 34
42 c Testing a Random Effect a b c Question: For repeatedmeasures designs in which a set of items are rotated through the experimental conditions across a series of lists, is it good to include list as a factor in the model (and if so, how do you test with the resulting confounding of subjects, items, etc)? Answer: I m not sure I understand the design... so at this point, I don t have an answer. Linear Algebra Slide 32 of 34
43 Dummy vs Effect Coding: For random intercept models, the only difference is how to interpret the parameters. Choose the one that s easier or more natural (e.g., I used dummy in the examples). For random slope models, the coding may matter depending on whether the variable(s) has a random slope. Miscellaneous continued Linear Algebra Slide 33 of 34
44 Miscellaneous continued Dummy vs Effect Coding: For random intercept models, the only difference is how to interpret the parameters. Choose the one that s easier or more natural (e.g., I used dummy in the examples). For random slope models, the coding may matter depending on whether the variable(s) has a random slope. MLM = Multilevel Logistic Model, and HLM = Hierarchical Linear Model. MLM and HLM are both special cases of GLMMs (Generalized Linear Mixed Models). Linear Algebra Slide 33 of 34
45 Miscellaneous continued Dummy vs Effect Coding: For random intercept models, the only difference is how to interpret the parameters. Choose the one that s easier or more natural (e.g., I used dummy in the examples). For random slope models, the coding may matter depending on whether the variable(s) has a random slope. MLM = Multilevel Logistic Model, and HLM = Hierarchical Linear Model. MLM and HLM are both special cases of GLMMs (Generalized Linear Mixed Models). Ordinal dependent variables: Use model designed for ordinal logistic regression and add random effects (e.g., proportional odds models, etc). Linear Algebra Slide 33 of 34
46 Miscellaneous continued Dummy vs Effect Coding: For random intercept models, the only difference is how to interpret the parameters. Choose the one that s easier or more natural (e.g., I used dummy in the examples). For random slope models, the coding may matter depending on whether the variable(s) has a random slope. MLM = Multilevel Logistic Model, and HLM = Hierarchical Linear Model. MLM and HLM are both special cases of GLMMs (Generalized Linear Mixed Models). Ordinal dependent variables: Use model designed for ordinal logistic regression and add random effects (e.g., proportional odds models, etc). Multicollinearity: The problem is basically the same as for normal linear regression. There are some additional problems with GLMMs and special cases of them (e.g., separation). Linear Algebra Slide 33 of 34
47 Miscellaneous continued Miscellaneous continued I don t use R for these kinds of models (yet) so I don t know how to code a 3level model. In SAS you just add another RANDOM statement and indicate nesting. How can CI be used to determine the level of a factor that is driving a significant effect? Answer: Check to see whether 0 in the interval or test whether parameter for a level is significantly different from 0. If it is or some look very similar, try recoding, refit model, and do likelihood ratio test. Linear Algebra Slide 34 of 34
Random Effects Logistic Regression Models as Item Response Models
Random Effects Logistic Regression Models as Item Response Models Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbanachampaign c Board of Trustees,
More informationj = 1,, J, where π j s are the response probabilities. The logit of the first J 1 cumulative probabilities are called cumulative logits and are log
Cumulative Logit Models for Ordinal Responses The ordinal responses can be modeled using logit models for proportional odds defined by the cumulative probabilities. Cumulative probabilities are the probabilities
More informationSlide 1 Logistic Regression 14 Oct Logistic Regression. Kirkwood and Sterne
Slide 1 Logistic Regression 14 Oct 2009 Logistic Regression Essential Medical Statistics Kirkwood and Sterne Applied Logisitic Regression Analysis Menard Slide 2 Logistic Regression 14 Oct 2009 Linear
More informationCount variables: Often have more zeroes than predicted by Poisson/NegBin Suggest two processes: Zero vs. nonzero Variability in nonzeroes
Count variables: Often have more zeroes than predicted by Poisson/NegBin Suggest two processes: Zero vs. nonzero Variability in nonzeroes Important: Why do you have so many zeroes? Zeroinflated submodels
More informationMultivariate Generalized Multilevel Models: Item Response Models
Multivariate Generalized Multilevel Models: Item Response Models Applied Multilevel Models for Cross Sectional Data Lecture 15 ICPSR Summer Workshop University of Colorado Boulder Lecture 15: IRT Models
More informationClass Notes: Week 3. proficient
Ronald Heck Class Notes: Week 3 1 Class Notes: Week 3 This week we will look a bit more into relationships between two variables using crosstabulation tables. Let s go back to the analysis of home language
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationLecture 13 Estimation and hypothesis testing for logistic regression
Lecture 13 Estimation and hypothesis testing for logistic regression BIOST 515 February 19, 2004 BIOST 515, Lecture 13 Outline Review of maximum likelihood estimation Maximum likelihood estimation for
More informationLDA Final Exam Due: 03/21/2005
5 LDA final exam answers LDA 1.665 Final Exam Due: /1/5 Question (a) Model for E y ) : ( log it = log( ) = β 1time agei sexi 1 vita i where E ( ) = y Model for variance: var( y ) = (1 ) Question (b): under
More informationIntroduction to Logistic. Regression
Introduction to Logistic Regression Content Simple and multiple linear regression Simple and multiple Discriminant Analysis Simple logistic regression The logistic function Estimation of parameters Interpretation
More informationTypes of Biostatistics. Lecture 18: Review Lecture. Types of Biostatistics. Approach to Modeling. 2) Inferential Statistics
Types of Biostatistics Lecture 18: Review Lecture Ani Manichaikul amanicha@jhsph.edu 15 May 2007 2) Inferential Statistics Confirmatory Data Analysis Methods Section of paper Goal: quantify relationships,
More informationProb[Y=1] = ˆ π ODDS = ˆ π /(1 ˆ π ) Probability. Log Odds. Log Odds. Std. Dev = 1.71 Mean = 0.00 N =
Prob[Y=1] = ˆ π ODDS = ˆ π /(1 ˆ π ) Logit = Log Odds = LN[ ˆ π /(1 ˆ π )] 0.01 0.0101014.595120 0.02 0.0204083.891820 0.03 0.0309283.476099 0.04 0.0416673.178054 0.05 0.0526322.944439 0.10 0.1111112.197225
More informationLMM: Linear Mixed Models and FEV1 Decline
LMM: Linear Mixed Models and FEV1 Decline We can use linear mixed models to assess the evidence for differences in the rate of decline for subgroups defined by covariates. S+ / R has a function lme().
More informationThe 3Level HLM Model
James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 2 Basic Characteristics of the 3level Model Level1 Model Level2 Model Level3 Model
More informationFraming Item Response Models as Hierarchical Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop
Framing Item Response Models as Hierarchical Linear Models Measurement Incorporated Hierarchical Linear Models Workshop Overview Nonlinear Item Response Theory (IRT) models. Conceptualizing IRT models
More informationOverview of SEM with Binary or Ordinal Outcomes
Ronald H. Heck 1 Handout #18 So far we have been working with data that are assumed to be continuous (i.e., ordinal scales) in developing measurement models. We also examined a number of path models with
More informationLecture 16: Logistic regression diagnostics, splines and interactions. Sandy Eckel 19 May 2007
Lecture 16: Logistic regression diagnostics, splines and interactions Sandy Eckel seckel@jhsph.edu 19 May 2007 1 Logistic Regression Diagnostics Graphs to check assumptions Recall: Graphing was used to
More informationSAS Syntax and Output for Data Manipulation:
Psyc 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling WithinPerson Change The models for this example come from Hoffman (in preparation) chapter 5. We will be examining
More informationMixed models in R using the lme4 package Part 5: Generalized linear mixed models
Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates Department of Statistics University of Wisconsin  Madison Madison January 11, 2011 Douglas
More informationHow to use HLM 6 for hierarchical linear modeling (aka mixed modeling, aka generalized estimating equations )
How to use HLM 6 for hierarchical linear modeling (aka mixed modeling, aka generalized estimating equations ) Use HLM when you have random effects (e.g., outcomes over time, a continuous variable) nested
More informationHow Do We Test Multiple Regression Coefficients?
How Do We Test Multiple Regression Coefficients? Suppose you have constructed a multiple linear regression model and you have a specific hypothesis to test which involves more than one regression coefficient.
More informationLongitudinal Data Analysis. GENERALIZED LINEAR MIXED MODELS (GLMMs)
Longitudinal Data Analysis GENERALIZED LINEAR MIXED MODELS (GLMMs) 432 Heagerty, 2006 Categorical Response Variables Q: If we have multivariate categorical data then what models / methods are available?
More informationLecture 13: Introduction to Logistic Regression
Lecture 13: Introduction to Logistic Regression Sandy Eckel seckel@jhsph.edu 13 May 2008 1 Logistic Regression Basic Idea: Logistic regression is the type of regression we use for a response variable (Y)
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More informationOverview Classes. 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7)
Overview Classes 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7) 24 Loglinear models (8) 54 1517 hrs; 5B02 Building and
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationGeneralized Mixed Models for Ordinal Longitudinal Outcomes using PROC GLIMMIX
Generalized Mixed Models for Ordinal Longitudinal Outcomes using PROC GLIMMIX SAS Data Manipulation: * Reading in all data; DATA alldata; SET annk.annknewfinal; WHERE NMISS(age80, mmse16)=0; cam012=cam;
More informationFamily economics data: total family income, expenditures, debt status for 50 families in two cohorts (A and B), annual records from 1990 1995.
Lecture 18 1. Random intercepts and slopes 2. Notation for mixed effects models 3. Comparing nested models 4. Multilevel/Hierarchical models 5. SAS versions of R models in Gelman and Hill, chapter 12 1
More informationBinary Logistic Regression
Binary Logistic Regression Describing Relationships Classification Accuracy Examples Logistic Regression Logistic regression is used to analyze relationships between a dichotomous dependent variable and
More informationHLM An Introduction. James H. Steiger. Multilevel Regression Modeling, Department of Psychology and Human Development Vanderbilt University
James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 Introduction 2 The HLM Notation System 3 OneWay ANOVA with Random Effects Data Preparation
More informationCustomer Satisfaction with Digitech Laptop Computers
Customer Satisfaction with Digitech Laptop Computers These data come from a study that asked owners of a Digitech laptop computer about their satisfaction with the computer and the Digitech Company. The
More informationGeneralized Linear Mixed Modeling and PROC GLIMMIX
Generalized Linear Mixed Modeling and PROC GLIMMIX Richard Charnigo Professor of Statistics and Biostatistics Director of Statistics and Psychometrics Core, CDART RJCharn2@aol.com Objectives First ~80
More informationStatistics 522: Sampling and Survey Techniques. Topic 12. See Example 11.2 on page 350 and discussion on page 353.
Topic Overview Statistics 522: Sampling and Survey Techniques This topic will cover Regression with complex surveys Topic 12 Simple linear regression for an SRS Model with parameters Estimates of the parameters
More informationVI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
More informationIntroduction to Multivariate Models: Modeling Multivariate Outcomes with Mixed Model Repeated Measures Analyses
Introduction to Multivariate Models: Modeling Multivariate Outcomes with Mixed Model Repeated Measures Analyses Applied Multilevel Models for Cross Sectional Data Lecture 11 ICPSR Summer Workshop University
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study loglinear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationRandom effects and nested models with SAS
Random effects and nested models with SAS /************* classical2.sas ********************* Three levels of factor A, four levels of B Both fixed Both random A fixed, B random B nested within A ***************************************************/
More informationMath 141. Lecture 29: Logistic Regression Details. Albyn Jones 1. 1 Library jones/courses/141
Math 141 Lecture 29: Logistic Regression Details Albyn Jones 1 1 Library 304 jones@reed.edu www.people.reed.edu/ jones/courses/141 Extending The Linear Model Nonlinear Least Squares: briefly... Generalized
More informationBasic Statistical and Modeling Procedures Using SAS
Basic Statistical and Modeling Procedures Using SAS OneSample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom
More informationGeneralized Linear Models. Today: definition of GLM, maximum likelihood estimation. Involves choice of a link function (systematic component)
Generalized Linear Models Last time: definition of exponential family, derivation of mean and variance (memorize) Today: definition of GLM, maximum likelihood estimation Include predictors x i through
More informationLecture 17: Logistic Regression: Testing Homogeneity of the OR
Lecture 17: Logistic Regression: Testing Homogeneity of the OR Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University
More informationOrdered Logit Models Vartanian: SW 541
Ordered Logit Models Vartanian: SW 541 You are examining an ordered logit model the age at first marriage. You have a 3 level dependent variable for age at first marriage. Level 1 indicates the person
More informationE(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F
Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,
More informationMixed models in R using the lme4 package Part 5: Generalized linear mixed models
Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates 8 th International Amsterdam Conference on Multilevel Analysis 20110316 Douglas Bates
More informationIntroduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group
Introduction to Multilevel Modeling Using HLM 6 By ATS Statistical Consulting Group Multilevel data structure Students nested within schools Children nested within families Respondents nested within interviewers
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationModels of binary outcomes with 3level data: A comparison of some options within SAS. CAPS Methods Core Seminar April 19, 2013.
Models of binary outcomes with 3level data: A comparison of some options within SAS CAPS Methods Core Seminar April 19, 2013 Steve Gregorich SEGregorich 1 April 19, 2013 Designs I. Cluster Randomized
More informationHLM software has been one of the leading statistical packages for hierarchical
Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush
More informationStructural Equation Models for Comparing Dependent Means and Proportions. Jason T. Newsom
Structural Equation Models for Comparing Dependent Means and Proportions Jason T. Newsom How to Do a Paired ttest with Structural Equation Modeling Jason T. Newsom Overview Rationale Structural equation
More informationLecture 20: Logit Models for Multinomial Responses
Lecture 20: Logit Models for Multinomial Responses Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South
More informationThe sinking of the Titanic
The sinking of the Titanic The logistic regression model is a member of a general class of models called log linear models. These models are particularly useful when studying contingency tables (tables
More informationEPID 766: Analysis of Longitudinal Data from Epidemiologic Studies
Epid 766 D. Zhang EPID 766: Analysis of Longitudinal Data from Epidemiologic Studies Daowen Zhang zhang@stat.ncsu.edu http://www4.stat.ncsu.edu/ dzhang2 Graduate Summer Session in Epidemiology Slide 1
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationPerform hypothesis testing
Multivariate hypothesis tests for fixed effects Testing homogeneity of level1 variances In the following sections, we use the model displayed in the figure below to illustrate the hypothesis tests. Partial
More informationLab 5: GROWTH CURVE MODELING (from pages and of the old textbook edition and starting on page 210 of the new edition)
BIO656 008 Lab 5: GROWTH CURVE MODELING (from pages 7887 and 994 of the old textbook edition and starting on page 0 of the new edition) Data: Weight gain in Asian children in Britain. Variables id: child
More informationAnalyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest
Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t
More information5 Logistic Regression
5 Logistic Regression I The logistic regression model Data (x i, Y i ) x i = covariate, indep or explanatory variable continuous Y i = response Y i x i Bin(n i, π(x i )) E(Y i /n i x i ) = π(x i ) var(y
More informationBinary Logistic Regression
Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including
More informationLogistic Regression. Psy 524 Ainsworth
Logistic Regression Psy 524 Ainsworth What is Logistic Regression? Form of regression that allows the prediction of discrete variables by a mix of continuous and discrete predictors. Addresses the same
More informationOutline. Topic 31  Multiple Logistic Regression. Multiple Logistic Regression. Example Page 573
Topic 31  Multiple Logistic Regression  Fall 2013 Outline Multiple Logistic Regression Model Inference Diagnostics and remedies Polytomous Logistic Regression Ordinal Nominal Topic 31 2 Multiple Logistic
More informationMLM 2007 Marginal vs RE models, Ordinal Responses (and other musings )
MLM 2007 Marginal vs RE models, Ordinal Responses (and other musings ) Michael Griswold Guest Lecture Discussion Outline MLM review: Goals & Concepts Marginal & RandomEffect Models: Logistic: PA & SS
More informationIllustration (and the use of HLM)
Illustration (and the use of HLM) Chapter 4 1 Measurement Incorporated HLM Workshop The Illustration Data Now we cover the example. In doing so we does the use of the software HLM. In addition, we will
More informationUsing Stata for Logistic Regression
Using Stata for Logistic Regression NOTE: The routines spostado and lrdrop1 are used in this handout. Use the findit command to locate and install them. See related handouts for the statistical theory
More information5. Ordinal regression: cumulative categories proportional odds. 6. Ordinal regression: comparison to single reference generalized logits
Lecture 23 1. Logistic regression with binary response 2. Proc Logistic and its surprises 3. quadratic model 4. HosmerLemeshow test for lack of fit 5. Ordinal regression: cumulative categories proportional
More informationdata on Down's syndrome
DATA a; INFILE 'downs.dat' ; INPUT AgeL AgeU BirthOrd Cases Births ; MidAge = (AgeL + AgeU)/2 ; Rate = 1000*Cases/Births; LogRate = Log( (Cases+0.5)/Births ); LogDenom = Log(Births); age_c = MidAge  30;
More informationGeneralized Multilevel Models for Non Normal Data
Generalized Multilevel Models for Non Normal Data Applied Multilevel Models for Cross Sectional Data Lecture 14 ICPSR Summer Workshop University of Colorado Boulder Lecture 14: Generalized Models 1 Topics
More informationSession 6. Logistic Regression
Session 6 Logistic Regression page Analysis of Binary Data 62 Models for Binary Data 63 Hypothesis Testing 64 Interpreting Logistic Regression in SPSS 65 Logistic Regression in SPSS 66 1. Regression
More informationExamples of Using R for Modeling Ordinal Data
Examples of Using R for Modeling Ordinal Data Alan Agresti Department of Statistics, University of Florida Supplement for the book Analysis of Ordinal Categorical Data, 2nd ed., 2010 (Wiley), abbreviated
More informationYiming Peng, Department of Statistics. February 12, 2013
Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop
More informationLinda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents
Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén
More informationMultilevel Models for Categorical Data Using SAS PROC GLIMMIX: The Basics
Paper 34302015 Multilevel Models for Categorical Data Using SAS PROC GLIMMIX: The Basics ABSTRACT Mihaela Ene, Elizabeth A. Leighton, Genine L. Blue, and Bethany A. Bell University of South Carolina Multilevel
More informationSTA 4504/5503 Sample questions for exam 2 Courtesy of Alan Agresti. 1. TrueFalse questions.
STA 4504/5503 Sample questions for exam 2 Courtesy of Alan Agresti 1. TrueFalse questions. (a) For General Social Survey data on Y = political ideology (categories liberal, moderate, conservative), X
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More informationUsing PROC MIXED in Hierarchical Linear Models: Examples from two and threelevel schooleffect analysis, and metaanalysis research
Using PROC MIXED in Hierarchical Linear Models: Examples from two and threelevel schooleffect analysis, and metaanalysis research Sawako Suzuki, DePaul University, Chicago ChingFan Sheu, DePaul University,
More informationPolytomous Logistic Regression
page Polytomous Logistic Regression Consider now the situation where the outcome variable is not dichotomous but rather has many levels ( i.e. polytomous). For example, we will consider an outcome variable
More informationPanel Data 4: Fixed Effects vs Random Effects Models
Panel Data 4: Fixed Effects vs Random Effects Models Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised March 29, 2016 These notes borrow very heavily, sometimes verbatim,
More informationAIC and BIC for Survey Data
AIC and BIC for Survey Data Thomas Lumley & Alastair Scott* Department of Statistics University of Auckland t.lumley@auckland.ac.nz, a.scott@auckland.ac.nz Analysing survey data Analyzing survey data has
More informationLogit, Probit and Tobit: Models for Categorical and Limited
Logit, Probit and Tobit: Models for Categorical and Limited Dependent Variables By Rajulton Fernando Presented at PLCS/RDC Statistics and Data Series at Western March 23, 2011 Introduction In social science
More informationAn Introduction to Generalized Estimating Equations
An Introduction to Generalized Estimating Equations Cancer Prevention and Control Tutorial 16 October 2008 An Introduction to Generalized Estimating Equations p. 1/14 Repeated measures ANOVA limitations
More informationAnalysis of Longitudinal Data in Stata, Splus and SAS
Analysis of Longitudinal Data in Stata, Splus and SAS Rino Bellocco, Sc.D. Department of Medical Epidemiology Karolinska Institutet Stockholm, Sweden rino@mep.ki.se March 12, 2001 NASUGS, 2001 OUTLINE
More informationLimited Dependent Variable Models I
Limited Dependent Variable Models I Fall 2008 Environmental Econometrics (GR03) LDV Fall 2008 1 / 20 Limited Dependent Variables A limited dependent variable, Y, is de ned as a dependent variable whose
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationLogistic Regression, Part III: Hypothesis Testing, Comparisons to OLS
Logistic Regression, Part III: Hypothesis Testing, Comparisons to OLS Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 22, 2015 This handout steals heavily
More information11. Analysis of Casecontrol Studies Logistic Regression
Research methods II 113 11. Analysis of Casecontrol Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationRepeated Measures ANOVA/GLM Multivariate ANOVA/GLM in PROC MIXED
Repeated Measures ANOVA/GLM Multivariate ANOVA/GLM in PROC MIXED Multivariate Methods in Education ERSH 8350 Lecture #8 October 5, 2011 ERSH 8350: Lecture 8 Today s Class Using PROC MIXED for: Repeated
More informationSlides Prepared by JOHN S. LOUCKS St. Edward s s University Thomson/SouthWestern. Slide
s Prepared by JOHN S. LOUCKS St. Edward s s University 1 Chapter 13 Multiple Regression Multiple Regression Model Least Squares Method Multiple Coefficient of Determination Model Assumptions Testing for
More informationData handling rules for RMR, TEE and AREE Residuals
RMR Residuals One of the two Primary Endpoints of the study is resting metabolic rate (RMR) corrected for changes in body composition. This endpoint variable is referred to as the RMR Residual. To correct
More informationLecture 19: Conditional Logistic Regression
Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina
More informationBiases. Confounding Bias: Definition. OUTLINE Review Confounding bias Multiple linear regression Inclass questions
OUTLINE Review Confounding bias Multiple linear regression Inclass questions Biases Selection bias Information bias Confounding bias Bias is an error in an epidemiologic study that results in an incorrect
More informationRepeated Measures Analysis with Discrete Data Using the SAS System
Repeated Measures Analysis with Discrete Data Using the SAS System Gordon Johnston, SAS Institute Inc, Cary, NC Abstract The analysis of correlated data arising from repeated measurements when the measurements
More informationModels for Count Data With Overdispersion
Models for Count Data With Overdispersion Germán Rodríguez November 6, 2013 Abstract This addendum to the WWS 509 notes covers extrapoisson variation and the negative binomial model, with brief appearances
More informationThe Latent Variable Growth Model In Practice. Individual Development Over Time
The Latent Variable Growth Model In Practice 37 Individual Development Over Time y i = 1 i = 2 i = 3 t = 1 t = 2 t = 3 t = 4 ε 1 ε 2 ε 3 ε 4 y 1 y 2 y 3 y 4 x η 0 η 1 (1) y ti = η 0i + η 1i x t + ε ti
More informationUniversità degli Studi di Bari Italia
Page 1 of 12 Università degli Studi di Bari Italia Statistical Models for Management February 1517, 2010 Assessment There are 90 marks in total and 18 pages. please answer all questions show all working
More informationLogistic Regression for Ordinal Response Variables
Logistic Regression for Ordinal Response Variables Edpsy/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbanachampaign c Board of Trustees,
More informationUse of deviance statistics for comparing models
A likelihoodratio test can be used under full ML. The use of such a test is a quite general principle for statistical testing. In hierarchical linear models, the deviance test is mostly used for multiparameter
More informationCentering Predictors and Variance Decomposition
Centering Predictors and Variance Decomposition Applied Multilevel Models for Cross Sectional Data Lecture 6 ICPSR Summer Workshop University of Colorado Boulder Covered this Section We will expand on
More informationPoisson Regression or Regression of Counts (& Rates)
Poisson Regression or Regression of (& Rates) Carolyn J. Anderson Department of Educational Psychology University of Illinois at UrbanaChampaign Generalized Linear Models Slide 1 of 51 Outline Outline
More informationDeveloping Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@
Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,
More informationCentering. The meaning of the intercept in the Level1 model depends upon the location of the level1
HLM offers the options to use predictors as they are, or to use them after grand or groupmean centering them. The choice of centering method is dictated by the question studied, and great care should
More information