# I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Save this PDF as:

Size: px
Start display at page:

Download "I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN"

## Transcription

1 Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of 34

2 Outline Logistic regression example Reaction time example Question 3 Feel free to browse lecture notes at: and Slides at Linear Algebra Slide 2 of 34

3 Binary Logistic Regression: Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On Response variable: Y ijk = targetfix (target fixtation?) y ijk = 0; n = 70, 198; 58.89% y ijk = 1; n = 48, 996; 41.11% where i =subject, j =trialid, k replication 39 subjects 44 trials 72 replications Total number of observations = 119, 194 Time in seconds: x = 0.649, s = 0.207, min= 0.3, max= 1.0 Gender of speaker (half female, half male) Looks like fully crossed design: Subject Trial ID ( Replication) Linear Algebra Slide 3 of 34

4 Random Effect Logistic Regression Model Random Component: y ijk U 0i, U 0j binomial(π ijk ) Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On Linear Algebra Slide 4 of 34

5 Random Effect Logistic Regression Model Random Component: Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On y ijk U 0i, U 0j binomial(π ijk ) Link Function: logit (natural log of odds) ( ) ( ) P(Y ijk = 1) πijk ln = ln = η ijk P(Y ijk = 0) 1 π ijk Linear Algebra Slide 4 of 34

6 Random Effect Logistic Regression Model Random Component: Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On y ijk U 0i, U 0j binomial(π ijk ) Link Function: logit (natural log of odds) ( ) ( ) P(Y ijk = 1) πijk ln = ln = η ijk P(Y ijk = 0) 1 π ijk Linear Predictor η ijk = β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j, Linear Algebra Slide 4 of 34

7 Random Effect Logistic Regression Model Random Component: Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On y ijk U 0i, U 0j binomial(π ijk ) Link Function: logit (natural log of odds) ( ) ( ) P(Y ijk = 1) πijk ln = ln = η ijk P(Y ijk = 0) 1 π ijk Linear Predictor η ijk = β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j, where ( U0i U 0j ) MVN (( 0 0 ), ( τ 2 Ss 0 0 τ 2 Tr )) Linear Algebra Slide 4 of 34

8 Putting it All Together Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On The conditional model: ( ) πijk ln = β 0 + β 1 (male) 1 π ijk + β 2 (time) ijk + U 0i + U 0j ijk Linear Algebra Slide 5 of 34

9 Putting it All Together Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On The conditional model: ( ) πijk ln = β 0 + β 1 (male) 1 π ijk + β 2 (time) ijk + U 0i + U 0j ijk or P(Y ijk = 1 U 0i, U 0j ) = exp[β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j ] 1 + exp[β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j ] Linear Algebra Slide 5 of 34

10 Putting it All Together Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On The conditional model: ( ) πijk ln = β 0 + β 1 (male) 1 π ijk + β 2 (time) ijk + U 0i + U 0j ijk or P(Y ijk = 1 U 0i, U 0j ) = exp[β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j ] 1 + exp[β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j ] The U 0i and U 0j are unobserved contributions to the intercept of the model. We assume that they are random and estimate their variances. The model is collapsed over the U s; they are integrated out. The model that is estimated is P(Y ijk = 1) exp[β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j ] = 1 + exp[β 0 + β 1 (male) ijk + β 2 (time) ijk + U 0i + U 0j ] f(u 0k, U 0j )du 0i Linear Algebra Slide 5 of 34

11 Estimation For Normal Models: MLE and REML Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On Linear Algebra Slide 6 of 34

12 Estimation For Normal Models: MLE and REML Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On For Others: A difficult problem Gauss-Hermite quadrature: MLE but problematic for large number of random effects. Laplace: Does pretty well (close to MLE) Bayesian: Difficult and very time consuming. Others: Can lead to very biased results, especially estimates of variances and covariances (i.e., τ s). Active area of development. If only interested in population model, then use GEE (a marginal model and not a random effects one). Linear Algebra Slide 6 of 34

13 Summary of Logistic Models fit To Data No Random Ss Random Trial Random Both Std Std Std Std Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On Effect Est. Err Est. Err Est. Err Est. Err Fixed Effects (regression coefficients) Intercept β (.02) 3.04 (.06) 3.04 (.08) 3.12 (.10) female β (.01) 0.39 (.01) 0.39 (.11) 0.40 (.11) male time β (.03) 3.73 (.03) 3.74 (.03) 3.84 (.03) Random Effects (Variances) Subject τss (.03) 0.12 (.03) Trial τt 2 r 0.13 (.03) 0.14 (.03) # param lnlike 146, , , AIC 146, , , BIC 146, , , Linear Algebra Slide 7 of 34

14 An even better model? Fit Statistics Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On -2 Log Likelihood AIC (smaller is better) BIC (smaller is better) Covariance Parameter Estimates Standard Cov Parm Subject Estimate Error Intercept trialid Intercept subject Linear Algebra Slide 8 of 34

15 An even better model... Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On Solutions for Fixed Effects Standard Effect Estimate Error t Value Pr > t Intercept <.0001 Female Male 0... time <.0001 time*female <.0001 time*male Infty <.0001 A problem here? Linear Algebra Slide 9 of 34

16 What s Going On Message in the LOG file: Random Effect Logistic Regression Model Putting it All Together Estimation Summary of Logistic Models fit To Data An even better model? An even better model... What s Going On NOTE: Convergence criterion (GCONV=1E-8) satisfied. NOTE: At least one element of the gradient is greater than 1e-3. NOTE: PROCEDURE GLIMMIX used (Total process time): real time seconds cpu time seconds Elements of the Gradient should be 0 at the maximum of the likelihood. In this model, the largest element of the gradient is Estimation methods other than Laplace did not converge. Model too complex for the data? The model is not a good one for the data. Linear Algebra Slide 10 of 34

17 Reaction times are (generally) not normal: Non-negative continuous and positively skewed. y Gamma(µ, φ) and possibily ln as the link The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification f(y) Gamma(4,1) Gamma(6,0.50) Gamma(4,0.50) Gamma(6,0.33) y Linear Algebra Slide 11 of 34

18 The Data: Overall Distribution The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 12 of 34

19 Distribution of RT without Outliers The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 13 of 34

20 Distribution of RT1 by Verb The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 14 of 34

21 Distribution of RT1 by Verb The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 15 of 34

22 Distribution of RT1 by Verb The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 16 of 34

23 How about distribution for some Subjects The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 17 of 34

24 A couple more The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 18 of 34

25 ... and more The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 19 of 34

26 ... and even some more The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 20 of 34

27 A Model for the Data y ijk = reaction time for subject i on item j on replication k. The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 21 of 34

28 A Model for the Data y ijk = reaction time for subject i on item j on replication k. Random component: y ijk U 0i, U 1j Gamma(µ ijk, V ijk ) The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 21 of 34

29 A Model for the Data y ijk = reaction time for subject i on item j on replication k. Random component: y ijk U 0i, U 1j Gamma(µ ijk, V ijk ) Link: ln(µ ijk ) = η ijk The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Linear Algebra Slide 21 of 34

30 A Model for the Data y ijk = reaction time for subject i on item j on replication k. Random component: y ijk U 0i, U 1j Gamma(µ ijk, V ijk ) The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Link: ln(µ ijk ) = η ijk Linear predictor: η ijk = β 0 + U 0i + U 0j + β 1 age } {{ } i + β 2 gender } {{ } i intercept subject specific + β 3 DO j + β 4 M j + β 5 MM j + β 6 O j } {{ } item specific where DO j, M j, MM j and O j are dummy codes for verb bias (note: all equal 0 when verb bias is SO). Linear Algebra Slide 21 of 34

31 A Model for the Data y ijk = reaction time for subject i on item j on replication k. Random component: y ijk U 0i, U 1j Gamma(µ ijk, V ijk ) The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Link: ln(µ ijk ) = η ijk Linear predictor: η ijk = β 0 + U 0i + U 0j + β 1 age } {{ } i + β 2 gender } {{ } i intercept subject specific + β 3 DO j + β 4 M j + β 5 MM j + β 6 O j } {{ } item specific where DO j, M j, MM j and O j are dummy codes for verb bias (note: all equal 0 when verb bias is SO). Generalized Linear Mixed Model: In the scale of the data µ ijk = exp[β 0 + U 0i + U 0j + β 1 age i + β 2 gender i +β 3 DO j + β 4 M j + β 5 MM j + β 6 O j ] Linear Algebra Slide 21 of 34

32 Parameters of the Model The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Since subjects and items are viewed as random samples, we assume a distribution for the unobserved effects U 0i and U 0j : ( U 0i U 0j ) MVN (( 0 0 ), ( τ 2 Ss 0 0 τ 2 It We collapse or integrate out the random effects µ ijk = )) exp [ β 0 + U 0i + U 0j + β 1 age i + β 2 gender i +β 3 DO j + β 4 M j + β 5 MM j + β 6 O j ]f(u 0i )f(u 0j )d(u 0i ), d(u 0j The parameters of the distribution at the βs and the τs. The variance V ijk is complicated... Linear Algebra Slide 22 of 34

33 The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Summary of Models fit to Data Fixed Effect No Random Random Subject Random Item Subject & Item est se est se est se est se intercept β age β female β DO β M β MM β O β Subject τ 2 Ss Item τ 2 It Scale φ # params lnLike 170, , , , AIC 170, , , , BIC 170, , , , Linear Algebra Slide 23 of 34

34 A little Modification The Data: Overall Distribution Distribution of RT without Outliers How about distribution for some Subjects A couple more... and more... and even some more A Model for the Data Parameters of the Model Summary of Models fit to Data A little Modification Since the estimated parameters for M and MM are similar in value (relative to their standard errors), verb bias for these two categories was re-coded as 1 if verb bias was M or MM x M,MM = 0 otherwise Original Revised Normal dist. est se est se est se Subject τss Item τit Scale φ , # params lnLike 164, , , AIC 164, , , BIC 164, , , Linear Algebra Slide 24 of 34

35 Question1 Question: How does the modeling and estimation of random effects differ from fixed effects? Specifically, what is the term whose distribution is modeled as random and how many parameters are estimated in this process? Question1 a b Answer: We think of the Subject as random sample and trial ID s as a random sample. Rather than estimating an effect for each individual and each trial ID, we assume a distribution for them (the Us) and estimate the parameters of that distribution. Since we are assuming the distribution of a U N(0, τ 2 ), we only estimate it s variance; that is 1 parameter (τ 2 ) for each random effect, rather than 39 U 0i s (one for each subject) and 44 U 0j s (one for each trial type). The Us can be estimated AFTER the model is fit to the data. They are estimated using Bayesian methods: BLUPS. Linear Algebra Slide 25 of 34

36 a Question1 a b Question: Do the models calculate intercepts for each random effect? If so, is it the case that each random effect estimates an additional constant that is ADDED to the general intercept of the model, and which varies randomly according to some distribution?... I think the examples on the previous slides may have answered this... Linear Algebra Slide 26 of 34

37 b Question: VARIANCE COVARIANCE STRUCTURES what are they? Question1 a b how to we choose which ones(s?) to use? what consequences does this choice have? Answer: If you form a model using a multilevel perspective, they are implied by the model you specify For Normal data: var(y) = ZTZ } {{ } Level2 + }{{} σ 2 I. Level1 You can also specify a particular form for T (usually this is un-structured) and/or a particular form for the level 1 covariance matrix (e.g., AR(lag) for longitudinal data). The conditional variance (regardless of the distribution of y) is var(y U) = ZTZ. The marginal covariance matrix is much more complicated. Linear Algebra Slide 27 of 34

38 Testing a Random Effect a b c Question: What is the recommended procedure for adding/taking factors out of models when you are model testing? Answer: If it is a fixed effect, SAS gives ones for the effect and ones for individual parameters. If not significant, then I remove it from the model and do a likelihood ratio test (LR is more powerful and isn t as sensitive to multicolinearity). If it is a random effect (variance) and you re using MLE (or laplace), then Fit a model with and without the random effect (a model with variances and covariance for it and one without). Compare the LR statistic to a mixture of chi-square distributions. This is done getting p-value using a χ 2 df where df equal the normal way to computer df and get the p-value from χ 2 df 1. Take the average of these two. Linear Algebra Slide 28 of 34

39 Testing a Random Effect Using the data from the logistic regression example... Testing a Random Effect a b c H o : τ 2 Tr = 0 Using the models with both subject and trial effects: LR = = p-value comparing this to χ 2 1 is tiny and p-value from χ 2 0 = 0. So in this case just take half the p-value from χ 2 1. Linear Algebra Slide 29 of 34

40 a Testing a Random Effect a b c Question: When do you include all the factor that you want to control for in your final models ( psychology/descriptive approach ) vs. including only the predictors that significantly improve the fit of the model to the data ( statistical/predictive approach )? Answer: The fixed effects that you include in the model affect the variances, and the random effects that you include in the model affect the fixed effects. I think the best approach is to find the best model for the data both in terms of fixed effects and random effects. (i.e., if effects that you re controlling for are not significant, take them out). Linear Algebra Slide 30 of 34

41 b Question: What do you conclude when model comparison indicates a factor should be in the model (say according to a deviance statistic or AIC), but the factor is not significant according to a test statistic for that factor (say a z-statistics)? Testing a Random Effect a b c Answer: It depends on Does your theory say it should be significant? Is the effect important? Is the effect large or small? When you take it out of the model or include it, do the other estimated parameters stay basically the same or do they change? How many tests have you done? By deviance, I m assuming you mean likelihood ratio test statistic. LR tests are powerful than z, t, and score tests. In the abstract, I ld say kept it in the model (at least at this point in modeling re-visit it when you have a final model). Linear Algebra Slide 31 of 34

42 c Testing a Random Effect a b c Question: For repeated-measures designs in which a set of items are rotated through the experimental conditions across a series of lists, is it good to include list as a factor in the model (and if so, how do you test with the resulting confounding of subjects, items, etc)? Answer: I m not sure I understand the design... so at this point, I don t have an answer. Linear Algebra Slide 32 of 34

43 Dummy vs Effect Coding: For random intercept models, the only difference is how to interpret the parameters. Choose the one that s easier or more natural (e.g., I used dummy in the examples). For random slope models, the coding may matter depending on whether the variable(s) has a random slope. Miscellaneous continued Linear Algebra Slide 33 of 34

44 Miscellaneous continued Dummy vs Effect Coding: For random intercept models, the only difference is how to interpret the parameters. Choose the one that s easier or more natural (e.g., I used dummy in the examples). For random slope models, the coding may matter depending on whether the variable(s) has a random slope. MLM = Multilevel Logistic Model, and HLM = Hierarchical Linear Model. MLM and HLM are both special cases of GLMMs (Generalized Linear Mixed Models). Linear Algebra Slide 33 of 34

45 Miscellaneous continued Dummy vs Effect Coding: For random intercept models, the only difference is how to interpret the parameters. Choose the one that s easier or more natural (e.g., I used dummy in the examples). For random slope models, the coding may matter depending on whether the variable(s) has a random slope. MLM = Multilevel Logistic Model, and HLM = Hierarchical Linear Model. MLM and HLM are both special cases of GLMMs (Generalized Linear Mixed Models). Ordinal dependent variables: Use model designed for ordinal logistic regression and add random effects (e.g., proportional odds models, etc). Linear Algebra Slide 33 of 34

46 Miscellaneous continued Dummy vs Effect Coding: For random intercept models, the only difference is how to interpret the parameters. Choose the one that s easier or more natural (e.g., I used dummy in the examples). For random slope models, the coding may matter depending on whether the variable(s) has a random slope. MLM = Multilevel Logistic Model, and HLM = Hierarchical Linear Model. MLM and HLM are both special cases of GLMMs (Generalized Linear Mixed Models). Ordinal dependent variables: Use model designed for ordinal logistic regression and add random effects (e.g., proportional odds models, etc). Multicollinearity: The problem is basically the same as for normal linear regression. There are some additional problems with GLMMs and special cases of them (e.g., separation). Linear Algebra Slide 33 of 34

47 Miscellaneous continued Miscellaneous continued I don t use R for these kinds of models (yet) so I don t know how to code a 3-level model. In SAS you just add another RANDOM statement and indicate nesting. How can CI be used to determine the level of a factor that is driving a significant effect? Answer: Check to see whether 0 in the interval or test whether parameter for a level is significantly different from 0. If it is or some look very similar, try recoding, re-fit model, and do likelihood ratio test. Linear Algebra Slide 34 of 34

### Generalized Linear Models

Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

### Lecture 16: Logistic regression diagnostics, splines and interactions. Sandy Eckel 19 May 2007

Lecture 16: Logistic regression diagnostics, splines and interactions Sandy Eckel seckel@jhsph.edu 19 May 2007 1 Logistic Regression Diagnostics Graphs to check assumptions Recall: Graphing was used to

### How Do We Test Multiple Regression Coefficients?

How Do We Test Multiple Regression Coefficients? Suppose you have constructed a multiple linear regression model and you have a specific hypothesis to test which involves more than one regression coefficient.

### LMM: Linear Mixed Models and FEV1 Decline

LMM: Linear Mixed Models and FEV1 Decline We can use linear mixed models to assess the evidence for differences in the rate of decline for subgroups defined by covariates. S+ / R has a function lme().

### The 3-Level HLM Model

James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 2 Basic Characteristics of the 3-level Model Level-1 Model Level-2 Model Level-3 Model

### SAS Syntax and Output for Data Manipulation:

Psyc 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (in preparation) chapter 5. We will be examining

### Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

### Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

### Generalized Linear Mixed Modeling and PROC GLIMMIX

Generalized Linear Mixed Modeling and PROC GLIMMIX Richard Charnigo Professor of Statistics and Biostatistics Director of Statistics and Psychometrics Core, CDART RJCharn2@aol.com Objectives First ~80

### Multivariate Logistic Regression

1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

### Introduction to Multivariate Models: Modeling Multivariate Outcomes with Mixed Model Repeated Measures Analyses

Introduction to Multivariate Models: Modeling Multivariate Outcomes with Mixed Model Repeated Measures Analyses Applied Multilevel Models for Cross Sectional Data Lecture 11 ICPSR Summer Workshop University

### VI. Introduction to Logistic Regression

VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models

### Models of binary outcomes with 3-level data: A comparison of some options within SAS. CAPS Methods Core Seminar April 19, 2013.

Models of binary outcomes with 3-level data: A comparison of some options within SAS CAPS Methods Core Seminar April 19, 2013 Steve Gregorich SEGregorich 1 April 19, 2013 Designs I. Cluster Randomized

### Generalized Mixed Models for Ordinal Longitudinal Outcomes using PROC GLIMMIX

Generalized Mixed Models for Ordinal Longitudinal Outcomes using PROC GLIMMIX SAS Data Manipulation: * Reading in all data; DATA alldata; SET annk.annknewfinal; WHERE NMISS(age80, mmse16)=0; cam012=cam;

### Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group

Introduction to Multilevel Modeling Using HLM 6 By ATS Statistical Consulting Group Multilevel data structure Students nested within schools Children nested within families Respondents nested within interviewers

### Family economics data: total family income, expenditures, debt status for 50 families in two cohorts (A and B), annual records from 1990 1995.

Lecture 18 1. Random intercepts and slopes 2. Notation for mixed effects models 3. Comparing nested models 4. Multilevel/Hierarchical models 5. SAS versions of R models in Gelman and Hill, chapter 12 1

### E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F

Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,

### HLM software has been one of the leading statistical packages for hierarchical

Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom

### Yiming Peng, Department of Statistics. February 12, 2013

Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop

### Ordinal Regression. Chapter

Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe

### Poisson Models for Count Data

Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

### Random effects and nested models with SAS

Random effects and nested models with SAS /************* classical2.sas ********************* Three levels of factor A, four levels of B Both fixed Both random A fixed, B random B nested within A ***************************************************/

### Examples of Using R for Modeling Ordinal Data

Examples of Using R for Modeling Ordinal Data Alan Agresti Department of Statistics, University of Florida Supplement for the book Analysis of Ordinal Categorical Data, 2nd ed., 2010 (Wiley), abbreviated

### SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

### Models for Count Data With Overdispersion

Models for Count Data With Overdispersion Germán Rodríguez November 6, 2013 Abstract This addendum to the WWS 509 notes covers extra-poisson variation and the negative binomial model, with brief appearances

### Structural Equation Models for Comparing Dependent Means and Proportions. Jason T. Newsom

Structural Equation Models for Comparing Dependent Means and Proportions Jason T. Newsom How to Do a Paired t-test with Structural Equation Modeling Jason T. Newsom Overview Rationale Structural equation

### Generalized Linear Models. Today: definition of GLM, maximum likelihood estimation. Involves choice of a link function (systematic component)

Generalized Linear Models Last time: definition of exponential family, derivation of mean and variance (memorize) Today: definition of GLM, maximum likelihood estimation Include predictors x i through

### Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t

### data on Down's syndrome

DATA a; INFILE 'downs.dat' ; INPUT AgeL AgeU BirthOrd Cases Births ; MidAge = (AgeL + AgeU)/2 ; Rate = 1000*Cases/Births; LogRate = Log( (Cases+0.5)/Births ); LogDenom = Log(Births); age_c = MidAge - 30;

### Getting Started with HLM 5. For Windows

For Windows August 2012 Table of Contents Section 1: Overview... 3 1.1 About this Document... 3 1.2 Introduction to HLM... 3 1.3 Accessing HLM... 3 1.4 Getting Help with HLM... 4 Section 2: Accessing Data

### Illustration (and the use of HLM)

Illustration (and the use of HLM) Chapter 4 1 Measurement Incorporated HLM Workshop The Illustration Data Now we cover the example. In doing so we does the use of the software HLM. In addition, we will

Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### Introducing the Multilevel Model for Change

Department of Psychology and Human Development Vanderbilt University GCM, 2010 1 Multilevel Modeling - A Brief Introduction 2 3 4 5 Introduction In this lecture, we introduce the multilevel model for change.

### Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@

Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,

### ECON Introductory Econometrics. Lecture 15: Binary dependent variables

ECON4150 - Introductory Econometrics Lecture 15: Binary dependent variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 11 Lecture Outline 2 The linear probability model Nonlinear probability

### Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### Logit and Probit. Brad Jones 1. April 21, 2009. University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science

Logit and Probit Brad 1 1 Department of Political Science University of California, Davis April 21, 2009 Logit, redux Logit resolves the functional form problem (in terms of the response function in the

### Perform hypothesis testing

Multivariate hypothesis tests for fixed effects Testing homogeneity of level-1 variances In the following sections, we use the model displayed in the figure below to illustrate the hypothesis tests. Partial

### 5. Ordinal regression: cumulative categories proportional odds. 6. Ordinal regression: comparison to single reference generalized logits

Lecture 23 1. Logistic regression with binary response 2. Proc Logistic and its surprises 3. quadratic model 4. Hosmer-Lemeshow test for lack of fit 5. Ordinal regression: cumulative categories proportional

### Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, Last revised March 28, 2015

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised March 28, 2015 NOTE: The routines spost13, lrdrop1, and extremes are

### Lecture #2 Overview. Basic IRT Concepts, Models, and Assumptions. Lecture #2 ICPSR Item Response Theory Workshop

Basic IRT Concepts, Models, and Assumptions Lecture #2 ICPSR Item Response Theory Workshop Lecture #2: 1of 64 Lecture #2 Overview Background of IRT and how it differs from CFA Creating a scale An introduction

### Binary Logistic Regression

Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including

### Using PROC MIXED in Hierarchical Linear Models: Examples from two- and three-level school-effect analysis, and meta-analysis research

Using PROC MIXED in Hierarchical Linear Models: Examples from two- and three-level school-effect analysis, and meta-analysis research Sawako Suzuki, DePaul University, Chicago Ching-Fan Sheu, DePaul University,

### Poisson Regression or Regression of Counts (& Rates)

Poisson Regression or Regression of (& Rates) Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Generalized Linear Models Slide 1 of 51 Outline Outline

### Lecture Outline (week 13)

Lecture Outline (week 3) Analysis of Covariance in Randomized studies Mixed models: Randomized block models Repeated Measures models Pretest-posttest models Analysis of Covariance in Randomized studies

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### Use of deviance statistics for comparing models

A likelihood-ratio test can be used under full ML. The use of such a test is a quite general principle for statistical testing. In hierarchical linear models, the deviance test is mostly used for multiparameter

### Lecture 19: Conditional Logistic Regression

Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina

### Analysis of Longitudinal Data in Stata, Splus and SAS

Analysis of Longitudinal Data in Stata, Splus and SAS Rino Bellocco, Sc.D. Department of Medical Epidemiology Karolinska Institutet Stockholm, Sweden rino@mep.ki.se March 12, 2001 NASUGS, 2001 OUTLINE

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

### Logit Models for Binary Data

Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response

### Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.

Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.

### data visualization and regression

data visualization and regression Sepal.Length 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 I. setosa I. versicolor I. virginica I. setosa I. versicolor I. virginica Species Species

### Introduction to Hierarchical Linear Modeling with R

Introduction to Hierarchical Linear Modeling with R 5 10 15 20 25 5 10 15 20 25 13 14 15 16 40 30 20 10 0 40 30 20 10 9 10 11 12-10 SCIENCE 0-10 5 6 7 8 40 30 20 10 0-10 40 1 2 3 4 30 20 10 0-10 5 10 15

### Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA

Paper P-702 Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA ABSTRACT Individual growth models are designed for exploring longitudinal data on individuals

### The Latent Variable Growth Model In Practice. Individual Development Over Time

The Latent Variable Growth Model In Practice 37 Individual Development Over Time y i = 1 i = 2 i = 3 t = 1 t = 2 t = 3 t = 4 ε 1 ε 2 ε 3 ε 4 y 1 y 2 y 3 y 4 x η 0 η 1 (1) y ti = η 0i + η 1i x t + ε ti

### Statistical Models in R

Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

### Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions Douglas Bates 8 th International Amsterdam Conference on Multilevel Analysis 2011-03-16 Douglas

### Multilevel Modeling of Complex Survey Data

Multilevel Modeling of Complex Survey Data Sophia Rabe-Hesketh, University of California, Berkeley and Institute of Education, University of London Joint work with Anders Skrondal, London School of Economics

### Lecture 13: Introduction to generalized linear models

Lecture 13: Introduction to generalized linear models 21 November 2007 1 Introduction Recall that we ve looked at linear models, which specify a conditional probability density P(Y X) of the form Y = α

### Statistical Modeling Using SAS

Statistical Modeling Using SAS Xiangming Fang Department of Biostatistics East Carolina University SAS Code Workshop Series 2012 Xiangming Fang (Department of Biostatistics) Statistical Modeling Using

### Chapter 6: Answers. Omnibus Tests of Model Coefficients. Chi-square df Sig Step Block Model.

Task Chapter 6: Answers Recent research has shown that lecturers are among the most stressed workers. A researcher wanted to know exactly what it was about being a lecturer that created this stress and

### Multiple Choice Models II

Multiple Choice Models II Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Multiple Choice Models II 1 / 28 Categorical data Categorical

### How to set the main menu of STATA to default factory settings standards

University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be

### Logistic regression diagnostics

Logistic regression diagnostics Biometry 755 Spring 2009 Logistic regression diagnostics p. 1/28 Assessing model fit A good model is one that fits the data well, in the sense that the values predicted

### 11/20/2014. Correlational research is used to describe the relationship between two or more naturally occurring variables.

Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

### Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples

### Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

### Mixed models for the analysis of categorical repeated measures

Mixed models for the analysis of categorical repeated measures Geert Verbeke geert.verbeke@med.kuleuven.be Biostatistical Centre, K.U.Leuven, Belgium Joint work with Geert Molenberghs and many others PAGE,

### REGRESSION LINES IN STATA

REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression

### ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.

### Longitudinal Data Analyses Using Linear Mixed Models in SPSS: Concepts, Procedures and Illustrations

Research Article TheScientificWorldJOURNAL (2011) 11, 42 76 TSW Child Health & Human Development ISSN 1537-744X; DOI 10.1100/tsw.2011.2 Longitudinal Data Analyses Using Linear Mixed Models in SPSS: Concepts,

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

### ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

### e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

### Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

### Logistic regression: Model selection

Logistic regression: April 14 The WCGS data Measures of predictive power Today we will look at issues of model selection and measuring the predictive power of a model in logistic regression Our data set

### The Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities

The Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities Elizabeth Garrett-Mayer, PhD Assistant Professor Sidney Kimmel Comprehensive Cancer Center Johns Hopkins University 1

### Linear Discrimination. Linear Discrimination. Linear Discrimination. Linearly Separable Systems Pairwise Separation. Steven J Zeil.

Steven J Zeil Old Dominion Univ. Fall 200 Discriminant-Based Classification Linearly Separable Systems Pairwise Separation 2 Posteriors 3 Logistic Discrimination 2 Discriminant-Based Classification Likelihood-based:

### Bivariate Analysis. Correlation. Correlation. Pearson's Correlation Coefficient. Variable 1. Variable 2

Bivariate Analysis Variable 2 LEVELS >2 LEVELS COTIUOUS Correlation Used when you measure two continuous variables. Variable 2 2 LEVELS X 2 >2 LEVELS X 2 COTIUOUS t-test X 2 X 2 AOVA (F-test) t-test AOVA

### These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher

### Milk Data Analysis. 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED

1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED 2. Introduction to SAS PROC MIXED The MIXED procedure provides you with flexibility

### Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

### Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture - 2 Simple Linear Regression

Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 2 Simple Linear Regression Hi, this is my second lecture in module one and on simple

### Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection

Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics

### Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.

Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features

### Regression in ANOVA. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Regression in ANOVA James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Regression in ANOVA 1 Introduction 2 Basic Linear

### Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

### Electronic Thesis and Dissertations UCLA

Electronic Thesis and Dissertations UCLA Peer Reviewed Title: A Multilevel Longitudinal Analysis of Teaching Effectiveness Across Five Years Author: Wang, Kairong Acceptance Date: 2013 Series: UCLA Electronic

### Introduction to latent variable models

Introduction to latent variable models lecture 1 Francesco Bartolucci Department of Economics, Finance and Statistics University of Perugia, IT bart@stat.unipg.it Outline [2/24] Latent variables and their