Introduction to Data Analysis in Hierarchical Linear Models April 20, 2007 Noah Shamosh & Frank Farach Social Sciences StatLab Yale University
Scope & Prerequisites Strong applied emphasis Focus on HLM software Has special functionality Other options: SPSS, SAS, MLWin, R Familiarity with regression assumed
Road to HLM Happiness Conceptualize model hierarchically Prepare data Import data into HLM Build statistical models Estimate and interpret models Graph models
What is HLM? Hierarchical Linear Model A multilevel statistical model Software program used for such models Deconstructing the name (in reverse) Model: It s a statistical model Linear: The model must be linear in the parameters Hierarchical: Nested data structures are explicitly modeled
When are data hierarchical? When units are grouped at higher units of analysis Such data may be nested within higher levels (i.e., units) of analysis Nesting can occur between subjects Children nested within classrooms Classrooms nested within schools and/or within subjects Repeated observations on the same individuals over time (observations nested within individuals)
Why not use regular regression on nested data? Increased Type I error Model misspecification Miss opportunity to examine potentially interesting contextual questions These problems increase as observations become less independent
Hierarchical Model Conceptualization What kind of hierarchical relations might be present? What factors could I incorporate in my model to reflect this organization?
HLM Caveats Adding levels of nesting increases the complexity of the model exponentially HLM can handle up to three levels Must have several times more lower level observations than upper level observations Parameter estimation uses maximum likelihood instead of least squares
Road to HLM Happiness Conceptualize model hierarchically Prepare data Import data into HLM Build statistical models Estimate and interpret models Graph models
Prep, prep, prep! This is the most labor intensive part of workflow, and is the source of many problems that come to us at the StatLab Two obstacles HLM doesn t do data manipulation or basic data description HLM requires a special data structure Solutions Plan ahead. Do all data screening, variable transformations, exploratory analyses, and assumption-checking beforehand
Data prep: SPSS example 1 Data set: IQ v & language achievement Two files Level 1: dependent variable (language achievement) and other child characteristics (e.g. IQ v ) Level 2: school characteristics (e.g. SES) Children are nested within schools 1 Extensively adapted from Bryk & Raudenbush (2002) and Bauer (2005)
Road to HLM Happiness Conceptualize model hierarchically Prepare data Import data into HLM Build statistical models Estimate and interpret models Graph models
Creating the Multivariate Data Matrix (MDM) Making an MDM file A caveat The procedure Check your summary statistics before building any models (cross-reference) Main window: are all of your variables there?
Road to HLM Happiness Conceptualize model hierarchically Prepare data Import data into HLM Build statistical models Estimate and interpret models Graph models
Build statistical models Basic model: random-effects ANOVA Test for mean group differences in population Between-group vs. total variance Key assumption check of HLM
Random-effects ANOVA Choose outcome variable Terms Toggle Level 2 error term Level 1 (r) vs. Level 2 (u) error terms The Mixed window
Random effects ANOVA Language achievement M1 M2 M3 GM
Road to HLM Happiness Conceptualize model hierarchically Prepare data Import data into HLM Build statistical models Estimate and interpret models Graph models
Random effects ANOVA Results Fixed effects: the intercept Is the grand mean significantly different from zero? Variance components (random effects) Level 2 (U0): significant variability between groups? Level 1 (R): significant variability within groups?
Random effects ANOVA Intraclass correlation (ICC) Proportion of total variance accounted for by between-group differences Level 2 variance component divided by sum of Level 1 and Level 2 variance components Ours is.23; HLM is warranted
Road to HLM Happiness Conceptualize model hierarchically Prepare data Import data into HLM Build statistical models Estimate and interpret models Graph models
Random effects regression Test for relationship between a Level 1 IV and the DV Test whether an IV explains any between groups variance Terms We are assuming a fixed slope
Random effects regression Language achievement IQ
Road to HLM Happiness Conceptualize model hierarchically Prepare data Import data into HLM Build statistical models Estimate and interpret models Graph models
Random effects regression Results Fixed effects Level 1 intercept: Mean of DV where IV is zero Level 1 slope: Change in DV with one unit of change in IV (just like OLS regression) Random effects Intercept: Between-group variance that is not explained by IV Residual variance: Within-group variance that is not explained by DV
Random effects regression Variance accounted for by IV Level 1: Compare residual variance component to random effects ANOVA model (8.0-6.5) / 8.0 =.19 Level 2: Do the same for the random intercept variance component (19.6-9.6) / 19.6 =.51
Fixed slopes Language achievement IQ
Random slopes Language achievement IQ
Random slopes Goal: test whether the IV - DV relationship varies between groups Add only if supported by theory Toggle Level 2b error term In output, look at slope variance component
Slopes as outcomes Goal: test cross level interactions Does the between-group variability in the IV - DV relation vary by a systematic factor? Add Level 2 predictor Terms
Slopes as outcomes Fixed effects For Level 1 intercept Intercept: predicted score on DV at mean value of L-1 IV Slope: Influence of Level 2 IV on DV For Level 1 slope Intercept: Influence of Level 1 IV on DV Slope: Influence of L-2 IV on L-1 IV - DV relation Random effects (same as before)
Road to HLM Happiness Conceptualize model hierarchically Prepare data Import data into HLM Build statistical models Estimate and interpret models Graph models
Graph: Simple slopes Useful for visualizing cross-level interactions Just like simple slope plots in regression Graph Equations > Model graphs Useful for categorical or continuous data
Graph: Level-1 equations Useful for: Visualizing variability in intercepts and slopes Identifying moderators Graph Equations > Level 1 equation graphing
Recommended Reading Bickel, R. (2007). Multilevel analysis for applied research: It's just regression! New York: Guilford Press. Bryk, A. & Raudenbush, S. (2002). Hierarchical Linear Models: Applications and data analysis methods (2nd ed.). Thousand Oaks, CA: Sage. Luke, D. (2004). Multilevel modeling. Thousand Oaks, CA: Sage. Heck, R. H., & Thomas, S. L. (2000). An introduction to multilevel modeling techniques. Lawrence Erlbaum Associates. Kreft, I. & de Leeuw, J. (1998). Introducing multilevel modeling. Sage. Singer, J. D., & Willett, J. B. (2003). Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. Oxford Univ. Press. (Longitudinal focus)
HLM Resources on the Web UCLA s HLM portal http://statcomp.ats.ucla.edu/mlm Excellent example of analysis http://www.ats.ucla.edu/stat/hlm/seminars/ hlm_mlm/mlm_hlm_seminar.htm