Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Similar documents

Introduction to Longitudinal Data Analysis

Illustration (and the use of HLM)

The Basic Two-Level Regression Model

A Basic Introduction to Missing Data

Introduction to Data Analysis in Hierarchical Linear Models

Introducing the Multilevel Model for Change

HLM software has been one of the leading statistical packages for hierarchical

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Missing Data. A Typology Of Missing Data. Missing At Random Or Not Missing At Random

Descriptive Statistics

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

A REVIEW OF CURRENT SOFTWARE FOR HANDLING MISSING DATA

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Assignments Analysis of Longitudinal data: a multilevel approach

Εισαγωγή στην πολυεπίπεδη μοντελοποίηση δεδομένων με το HLM. Βασίλης Παυλόπουλος Τμήμα Ψυχολογίας, Πανεπιστήμιο Αθηνών

Longitudinal Data Analyses Using Linear Mixed Models in SPSS: Concepts, Procedures and Illustrations

Introduction to Fixed Effects Methods

SPSS Guide: Regression Analysis

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Longitudinal Meta-analysis

SPSS Explore procedure

The 3-Level HLM Model

Multilevel Modeling Tutorial. Using SAS, Stata, HLM, R, SPSS, and Mplus

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Research Methods & Experimental Design

1 Theory: The General Linear Model

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

10. Analysis of Longitudinal Studies Repeat-measures analysis

Binary Logistic Regression

Module 5: Multiple Regression Analysis

Module 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling

Additional sources Compilation of sources:

Pearson's Correlation Tests

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Chapter 7: Simple linear regression Learning Objectives

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Experimental Design and Hypothesis Testing. Rick Balkin, Ph.D.

Chapter 15. Mixed Models Overview. A flexible approach to correlated data.

Data Analysis Tools. Tools for Summarizing Data

Problem of Missing Data

Introduction to Hierarchical Linear Modeling with R

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

II. DISTRIBUTIONS distribution normal distribution. standard scores

Example: Boats and Manatees

MULTIPLE REGRESSION WITH CATEGORICAL DATA

Directions for using SPSS

Data analysis process

Part 2: Analysis of Relationship Between Two Variables

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

Premaster Statistics Tutorial 4 Full solutions

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Module 3: Correlation and Covariance

1.5 Oneway Analysis of Variance

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

A Mixed Model Approach for Intent-to-Treat Analysis in Longitudinal Clinical Trials with Missing Values

Qualitative vs Quantitative research & Multilevel methods

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Introduction to proc glm

Module 4 - Multiple Logistic Regression

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Simple Regression Theory II 2010 Samuel L. Baker

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F

Moderation. Moderation

Introduction to mixed model and missing data issues in longitudinal studies

SAS Syntax and Output for Data Manipulation:

Power and sample size in multilevel modeling

Mixed 2 x 3 ANOVA. Notes

Financial Risk Management Exam Sample Questions/Answers

Chapter 23. Inferences for Regression

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Multivariate Analysis of Variance (MANOVA)

Handling attrition and non-response in longitudinal data

An introduction to hierarchical linear modeling

Electronic Thesis and Dissertations UCLA

Univariate Regression

Introduction to Regression and Data Analysis

Multiple Choice: 2 points each

Ordinal Regression. Chapter

Imputing Missing Data using SAS

Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

5. Linear Regression

Regression III: Advanced Methods

Multivariate Logistic Regression

Analyzing Structural Equation Models With Missing Data

Didacticiel - Études de cas

T-test & factor analysis

Data Cleaning and Missing Data Analysis

Final Exam Practice Problem Answers

Traditional Conjoint Analysis with Excel

Transcription:

Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t - test - Y C 2 Better Design: Have Pretest R Y 1E X Y 2E Y 1C - Y 2C Random assignment Experimental + Control group Pretest + Posttest Why Better? Analysis? 3 1

Why is Pretest/Posttest Better? Much larger power But depends on analysis Imagine effect: 4 Analysis Possibilities Covariance analysis (ancova) highest power t -Test on difference scores less power Interaction in 2-way analysis of variance (manova) less power 5 How Much Power? Example: assume posttest only Effect size is medium: Cohen s d = 0.5 Sample size is 25 + 25 = 50 Alpha = 0.05, test is two-sided Power of t - test is 0.41 6 2

How Much Power? Example: assume pretest-posttest d = 0.5, N = 25 + 25 = 50, α = 0.05 Power depends on correlation pretestposttest, which is typically high Assume r = 0.3 (Cohen: medium) Power of ancova - test is 0.46 Assume r = 0.5 (Cohen: high) Power of ancova - test is 0.55 7 How Much Power? Same example: pretest-posttest, r = 0.6 Power of ancova - test is 0.57 Power of t - test on difference scores (assume equal variances) is 0.49 (t - test on posttest was 0.41) 8 How Much Power? Same example: pretest-posttest, r = 0.6 Power of ancova - test is 0.57 Power of interaction test in manova (interaction pre/posttest with exp/con) is 0.49 (t - test on posttest was 0.41) 9 3

How Much Power? Power of ancova - test is 0.57 Power t - test on difference scores 0.49 Power of interaction test in manova 0.49 Same hypothesis is tested, thus power is in general about equal But difference scores much simpler 10 Still Better Design: Have Pretest and Two Posttests R Y 1E X Y 2E Y 3E First posttest immediately after intervention: effect Second posttest much later: retention Y 1C - Y 2C Y 3C Best analysis: covariance analysis on posttests, separately or simultaneous 11 Why Multilevel Analysis? If your data have a grouping structure Interventions in organizations e.g. anti-smoking campaign in school classes, intervention done at class level If you have a panel design with substantial dropout It is not unusual to loose 25% at each wave 12 4

If Data Are Grouped Assume t-test, with natural groups (e.g. compare two intact schools) Significance test very biased! Dependence given by intraclass correlation Formal α = 0.05, but the actual alpha is larger! n per intraclass correlation group.00.01.05.20.40.80 10.05.06.11.28.46.75 25.05.08.19.46.63.84 50.05.11.30.59.74.89 100.05.17.43.70.81.92 13 If You Have Panel Dropout SPSS ancova & manova completely handle random: dropout by casewise deletion do you really Cases with missings anywhere believe that? are deleted Minor disadvantage: waste of power Major disadvantage: assumes dropout is completely at random 14 Why Multilevel Analysis If your data do not have a grouping structure If your longitudinal study has negligible dropout Say at most 5% (Little & Rubin, 1987) Multilevel analysis has no advantages Use (m)anova with pretest as covariate 15 5

Example Intervention Data Experimental/Control group Pretest/Posttest: N = 2 25, r = 0.6 Part of data file: file: example.sav 16 Anova on Intervention Data 17 Anova on Intervention Data SPSS - output: 18 6

However If your data have a grouping structure and / or If you have a panel design with substantial dropout You need multilevel analysis! 19 Multilevel Data Three level data structure Groups at may have different sizes Response variable at lowest level Explanatory variables at all levels Model assumes sampling at all levels 20 Longitudinal Data As Multilevel Six persons measured on up to four occasions Levels are occasion level, and person level We can mix time variant (occasion level) and time invariant (person level) predictors Note that missing occasions are no problem 21 7

The Multilevel Regression Model: the Lowest (Individual) Level Ordinary regression, one explanatory variable X: Y i = β 0 + β 1 X i + e i β 0 intercept, β 1 regression slope, e i residual error term In multilevel regression, at the lowest level: Y ij = β 0j + β 1j X ij + e ij β 0j intercept, β 1j regression slope, e ij residual error subscript i for individuals, j for groups each group has its own intercept coefficient β 0j and its own slope coefficient β 1j Intercept and slope coefficients vary across the groups, hence the term random coefficient model 22 The Multilevel Regression Model: the Second (Group) Level At the lowest level: Y ij = β 0j + β 1j X ij + e ij Intercept and slope coefficients vary across the groups We predict intercept and slope with a second level regression model: β 0j = γ 00 + γ 01 Z j + u 0j β 1j = γ 10 + γ 11 Z j + u 1j γ 00 and γ 01 are the intercept and slope to predict β 0j from Z j, with u 0j the residual error term γ 10 and γ 11 are the intercept and slope to predict ß 1j from Z j, with u 1j the residual error term Coefficients γ do not vary across groups 23 The Multilevel Regression Model: Single Equation Version At the lowest (individual) level we have Y ij = β 0j + β 1j X ij + e ij and at the second (group) level β 0j = γ 00 + γ 01 Z j + u 0j β 1j = γ 10 + γ 11 Z j + u 1j Combining (substitution and rearranging terms) gives Y ij = γ 00 + γ 10 X ij + γ 01 Z j + γ 11 Z j X ij + u 1j X ij + u 0j + e ij Looks like an ordinary regression model with complicated error terms 24 8

The Multilevel Regression Model: Single Equation Version Y ij = [γ 00 + γ 10 X ij + γ 01 Z j + γ 11 Z j X ij ] + [u 1j X ij + u 0j + e ij ] This equation has two distinct parts [γ 00 + γ 10 X ij + γ 01 Z j + γ 11 Z j X ij ] contains all the fixed coefficients, it is the fixed part of the model [u 1j X ij + u 0j + e ij ] contains all the random error terms, it is the random part of the model Plus: The cross-level interaction Z j X ij results from modeling the regression slope β 1j of individual level variable X ij with the group level variable Z j the error term u 1j is connected to X ij. Thus the residuals are larger for large values of X ij, implying heteroscedasticity 25 The Multilevel Regression Model: Interpretation Y ij = [γ 00 + γ 10 X ij + γ 01 Z j + γ 11 Z j X ij ] + [u 1j X ij + u 0j + e ij ] Fixed part is an ordinary regression model Complicated error term: [u 1j X ij + u 0j + e ij ] Several error variances σ e ² variance of the lowest level errors e ij σ 00 variance of the highest level errors u 0j σ 11 variance of the highest level errors u 1j σ 01 covariance of u 0j and u 1j Note: σ 00, σ 11, and σ 01 also interpreted as (co)variances of lowest level coefficients β j 26 The Multilevel Regression Model: Estimation At the lowest (individual) level we have Y ij = β 0j + β 1j X ij + e ij and at the second (group) level β 0j = γ 00 + γ 01 Z j + u 0j β 1j = γ 10 + γ 11 Z j + u 1j Maximum Likelihood (ML) estimation Gamma coefficients, standard errors, p-values Variance components σ e ² and σ 00, σ 01, σ 11 (significance of (co)variances in Σ) Model deviance 27 9

Data Structure Multilevel Regression Model MLwiN & most other software All data in one single file Levels identified by identification numbers school number, class number, pupil number HLM Separate file for each level Reads SPSS, SAS, EXCEL & other files Note: SPSS 11.0 mixed model module is seriously flawed, do not use 28 Preparing Data for MLwiN Group and individual identification numbers Regression constant computed in SPSS const = 1 (not visible here) Write data out as fixed ascii -file 29 Porting Data to MLwiN Write data out as fixed ascii -file Using Save As Save gives all variables Or Paste into syntax window and replace all with list of variables WRITE OUTFILE = 'd:\joop\intervention\example.dat' TABLE / id group expcon pretest posttest const. EXECUTE. 30 10

Start MLwiN & Input Data Under File open fixed ascii file Specify columns = # of variables Specify location & file 31 MLwiN: Next Nothing seems to happen Click on Data Manipulation to assign names Assigning names to columns is important c1=person, c2=group, c3=expcon, c4=pretest, c5=posttest, c6=const 32 MLwiN: Next Select column to assign name Type name in textbox, next hit Enter or click on unmarked button 33 11

MLwiN: Next Go to Model, open Equations window Which produces this: The Equations window is where we specify the model 34 The Equations Window: Heart of MLwiN Specify the outcome variable Specify the variables that identify levels Add predictors Including the regression constant Specify which 1 st level predictors have random (=varying) regression coefficients across level 2 units Including the regression constant Almost always random at all levels 35 Specify Outcome & Levels click on y 36 12

Specify Constant as 1 st Predictor click on x constant always varies at all levels indicate variation here 37 Things to do in the Equations Window show names show random part add predictors show estimates not symbols 38 Things to do in the MLwiN Window start / stop estimating complicated: RTFM like SPSS compute & recode specify estimation method Logical order in analysis of grouped (multilevel) data Only constant Add 1st level predictors, then add 2nd level predictors, etc Specify varying coefficients 39 13

Start Computations As by magic, estimates appear Constant plus the two variance components The intraclass correlation is (σ11/(σ11+ σe2)) Here (25.79/(25.79 + 78.04) = 0.25 25% of variance at group level 40 Multilevel Analysis Add predictor variables Assess significance Like usual multiple regression BUT 1st level predictors may have regression coefficients varying across groups Interesting hypothesis: is intervention effect constant across groups? If not: which group variables explain differences in intervention effect? 41 Back to Longitudinal Data Six persons measured on up to four occasions Levels are occasion level, and person level We can mix time variant (occasion level) and time invariant (person level) predictors Note that missing occasions are no problem 42 14

Typical Longitudinal Data File 43 Typical Longitudinal Data File, Prepared for Multilevel Analysis Multilevel longitudinal data file as raw data and as SPSS file SPSS file useful for preliminaries: inspect distributions, outliers, exploratory analyses 44 Intervention Data File Person Identification, Exp/Con group, 1 pretest, immediate posttest, follow-up posttest 45 15

Creating the Multilevel Longitudinal Data File SPSS-file intervention.sav In SPSS: make sure each variable has enough columns No system missing values but explicit missing data code Write raw data to interventionflat.dat Data are read back in SPSS in free format (variables separated by spaces) in new data layout Saved in SPSS file interventionflat.sav 46 Creating the Multilevel Longitudinal Data File Data are read back in SPSS in free format (variables separated by spaces) in new data layout Saved in SPSS file interventionflat.sav 47 Creating the Multilevel Longitudinal Data File Remove in this file all cases ( = occasions) that have a missing value on test ( = missed occasion) Write remainder to raw data file, transport to MLwiN Male levels: 1 st = occasion, 2 nd = person (ident) Dummies for pretest & 2 posttests Effect of expcon modeled as interaction with post1 or post2 dummy 48 16

Creating the Multilevel Longitudinal Data File Note: effect of expcon modeled as interaction with post1 or post2 dummy means less power Just as in manova But testscore may be missing at any occasion Alternative (pretest as covariate) only fesible if both pretest and posttest1 virtually no missings Which does happen 49 Usual Longitudinal Multilevel Analysis Baseline model always includes occasions as linear predictor or series of dummies Check for autonomous shift over time Add covariate expcon as interaction with post1 and / or post2 dummy We may add covariate expcon as interaction with pretest to check initial equality of groups 50 Remember If your data have a grouping structure and / or If you have a panel design with substantial dropout You need multilevel analysis! 51 17

Because SPSS ancova & manova completely handle random: dropout by casewise deletion do you really Cases with missings anywhere believe that? are deleted Minor disadvantage: waste of power Major disadvantage: assumes dropout is completely at random 52 Just to Hammer it in The advantage of multilevel models in pretest - posttest designs Is the softer assumption of missing at random (MAR) Instead completely at random (MCAR) MAR is plausible if dropout can be predicted from available variables Including the pretest! 53 Let s get moving! OK 54 18