Using Mixture Latent Markov Models for Analyzing Change with Longitudinal Data



Similar documents
15 Ordinal longitudinal data analysis

CHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA

Chapter 5 Analysis of variance SPSS Analysis of variance

Introduction to Longitudinal Data Analysis

Statistical Innovations Online course: Latent Class Discrete Choice Modeling with Scale Factors

Ordinal Regression. Chapter

LOGISTIC REGRESSION ANALYSIS

Statistics in Retail Finance. Chapter 6: Behavioural models

The Latent Variable Growth Model In Practice. Individual Development Over Time

Association Between Variables

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Module 5: Multiple Regression Analysis

Introduction to Regression and Data Analysis

LOGIT AND PROBIT ANALYSIS

Multivariate Logistic Regression

A Quick Guide to Constructing an SPSS Code Book Prepared by Amina Jabbar, Centre for Research on Inner City Health

II. DISTRIBUTIONS distribution normal distribution. standard scores

When to Use a Particular Statistical Test

Module 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

IBM SPSS Direct Marketing 23

Traffic Safety Facts Research Note

IBM SPSS Direct Marketing 22

SAMPLE SIZE TABLES FOR LOGISTIC REGRESSION

5.2 Customers Types for Grocery Shopping Scenario

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

4. Simple regression. QBUS6840 Predictive Analytics.

Generalized Linear Models

Weight of Evidence Module

Introduction to Structural Equation Modeling (SEM) Day 4: November 29, 2012

Multinomial and Ordinal Logistic Regression

Univariate and Multivariate Methods PEARSON. Addison Wesley

How to set the main menu of STATA to default factory settings standards

HIDDEN MARKOV MODELS FOR ALCOHOLISM TREATMENT TRIAL DATA

LATENT GOLD CHOICE 4.0

Causes of Inflation in the Iranian Economy

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Organizing Your Approach to a Data Analysis

SAS Software to Fit the Generalized Linear Model

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Efficient and Practical Econometric Methods for the SLID, NLSCY, NPHS

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

Abbas S. Tavakoli, DrPH, MPH, ME 1 ; Nikki R. Wooten, PhD, LISW-CP 2,3, Jordan Brittingham, MSPH 4

10. Analysis of Longitudinal Studies Repeat-measures analysis

Joseph Twagilimana, University of Louisville, Louisville, KY

SPSS Guide: Regression Analysis

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Standard errors of marginal effects in the heteroskedastic probit model

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Module 14: Missing Data Stata Practical

Longitudinal Data Analyses Using Linear Mixed Models in SPSS: Concepts, Procedures and Illustrations

USE OF ARIMA TIME SERIES AND REGRESSORS TO FORECAST THE SALE OF ELECTRICITY

Multiple Linear Regression

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén Table Of Contents

Lean Six Sigma Analyze Phase Introduction. TECH QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY

HLM software has been one of the leading statistical packages for hierarchical

A course on Longitudinal data analysis : what did we learn? COMPASS 11/03/2011

Two Correlated Proportions (McNemar Test)

Gender Differences in Employed Job Search Lindsey Bowen and Jennifer Doyle, Furman University

11. Analysis of Case-control Studies Logistic Regression

IBM SPSS Forecasting 21

Regression Analysis: A Complete Example

Assignments Analysis of Longitudinal data: a multilevel approach

International Statistical Institute, 56th Session, 2007: Phil Everson

BayesX - Software for Bayesian Inference in Structured Additive Regression

Problem of Missing Data

Logit Models for Binary Data

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

Logs Transformation in a Regression Equation

COUPLE OUTCOMES IN STEPFAMILIES

IBM SPSS Forecasting 22

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CHAPTER 4 EXAMPLES: EXPLORATORY FACTOR ANALYSIS

Additional sources Compilation of sources:

Sensitivity Analysis in Multiple Imputation for Missing Data

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.

UNDERSTANDING THE TWO-WAY ANOVA

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry

It is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3.

Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences

Sun Li Centre for Academic Computing

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

LCMM: a R package for the estimation of latent class mixed models for Gaussian, ordinal, curvilinear longitudinal data and/or time-to-event data

Modeling Churn and Usage Behavior in Contractual Settings

Deciding on the Number of Classes in Latent Class Analysis and Growth Mixture Modeling: A Monte Carlo Simulation Study

Transcription:

Using Mixture Latent Markov Models for Analyzing Change with Longitudinal Data Jay Magidson, Ph.D. President, Statistical Innovations Inc. Belmont, MA., U.S. Presented at Modern Modeling Methods (M3) 2013, University of Connecticut 1

Abstract Mixture latent Markov (MLM) models are latent class models containing both time-constant and time-varying discrete latent variables. We introduce and illustrate the utility of latent Markov models in 3 real world examples using the new GUI in Latent GOLD 5.0. MLM models often fit longitudinal data better than comparable mixture latent growth models because of the inclusion of transition probabilities as model parameters which directly account for first-order autocorrelation in the data. Lack of fit can be localized (e.g., first-order, second-order autocorrelation) using new longitudinal bivariate residuals (L-BVRs) to suggest model modification. 2

Outline of Presentation Introduction to latent Markov (LM) and mixture latent Markov (MLM) Models* Example 1: LM model with time-homogeneous transition probabilities Example 2: MLM model with time-heterogeneous transitions Example 3: MLM model with time-heterogeneous transitions and covariates In each of these examples latent Markov outperformed the corresponding latent growth model, the largest improvement being in the Lag1 BVRs (i.e., first-order autocorrelation). * (Mixture) latent Markov models are also known as (mixture) hidden Markov models 3

Latent (Hidden) Markov Models are Defined by 3 Sets of Equations Initial State probs: P(X 0 = s) categories of X are called latent states (rather than latent classes) since a person may change from one state to another over time. P( x0 s) Logit model may include covariates Z (e.g., AGE, SEX) 0 P( x r x s) log P ( x x s ) t t 1 Transition probs: P(X t = r X t-1 = s) t 1 t 1 Logit may include time-varying and fixed predictors log s Px ( 1) Measurement model probs: P(y t = j X t = s) One or more dependent variables Y, of possibly different scale types (e.g., continuous, count, dichotomous, ordinal, nominal) P( yit xt s) log P ( y 1 x s ) it t 0 1 s 0 0r 1rs 2rt 3rst * Equations can be customized using Latent GOLD 5.0 GUI and/or syntax. 4

Equations: Latent Markov (LM): Initial latent state & Transition sub-models K K K P( y z )... P( x, x,..., x z ) P( y x, x,..., x, z ) i i 0 1 T i i 0 1 T i x 1 x 1 x 1 0 1 T P( x, x,..., x z ) P( x z ) P( x x, z ) 0 1 T i 0 i0 t t 1 it t 1 T Measurement sub-model T T J P( y x, x,..., x, z ) P( y x, z ) P( y x, z ) i 0 1 T i it t it itj t it t 0 t 0 j 1 Mixture Latent Markov (MLM): L K K K P( y z )... P( w, x, x,..., x z ) P( y w, x, x,..., x, z ) i i 0 1 T i i 0 1 T i w 1 x0 1 x1 1 xt 1 T P( w, x, x,..., x z ) P( w z ) P( x w, z ) P( x x, w, z ) 0 1 T i i 0 i0 t t 1 it t 1 T T J P( y w, x, x,..., x, z ) P( y x, w, z ) P( y x, w, z ) i 0 1 T i it t it itj t it t 0 t 0 j 1 5

Relationship between Mixture Latent Markov and Mixture Latent Growth Models Classification of latent class models for longitudinal research Model name Transition structure Unobserved heterogeneity Measurement error I Mixture latent Markov yes yes yes II Mixture Markov yes yes no III Latent Markov yes no yes IV Standard Markov* yes no no V Mixture latent growth no yes yes VI Mixture growth no yes no VII Standard latent class no no yes VIII Independence* no no no *This model is not a latent class model. Vermunt, Tran and Magidson (2008) Latent class models in longitudinal research, chapter 23 in Handbook of Longitudinal Research, S. Menard Editor, Academic Press. 6

Longitudinal Bivariate Residuals (L-BVRs) Longitudinal bivariate residuals quantify for each response variable Y k how well the overall trend as well as the first- and second-order autocorrelations are predicted by the model. BVR.Time=BVR k (time, y k ), BVR.Lag1 =BVRk(y k [t-1], y k [t]) and BVR.Lag2 =BVR k (y k [t-2], y k [t]) residual autocorrelations remaining unexplained by the model. Lag1:

Example 1: Loyalty Data in Long File Format N=631 respondents T= 5 time points Dichotomous Y: Choose Brand A? 1=Yes 0=No Y=(Y 1,Y 2,Y 3,Y 4,Y 5 ) 2 5 = 32 response patterns id=1,2,,32 freq is used as a case weight 8

Loyalty Model with Time-homogeneous Transitions Measurement equivalence Initial State Probability (b 0 ) Transition Probabilities (b) Measurement Model Probabilities (a) 9

Example 1: LM Model Outperforms Latent Growth Model 2-state time-homogeneous latent Markov model fits well: (p=.77), small L-BVRs 2-class latent growth model ( 2-class Regression ) is rejected (p=.0084) L-BVR 2-class Regression LM Time 0.000.0785 Lag1 8.739* (p =.003).0115 Lag2 0.065.3992 Lag1 BVR pinpoints problem in LC growth model as failure to explain 1 st order autocorrelation 10

Example 2: Life Satisfaction Model with Time-Heterogeneous Transitions N=5,147 respondents T= 5 time points Dichotomous Y: Satisfied with life? 1=No 2=Yes Y=(Y 1,Y 2,Y 3,Y 4,Y 5 ) 2 5 = 32 response patterns id=1,2,,32 weight is used as a case weight Models: Null Time heterogeneous LM 2-class mixture LM Restricted (Mover-Stayer) 11

Ex. 2: Time-heterogeneous Models Latent Markov Model Separate transition probability parameters exist for each pair of adjacent time points 12

Example 2: Model Parameters Time-heterogeneous Model Initial State Probability (b 0 ) Transition Probabilities (b t ) Measurement Model Probabilities (a) Estimated values for 2-state LM model 13

Ex. 2: Mover-Stayer Time-heterogeneous Latent Markov Model Both the unrestricted and Mover-Stayer 2-class MLM models fit well (p=.95 and.71), the BIC statistic preferring the Mover-Stayer model. L-BVR 2-class Regression 2-class LM 2-class LM Mover-Stayer Time 0.0 0.02 0.03 Lag1 55.1 (p=1.1e-13) 0.00 0.00 Lag2 1.7 0.10 1.97 14

Mover-Stayer Time-heterogeneous 2-class MLM Model Estimated Values output for 2-state time-heterogeneous MLM model with 2 classes Class Size Initial State Transition Probabilities Measurement model 15

Example 2: Mover-Stayer Time-heterogeneous Latent Markov Model The Estimated Values output shows that 52.25% of respondents are in the Stayer class, who tend to be mostly Satisfied with their lives throughout this 5 year period -- 67.85% are in state 1 ( Satisfied' state) initially and remain in that state. In contrast, among respondents whose life satisfaction changed during this 5 year period (the Mover class), fewer (54.82%) were in the Satisfied state during the initial year. Class Size Initial State Transition Probabilities note that class 1 probability of staying in the same state has been restricted to 1. Measurement model 16

Example 2: Longitudinal Profile Plot for the Mover-Stayer LM model Stayer class showing 61.75% satisfied each year Mover class showing changes over time 17

Example 2: Longitudinal-Plot with Predicted Probability of Being Satisfied ( Overall Prob ) Appended 18

Example 2: L-BVRs for the Mover-Stayer Model 19

Example 3: Latent GOLD Longitudinal Analysis of Sparse Data N=1725 pupils who were of age 11-17 at the initial measurement occasion (in 1976) Survey conducted annually from 1976 to 1980 and at three year intervals after 1980 23 time points (T+1=23), where t=0 corresponds to age 11 and the last time point to age 33. For each subject, data is observed for at most 9 time points (the average is 7.93) which means that responses for the other time points are treated as missing. (See Figure 2) Dichotomous dependent variable drugs indicating whether respondent used hard drugs during the past year (1=yes; 0=no). Time-varying predictors are time (t) and time_2 (t 2 ); time-constant predictors are male and ethn4 (ethnicity). 20

Example 3: Latent GOLD Longitudinal Analysis of Sparse Data The plot on the left shows the overall trend in drug usage during this period is nonlinear, with zero usage reported for 11 year olds, increasing to a peak in the early 20s and then declining through age 33. The plot on the right plots the results from a mixture latent Markov model suggesting that the population consists of 2 distinct segments with different growth rates, Class 2 consisting primarily of non-users. 21

Example 3: 2-class MLM Model Class size Initial state probabilities by class Transition probabilities by class Measurement model probabilities 22

Example 3: Including Gender and Ethnicity as Covariates in Model 23

Example 3: Including Gender and Ethnicity as Covariates in Model Model Summary output, showing that adding gender and ethnicity improves the fit. 24

Example 3: Including Gender and Ethnicity as Covariates in Model 25

Example 3: Including Gender and Ethnicity as Covariates in Model For concreteness we will focus on 18 year olds (see highlighted cells in figure). We see that 18 year olds who were in the lower usage state (State 1) at age 17 have a probability of.1876 of switching to the higher usage state (State 2) if they are in Class 2 compared to a probability of only.0211 of switching if they were in Class 1. In addition, if they were in the higher use state (State 2) at age 17, they have a probability of.9589 of remaining in that state compared to only.3636 if they were in Class 1. Thus, based on these different transition probabilities we see that Class 2 is more likely to move to and remain in a higher drug usage state than Class 1. 26

Appendix: Equations Latent Markov: K K K P( y )... P( x, x,..., x ) P( y x, x,..., x ) i 0 1 T i 0 1 T x 1 x 1 x 1 0 1 T P( x, x,..., x ) P( x ) P( x x ) 0 1 T 0 t t 1 t 1 T T T J P( y x, x,..., x ) P( y x ) P( y x ) i 0 1 T it t itj t t 0 t 0 j 1 Latent Growth: L P( y z ) P( w z ) P( y w, z ) i i i it it w 1 t 0 T 27

References Vermunt, J.K., Tran, B. and Magidson, J (2008). Latent Markov model. Vermunt, J.K. Langeheine, R., Böckenholt, U. (1999). Discrete-time discrete-state latent Markov models with time-constant and time-varying covariates. Journal of Educational and Behavioral Statistics, 24, 178-205. 28