Introduction to structural equation modeling using the sem command

Similar documents
Lecture 15. Endogeneity & Instrumental Variable Estimation

Introduction to Path Analysis

MULTIPLE REGRESSION EXAMPLE

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

August 2012 EXAMINATIONS Solution Part I

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, Last revised February 21, 2015

Standard errors of marginal effects in the heteroskedastic probit model

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Module 14: Missing Data Stata Practical

Applications of Structural Equation Modeling in Social Sciences Research

Correlation and Regression

From the help desk: Swamy s random-coefficients model

xtmixed & denominator degrees of freedom: myth or magic

Multiple Linear Regression

Sample Size Calculation for Longitudinal Studies

Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week (0.052)

Nonlinear Regression Functions. SW Ch 8 1/54/

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Moderator and Mediator Analysis

Handling missing data in Stata a whirlwind tour

From this it is not clear what sort of variable that insure is so list the first 10 observations.

Introduction to Regression and Data Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Introduction to Structural Equation Modeling. Course Notes

From the help desk: hurdle models

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

Multinomial and Ordinal Logistic Regression

Presentation Outline. Structural Equation Modeling (SEM) for Dummies. What Is Structural Equation Modeling?

Rockefeller College University at Albany

An Introduction to Path Analysis. nach 3

Simple Linear Regression Inference

Structural Equation Modelling (SEM)

THE IMPACT OF MACROECONOMIC FACTORS ON NON-PERFORMING LOANS IN THE REPUBLIC OF MOLDOVA

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

From the help desk: Bootstrapped standard errors

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:

Moderation. Moderation

The Latent Variable Growth Model In Practice. Individual Development Over Time

Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Competing-risks regression

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Data Analysis Methodology 1

Correlational Research

Milk Data Analysis. 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED

lavaan: an R package for structural equation modeling

SPSS Guide: Regression Analysis

Curriculum - Doctor of Philosophy

The following postestimation commands for time series are available for regress:

Analyzing Structural Equation Models With Missing Data

Common factor analysis

An Empirical Study on the Effects of Software Characteristics on Corporate Performance

16 : Demand Forecasting

Introduction to Structural Equation Modeling (SEM) Day 4: November 29, 2012

Discussion Section 4 ECON 139/ Summer Term II

Data Mining Techniques Chapter 6: Decision Trees

The average hotel manager recognizes the criticality of forecasting. However, most

From the help desk: Demand system estimation

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

A Simple Feasible Alternative Procedure to Estimate Models with High-Dimensional Fixed Effects

Nonlinear relationships Richard Williams, University of Notre Dame, Last revised February 20, 2015

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

A Panel Data Analysis of Corporate Attributes and Stock Prices for Indian Manufacturing Sector

Chapter 4 and 5 solutions

Regression Analysis: A Complete Example

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Additional sources Compilation of sources:

Testing for Lack of Fit

Panel Data Analysis Josef Brüderl, University of Mannheim, March 2005

Multicollinearity Richard Williams, University of Notre Dame, Last revised January 13, 2015

Stata Walkthrough 4: Regression, Prediction, and Forecasting

Missing Data & How to Deal: An overview of missing data. Melissa Humphries Population Research Center

Pharmacoeconomic, Epidemiology, and Pharmaceutical Policy and Outcomes Research (PEPPOR) Graduate Program

Linear Regression Models with Logarithmic Transformations

An introduction to GMM estimation using Stata

MODELING AUTO INSURANCE PREMIUMS

Least Squares Estimation

Introduction. Survival Analysis. Censoring. Plan of Talk

SEM Analysis of the Impact of Knowledge Management, Total Quality Management and Innovation on Organizational Performance

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Maximum likelihood estimation of a bivariate ordered probit model: implementation and Monte Carlo simulations

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén Table Of Contents

FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.

Social Media Marketing Management 社 會 媒 體 行 銷 管 理 確 認 性 因 素 分 析. (Confirmatory Factor Analysis) 1002SMMM12 TLMXJ1A Tue 12,13,14 (19:20-22:10) D325

Interaction effects between continuous variables (Optional)

Example G Cost of construction of nuclear power plants

III. INTRODUCTION TO LOGISTIC REGRESSION. a) Example: APACHE II Score and Mortality in Sepsis

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Transcription:

Introduction to structural equation modeling using the sem command Gustavo Sanchez Senior Econometrician StataCorp LP Mexico City, Mexico Gustavo Sanchez (StataCorp) November 13, 2014 1 / 33

Outline Outline Structural Equation Models (SEM): Applications, concepts and components Mediation Model Measurement Models SEM Model Other Models Gustavo Sanchez (StataCorp) November 13, 2014 2 / 33

Outline Outline Structural Equation Models (SEM): Applications, concepts and components Mediation Model Measurement Models SEM Model Other Models Gustavo Sanchez (StataCorp) November 13, 2014 2 / 33

Outline Outline Structural Equation Models (SEM): Applications, concepts and components Mediation Model Measurement Models SEM Model Other Models Gustavo Sanchez (StataCorp) November 13, 2014 2 / 33

Structural Equation Models: Applications, Concepts and Components Applications SEM: Applications Psychology (e.g. Behavioral analysis, depression) Sociology (e.g. Social network, work environment) Marketing (e.g. Consumer satisfaction, new products development) Academic research (e.g. Analysis of learning abilities) Medicine (e.g. Sleep disorders, population health services) And more Gustavo Sanchez (StataCorp) November 13, 2014 3 / 33

Structural Equation Models: Applications, Concepts and Components Applications SEM: Applications Example: Path diagram for a SEM model Gustavo Sanchez (StataCorp) November 13, 2014 4 / 33

Structural Equation Models: Applications, Concepts and Components Applications SEM: Applications Models Linear regression ANOVA Multivariate regression Simultaneous equation models Path analysis Simultaneous equation models Mediation analysis Confirmatory factor analysis Reliability estimation Full structural equations models Multiple indicators and multiple causes (MIMIC) Latent growth curve Multiple group models Gustavo Sanchez (StataCorp) November 13, 2014 5 / 33

Structural Equation Models: Applications, Concepts and Components Concepts SEM: Concepts: SEM Structural equation modeling was developed by geneticists (Wright 1921) and economists (Haavelmo 1943; Koopmans 1950, 1953) so that qualitative cause-effect information could be combined with statistical data to provide quantitative assessment of cause-effect relationships among variables of interest Pearl (2000). SEM is a class of statistical techniques used for estimating and testing hypotheses on causal relationships among a set of associated variables A significant number of models can be expressed as particular cases of structural equation models. Gustavo Sanchez (StataCorp) November 13, 2014 6 / 33

Structural Equation Models: Applications, Concepts and Components Concepts SEM: Concepts: SEM Structural equation modeling was developed by geneticists (Wright 1921) and economists (Haavelmo 1943; Koopmans 1950, 1953) so that qualitative cause-effect information could be combined with statistical data to provide quantitative assessment of cause-effect relationships among variables of interest Pearl (2000). SEM is a class of statistical techniques used for estimating and testing hypotheses on causal relationships among a set of associated variables A significant number of models can be expressed as particular cases of structural equation models. Gustavo Sanchez (StataCorp) November 13, 2014 6 / 33

Structural Equation Models: Applications, Concepts and Components Concepts SEM: Concepts: SEM Structural equation modeling was developed by geneticists (Wright 1921) and economists (Haavelmo 1943; Koopmans 1950, 1953) so that qualitative cause-effect information could be combined with statistical data to provide quantitative assessment of cause-effect relationships among variables of interest Pearl (2000). SEM is a class of statistical techniques used for estimating and testing hypotheses on causal relationships among a set of associated variables A significant number of models can be expressed as particular cases of structural equation models. Gustavo Sanchez (StataCorp) November 13, 2014 6 / 33

Structural Equation Models: Applications, Concepts and Components Concepts and components SEM: Concepts and components: Types of variables Observed and Latent A variable is observed if it is a variable in your dataset A variable is latent if it is not observed. It is not in your dataset but you wish it were Errors are a special case of latent variables Exogenous and Endogenous A variable, observed or latent, is exogenous (determined outside the system) if paths only originate from it (no path points to it) A variable, observed or latent, is endogenous (determined by the system) if any path points to it. Gustavo Sanchez (StataCorp) November 13, 2014 7 / 33

Structural Equation Models: Applications, Concepts and Components Concepts and components SEM: Concepts and components: Types of variables Observed and Latent A variable is observed if it is a variable in your dataset A variable is latent if it is not observed. It is not in your dataset but you wish it were Errors are a special case of latent variables Exogenous and Endogenous A variable, observed or latent, is exogenous (determined outside the system) if paths only originate from it (no path points to it) A variable, observed or latent, is endogenous (determined by the system) if any path points to it. Gustavo Sanchez (StataCorp) November 13, 2014 7 / 33

Structural Equation Models: Applications, Concepts and Components Concepts and components SEM: Concepts and components: Types of variables Observed and Latent A variable is observed if it is a variable in your dataset A variable is latent if it is not observed. It is not in your dataset but you wish it were Errors are a special case of latent variables Exogenous and Endogenous A variable, observed or latent, is exogenous (determined outside the system) if paths only originate from it (no path points to it) A variable, observed or latent, is endogenous (determined by the system) if any path points to it. Gustavo Sanchez (StataCorp) November 13, 2014 7 / 33

Structural Equation Models: Applications, Concepts and Components Concepts and components SEM: Concepts and components: Path diagrams A Path diagram is a graphical representation of the model Boxes contain observed variables Ovals contain latent variables Circles contain the equation errors Straight arrows represent effects from one variable to another Curved arrows indicate correlation between a pair of variables. sem (income lag_consumption -> consumption) /// > (income -> investment), /// > cov(e.investment*e.consumption) Gustavo Sanchez (StataCorp) November 13, 2014 8 / 33

Structural Equation Models: Applications, Concepts and Components Concepts and components SEM: Concepts and components: Path diagrams A Path diagram is a graphical representation of the model Boxes contain observed variables Ovals contain latent variables Circles contain the equation errors Straight arrows represent effects from one variable to another Curved arrows indicate correlation between a pair of variables. sem (income lag_consumption -> consumption) /// > (income -> investment), /// > cov(e.investment*e.consumption) Gustavo Sanchez (StataCorp) November 13, 2014 8 / 33

Structural Equation Models: Applications, Concepts and Components Concepts and components. sem (income lag_consumption -> consumption) (income -> investment), /// > cov(e.investment*e.consumption) nolog nodescribe (1 observations with missing values excluded) Structural equation model Number of obs = 91 Estimation method = ml Log likelihood = -1932.0358 OIM Coef. Std. Err. z P> z [95% Conf. Interval] Structural consumption <- income.3414577.0444842 7.68 0.000.2542703.4286451 lag_consumption.6026804.0531652 11.34 0.000.4984785.7068823 _cons 13.01908 2.511575 5.18 0.000 8.096479 17.94167 investment <- income.2959361.0058258 50.80 0.000.2845177.3073544 _cons 71.16166 8.916877 7.98 0.000 53.6849 88.63842 var(e.consumption) 108.4106 16.07291 81.07234 144.9674 var(e.investment) 1480.696 219.512 1107.327 1979.957 cov(e.consumption, e.investment) 30.54975 42.37582 0.72 0.471-52.50532 113.6048 LR test of model vs. saturated: chi2(1) = 1.11, Prob > chi2 = 0.2925 Gustavo Sanchez (StataCorp) November 13, 2014 9 / 33

Structural Equation Models: Applications, Concepts and Components Concepts and components. estat gof Fit statistic Value Description Likelihood ratio chi2_ms(1) 1.108 model vs. saturated p > chi2 0.293 chi2_bs(5) 1042.569 baseline vs. saturated p > chi2 0.000. estat eqgof Equation-level goodness of fit Variance depvars fitted predicted residual R-squared mc mc2 observed consumption 342915.7 342807.3 108.4106.9996839.9998419.9996839 investment 43467.46 41986.77 1480.696.9659355.9828202.9659355 overall.999688 mc = correlation between depvar and its prediction mc2 = mc^2 is the Bentler-Raykov squared multiple correlation coefficient Gustavo Sanchez (StataCorp) November 13, 2014 10 / 33

Mediation model Example 1: Mediation models The explanatory variables may have a direct effect on the outcome and also an indirect effect that is transmitted by a mediator variable The traditional mediation analysis was based on a series of linear regressions with no correlated errors (Baron and Kenny (1986)) With SEM we can fit one simultaneous equation model and get estimates for the indirect and total effects The model can be incorporated as part of a larger model Gustavo Sanchez (StataCorp) November 13, 2014 11 / 33

Mediation model Example 1: Mediation models The explanatory variables may have a direct effect on the outcome and also an indirect effect that is transmitted by a mediator variable The traditional mediation analysis was based on a series of linear regressions with no correlated errors (Baron and Kenny (1986)) With SEM we can fit one simultaneous equation model and get estimates for the indirect and total effects The model can be incorporated as part of a larger model Gustavo Sanchez (StataCorp) November 13, 2014 11 / 33

Mediation model Example 1: Mediation models The explanatory variables may have a direct effect on the outcome and also an indirect effect that is transmitted by a mediator variable The traditional mediation analysis was based on a series of linear regressions with no correlated errors (Baron and Kenny (1986)) With SEM we can fit one simultaneous equation model and get estimates for the indirect and total effects The model can be incorporated as part of a larger model Gustavo Sanchez (StataCorp) November 13, 2014 11 / 33

Mediation model Example 1: Mediation models The explanatory variables may have a direct effect on the outcome and also an indirect effect that is transmitted by a mediator variable The traditional mediation analysis was based on a series of linear regressions with no correlated errors (Baron and Kenny (1986)) With SEM we can fit one simultaneous equation model and get estimates for the indirect and total effects The model can be incorporated as part of a larger model Gustavo Sanchez (StataCorp) November 13, 2014 11 / 33

Mediation model Example 1: Mediation models Example from Alan Acock (2013) The researcher wants to analyze whether children with better span skills at four have advantages in their academic development for math at a later age Variables attention4: span of attention at 4 math7: performance on math at 7 read7: performance on reading at 7 math21: performance on math at 21. sem (math7 <- attention4) (read7 <- attention4) (math21 <- attention4 math7 read7) Gustavo Sanchez (StataCorp) November 13, 2014 12 / 33

Mediation model Example 1: Mediation models Example from Alan Acock (2013) The researcher wants to analyze whether children with better span skills at four have advantages in their academic development for math at a later age Variables attention4: span of attention at 4 math7: performance on math at 7 read7: performance on reading at 7 math21: performance on math at 21. sem (math7 <- attention4) (read7 <- attention4) (math21 <- attention4 math7 read7) Gustavo Sanchez (StataCorp) November 13, 2014 12 / 33

Mediation model. sem (math7 <- attention4) (read7 <- attention4) (math21 <- attention4 math7 read7), /// > nolog noheader nodescribe (92 observations with missing values excluded) OIM Coef. Std. Err. z P> z [95% Conf. Interval] Structural math7 <- attention4.1129353.0493128 2.29 0.022.016284.2095865 _cons 8.719295.899051 9.70 0.000 6.957187 10.4814 read7 <- attention4.3024263.1441203 2.10 0.036.0199556.5848969 _cons 26.41214 2.627544 10.05 0.000 21.26225 31.56204 math21 <- math7.2938502.0475641 6.18 0.000.2006263.3870741 read7.0825399.0162747 5.07 0.000.0506421.1144378 attention4.0813543.0421041 1.93 0.053 -.0011683.1638769 _cons 3.987873.9165827 4.35 0.000 2.191404 5.784342 var(e.math7) 7.841701.6032078 6.744244 9.117743 var(e.read7) 66.97947 5.152267 57.6056 77.87871 var(e.math21) 5.58987.42999 4.80756 6.499482 LR test of model vs. saturated: chi2(1) = 23.72, Prob > chi2 = 0.0000 Gustavo Sanchez (StataCorp) November 13, 2014 13 / 33

Mediation model. estat teffects,compact nodirect Indirect effects OIM Coef. Std. Err. z P> z [95% Conf. Interval] Structural math21 <- attention4.0581483.0197686 2.94 0.003.0194026.096894 Total effects OIM Coef. Std. Err. z P> z [95% Conf. Interval] Structural math7 <- attention4.1129353.0493128 2.29 0.022.016284.2095865 read7 <- attention4.3024263.1441203 2.10 0.036.0199556.5848969 math21 <- math7.2938502.0475641 6.18 0.000.2006263.3870741 read7.0825399.0162747 5.07 0.000.0506421.1144378 attention4.1395026.045661 3.06 0.002.0500086.2289966 Gustavo Sanchez (StataCorp) November 13, 2014 14 / 33

Measurement models - one factor Example 2.1: Measurement model - one factor The researcher is interested in a latent variable (e.g. Consumer satisfaction, verbal abilities, alienation) The model specifies the relation between latent variables and measured indicator variables Modification indices are normally used to refine the model The model can also be incorporated as part of a larger model Gustavo Sanchez (StataCorp) November 13, 2014 15 / 33

Measurement models - one factor Example 2.1: Measurement model - one factor The researcher is interested in a latent variable (e.g. Consumer satisfaction, verbal abilities, alienation) The model specifies the relation between latent variables and measured indicator variables Modification indices are normally used to refine the model The model can also be incorporated as part of a larger model Gustavo Sanchez (StataCorp) November 13, 2014 15 / 33

Measurement models - one factor Example 2.1: Measurement model - one factor The researcher is interested in a latent variable (e.g. Consumer satisfaction, verbal abilities, alienation) The model specifies the relation between latent variables and measured indicator variables Modification indices are normally used to refine the model The model can also be incorporated as part of a larger model Gustavo Sanchez (StataCorp) November 13, 2014 15 / 33

Measurement models - one factor Example 2.1: Measurement model - one factor The researcher is interested in a latent variable (e.g. Consumer satisfaction, verbal abilities, alienation) The model specifies the relation between latent variables and measured indicator variables Modification indices are normally used to refine the model The model can also be incorporated as part of a larger model Gustavo Sanchez (StataCorp) November 13, 2014 15 / 33

Measurement models - one factor Example 2: Measurement model - one factor Example with Holzinger and Swineford (1939) Data The researcher wants to analyze the verbal ability based on indices associated to tests on word classification, sentence completion and paragraph comprehension Variables: Verbal: latent variable for verbal ability wordc: scores on word classification test sentence: scores on sentence completion test paragraph: scores on paragraph comprehension test general: scores on general information test Gustavo Sanchez (StataCorp) November 13, 2014 16 / 33

Measurement models - one factor. sem (Verbal -> wordc sentence paragraph general) OIM Coef. Std. Err. z P> z [95% Conf. Interval] Measurement wordc <- Verbal 1 (constrained) _cons 26.12625.3265833 80.00 0.000 25.48615 26.76634 sentence <- Verbal 1.080072.0698177 15.47 0.000.9432322 1.216912 _cons 17.36213.2970317 58.45 0.000 16.77995 17.9443 paragraph <- Verbal.6603575.0471812 14.00 0.000.5678841.7528309 _cons 9.182724.200961 45.69 0.000 8.788848 9.576601 general <- Verbal 2.351367.1645309 14.29 0.000 2.028892 2.673842 _cons 40.59136.7124269 56.98 0.000 39.19503 41.98769 var(e.wordc) 13.8878 1.336773 11.50008 16.77127 var(e.sentence) 5.306741.813477 3.92958 7.166542 var(e.paragr ~ h) 4.21255.4386365 3.43489 5.166273 var(e.general) 52.05898 5.597825 42.16646 64.27234 var(verbal) 18.21586 2.46415 13.97345 23.7463 Gustavo Sanchez (StataCorp) November 13, 2014 17 / 33

Measurement models - one factor. sem (Verbal -> wordc sentence paragraph general), standardize OIM Standardized Coef. Std. Err. z P> z [95% Conf. Interval] Measurement wordc <- Verbal.7532647.0289445 26.02 0.000.6965345.8099949 _cons 4.611049.1965727 23.46 0.000 4.225774 4.996324 sentence <- Verbal.8945235.0185182 48.31 0.000.8582284.9308185 _cons 3.369123.1489219 22.62 0.000 3.077242 3.661005 paragraph <- Verbal.8083679.0243277 33.23 0.000.7606865.8560492 _cons 2.633762.1218401 21.62 0.000 2.39496 2.872565 general <- Verbal.811936.0245424 33.08 0.000.7638338.8600382 _cons 3.284053.1457311 22.54 0.000 2.998425 3.56968 var(e.wordc).4325923.0436058.3550395.5270852 var(e.sentence).1998278.03313.1443886.2765533 var(e.paragr ~ h).3465414.0393314.2774255.4328764 var(e.general).34076.0398537.2709543.4285496 var(verbal) 1... Gustavo Sanchez (StataCorp) November 13, 2014 18 / 33

Measurement models - one factor. sem (Verbal -> wordc sentence paragraph general), standardize nomeans OIM Standardized Coef. Std. Err. z P> z [95% Conf. Interval] Measurement wordc <- Verbal.7532647.0289445 26.02 0.000.6965345.8099949 sentence <- Verbal.8945235.0185182 48.31 0.000.8582284.9308185 paragraph <- Verbal.8083679.0243277 33.23 0.000.7606865.8560492 general <- Verbal.811936.0245424 33.08 0.000.7638338.8600382 var(e.wordc).4325923.0436058.3550395.5270852 var(e.sentence).1998278.03313.1443886.2765533 var(e.paragr ~ h).3465414.0393314.2774255.4328764 var(e.general).34076.0398537.2709543.4285496 var(verbal) 1... Gustavo Sanchez (StataCorp) November 13, 2014 19 / 33

Measurement models - two factors Example 2.2: Measurement model - two factors Holzinger and Swineford (1939) Data Variables Verbal: latent variable for verbal ability wordc: scores on word classification test sentence: scores on sentence completion test paragraph: scores on paragraph comprehension test general: scores on general information test Memory: latent variable for Memory condition wordr: scores on word recognition test number: scores on number recognition test figurer: scores on figurer recognition test object: scores on object-number test Gustavo Sanchez (StataCorp) November 13, 2014 20 / 33

Measurement models - two factors Example 2.2: Measurement model - two factors Holzinger and Swineford (1939) Data Variables Verbal: latent variable for verbal ability wordc: scores on word classification test sentence: scores on sentence completion test paragraph: scores on paragraph comprehension test general: scores on general information test Memory: latent variable for Memory condition wordr: scores on word recognition test number: scores on number recognition test figurer: scores on figurer recognition test object: scores on object-number test Gustavo Sanchez (StataCorp) November 13, 2014 20 / 33

Example 2.2: Measurement model - two factors Gustavo Sanchez (StataCorp) November 13, 2014 21 / 33

. sem (Verbal -> wordc sentence paragraph general) /// > (Memory -> wordr numberr figurer objectn), standardize /// > cov( Memory*Verbal) nomeans noheader nocnsreport nolog Endogenous variables Measurement: wordc sentence paragraph general wordr numberr figurer objectn Exogenous variables Latent: Verbal Memory OIM Standardized Coef. Std. Err. z P> z [95% Conf. Interval] Measurement wordc <- Verbal.7558212.0287657 26.28 0.000.6994414.812201 sentence <- Verbal.890738.0186103 47.86 0.000.8542625.9272136 paragraph <- Verbal.8118064.0241118 33.67 0.000.7645481.8590646 general <- Verbal.8114395.02443 33.21 0.000.7635576.8593214 Gustavo Sanchez (StataCorp) November 13, 2014 22 / 33

wordr <- Memory.697023.0530879 13.13 0.000.5929727.8010734 numberr <- Memory.5658826.0555916 10.18 0.000.456925.6748401 figurer <- Memory.5741969.0559689 10.26 0.000.4645.6838939 objectn <- Memory.4994731.0575943 8.67 0.000.3865904.6123558 var(e.wordc).4287343.0434835.3514445.5230216 var(e.sentence).2065857.0331538.1508327.282947 var(e.paragraph).3409704.0391482.2722618.4270185 var(e.general).3415659.0396469.2720646.4288219 var(e.wordr).5141589.074007.3877726.681738 var(e.numberr).6797769.0629166.5670007.8149843 var(e.figurer).6702979.0642743.5554524.8088889 var(e.objectn).7505266.0575336.6458253.8722022 var(verbal) 1... var(memory) 1... cov(verbal,memory).2579165.0705776 3.65 0.000.1195869.3962461 LR test of model vs. saturated: chi2(19) = 41.13, Prob > chi2 = 0.0023 Gustavo Sanchez (StataCorp) November 13, 2014 23 / 33

. estat mindices Modification indices Standard MI df P>MI EPC EPC Measurement sentence <- paragraph <- numberr <- figurer <- Memory 5.272 1 0.02 -.064809 -.1007062 Memory 6.152 1 0.01.0518729.1191386 Verbal 8.098 1 0.00 -.3086542 -.1712958 Verbal 5.759 1 0.02.2570111.144435 cov(e.wordc,e.paragraph) 4.056 1 0.04-1.228282 -.1626198 cov(e.wordc,e.figurer) 9.964 1 0.00 4.905813.2119483 cov(e.sentence,e.numberr) 4.560 1 0.03-2.528942 -.1697062 EPC = expected parameter change Gustavo Sanchez (StataCorp) November 13, 2014 24 / 33

Based on estat mindices, let s add figurer and numberr to the equation for verbal, and also cov(e.wordc*e.paragraph): (but we should first check the theoretical framework) sem (Verbal -> wordc sentence paragraph general figurer numberr ) /// (Memory -> wordr numberr figurer objectn), standardize nocnsreport /// nolog nomeans cov( Memory*Verbal) noheader cov(e.wordc*e.paragraph) Compare the chi2 ms before and after adding the suggestions from estat mindices:. estat gof /* Before adding suggestions from estat mindices */ Fit statistic Value Description Likelihood ratio chi2_ms(19) 41.132 model vs. saturated p > chi2 0.002. estat gof /* After adding suggestions from estat mindices */ Fit statistic Value Description Likelihood ratio chi2_ms(16) 25.107 model vs. saturated p > chi2 0.068 Gustavo Sanchez (StataCorp) November 13, 2014 25 / 33

Based on estat mindices, let s add figurer and numberr to the equation for verbal, and also cov(e.wordc*e.paragraph): (but we should first check the theoretical framework) sem (Verbal -> wordc sentence paragraph general figurer numberr ) /// (Memory -> wordr numberr figurer objectn), standardize nocnsreport /// nolog nomeans cov( Memory*Verbal) noheader cov(e.wordc*e.paragraph) Compare the chi2 ms before and after adding the suggestions from estat mindices:. estat gof /* Before adding suggestions from estat mindices */ Fit statistic Value Description Likelihood ratio chi2_ms(19) 41.132 model vs. saturated p > chi2 0.002. estat gof /* After adding suggestions from estat mindices */ Fit statistic Value Description Likelihood ratio chi2_ms(16) 25.107 model vs. saturated p > chi2 0.068 Gustavo Sanchez (StataCorp) November 13, 2014 25 / 33

Based on estat mindices, let s add figurer and numberr to the equation for verbal, and also cov(e.wordc*e.paragraph): (but we should first check the theoretical framework) sem (Verbal -> wordc sentence paragraph general figurer numberr ) /// (Memory -> wordr numberr figurer objectn), standardize nocnsreport /// nolog nomeans cov( Memory*Verbal) noheader cov(e.wordc*e.paragraph) Compare the chi2 ms before and after adding the suggestions from estat mindices:. estat gof /* Before adding suggestions from estat mindices */ Fit statistic Value Description Likelihood ratio chi2_ms(19) 41.132 model vs. saturated p > chi2 0.002. estat gof /* After adding suggestions from estat mindices */ Fit statistic Value Description Likelihood ratio chi2_ms(16) 25.107 model vs. saturated p > chi2 0.068 Gustavo Sanchez (StataCorp) November 13, 2014 25 / 33

Structural equation model Example 3: Structural equation model The model Integrates two components A structural component that specifies the relationship among the latent variables A measurement component that specifies the relationship between the latent variables and the corresponding indicators Modification indices are normally used to refine the model The model can incorporate causal relationships among the latent variables Gustavo Sanchez (StataCorp) November 13, 2014 26 / 33

Structural equation model Example 3: Structural equation model The model Integrates two components A structural component that specifies the relationship among the latent variables A measurement component that specifies the relationship between the latent variables and the corresponding indicators Modification indices are normally used to refine the model The model can incorporate causal relationships among the latent variables Gustavo Sanchez (StataCorp) November 13, 2014 26 / 33

Structural equation model Example 3: Structural equation model (Fictitious data) Gustavo Sanchez (StataCorp) November 13, 2014 27 / 33

Structural equation model Example 3: Structural equation model (Fictitious data) Gustavo Sanchez (StataCorp) November 13, 2014 28 / 33

Other models Example 4: Dynamic Factor Unobserved (factor) effect on unemployment in four different regions use http://www.stata-press.com/data/r13/urate,clear dfactor (D.(west south ne midwest) =, ) (Unemp_factor = ) sem (D.(west south ne midwest) <- Unemp_factor),var(Unemp_factor@1) Gustavo Sanchez (StataCorp) November 13, 2014 29 / 33

Other models Example 5: VAR model VAR model for consumption and investment with income as an exogenous variable use http://www.stata-press.com/data/r13/lutkepohl,clear var dlconsumption dlinvestment,lags(1/2) exog(dlincome) sem (dlconsumption <- l.dlconsumption l2.dlconsumption /// l.dlinvestment l2.dlinvestment dlincome) /// (dlinvestment <- l.dlconsumption l2.dlconsumption /// l.dlinvestment l2.dlinvestment dlincome), /// cov(e.dlinvestment*e.dlconsumption) Gustavo Sanchez (StataCorp) November 13, 2014 30 / 33

Summary Summary Structural Equation Models (SEM): Applications, Concepts and components Mediation Model Measurement Model SEM Model Other Models What is next? Generalization (gsem) Extensions for nonlinear models Multilevel models and more Gustavo Sanchez (StataCorp) November 13, 2014 31 / 33

Summary Summary Structural Equation Models (SEM): Applications, Concepts and components Mediation Model Measurement Model SEM Model Other Models What is next? Generalization (gsem) Extensions for nonlinear models Multilevel models and more Gustavo Sanchez (StataCorp) November 13, 2014 31 / 33

References References Acock, Alan 2013. Discovering structural equation modeling using Stata. Stata Press Holzinger, K. J. and F. Swineford, 1939. A study in factor analysis: The stability of a bi-factor solution. Supplementary Education Monographs, 48. Chicago, IL: University of Chicago. Haavelmo, T. 1943. The statistical implications of a system of simultaneous equations. Econometrica 11 (1), 1 12. Koopmans, T. C. 1950. When is an Equation System Complete for Statistical Purposes? (in Statistical Inference in Dynamic Economic Models, T. C. Koopmans (ed.), Cowles Commission Monograph 10, Wiley, New York, 1950, pp. 393 409) pp. 527 537 Koopmans, T.C. 1953 Identification Problems in Economic Model Construction (in Studies in Econometric Method, Cowles Commission Monograph 14), New Haven, Yale University Press Pearl J. 2000 Causality: Models, Reasoning, and Inference. Cambridge University Press. Wright, Sewall S. 1921. Correlation and causation. Journal of Agricultural Research 20: 557 85. Gustavo Sanchez (StataCorp) November 13, 2014 32 / 33

References Questions - Comments Gustavo Sanchez (StataCorp) November 13, 2014 33 / 33