Outline. Session A: Various Definitions. 1. Basics of Path Diagrams and Path Analysis

Size: px
Start display at page:

Download "Outline. Session A: Various Definitions. 1. Basics of Path Diagrams and Path Analysis"

Transcription

1 Session A: Basics of Structural Equation Modeling and The Mplus Computer Program Kevin Grimm University of California, Davis June 9, 008 Outline Basics of Path Diagrams and Path Analysis Regression and Structural Regression Structural Expectations Covariance Expectations Mean Expectations The Common Factor Model 5 Steps of SEM The Mplus Computer Program Various Definitions. Basics of Path Diagrams and Path Analysis Formal statistical statement about the relations among chosen variables Hypothesized pattern of (linear) relationships among a set of variables Collection of statistical techniques that allow the examination of a set of relationships between one or more IV s (continuous or discrete) and one or more DV s (continuous or discrete)

2 Some Advantages of SEM Explicit representation of theory (no default model) Representation of complex multivariate theories involving latent entities are possible Direct and indirect effects can be teased apart Analysis of multiple groups Variables can be outcomes (DV) and predictors (IV) Missing Data Some Disadvantages of SEM Need for strong substantive theory regarding relationship of variables Sample Size Need for some basic understanding of statistics Yield to temptation, when the tail wags the dog Equivalent models Yields to statements of causality easily Special SEM Some SEM Software General linear model (t-test, regression, multiple regression, ANOVA, etc.) Path model Confirmatory factor analysis Latent variable path model Latent growth curve analysis LISREL Mplus AMOS SAS Proc Calis Mx COSAN Sepath LisComp Systat Ramona

3 Underlying Principles Because SEM concerns relationships among variables, the emphasis is mainly on moment matrices (i.e., the structure of the data) The hypothesized model attempts to explain the structure of the data, with parsimony and accuracy Some statistical statement is needed about the match between the observed data and the hypothesized model (e.g. Fit Statistics) Some SEM Terminology Manifest variables: Observed (i.e., measured) variables defining the structure we wish to model Latent variables: Unobserved (i.e., unmeasured) variables implied by the covariance among two or more manifest variables Specification: Exercise of formally stating a model Some SEM Terminology () SEM Path Diagrams A Key Association: Non- (or bi-) directional (reciprocal) Y Squares = Observed Variables relation between variables Direct effect: (uni-) Directional (non-reciprocal) U f Circles = Latent Variables relation between variables (IV and DV) Indirect effect: effect of an IV on a DV through one or more intervening or mediating variables Double-Headed Arrows = Variances/Covariances (Association) Single-Headed Arrows = Regressions (Direct Effect) Total effect: sum of direct and indirect effects of an IV on a DV Triangle = Assigned variable = Constant (=.0) for modeling Means

4 Regression & Structural Regression Y X e n 0 n n n 0 = intercept, predicted value of Y when predictors (X) are zero = slope coefficient, predicted amount of change in Y for a unit change in X e = residual, part of Y not predicted by X, uncorrelated with X Variance of Y is decomposed into variance explained by X and unexplained variance (e) X Structural Regression X Y e X 0 e Y X e n 0 n n n Structural Regression (Short-hand) Structural Expectations X X X 0 Y e Y X e n 0 n n n Every Structural Model has a set of Structural Expectations. Variance/Covariance Expectations Mean Expectations Used to estimate parameters Difference between Structural Expectations and Observed Statistics (Covariance/Mean) is the model misfit Expectations can be calculated based on a path diagram or computed algebraically.

5 Calculating Covariance Expectations X X X X Covariance Expectations Variance Expectations (Select) EX X [, ] X EX [, X] X X X,Y Covariance Expectations (Select) Y Y EX [, Y] X Y, EX X [, ] X Y Y EX [, Y] X, Y Calculating Mean Expectations Mean Expectations (Select) X X X EX [ ] X Y EX [ ] X Y Y

6 X X Y e e X 0 Expected Variances EXX [ ] X EYY [ ] b b X e Confirmatory Factor Analysis Expected Covariance EXY [ ] b X Expected Means EX [ ] X E[ Y] b X b 0 Confirmatory Factor Analysis (CFA) Used to study how well a hypothesized structure fits to a sample of measurements Hypothesis-driven Explicitly test a priori hypotheses (theory) about the structures that underlie the data Number of, characteristics of, and interrelations among underlying factors Specify a common measurement base for comparisons across groups/occasions (factorial invariance) Confirmatory Factor Analysis (CFA) Testing an a-priori hypothesis about the structures in the data Requires specific expectations regarding The number of factors Which variables reflect given factors How the factors are related to one another

7 One Factor Model Covariance Expectations for the Single Common Factor f = Y Y Y 3 Y + 3 Y + Y Y Y 3 Y U U U 3 3 Path Tracing Rules Note: The main diagonal variances include the unique variances, j, but the offdiagonal covariances do not. All terms include the variance of the common factor,. Structural Expectations: Identification Constraints Additional constraints are often needed to obtain a unique set of estimates, and we typically assume the latent variable has no meaningful scaling, so we assume it has variance E{f, f } = =.0 After this scaling each pair covariance is simply a product of the pairs of loadings: E{ } = One Factor Model f 3 Y Y Y 3 U U U 3 3

8 Covariance Expectations for the Single Common Factor with = Numerical Expectations for a Population Common Factor with = and = [.8,.7,.6] = Y Y Y 3 = Y Y Y 3 Y + Y + Y +.36 =.0 Y Y *.8 = =.0 Y 3 *.8 =.48 *.7 = =.0 Note: Each covariance, ij, expectation follows a simple pattern determined by the product of the respective loadings i j Structural Equation Modeling Covariance expectations from a single common factor model are a simple product of the loadings Introducing Means into the CFA Model Consider The Common Factor Model f The extension to multiple factors is straightforward However, as models become more complicated (read realistic) the structural expectations become more complex as well 3 Structural Equation Modeling (covariance analysis) is simply the method by which we test our expectations against our data Y u y Y u y Y 3 u y3

9 Introducing Means into the CFA Model Observed Variable Means Structural Expectations: Means Observed Variable Means f Y Y Y 3 3 = Y Y Y3 Y Y Y3 Y Y Y 3 u y u y u y3 Path Tracing Rules Introducing Means into the CFA Model Latent Variable Mean Structural Expectations: Means Latent Variable Mean f f Y Y Y 3 3 = f f f Y Y Y 3 u y u y u y3 Path Tracing Rules

10 5 Steps in SEM Analyses SEM is often viewed as an advanced and novel form of analysis but this approach is not new: 5 Steps in SEM Analysis. Theory-Data: form some basic ideas of merging theory and data. Specification: form explicit hypotheses using regression and factor analysis concepts to form structural restrictions 3. Estimation: use specialized computer software to estimate coefficients, standard errors and various statistical indicators 4. Evaluation: compare alternative structural restrictions in a series of statistical tests 5. Re-evaluation & Extension: exploring new ideas/models Step : Theory-Data The purpose of statistical procedures is to assist in establishing the plausibility of a theoretical model (Cooley, 978) SEM is a general statistical framework that allows researchers to be explicit about theory and how it might be reflected in one s data Statistical models are where theories and data collide However, in their assumptions, statistical models invoke a particular notion of reality that may or may not match one s theoretical ideas Step : Theory-Data SEM is a confirmatory framework for testing an a-priori hypotheses about the structures in the data Requires specific expectations regarding One s theory How one s theory may be reflected in one s data Selection of persons Selection of variables Selection of occasions

11 Step : Model Specification There are many ways to use SEM programs (e.g., Mplus, AMOS, LISREL, Mx) to produce the same result Specify a set of expectations that match the theory to be tested Path Diagram, Matrix, or Multiple Equation specifications are functionally equivalent Step 3: Parameter Estimation A series of computational steps is taken, each successive step to minimize the value of the fit function equation a weighted distance between the observations and expectations. At each step in the minimization process, a vector of model parameter estimates is updated so that the model reproduces the observed covariance matrix as closely as possible. When these parameter estimates no longer improve the fit over the previous estimates, the process is said to have converged. The estimates at convergence are the parameter estimates. Example of a Fit Function F(ml) = ln S -ln +tr(s - )-p, where S is the sample covariance matrix and is the estimated covariance matrix based on the assumed model. Step 4: Evaluating Fit of SEM Relative Fit: A variety of structural models are typically fitted to the same observations (data) Nested models Fit Indices: Lots of fit indices are available: simple residuals, standard errors, and likelihood ratio and chi-square tests Lawley & Maxwell, 97; Browne, 985; Browne & Cudeck, 993

12 Step 4: Evaluating Fit of SEM Fit Indices (cont): If we calculate the parameters based on the principles of maximum likelihood estimation (MLE) we obtain a likelihood ratio statistic (L ) of misfit Under standard assumptions (e.g., normality of residuals) L follows a distribution with df = N s -N p Step 4: Evaluating Fit of SEM Testing Fit: We use L type tests to ask Should we reject the hypothesis of this model? Often this gets rephrased... : Are the observed data consistent with the hypothetical model? Is the model plausible? Does the model fit? Step 4: Evaluating Fit of SEM Probability of Close Fit: Probability models based on normal distribution theory are available e.g., p(perfect fit) and p(close fit) These same statistical analyses can be used even when models are complex e.g., when latent variables, multiple groups, or incomplete data are the focus of study Major Bonus of SEM Step 4: Evaluating Model Fit How good is the model? How well does the model represent the data? How well does the model represent the theory? Fit to the data Measures of how well the estimated covariance matrix derived from the model matches the observed covariance matrix Fit to the theory Subjective interpretation

13 Confirmatory Hypothesis Tests When restrictions are stated in advance of the estimation, the hypothesis is clear and we can use statistical probability models These models have degrees-of-freedom and are rejectable Examine overall fit, standard errors and residuals We do not conclude the model fits the data, but we can conclude the model does not fit the data, or that one model fits better than another Relative fit: We need to examine most restrictions via the comparison of at least two alternative models Model Fit Statistics (or -LL) df = degrees of freedom Null hypothesis Estimated covariance matrix = Observed covariance matrix (sensitive to sample size) RMSEA Range: 0.00 to.00 lower values indicate better fit M. Browne s rule of thumb: RMSEA <.05 indicates good fit CFI (Comparative Fit Index) NFI (Normed Fit Index) TLI (Tucker-Lewis Index) Range: 0.00 to.00+ higher values indicate better fit Nested Hypothesis Tests Two alternative models may be nested Parameters are said to be nested when they are included in one model (M 0 ) and then can be removed to form the alternative model (M ) The hierarchy of restrictions makes it easy to use statistical probability tests Under typical assumptions the difference between two nested models can be evaluated using a chi-square test Relative Fit of Nested Models difference tests (for nested models) [(Model B ) - (Model A )]/ df B -df A Information criteria for non-nested model comparisons (using same data) AIC (Aikake Information Criteria) BIC (Bayes Information Criteria) Lower values are better Should be used in conjunction with judgments about the theoretical interpretation of the models

14 Evaluating Relative Fit Relative Fit of Nested Models =.0 = =.0 Evaluate Fit for Model A f f f Add restrictions to construct Model B Evaluate Fit for Model B Y Y Y 3 Y 4 Y 5 Y 6 Y Y Y 3 Y 4 Y 5 Y 6 Evaluate difference in fit = /df Is the restricted (parsimonious) model of significantly worse fit than the less restrictive (more complex) model or is this complexity needed? U U U 3 U 4 U 5 U = 55, df = 9, RMSEA =.4 U U U 3 U 4 U 5 U =, df = 8, RMSEA =.053 Model Comparison: / df = 44/ p <.05 Step 4: Evaluating Model Fit How good is the model? How well does the model represent the data? How well does the model represent the theory? Examine the relative fit of multiple models Reject those models that fit relatively worse Carry forward those models that fit relatively well Step 5: Re-evaluation & Extension Moving from Confirmation to Exploration A philosophical debate In most confirmatory analyses some results suggest alterations of the original concepts (specifying a different theory ) Often the model is modified because the original model does not fit An exploratory phase begins Note: If there is a lack of prior directional hypotheses, probability models based on normal distribution theory are no longer available

15 A General Latent Variable Framework & Analysis Software TITLE: Mplus Language I Factor Model; Muthén & Muthén, DATA: FILE = wisc3raw.dat; VARIABLE: NAMES = id verb verb verb4 verb6 perfo perfo perfo4 perfo6 info comp simi voca info6 comp6 simi6 voca6 momed grad constant; USEVARIABLES = info comp simi voca; ANALYSIS: TYPE = MEANSTRUCTURE; Y Squares = observed variables USEVAR = var var ; MPlus Language II f Circles = latent variables U ANALYSIS: TYPE = MEANSTRUCTURE;! TYPE = BASIC; gives sample statistics! TYPE = NOMEANSTRUCTURE; allows for model without means Double-Headed Arrows = Covariances var WITH var; Double-Headed Arrows = Variances var*; or factor@; Single-Headed Arrows = Regressions var ON var; or factor BY var var var3; Triangle = Assigned variable = Constant (=.0) for modeling Means [var]; MODEL:!Factor Loadings verb BY info@ comp simi voca;!factor Variance verb*;!mean of Factor [verb@0];!means of Observed [info comp simi voca];!variances of Observed info comp simi voca; OUTPUT: SAMPSTAT STANDARDIZED;

16 Session B: Alternative Structural Equation Models for Change over Two-Occasions Kevin Grimm University of California, Davis June 9, 008 Overview. Practical Preliminaries. Two-occasion longitudinal data 3. Type-A auto-regression models 4. Type-D difference score models 5. Combining alternative models 6. Summary & Discussion Practical Preliminaries Data set formatting Preliminary data examination Correlations (covariances) over time Means over time Longitudinal plots Examine for Shapes, Outliers, Possible time bases, etc. Longitudinal Data Formats Two common data formats Single-record per person (wide form data) all the data associated with one person appears in a single record id adhd adhd4 adhd5 adhd

17 Multiple-record per person (long form data, person-period data, relational data) the data associated with one person appear in multiple records indexed by id and time variables. id age read Preliminary Examination of Longitudinal Data Sample Statistics Correlations (covariances) over time Stability coefficients Means over time Plots of intraindividual change over time Sample Statistics in SAS/SPSS Sample Statistics in Mplus *Examining Correlations Across Time; PROC CORR DATA=wiscraw; VAR adhd adhd4 adhd5 adhd7; RUN; *Examining Means Across Time; PROC MEANS DATA=wiscraw; VAR adhd adhd4 adhd5 adhd7; RUN; *Correlations over time. CORRELATIONS /VARIABLES= adhd adhd4 adhd5 adhd7 /MISSING=PAIRWISE. *Means over time. DESCRIPTIVES VARIABLES= adhd adhd4 adhd5 adhd7 /STATISTICS=MEAN STDDEV MIN MAX. TITLE: Descriptive Sample Stats; DATA: FILE = adhd_uncg.dat; VARIABLE: NAMES = id girl minority sesyr dadhp7 doddp7 adhd_tot adhd_tot4 adhd_tot5 adhd_tot7 adhd_in adhd_in4 adhd_in5 adhd_in7 adhd_hy adhd_hy4 adhd_hy5 adhd_hy7; MISSING =. USEVAR = adhd_tot - adhd_tot7; ANALYSIS: TYPE = BASIC; OUTPUT: SAMPSTAT; *Note: Data must be in single-record per person (wide) format *Note: Data must be in single-record per person (wide) format

18 Sample Statistics Longitudinal Correlations & Means Plot of Intraindividual Change in SAS adhd adhd4 adhd5 adhd7 Means N adhd adhd adhd adhd PROC GPLOT DATA = temp_long (where = (new_id < 50)); SYMBOL REPEAT=500 I=join V=dot H=.5 W= C=black; AXIS LABEL = (A=90 F=SWISSX H=.3 'ADHD Total Score') ORDER = (0 to 60 by 0) MINOR = none OFFSET = (); AXIS LABEL = (F=SWISSX H=.3 'Age') ORDER = ( to 7 by ) MINOR = none OFFSET = (); PLOT adhd_t * age = id /NOLEGEND VAXIS=AXIS HAXIS=AXIS; RUN; *Note: Data must be in multiple-record per person (long) format Plot of Intraindividual Change in SPSS Longitudinal Plot of ADHD Total Score (N = 50) igraph /x=var(age) type = scale /y=var(adhd_t) type=scale /style=var(id) /line(mode) key=off style=line interpolate=straight /scalerange=var(adhd_t) min=0 max=60. *Note: Data must be in multiple-record per person (long) format

19 . Two-Occasion Longitudinal Data Two-Occasion Data Are Valuable Two-occasions are the first case of longitudinal data collections There are several special properties of repeated measures data Most analyses deal with basic questions of representing change over time Different problems seem to suggest different models and methods of analysis Example of Two-Occasion Data Data from the RIGHT (Research Investigating Growth and Health Trajectories) Track Research Project Focus on the development and developmental trajectories of early disruptive behavior Study participants N=43 children Measured at age, 4, 5 & 7 years of age In this illustration the Attention Deficit Hyperactivity Total Scores from the age 4 & age 7 assessments are used

20 Summary Statistics from the RIGHT Track Study The CORR Procedure Variables: adhd_to4 adhd_to7 Univariate Histograms of ADHD Age 4 and Age 7 Simple Statistics Variable N Mean Std Dev Sum Minimum Maximum adhd_to adhd_to Pearson Correlation Coefficients Prob > r under H0: Rho=0 Number of Observations adhd_to4 adhd_to7 adhd_to < adhd_to < Quotes from Bachman et al (00, p.3) There are two approaches to the prediction of change in such analyses, and we use both. The first approach involves computing a change score by subtracting the before measure from the after measure, and then using the change score as the dependent variable. The second approach uses the after measure as the dependent variable and include the before measure as one of the predictors (i.e., as a covariate). In either case one could say that the earlier score is being controlled, but the means of controlling differ and the results of the analysis also can differ --- sometimes in important ways. Bivariate Scatterplot X=Age 4 versus Y=Age 7

21 Plotting Individual Trajectories (n=50) SEM and Two-Occasion Data Analyses We can use use regular regression or ANOVA programs to get reasonable answers to some change questions. Alternatively, we can use any standard SEM computer program and any available options for input and output. SEMs are based on: (a) an algebraic model, (b) a corresponding path diagram, (c) input the model to a SEM program, and (d) examine expectations generated. Empirical models can be fit using any technique for means and covariances or raw data (e.g., in M+). SEM provide a framework for dealing with more than two alternative models, including those based on latent variable and requiring incomplete data analyses. Most Common Linear Regression Models A linear model is expressed for n= to N as 3. Type-A Auto-Regressive Models for Repeated Measures Y[] n = 0 + Y[] n + e n 0 is the intercept term -- the predicted score of Y[] when Y[]=0 is the coefficient term -- the change in the predicted score of Y[] for a one unit change in Y[] e is the residual score -- an unobserved and random score which is uncorrelated with Y[] but forms part of the variance of Y[]. The ratio of the variance of e to Y[] ( e / y = -R ) can be a useful index of forecast efficiency. In our notation, Greek letters used for estimated parameters.

22 Y Typical Autoregression Path Model for Two Repeated Measures Y[] Y 0 Y[] e e SAS/SPSS Autoregression Input Script SAS *Type A - Autoregressive Model for Two Occasion; PROC REG DATA = adhd; MODEL adhd_to7 = adhd_to4 / STB; RUN; SPSS *Autoregressive Model. REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.0) /NOORIGIN /DEPENDENT adhd_to7 /METHOD=ENTER adhd_to4. Auto-Regression Results (SAS) Dependent Variable: adhd_to7 Number of Observations Read 43 Number of Observations Used 34 Number of Observations with Missing Values 7 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.000 Error Corrected Total Root MSE R-Square Dependent Mean.0654 Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Standardized Variable DF Estimate Error t Value Pr > t Estimate Intercept < adhd_to < TITLE: DATA: VARIABLE: ANALYSIS: MODEL: Autoregressive Model in Mplus ADHD Autoregressive Model; FILE = adhd_uncg_wide.dat; LISTWISE=ON; NAMES = id girl minority sesyr dadhp7 doddp7 adhd_to adhd_to4 adhd_to5 adhd_to7 adhd_in adhd_in4 adhd_in5 adhd_in7 adhd_hy adhd_hy4 adhd_hy5 adhd_hy7; MISSING =.; USEVAR = adhd_to4 adhd_to7; TYPE = MEANSTRUCTURE; adhd_to7 ON adhd_to4; adhd_to4 adhd_to7; [adhd_to4 adhd_to7]; OUTPUT: SAMPSTAT;

23 Mplus Output Mplus Output SUMMARY OF ANALYSIS Number of groups Number of observations 34 Number of dependent variables Number of independent variables Number of continuous latent variables 0 Observed dependent variables Continuous ADHD_TO7 Observed independent variables ADHD_TO4 Means Covariances SAMPLE STATISTICS ADHD_TO7 ADHD_TO ADHD_TO7 ADHD_TO4 ADHD_TO ADHD_TO Correlations ADHD_TO7 ADHD_TO4 ADHD_TO7.000 ADHD_TO Mplus Output TESTS OF MODEL FIT Chi-Square Test of Model Fit Value Degrees of Freedom 0 P-Value CFI/TLI CFI.000 TLI.000 Loglikelihood H0 Value H Value Information Criteria Number of Free Parameters 5 Akaike (AIC) Bayesian (BIC) Sample-Size Adjusted BIC (n* = (n + ) / 4) RMSEA (Root Mean Square Error Of Approximation) Estimate Percent C.I Probability RMSEA <= Mplus Output Two-Tailed Estimate S.E. Est./S.E. P-Value ADHD_TO7 ON ADHD_TO Means ADHD_TO Intercepts ADHD_TO Variances ADHD_TO Residual Variances ADHD_TO

24 Results from Autoregression Model Graphic Result of the Autoregression ADHD[4] = ADHD[7] e * Note: Fully saturated model so =0 with df=0; Asterisk indicates t= p/se(p) >. = 4. Type D Latent Difference Score Models for Repeated Measures Statistical Features of Difference Scores Using the same repeated measures scores, the difference scores have useful properties for the means, such as [] = [] + D so D = [] - [] n And also for the variances and covariances = + D + D so D = + - and D = - The statistics of the difference scores are a transformation of the statistics in the measured variables, and this is useful.

25 Classic Critique of Difference Scores There have been major critics of the use of difference scores (e.g., Cronbach & Furby, 970). A typically model for a set of observed repeated measures is Y[] n = y n + e[] n and Y[] n = y n + e[] n where y = an unobserved true score for both occasions, and e[t]=an unobserved random error that is independent over each occasion. In this model the true score remains the same and all changes are based on the random noise. Classic Critique of Difference Scores If the previous model holds then the simple difference score can be rewritten as D n = Y[] n - Y[] n = (y n + e[] n )- (y n + e[] n ) = (y n -y n ) + (e[] n - e[] n ) = e[] n - e[] n So the variance of the difference score is entirely based on the differences in the random error scores. This also implies that the reliability of the difference score is zero. For these and other reasons the use of the simple difference score has been frowned upon in much of developmental research. Classic Resolution of Difference Critique Other researchers (e.g., Nesselroade, 97, 974) defined a model for a set of observed repeated measures as Y[]n = y[] n + e[] n and Y[] n = y[] n + e[] n where y 0 = an unobserved true score for he first occasion, y = an unobserved true score for the second occasion, and e[t]=an unobserved random error. It is also possible to redefine this model as a gain and write Y[] n = y[] n + e[] n and Y[] n = y[] n + y n + e[] n where y[] n = an unobserved true score for both occasions, y = an unobserved true change at the second occasion, and e[t]=an unobserved random error. Classic Resolution of Difference Critique However, iff this model holds then the difference score D n = Y[] n - Y[] n = (y[] n + e[] n ) - (y[]+ e[] n ) = (y[] n -y[] n ) + (e[] n - e[] n ) = y n + (e[] n - e[] n ) = y n + e n So the variance of the difference score is party based on the differences in the random error scores but also partly on the gain in the true score. The relative size of the true-score gain determines variance and reliability of the difference. This implies the difference score may be a very good way to consider measuring change, and researchers should consider them carefully, especially latent scores without accumulated errors.

26 Models with Difference Scores Using the same repeated measures scores, we can write the alternative difference score for any person n as Y[] n = Y[] n + y n where the y is an implied or latent difference score. This can be verified simply by rewriting the model as y n =Y[] n -Y[] n We can also add features of the means and covariances as y n = d + y n * and E{y* y*}= and E{Y[]* y*}= Difference Score Model in SAS/SPSS *Type D - Difference Score Model for Two Occasion; DATA diff; SET adhd; delta = adhd_to7 - adhd_to4; constant = ; RUN; PROC REG DATA = diff; MODEL delta = constant/ NOINT; RUN; *Computing Difference Score & Difference Score Model. COMPUTE delta = adhd_to7 - adhd_to4. COMPUTE constant =. EXECUTE. REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.0) /ORIGIN /DEPENDENT delta /METHOD=ENTER constant. SAS/SPSS Output Path Diagram of the Latent Difference Score Model Dependent Variable: delta Number of Observations Read 43 Number of Observations Used 34 Number of Observations with Missing Values 7 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Uncorrected Total Root MSE R-Square 0.05 Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t constant Y[] Y[] y Note: The y is an unobserved variable whose moments (,, ) are implied by the formula Y[]=Y[]+y.

27 Mplus Latent Difference Score TITLE: ADHD Difference Model; DATA: FILE = adhd_uncg_wide.dat; LISTWISE=ON; VARIABLE: NAMES = id girl minority sesyr dadhp7 doddp7 adhd_to adhd_to4 adhd_to5 adhd_to7 adhd_in adhd_in4 adhd_in5 adhd_in7 adhd_hy adhd_hy4 adhd_hy5 adhd_hy7; MISSING =.; USEVAR = adhd_to4 adhd_to7; ANALYSIS: MODEL: TYPE = MEANSTRUCTURE; adhd_to7 ON adhd_to4@; adhd_to4 adhd_to7@; [adhd_to4 adhd_to7@0]; delta BY adhd_to7@; delta; [delta]; SUMMARY OF ANALYSIS Mplus Output Number of groups Number of observations 34 Number of dependent variables Number of independent variables Number of continuous latent variables Observed dependent variables Continuous ADHD_TO7 Observed independent variables ADHD_TO4 Continuous latent variables DELTA OUTPUT: SAMPSTAT; Mplus Output Chi-Square Test of Model Fit Value Degrees of Freedom 0 P-Value CFI/TLI CFI.000 TLI.000 Loglikelihood H0 Value H Value Information Criteria Number of Free Parameters 5 Akaike (AIC) Bayesian (BIC) Sample-Size Adjusted BIC (n* = (n + ) / 4) RMSEA (Root Mean Square Error Of Approximation) Estimate Percent C.I Probability RMSEA <= Mplus Output MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value DELTA BY ADHD_TO ADHD_TO7 ON ADHD_TO ADHD_TO4 WITH DELTA Means ADHD_TO DELTA Intercepts ADHD_TO Variances ADHD_TO DELTA Residual Variances ADHD_TO

28 Results of the Latent Difference Score Model Y[] Y[] y Note: Fully saturated model so =0 with df=0; Asterisk indicates t= p/se(p) >. Summary of Latent Difference Models The use of latent difference score (LDS) in SEM is based on the same statistical information in the twooccasion data or the calculated difference score model. This means that variations in the parameters of each change model can be evaluated in the same way -- by the difference in goodness-of-fit tests. However, by avoiding the direct calculation of the different score we can now consider models where we attempt a model-based separation of the errors of measurement from the systematic change. The previous point will be emphasized in the next set of models when we measure: (a) more variables, (b) more time points, or (c) more of both. 5. Combining Features of Alternative Repeated Measures Models Interpreting Change From Autoregression Assume scores over time for multiple variables have been fitted using this form of regression over time Y[] n = 0 + Y[] n + e n. Rewrite this expression as a residual change (Y[] n - Y[] n ) = 0 + e n. Or rewrite this expression as a direct change (Y[] n - Y[] n ) = 0 + Y[] n + e n - Y[] n = 0 +( ) Y[] n + e n = 0 + Y[] n + z n 3. Or rewrite this expression as a historical change Y[] n = 0 + Y[] n + e[] n BUT if Y[] n = 0 + Y[0] n + e[] n then (Y[] n - Y[] n ) = (Y[] n - Y[0] n ) + (e[] n - e[] n )

29 Models with Latent Difference Scores Assuming we can write the alternative difference score for any person n as Y[] n = Y[] n + y n where the y is an implied or latent difference score. So suppose we consider features of a dual change prediction system with y n = 0 + Y[] n + z n This model of y makes it easy to see the possible to plots the latent difference scores. To obtain the autoregression from the difference score coefficients we write 0 = 0 and = ( -). This is a non-trivial resolution of the fundamental question of the alternative models. A path diagram for the prediction of the Latent Difference Score Y[] Y[] y z Note: The y is an unobserved variable whose moments (,, ) are implied by the formula Y[]=Y[]+y AND y = 0 + Y[] + z. Mplus Latent Difference Score TITLE: ADHD Difference Model; DATA: FILE = adhd_uncg_wide.dat; LISTWISE=ON; VARIABLE: NAMES = id girl minority sesyr dadhp7 doddp7 adhd_to adhd_to4 adhd_to5 adhd_to7 adhd_in adhd_in4 adhd_in5 adhd_in7 adhd_hy adhd_hy4 adhd_hy5 adhd_hy7; MISSING =.; USEVAR = adhd_to4 adhd_to7; ANALYSIS: TYPE = MEANSTRUCTURE; MODEL: adhd_to7 ON adhd_to4@; adhd_to4 adhd_to7@; [adhd_to4 adhd_to7@0]; delta BY adhd_to7@; delta; [delta]; delta ON adhd_to4;!prediction of Change OUTPUT: SAMPSTAT; Mplus Output Chi-Square Test of Model Fit Value Degrees of Freedom 0 P-Value CFI/TLI CFI.000 TLI.000 Loglikelihood H0 Value H Value Information Criteria Number of Free Parameters 5 Akaike (AIC) Bayesian (BIC) Sample-Size Adjusted BIC (n* = (n + ) / 4) RMSEA (Root Mean Square Error Of Approximation) Estimate Percent C.I Probability RMSEA <=

30 Mplus Output MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value DELTA BY ADHD_TO DELTA ON ADHD_TO ADHD_TO7 ON ADHD_TO Means ADHD_TO Intercepts ADHD_TO DELTA Variances ADHD_TO Residual Variances ADHD_TO DELTA Autoregression vs. Change Estimates. We previous fit the auto-regression over time with MLE of Y[] n = 0 + Y[] n + e n = + Y[] n + e n. We now use the same data and rewrite this expression as a direct change with MLE of (Y[] n - Y[] n ) = 0 +( ) Y[] n + z n = 0 + Y[] n + z n = + *Y[] n + z n 3. The explained variance is the same residual variance ( e ) but it is compared to the variance at time ( ) in auto-reg model, but it is compared to the variance of the difference ( ) in latent-difference model. Testing Hypotheses with Difference Scores Hypothesis : No change has occurred Hypothesis : ADHD at age 4 is not predictive of change in ADHD from age 4 to age 7 No Change TITLE: ADHD Difference Model; DATA: FILE = adhd_uncg_wide.dat; LISTWISE=ON; VARIABLE: NAMES = id girl minority sesyr dadhp7 doddp7 adhd_to adhd_to4 adhd_to5 adhd_to7 adhd_in adhd_in4 adhd_in5 adhd_in7 adhd_hy adhd_hy4 adhd_hy5 adhd_hy7; MISSING =.; USEVAR = adhd_to4 adhd_to7; ANALYSIS: TYPE = MEANSTRUCTURE; MODEL: adhd_to7 ON adhd_to4@; adhd_to4 adhd_to7@; [adhd_to4 adhd_to7@0]; delta BY adhd_to7@; delta; [delta@0];!no Average Change delta WITH adhd_to4; OUTPUT: SAMPSTAT;

31 Mplus Output TESTS OF MODEL FIT Chi-Square Test of Model Fit Value 6.89 Degrees of Freedom P-Value CFI/TLI CFI TLI Loglikelihood H0 Value H Value Information Criteria Number of Free Parameters 4 Akaike (AIC) Bayesian (BIC) Sample-Size Adjusted BIC (n* = (n + ) / 4) RMSEA (Root Mean Square Error Of Approximation) Estimate Percent C.I Probability RMSEA <= No Prediction of Change TITLE: ADHD Difference Model; DATA: FILE = adhd_uncg_wide.dat; LISTWISE=ON; VARIABLE: NAMES = id girl minority sesyr dadhp7 doddp7 adhd_to adhd_to4 adhd_to5 adhd_to7 adhd_in adhd_in4 adhd_in5 adhd_in7 adhd_hy adhd_hy4 adhd_hy5 adhd_hy7; MISSING =.; USEVAR = adhd_to4 adhd_to7; ANALYSIS: TYPE = MEANSTRUCTURE; MODEL: adhd_to7 ON adhd_to4@; adhd_to4 adhd_to7@; [adhd_to4 adhd_to7@0]; delta BY adhd_to7@; delta; [delta]; delta ON adhd_to4@0;!no Prediction of Change OUTPUT: SAMPSTAT; Mplus Output TESTS OF MODEL FIT Chi-Square Test of Model Fit Value Degrees of Freedom P-Value CFI/TLI CFI TLI Loglikelihood H0 Value H Value Information Criteria Number of Free Parameters 4 Akaike (AIC) Bayesian (BIC) Sample-Size Adjusted BIC (n* = (n + ) / 4) 6. Summary & Discussion RMSEA (Root Mean Square Error Of Approximation) Estimate Percent C.I Probability RMSEA <=

32 Summary of Two Occasion Models Given any two-occasion repeated measures data, we can write many alternative structural change models. It is not easy to distinguish these models by goodnessof-fit tests, because some can be exactly identified, rotated, and fit as well as one another. Our interpretations of change from these models are fundamentally restricted by these initial choices -- so choose carefully and match the problem at hand. Measurement error can have direct impacts on the lowering the determination of individual differences in changes. Note: The two models discussed here, auto-regression and difference-scores can be distinguished when more than t > 3 repeated measures are available. Alternative Indices of Change Assuming A[t]=The index at a particular time, we can calculate:. The Difference score [-]= A[] - A[]. The Ratio score []= A[] / A[] 3. The Change Ratio score []= {A[]-A[]} / A[] 4. The Change Ratio score []= {A[]-A[]} / A[] 5. The Residual Change Ratio [] = A[] - { 0 + A[]} with 0 and to be determined from the data References on Change Issues Re-considering Lord s Paradox? Holland, P.W. & Rubin, D.B. (983). In H. Wainer & S. Messick (Eds.), Principals of modern psychological measurement (pp. 3 5). Hillsdale, NJ: Erlbaum. (also see Laird, N, 983, Am Stat., 37, ). Additional problems due to Regression to the Mean? (Nesselroade et al, 98) The problems of un-reliability of difference scores? (Burr and Nesselroade, 990) Alternative change measures in Pretest-Postest? (Bonate, 000, Analysis of Pretest-Posttest Designs) Issues in Significance Testing? Harlow, L.L. Mulaik, S.A. & Steiger, J.H. (997). What if there were no significance tests? Hillsdale, NJ: Erlbaum. References on SEM Baltes, P.B., Dittmann-Kohli, F. & Kliegl, R. (986). Reserved capacity of the elderly in aging-sensitive tests of fluid intelligence: replication and extension. Psychology and Aging, (), Joreskog, K.G. & Sorbom, D. (979). Advances in factor analysis and structural equation models. In J. Magdson (Ed.), Cambridge, MA: Abt Books. Loehlin, John C. (998). Latent Variable Models: An Introduction to Factor, Path, and Structural Analysis. 3rd ed. Mahwah, N.J.: Lawrence Erlbaum Associates. McArdle, J.J. (996). Current directions in structural factor analysis. Current Directions in Psychological Science, 5 (), -8. McDonald, R.P. (985). Factor Analysis and Related Methods, New Jersey: Lawrence Erlbaum, Publishers.

33 Session C: Group Differences over Two Occasions Kevin Grimm University of California, Davis June 9, 008 Overview. Group Differences in Longitudinal Data. Traditional ANCOVA in Two Occasions 3. ANOVA Difference Score Regression 4. Multiple Group Structural Equation Modeling 5. Latent Class Mixture Modeling 6. Summary & Discussion Group differences in change. Group Differences in Longitudinal Data The alternative models of change are often considered in the context of a group difference Differences between groups can be considered for any feature of the model (e.g., means, regressions, etc.) If assignment to groups is based on random selection, then the inferences about the source of the subsequent changes is clear the impact is due to the assignment The SEM approach can add power to the model If assignment to groups is pre-existing or based on non-random selection (e.g., self selection) then the inferences about the source of the changes is often ambiguous the changes may be due to other features related to the initial selection The SEM approach does not counteract the selection mechanism, but it can help clarify some of the ambiguity.

34 Traditional concept of group differences on the distribution of a variable Group information on ADHD Data Gender Coded 0 for male (n = 45), for female (n = 69) Group A Group B Questions related to group information Do males/females tend to display more symptoms of ADHD? Are there more/less between-person differences in ADHD for males/females? Do males/females change more in their symptoms of ADHD? Are males/females more variable in their patterns of change in ADHD? Descriptive Information in ADHD by Gender Statistic ADHD Age 4 Mean (SD) Overall (N=34) 3.35 (9.39) Females (n f =69).7 (9.58) Males (n m =45) 4.60 (9.03) A note on Lord s paradox Lord, F.M. (967) A paradox in the interpretation of group comparisons, Psychological Bulletin, 68, A large university is interested in investigating the effects on the students of the diet provided in the university dining halls.various types of data are gathered. In particular the weight of each student at the time of his arrival in September and his weight in the following June are recorded ADHD Age 7 Mean (SD).07 (0.06) 0.68 (9.97) 3.68 (9.95) Lord suggested that Analyst A used ANCOVA to remove the preexisting differences and, and Analyst B used Repeated Measures ANOVA to examine changes Age 4/Age 7 Correlation The paradox Analyst A obtained significant group differences but Analyst B did not, and then Lord simply ended the paper

35 Multiple Regression ANCOVA. Traditional ANCOVA in SEM A linear model is expressed (for n= to N) as Y[] n = b 0 + b. Y[] n + b. G n + e n where G is a binary variable Iff coded in dummy (0,) form we can write Y[] n [: G n =0] =b 0 + b. Y[] n + b. 0 n + e n Y[] n [: G n =] =b 0 + b. Y[] n + b. n + e n so b 0 is the intercept for the group coded 0, b is the slope for the group coded 0, b is the change in the intercept for the group coded An Auto-Regression model with Group Differences Y[] Y[] e 0 g g G e Autoregression Input Script (SAS) TITLE Structural Equation Models of Change'; TITLE Auto-Regression and Difference Score Models of Change with Group Information'; *Type A with Group - Traditional ANCOVA Model with Repeated Measures; DATA adhd; SET adhd; adhd_by_girl = adhd_to4 * girl; RUN; PROC REG DATA = adhd; ModA: MODEL adhd_to7 = adhd_to4 / STB; ModA: MODEL adhd_to7 = adhd_to4 girl / STB; ModA3: MODEL adhd_to7 = adhd_to4 girl adhd_by_girl / STB; RUN; g

36 Autoregression Input Script (SPSS) *Computing Interaction between gender and ADHD at age 4. COMPUTE adhd_by_girl = adhd_to4 * girl. EXECUTE. *Autoregressive Model with Group Information. REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.0) /NOORIGIN /DEPENDENT adhd_to7 /METHOD=ENTER adhd_to4 /METHOD=ENTER girl adhd_to4 /METHOD=ENTER girl adhd_to4 adhd_by_girl. SAS Autoregression The REG Procedure Number of Observations Read 43 Number of Observations Used 34 Number of Observations with Missing Values 7 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.000 Error Corrected Total Root MSE R-Square Dependent Mean.0654 Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Standardized Variable DF Estimate Error t Value Pr > t Estimate Intercept < adhd_to < Autoregression with Intercept Difference Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.000 Error Corrected Total Root MSE R-Square Dependent Mean.0654 Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Standardized Variable DF Estimate Error t Value Pr > t Estimate Intercept < adhd_to < girl Mplus Autoregression & Group Input VARIABLE: NAMES = id girl minority sesyr dadhp7 doddp7 adhd_to adhd_to4 adhd_to5 adhd_to7 adhd_in adhd_in4 adhd_in5 adhd_in7 adhd_hy adhd_hy4 adhd_hy5 adhd_hy7; MISSING =.; USEVAR = adhd_to4 adhd_to7 girl; ANALYSIS: MODEL: TYPE = MEANSTRUCTURE; adhd_to7 ON adhd_to4; adhd_to4 adhd_to7; [adhd_to4 adhd_to7]; adhd_to7 ON girl; [girl*.5] (M_girl); girl (V_girl); girl WITH adhd_to4; MODEL CONSTRAINT: v_girl = M_girl * ( - M_girl); OUTPUT: SAMPSTAT TECH;

37 Mplus output Means ADHD_TO7 ADHD_TO4 GIRL Covariances ADHD_TO7 ADHD_TO4 GIRL ADHD_TO ADHD_TO GIRL Correlations ADHD_TO7 ADHD_TO4 GIRL ADHD_TO7.000 ADHD_TO GIRL TESTS OF MODEL FIT Chi-Square Test of Model Fit Value Degrees of Freedom P-Value.0000 CFI/TLI Mplus output CFI.000 TLI.04 Loglikelihood H0 Value H Value Information Criteria Number of Free Parameters 8 Akaike (AIC) Bayesian (BIC) Sample-Size Adjusted BIC (n* = (n + ) / 4) RMSEA (Root Mean Square Error Of Approximation) Estimate Percent C.I Probability RMSEA <= Mplus output MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value ADHD_TO7 ON ADHD_TO GIRL GIRL WITH ADHD_TO Means ADHD_TO GIRL Intercepts ADHD_TO Variances ADHD_TO GIRL Residual Variances ADHD_TO Autoregression Model with Group Difference Y[] Y[] e[] G

38 Predicted Regression Lines Autoregression Model Group and Pre-Existing Initial Differences Males Females e Y[] Y[] e[] e 0 g g 0 G g g Two Occasion ANCOVA with Interaction The model for multiple groups can be written as Y[] n = b 0 + b. Y[] n + b g. G n + b i. (Y[] n.g n ) + e n where G is a binary variable -- yes or no and the product variable is created as (Y[] n. G n ) Iff coded in dummy (0,) form we can write Y[] n [: G n =0]=b 0 + b Y[] n + b 0 n + b 3 0 n + e n Y[] n [: G n =]=b 0 + b Y[] n + b n + b 3. Y[] n + e n b 0 is the intercept for the group coded 0 b is the slope for the group coded 0 b is the change in the intercept for the group b 3 is the change in the slope for the group coded Autoregression Model with Group Differences and Interaction Y[] Y[] e 0 g g G 3 G*Y[] g g e

39 Autoregression with Int+Slope Differences Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.000 Error Corrected Total Root MSE R-Square Dependent Mean.0654 Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Standardized Variable DF Estimate Error t Value Pr > t Estimate Intercept < adhd_to < girl adhd_by_girl Mplus Autoregression & Interaction VARIABLE: NAMES = id girl minority sesyr dadhp7 doddp7 adhd_to adhd_to4 adhd_to5 adhd_to7 adhd_in adhd_in4 adhd_in5 adhd_in7 adhd_hy adhd_hy4 adhd_hy5 adhd_hy7; MISSING =.; USEVAR = adhd_to4 adhd_to7 girl g_by_adhd; DEFINE: g_by_adhd = girl * adhd_to4; ANALYSIS: TYPE = MEANSTRUCTURE; MODEL: adhd_to7 ON adhd_to4; adhd_to4 adhd_to7; [adhd_to4 adhd_to7]; adhd_to7 ON girl; [girl*.5] (M_girl); girl (V_girl); girl WITH adhd_to4;!new Additions adhd_to7 ON g_by_adhd; [g_by_adhd*.5]; g_by_adhd; g_by_adhd WITH girl adhd_to4; MODEL CONSTRAINT: v_girl = M_girl * ( - M_girl); OUTPUT: SAMPSTAT TECH; Mplus Output MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value ADHD_TO7 ON ADHD_TO GIRL G_BY_ADHD GIRL WITH ADHD_TO G_BY_ADH WITH GIRL ADHD_TO Means ADHD_TO GIRL G_BY_ADHD Intercepts ADHD_TO Variances ADHD_TO GIRL G_BY_ADHD Residual Variances ADHD_TO Predicted Regression Lines Males Females

40 3. Repeated Measures ANOVA and Difference Score Regression Difference Score ANOVA / ANCOVA A difference model is expressed (for n= to N) as Y[] n = Y[] n + Dy n and Dy n = m d + Dy n * now we add G as a binary variable ( yes or no ) Dy n = g 0 + g G n + d n g o is the intercept (average difference) for the group coded 0, and g is the change in the intercept for the group coded We can add an interaction here simply by writing Dy n = g 0 + g G n + g {Y[] n G n }+d n so g is difference in the slope for group coded A Latent Difference Score Model with Group Differences d Y[] Y[] y d d SAS/SPSS Input for Difference Score Model PROC REG DATA = diff; ModD0a: MODEL delta = / STB; ModDa: MODEL delta = adhd_to4 / STB; ModDb: MODEL delta = girl / STB; ModD: MODEL delta = adhd_to4 girl / STB; ModD3: MODEL delta = adhd_to4 girl adhd_by_girl / STB; RUN; g g 0 G *Difference Model with Group Information. REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.0) /NOORIGIN /DEPENDENT delta /METHOD=ENTER adhd_to4 /METHOD=ENTER girl adhd_to4 /METHOD=ENTER girl adhd_to4 /METHOD=ENTER girl adhd_to4 adhd_by_girl. g

41 Side-by-Side Distributions of Differences 0 = Boys = Girls Group Difference Score - SAS Output Dependent Variable: delta Number of Observations Read 43 Number of Observations Used 34 Number of Observations with Missing Values 7 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.000 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq 0.48 Coeff Var Parameter Estimates Parameter Standard Standardized Variable DF Estimate Error t Value Pr > t Estimate Intercept < adhd_to < girl Mplus Latent Difference Score with Group Information VARIABLE: NAMES = id girl minority sesyr dadhp7 doddp7 adhd_to adhd_to4 adhd_to5 adhd_to7 adhd_in adhd_in4 adhd_in5 adhd_in7 adhd_hy adhd_hy4 adhd_hy5 adhd_hy7; MISSING =.; USEVAR = adhd_to4 adhd_to7 girl; ANALYSIS: MODEL: TYPE = MEANSTRUCTURE; adhd_to7 ON adhd_to4@; adhd_to4 adhd_to7@; [adhd_to4 adhd_to7@0]; delta BY adhd_to7@; delta; [delta]; delta ON adhd_to4 girl; OUTPUT: SAMPSTAT; Mplus Output MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value DELTA BY ADHD_TO DELTA ON ADHD_TO GIRL ADHD_TO7 ON ADHD_TO GIRL WITH ADHD_TO Means ADHD_TO Intercepts ADHD_TO DELTA Variances ADHD_TO Residual Variances ADHD_TO DELTA

42 Results of Latent Difference Score Model with Group Differences -.36 * Predicted Change in ADHD by Groups * Y[] Y[] y.96 * Males Females 3.35 * -.58 * 4.37 * G.5 Latent Difference Score Results By Group The latent difference model implies m[] -m[] = b 0 + b *m[] + b * G n since G is a binary variable then for two groups m[] -m[]{ G=0} = *m[] -.5*0 = -.43 m[] -m[]{ G=} = *m[] -.5*= -.94 The latent difference model implies m[] = m[] + b *m[] + b 0 + b * G n since G is a binary variable then for two groups m[] { G=0} = *m[] -.5*0 =.9 m[] { G=} = *m[] -.5* =.4 Difference Score, Group & Interaction SAS output Dependent Variable: delta Number of Observations Read 43 Number of Observations Used 34 Number of Observations with Missing Values 7 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.000 Error Corrected Total Root MSE R-Square 0.66 Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Standardized Variable DF Estimate Error t Value Pr > t Estimate Intercept < adhd_to < girl adhd_by_girl

Applications of Structural Equation Modeling in Social Sciences Research

Applications of Structural Equation Modeling in Social Sciences Research American International Journal of Contemporary Research Vol. 4 No. 1; January 2014 Applications of Structural Equation Modeling in Social Sciences Research Jackson de Carvalho, PhD Assistant Professor

More information

The Latent Variable Growth Model In Practice. Individual Development Over Time

The Latent Variable Growth Model In Practice. Individual Development Over Time The Latent Variable Growth Model In Practice 37 Individual Development Over Time y i = 1 i = 2 i = 3 t = 1 t = 2 t = 3 t = 4 ε 1 ε 2 ε 3 ε 4 y 1 y 2 y 3 y 4 x η 0 η 1 (1) y ti = η 0i + η 1i x t + ε ti

More information

Indices of Model Fit STRUCTURAL EQUATION MODELING 2013

Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit A recommended minimal set of fit indices that should be reported and interpreted when reporting the results of SEM analyses:

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

SPSS and AMOS. Miss Brenda Lee 2:00p.m. 6:00p.m. 24 th July, 2015 The Open University of Hong Kong

SPSS and AMOS. Miss Brenda Lee 2:00p.m. 6:00p.m. 24 th July, 2015 The Open University of Hong Kong Seminar on Quantitative Data Analysis: SPSS and AMOS Miss Brenda Lee 2:00p.m. 6:00p.m. 24 th July, 2015 The Open University of Hong Kong SBAS (Hong Kong) Ltd. All Rights Reserved. 1 Agenda MANOVA, Repeated

More information

lavaan: an R package for structural equation modeling

lavaan: an R package for structural equation modeling lavaan: an R package for structural equation modeling Yves Rosseel Department of Data Analysis Belgium Utrecht April 24, 2012 Yves Rosseel lavaan: an R package for structural equation modeling 1 / 20 Overview

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Moderation. Moderation

Moderation. Moderation Stats - Moderation Moderation A moderator is a variable that specifies conditions under which a given predictor is related to an outcome. The moderator explains when a DV and IV are related. Moderation

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s Confirmatory Factor Analysis using Amos, LISREL, Mplus, SAS/STAT CALIS* Jeremy J. Albright

More information

Latent Variable Modeling of Differences and Changes with Longitudinal Data

Latent Variable Modeling of Differences and Changes with Longitudinal Data Annu. Rev. Psychol. 2009. 60:577 605 First published online as a Review in Advance on September 25, 2008 The Annual Review of Psychology is online at psych.annualreviews.org This article s doi: 10.1146/annurev.psych.60.110707.163612

More information

[This document contains corrections to a few typos that were found on the version available through the journal s web page]

[This document contains corrections to a few typos that were found on the version available through the journal s web page] Online supplement to Hayes, A. F., & Preacher, K. J. (2014). Statistical mediation analysis with a multicategorical independent variable. British Journal of Mathematical and Statistical Psychology, 67,

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Simple Linear Regression, Scatterplots, and Bivariate Correlation 1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

More information

Introduction to Path Analysis

Introduction to Path Analysis This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Introduction to Longitudinal Data Analysis

Introduction to Longitudinal Data Analysis Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction

More information

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Overview of Factor Analysis

Overview of Factor Analysis Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,

More information

Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Getting Correct Results from PROC REG

Getting Correct Results from PROC REG Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or

More information

HLM software has been one of the leading statistical packages for hierarchical

HLM software has been one of the leading statistical packages for hierarchical Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

Rens van de Schoot a b, Peter Lugtig a & Joop Hox a a Department of Methods and Statistics, Utrecht

Rens van de Schoot a b, Peter Lugtig a & Joop Hox a a Department of Methods and Statistics, Utrecht This article was downloaded by: [University Library Utrecht] On: 15 May 2012, At: 01:20 Publisher: Psychology Press Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office:

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

More information

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

Presentation Outline. Structural Equation Modeling (SEM) for Dummies. What Is Structural Equation Modeling?

Presentation Outline. Structural Equation Modeling (SEM) for Dummies. What Is Structural Equation Modeling? Structural Equation Modeling (SEM) for Dummies Joseph J. Sudano, Jr., PhD Center for Health Care Research and Policy Case Western Reserve University at The MetroHealth System Presentation Outline Conceptual

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 FA is not worth the time necessary to understand it and carry it out. -Hills, 1977 Factor analysis should not

More information

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,

More information

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables. SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

More information

An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA

An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA ABSTRACT An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA Often SAS Programmers find themselves in situations where performing

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Data analysis process

Data analysis process Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis

More information

Multivariate Logistic Regression

Multivariate Logistic Regression 1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

Specification of Rasch-based Measures in Structural Equation Modelling (SEM) Thomas Salzberger www.matildabayclub.net

Specification of Rasch-based Measures in Structural Equation Modelling (SEM) Thomas Salzberger www.matildabayclub.net Specification of Rasch-based Measures in Structural Equation Modelling (SEM) Thomas Salzberger www.matildabayclub.net This document deals with the specification of a latent variable - in the framework

More information

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s Linear Regression Models for Panel Data Using SAS, Stata, LIMDEP, and SPSS * Hun Myoung Park,

More information

Goodness of fit assessment of item response theory models

Goodness of fit assessment of item response theory models Goodness of fit assessment of item response theory models Alberto Maydeu Olivares University of Barcelona Madrid November 1, 014 Outline Introduction Overall goodness of fit testing Two examples Assessing

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Latent Class Regression Part II

Latent Class Regression Part II This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Electronic Thesis and Dissertations UCLA

Electronic Thesis and Dissertations UCLA Electronic Thesis and Dissertations UCLA Peer Reviewed Title: A Multilevel Longitudinal Analysis of Teaching Effectiveness Across Five Years Author: Wang, Kairong Acceptance Date: 2013 Series: UCLA Electronic

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

Structural Equation Modelling (SEM)

Structural Equation Modelling (SEM) (SEM) Aims and Objectives By the end of this seminar you should: Have a working knowledge of the principles behind causality. Understand the basic steps to building a Model of the phenomenon of interest.

More information

Psychology 405: Psychometric Theory Homework on Factor analysis and structural equation modeling

Psychology 405: Psychometric Theory Homework on Factor analysis and structural equation modeling Psychology 405: Psychometric Theory Homework on Factor analysis and structural equation modeling William Revelle Department of Psychology Northwestern University Evanston, Illinois USA June, 2014 1 / 20

More information

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t

More information

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical

More information

Random effects and nested models with SAS

Random effects and nested models with SAS Random effects and nested models with SAS /************* classical2.sas ********************* Three levels of factor A, four levels of B Both fixed Both random A fixed, B random B nested within A ***************************************************/

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form. One-Degree-of-Freedom Tests Test for group occasion interactions has (number of groups 1) number of occasions 1) degrees of freedom. This can dilute the significance of a departure from the null hypothesis.

More information

Basics of SEM. 09SEM3a 1

Basics of SEM. 09SEM3a 1 Basics of SEM What is SEM? SEM vs. other approaches Definitions Implied and observed correlations Identification Latent vs. observed variables Exogenous vs. endogenous variables Multiple regression as

More information

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings

More information

Introduction to Structural Equation Modeling (SEM) Day 4: November 29, 2012

Introduction to Structural Equation Modeling (SEM) Day 4: November 29, 2012 Introduction to Structural Equation Modeling (SEM) Day 4: November 29, 202 ROB CRIBBIE QUANTITATIVE METHODS PROGRAM DEPARTMENT OF PSYCHOLOGY COORDINATOR - STATISTICAL CONSULTING SERVICE COURSE MATERIALS

More information

The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

More information

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

ADVANCED FORECASTING MODELS USING SAS SOFTWARE ADVANCED FORECASTING MODELS USING SAS SOFTWARE Girish Kumar Jha IARI, Pusa, New Delhi 110 012 gjha_eco@iari.res.in 1. Transfer Function Model Univariate ARIMA models are useful for analysis and forecasting

More information

Multivariate Analysis. Overview

Multivariate Analysis. Overview Multivariate Analysis Overview Introduction Multivariate thinking Body of thought processes that illuminate the interrelatedness between and within sets of variables. The essence of multivariate thinking

More information

5. Multiple regression

5. Multiple regression 5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

More information

T-test & factor analysis

T-test & factor analysis Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

ANOVA. February 12, 2015

ANOVA. February 12, 2015 ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R

More information

Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure

Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure Technical report Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure Table of contents Introduction................................................................ 1 Data preparation

More information

CHAPTER 5: CONSUMERS ATTITUDE TOWARDS ONLINE MARKETING OF INDIAN RAILWAYS

CHAPTER 5: CONSUMERS ATTITUDE TOWARDS ONLINE MARKETING OF INDIAN RAILWAYS CHAPTER 5: CONSUMERS ATTITUDE TOWARDS ONLINE MARKETING OF INDIAN RAILWAYS 5.1 Introduction This chapter presents the findings of research objectives dealing, with consumers attitude towards online marketing

More information

6 Variables: PD MF MA K IAH SBS

6 Variables: PD MF MA K IAH SBS options pageno=min nodate formdlim='-'; title 'Canonical Correlation, Journal of Interpersonal Violence, 10: 354-366.'; data SunitaPatel; infile 'C:\Users\Vati\Documents\StatData\Sunita.dat'; input Group

More information

Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

More information

An Empirical Study on the Effects of Software Characteristics on Corporate Performance

An Empirical Study on the Effects of Software Characteristics on Corporate Performance , pp.61-66 http://dx.doi.org/10.14257/astl.2014.48.12 An Empirical Study on the Effects of Software Characteristics on Corporate Moon-Jong Choi 1, Won-Seok Kang 1 and Geun-A Kim 2 1 DGIST, 333 Techno Jungang

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection Chapter 311 Introduction Often, theory and experience give only general direction as to which of a pool of candidate variables (including transformed variables) should be included in the regression model.

More information

Chapter 15. Mixed Models. 15.1 Overview. A flexible approach to correlated data.

Chapter 15. Mixed Models. 15.1 Overview. A flexible approach to correlated data. Chapter 15 Mixed Models A flexible approach to correlated data. 15.1 Overview Correlated data arise frequently in statistical analyses. This may be due to grouping of subjects, e.g., students within classrooms,

More information

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

More information

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén

More information

10. Analysis of Longitudinal Studies Repeat-measures analysis

10. Analysis of Longitudinal Studies Repeat-measures analysis Research Methods II 99 10. Analysis of Longitudinal Studies Repeat-measures analysis This chapter builds on the concepts and methods described in Chapters 7 and 8 of Mother and Child Health: Research methods.

More information

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

More information

Psychology 205: Research Methods in Psychology

Psychology 205: Research Methods in Psychology Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready

More information

Applied Longitudinal Data Analysis: An Introductory Course

Applied Longitudinal Data Analysis: An Introductory Course Applied Longitudinal Data Analysis: An Introductory Course Emilio Ferrer UC Davis The Risk and Prevention in Education Sciences (RPES) Curry School of Education - UVA August 2005 Acknowledgments Materials

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

Mplus Tutorial August 2012

Mplus Tutorial August 2012 August 2012 Mplus for Windows: An Introduction Section 1: Introduction... 3 1.1. About this Document... 3 1.2. Introduction to EFA, CFA, SEM and Mplus... 3 1.3. Accessing Mplus... 3 Section 2: Latent Variable

More information

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group Introduction to Multilevel Modeling Using HLM 6 By ATS Statistical Consulting Group Multilevel data structure Students nested within schools Children nested within families Respondents nested within interviewers

More information