Latent Variable Modeling of Differences and Changes with Longitudinal Data

Transcription

1 Annu. Rev. Psychol : First published online as a Review in Advance on September 25, 2008 The Annual Review of Psychology is online at psych.annualreviews.org This article s doi: /annurev.psych Copyright c 2009 by Annual Reviews. All rights reserved /09/ $20.00 Latent Variable Modeling of Differences and Changes with Longitudinal Data John J. McArdle Department of Psychology, University of Southern California, Los Angeles, California ; [email protected] Key Words linear structural equations, repeated measures Abstract This review considers a common question in data analysis: What is the most useful way to analyze longitudinal repeated measures data? We discuss some contemporary forms of structural equation models (SEMs) based on the inclusion of latent variables. The specific goals of this review are to clarify basic SEM definitions, consider relations to classical models, focus on testable features of the new models, and provide recent references to more complete presentations. A broader goal is to illustrate why so many researchers are enthusiastic about the SEM approach to data analysis. We first outline some classic problems in longitudinal data analysis, consider definitions of differences and changes, and raise issues about measurement errors. We then present several classic SEMs based on the inclusion of invariant common factors and explain why these are so important. This leads to newer SEMs based on latent change scores, and we explain why these are useful. 577

2 Contents INTRODUCTION: LONGITUDINAL DATA AND THE STRUCTURAL EQUATION MODELING APPROACH Separating Differences from Changes The Structural Equation Modeling Approach STRUCTURAL EQUATION MODELS FOR REPRESENTING CHANGES Auto-Regression Models Change Score Models Change-Regression Models STRUCTURAL EQUATION MODELS FOR ADDING GROUP DIFFERENCES Group Information as Contrast Codes Multiple-Group Latent-Difference Models Multiple Group Structural Equation Model Estimation with Incomplete Data STRUCTURAL EQUATION MODELS FOR INCLUDING LATENT COMMON FACTORS. 587 Regression with Common Factors Latent Changes in Common-Factor Scores Questions of Factorial Invariance Over Time STRUCTURAL EQUATION MODELS USING TIME-SERIES CONCEPTS Crossed-Lagged Regression of Factors Extending Time-Series Factor Models to Multiple Occasions Extending Cross-Lagged Factor Regression to Multiple Occasions STRUCTURAL EQUATION MODELS USING LATENT-CURVE CONCEPTS Latent Growth-Curve Models Fitting Latent-Curve Hypotheses Considering Multiple Latent Curves STRUCTURAL EQUATION MODELS USING LATENT-CHANGE CONCEPTS Mixing Models for Means and Covariances Latent Change Score Models Multiple Latent Change Score Models ADDITIONAL RELEVANT RESEARCH ISSUES New Dynamic Structural Equation Model Applications Other Promising Directions Final Comments INTRODUCTION: LONGITUDINAL DATA AND THE STRUCTURAL EQUATION MODELING APPROACH This review describes contemporary methods for longitudinal data analysis. So let us start by considering situations in which individuals (N ) from different groups (G) have been measured over several discrete periods of time (T < N ) on a repeated set of measurements (Y m ). In an experimental design, this could represent a classical layout wherein we randomize individuals to different conditions and measure everyone at multiple time points on multiple variables (Bock 1975). In an observational study, or longitudinal panel design, we could measure 578 McArdle

3 different demographic groups on multiple occasions over many years (Hsiao 2003). Of course, a variety of classical techniques are used for analyzing such data, including repeated measures analysis of variance, time series analysis, and growth curve analysis (Nesselroade & Baltes 1979). There are also many newer techniques that serve similar purposes (Hedecker & Gibbons 2006, Muller & Stewart 2006, Singer & Willett 2003, Verbeke & Molenberghs 2000, Walls & Schafer 2006). A brief glance of the past few decades of Annual Review volumes shows the importance placed on statistical models for data analyses, and a few previous reviews have already focused on SEM principles and techniques (Bentler 1980, Bentler & Dudgeon 1996, Bollen 2002, MacCallum & Austin 2000, Tomarken & Waller 2005). These reviews discuss technical aspects of SEM, the nature of latent variables, SEM hypothesis formation and statistical testing, and even the misuse of SEM. Other Annual Review articles provide informative discussions about the key issues in repeated measures analysis (e.g., Collins 2006, Cudeck & Harring 2007, MacKinnon et al. 2007, Maxwell et al. 2008, Raudenbush 2001). The many good general references to SEM (e.g., Kline 2005, McDonald 1985) include several new books specifically about SEM for repeated measures (e.g., Bollen & Curran 2006, Duncan et al. 2006). In contemporary work, it has become popular to focus on the trajectory over time as the key feature of a repeated measures analysis (e.g., MacCallum et al. 1997; McArdle 1986, 1989; Raudenbush 2001). The trajectory approach has gained popularity, and for the most part, it nicely matches the scientific goals of longitudinal research (Nesselroade & Baltes 1979). Adding something new and useful to this impressive collection is not so easy. In this review, we survey a variety of different ideas about data analysis, but we try to bring these together by our explicit focus on what we term the latent change score model. It is expected that readers well versed in the techniques of regression analysis and factor analysis will find this review to be elementary reading and probably to be missing many technical details. But this is intentional. To reach a wider audience, we do not present the typical algebraic expressions or computer program SEM scripts, and all the new models are explained using simple plots or path diagrams. Because we survey models from a wide variety of scientific disciplines, where alternative terms are often used for the same mathematical and statistical concepts, we do not include all aspects of these models. Our hope is that our extensive use of path diagrams will help the reader see the common features of these theoretically appealing and practically useful ideas. And we hope this approach will clarify our main recommendation: When thinking about any repeated measures analysis it is best to ask first, what is your model for change? Separating Differences from Changes The many semantic conventions and colloquialisms used by psychological researchers do not always precisely match their formal definitions. A first issue here is the distinction between inferences about (a) differences between people and (b) changes within people. We use the plots of fictional scores over time in Figure 1 to clarify this key distinction. In Figure 1a, we have plotted six pairs of X-Y data points to show hypothetical data obtained from six different individuals. To provide a substantive context, we label the Y-axis Alert and we label the X-axis Time of Day. We add a regression line where X Y (with intercept β 0 and slope β 1 ). Because this regression line has a negative slope, our substantive interpretation might be that alertness decreases as the day goes on. In more technical terms, the difference between people on predictor X leads to an expected difference in outcome Y. To avoid causal language, we can express the slope (β 1 ) as the expected change in Y for a one-unit change in X. Notice we use the word change to describe the difference between persons. Next consider the different data layout of Figure 1b. Here we view the same six pieces Latent Models 0f Longitudinal Changes 579

4 Figure 1 Alternative plots of cross-sectional and longitudinal data. (a) Cross-sectional measurements, (b) longitudinal measurements, (c) one longitudinal alternative, and (d ) another longitudinal alternative. of information as coming from just two people measured at three time points on the same measure (i.e., Y[1], Y[2], and Y[3]). To indicate the same person over time, we can connect the dots with lines, and now see both lines go down as the time-of-day increases. A simple subtraction of any two scores for any person is termed a difference score (e.g., D = Y[2] Y[1]), and these are changes within a person over occasions. Now we use the word difference to define a change. The six data points in Figures 1a and 1b are in exactly the same positions, and there is no alteration of the inference about X producing Y. Of course, in Figure 1b we measure fewer people. To anticipate a reasonable statistical question, we can state (without proof ) that, given the same number of data points, there is typically a gain in statistical precision by adding more occasions of measurement with the same people i.e., repeated measures lead to increased power (Bock 1975, Hertzog et al. 2006, Muthén & Curran 1997). In Figure 1c and 1d, the same six data points are plotted, but we connect these data across two people in a different way. In Figure 1c, it appears as if the two people shift in their position from Time 1 to Time 2 one line goes down while the other line goes up. In Figure 1d, we connect other points, and now both lines appear to be fluctuating between ups and downs. By using the same data points, we can see that many different longitudinal patterns are possible underneath any set of crosssectional scores. This highlights the key purpose of most longitudinal repeated measures data to detect differences in the patterns of individual changes. In theory, we can calculate the change scores and write another regression equation in which change is the dependent variable predicted by some difference between the people. This is an elementary description of the well-known mixed model, which attempts to identify the between-person differences in within-person changes (Nesselroade & Baltes 1979). It is worth noting that seminal statements made by some of the most important leaders of our field strongly advocated the need to avoid change scores (e.g., Cronbach & Furby 1970, Lord 1958). These statements focused primarily on the very real problems of measurement error in the change scores. In contrast, other researchers who investigated these statistical issues emphasized the benefits of using change scores (i.e., Allison 1990, Nesselroade & Cable 1974, Rogosa 1979, Rogosa & Willett 1983). It is not surprising that the appropriate use of change scores remains a conundrum for many researchers. The Structural Equation Modeling Approach It is well known that SEMs are used to express a theoretical model in terms of linear and 580 McArdle

5 nonlinear expressions with observed and unobserved variables (Goldberger & Duncan 1973). SEM expressions lead to predictions or expectations for the means, standard deviations, and correlations, and these can then be compared to observed statistics. A series of alternative models, often based on radically different ideas, can be organized in this way and then compared with one another using a variety of goodness-offit tests. In the early years of SEM, only a few reliable computer programs were able to carry out these calculations (e.g., ACOVSM, Jöreskog et al., 1971; LISREL, Jöreskog & Sörbom 1979; COSAN; McDonald 1985). Today, many SEM programs exist, ranging from the most flexible (Mplus; Muthén & Muthén 2002), to the most graphic (AMOS; Arbuckle & Wotke 2004), to the least expensive (Mx; Neale et al. 1999). SEM programs are often hard to choose among, and alternative computer scripts can be very helpful (e.g., Ferrer et al. 2004). All SEM programs can carry out the calculations for the models described here, so computer program differences are not highlighted. Important lessons have been learned from statistical research on the classical tests of mean differences using repeated measures analysis of variance (ANOVA). Research in the early 1970s showed that the most popular ANOVA tests were based on an assumption of an equal variance and an equal correlation over time, a pattern termed compound symmetry (e.g., McCall & Applebaum 1973). Unfortunately, these assumptions seemed highly improbable with real longitudinal data. As it turned out, features of the tests of the mean differences over time were influenced by the adequacy of the covariance structure assumptions. In standard testing of the mean differences, (a) the Type I error rate (e.g., α = 0.05) is inflated if the simple covariance assumptions are not met, but (b) the Type II error (e.g., 1-power) is inflated if no structure is placed on the covariances. The suggested correction at the time was to alter the degrees-of-freedom by a coefficient (termed ε) computed from the covariance matrix or to use the unstructured but less-powerful multivariate approach (MANOVA; McCall & Applebaum 1973, O Brien & Kaiser 1985). At the same time, other researchers were promoting a different kind of data analysis approach, eventually termed SEM (e.g., Jöreskog et al. 1971), wherein the mean and the covariance hypotheses could be considered jointly by what are now termed shared parameters (McArdle et al. 2005). The approach presented here emphasizes the need for explicit structural hypotheses about means and covariances and for the direct inclusion of latent change scores to express specific developmental hypotheses about individuals and groups (e.g., McArdle & Nesselroade 1994, Nesselroade & Baltes 1979). SEM techniques are used to translate these specific hypotheses into structural expectations for the means and covariances over time so these expectations can be compared with a real set of longitudinal data. SEM path diagrams are used as a shorthand to convey aspects of the required matrix algebra. These diagrams highlight the key parameters we can test as well as the assumptions we cannot test. The path diagrams used here are intentionally more elaborate than are other SEM representations because these diagrams express every algebraic relationship among the scores. STRUCTURAL EQUATION MODELS FOR REPRESENTING CHANGES We first consider some of the key longitudinal questions about change using popular models for two occasions of data. The basic techniques described here are used in most of the other SEM examples to follow. Auto-Regression Models The first kind of model to be considered here is the familiar auto-regression model, drawn in Figure 2a. We label the two repeated scores as Y[1] and Y[2], and we presume Y[1] precedes Y[2] in time. This order of events suggests we add a regression (β) where Y[1] is used to predict Y[2] at the later time (i.e., Y[2] is Latent Models 0f Longitudinal Changes 581

6 Figure 2 Alternative structural equation models for two-occasion data. (a) Traditional regression, (b) latent change score, and (c) change regression. regressed on Y[1]). As in any regression calculation, we further assume that the unobserved residual term e is uncorrelated with the initial score Y[1]. In the traditional path diagram, there are no intercepts, and variables are often assumed to be standardized. But in this diagram, we explicitly draw all model parameters: (a) observed variables Y[1] and Y[2] as squares, (b) the unobserved variable e as a circle, (c) the implied constant of 1 as a triangle, (d ) one-headed arrows to represent fixed or group effects (μ 1, β 0, β 1 ), and (e) two-headed arrows to represent random or individual effects (σ 1 2, ψ e 2, 1). Although these parameters complicate this simple figure, they prove very useful in subsequent model comparisons. We can now use any SEM program to estimate values for this model. We start by calculating the means and covariances (or average cross-products) formed from observed raw scores. Numerical estimates for all unknown parameters (i.e., Greek letters) are obtained as well as a single index of goodness-of-fit of the model to the data (i.e., the likelihood L). To create a formal test of this model, we need to compare it to an alternative model, 582 McArdle

7 usually with different parameters. The SEM approach proves remarkably flexible here because parameters can be (a) free to vary, (b) fixed at any known value, or (c) set equal to any other parameter. For example, one typical alternative to model 2a is another in which there is no stability over time, and this can be formed by restricting the regression slope to be fixed at zero (β 1 = 0). Under standard regularity conditions (e.g., normality of the residuals), the difference in fit between the two models (L d = L a L b ) is distributed as a chisquare (χ 2 ) variate with one degree-of-freedom (df d = df a df b ). This SEM approach is not novel and it yields the same results we could obtain using any standard linear regression program. Change Score Models It is simple to subtract the two scores for each person using the observed data (D = Y[2] Y[1]), with the results known as gain scores or difference scores. But, as basic as this seems, calculating differences from the raw data is not the most promising route to take here. Instead, the model of Figure 2b is a change score model for the same initial observations. We start with the same data (Y[1] and Y[2]), but we add an unobserved variable labeled Δ. To this we add a set of fixed values ( = 1) on the specific arrows so we can mimic the result of a subtraction (Y[2] = 1 Y[1] + 1 Δ). This change score (Δ) is now explicitly defined as the part of the score of Y[2] that is not identical to Y[1]. This change score is not directly measured, so it can be considered as our first latent change score (McArdle & Nesselroade 1994; cf., Bollen 2002). We can now use SEM software to estimate and test questions about changes directly from the original two-occasion data. The traditional statistical features of the change score are all included as model parameters the mean of the changes (μ Δ ), the variance of the changes (σ Δ 2 ), and the covariance of the initial scores with the changes (σ 1Δ ). For example, we can now test the hypothesis of no mean differences over time by forcing the mean of the difference to be zero (μ Δ = 0). This model leads to expectations of equal means over time, and the difference in fit is indexed by a chi-square test (χ 2 ). We can use the same model to test hypotheses about individual differences in change (σ Δ 2 = 0, σ 1Δ = 0). We do not need to calculate the change scores directly to examine their statistical properties when we use the model of Figure 2b. Instead, we define a latent change score by using fixed unit values; this simple SEM technique also proves valuable in the more-complex models presented below. The auto-regression model of Figure 2b is fit to observed data that are identical to those of the change score model of Figure 2b, and both models have the same number of parameters and achieve the same fit i.e., these models are not testable alternatives of one another. Instead, the fundamental difference between auto-regression and change score models is in the way we represent and test hypotheses about the within-person changes. The change statistics are nearly impossible to describe in Figure 2a, but these are explicit parameters of Figure 2b. Change-Regression Models Further consideration about the two models discussed above leads to another question in change research: Should we remove the part of the individual change that is related to the initial level? In Figure 2c, we draw a slightly revised version of Figure 2b, in which we transform the covariance (σ 1Δ ) into a regression coefficient (δ 1 ) to estimate a model with a base-free measure of change. This is also a transformation of parameters in Figure 2a (i.e., with δ 0 = β 0, δ 1 = β 1 1) and is useful to formalize this simple interpretation. In addition, the transformation provides one way to deal with the classic problem known as Lord s Paradox (Lord 1967) the difference in results obtained from the regression in Figure 2a and the change model in Figure 2b are avoided if we use the changeregression model in Figure 2c (with group information). Latent Models 0f Longitudinal Changes 583

8 Estimating this change score regression (δ 1 ) is mainly useful when the changes have not taken place by the time of the initial occasion. This is assured, for example, in an experiment wherein a manipulation occurs between Time 1 and Time 2. In contrast, in observational research, the two occasions may be arbitrary selections from an ongoing process unfolding over time, possibly in different ways for different persons. In this case, the changes may already be apparent at the time of the initial data collection, so this change regression is only an arbitrary transformation. This is a case where SEMs regressions yield parameter estimates that may be very difficult to interpret. STRUCTURAL EQUATION MODELS FOR ADDING GROUP DIFFERENCES Group differences are important aspects of both experimental and observational longitudinal studies. As is well known, the use of random assignment to groups provides a direct basis for causal inference. In observational studies, a frequent goal is to separate groups that are not following the same process (i.e., heterogeneity). Group information may be considered in many ways, and the techniques described in this section are relevant to all SEMs in the discussion that follows. Group Information as Contrast Codes Let us assume that an important difference exists between groups of people (e.g., due to a manipulation, based on gender, linked to high test scores), and we want to examine how these differences impact some outcome. One popular model for this purpose is drawn as Figure 3a. This is a change score path model that also includes the group information as a measured variable (G) using dummy codes (G = 0or1) or effect code (G = 1/2 or + 1/2). This typical use of group differences as a coded variable allows standard regression parameters to estimate mean differences between groups. With dummy codes, these path coefficients represent a 2-by-2 ANOVA with four parameters: (a) an initial mean (β 0 = 1 Y[1]), (b) a betweengroup effect (β 1 = G Y[1]), (c) a withingroup effect (α 0 = 1 Δ), and (d ) a within-bybetween effect (α 1 = G Δ). Other aspects of the ANOVA include the variances and covariance of the residuals of the level and change score (ψ 2 e, ψ 2 z, ψ ez ). As in any regression formulation of ANOVA among K independent groups, we need K-1 contrasts to fully represent the mean differences. A common variation of this model is the use of adjusted change parameters in the analysis of covariance (ANCOVA). This is drawn in the same way as Figure 3a, but with the addition of a continuous variable X as an observed predictor of both the initial level and changes. In ANCOVA, the model parameters are conditional on the expected values of the measured X variable. Many researchers use the term controlled for this form of statistical adjustment, and this is reasonable in some cases. We must recognize that the statistics are under our control but the individuals are not. The ANCOVA interaction term, representing potential differences in the slopes of the covariate between groups, can be introduced in path models using product terms as measured variables (P = G X ). In this way, any ANCOVA can be carried out as an SEM. Multiple-Group Latent-Difference Models This previous use of group coding is limiting in a number of ways. The focus of this kind of analysis is on differences in the mean changes over groups, and other forms of group differences in change processes are not typically considered. Different groups of people may have different means, but they may also have different amounts of variability in their changes (σ Δ 2 ). The SEM approach expands our options for considering aspects of group differences. Figure 3b represents a latent change score model in which we have assumed there are two independent groups of individuals, perhaps differentiated by an experimental 584 McArdle

9 Figure 3 Alternative two-occasion structural equation models with group differences. (a) Adding group codes, (b) multiple group model, and (c) incomplete data groups. treatment (e.g., treatment versus controls), a demographic difference (e.g., males versus females), or an observational difference (e.g., high versus low math scores). This organization of the data into groups allows for tests of group differences by using a multiple-group SEM. In this example, the group means are represented as regressions from the constant within each group (1 Y[1] = μ (a) 1, μ (b) 1 ). A test of the equality of these coefficients is termed invariance over (a) groups (μ 1 = μ (b) 1 ), and this can be carried out using the SEM programs. These invariance constraints will result in a misfit (χ 2 ) of the same magnitude as tests of no mean differences between coded groups (β 1 = 0inFigure 3a). Next, the mean of the changes (μ (a) Δ, μ (b) Δ ) can (a) be tested for invariance over groups (μ Δ = μ (b) Δ ) as a test of group-by-time interaction. This multiple-group SEM allows testing the invariance of any model parameter. In this case, we might want to add a test of the equality-ofchange variation over groups (σ Δ = σ 2(b) Δ 2(a) ) Latent Models 0f Longitudinal Changes 585

10 to see if there are group differences in the amount of changes (using the χ 2 ). A more complex expression can be formed to test the equality of the coefficient of variation or effect sizes (μ (a) (a) Δ /σ Δ = μ (b) Δ /σ (b) Δ ). Following a similar logic, we can easily represent and test interactions, even interactions including latent variables, without creating product variables. In the typical MANOVA analyses, we require complete homogeneity of covariance over groups (e.g., Bock 1975, O Brien & Kaiser 1985), but some SEM alternatives, with lessextreme forms of invariance, may be more realistic and useful. Multiple Group Structural Equation Model Estimation with Incomplete Data In practical situations, we often have repeated measures data wherein some individuals are not measured at all occasions. In some designed experiments, we may plan not to measure some of the subjects to estimate the impact of measurement (e.g., incomplete blocks design, Solomon 4-group design). But in most observational studies, some participants drop out after the first occasion, usually for a variety of different reasons. As pointed out above, it is difficult to estimate changes when only one measurement occasion is available. So unless there is a compelling reason to do otherwise, persons who drop out of the study are typically dropped from all subsequent data analysis. This use of complete cases often seems to be the only possible analysis, and we generally view this as a conservative approach that avoids overstating our results. Recent statistical research has focused on this incomplete data problem and has demonstrated how the previous statements about complete-case analysis are not typically true (Enders 2001, Little & Rubin 2002). In fact, well-intentioned complete-case analyses are likely to yield unintentionally biased results. A typical indicator of attrition bias due to dropouts is expressed as the mean differences at Time 1 between groups that (a) participate at both occasions and (b) those that are not available at the second occasion. When we find mean differences between these groups we have selection bias, and the question becomes, what inferences are now possible? One of the more popular features of SEM is the ability to deal directly with common problems of incomplete data. Following the well-developed lead of many statisticians (Hsiao 2003, Jöreskog & Sörbom 1979, Little & Rubin 2002), any change model can be written in terms of a sum of misfits (L g ) for multiple groups, where groups are defined as persons with the identical pattern of complete data. In Figure 3c, we present a change-score model for two occasions for one group with complete data (Y[1] and Y[2]) and a second change-score model for the group of individuals who are missing data at Time 2. The difference between the groups is only that Y[2] is observed in group A (drawn as a square) but is unobserved in group B (drawn as a circle; Horn & McArdle 1980, McArdle & Bell 2000). Many alternative estimation techniques are available to deal with these problems (e.g., multiple imputation; Little & Rubin 2002), but the SEM approach is relatively easy to understand (McArdle & Bell 2000, Enders 2001). If we want to make an inference about all people as if they were from the same population of interest, we must assume invariance of all parameters over (a) all groups (e.g., μ 1 = μ (b) (a) 1, μ Δ = μ (b) Δ, 2(a) σ Δ = σ 2(b) Δ ). If these invariance assumptions yield a reasonable fit, we may conclude the incomplete data are missing completely at random (MCAR). However, if the multiple-group invariance constraints do not fit well (i.e., a significant χ 2 ), we may conclude that they are missing at random (MAR). In either event, requiring invariance of all parameters provides the best estimate of the population parameters of the latent differences as if everyone had continued to participate. Thus, we accept any loss of fit associated with this form of invariance, and we compare alternative models with this misfit as our new baseline. In general, this multiple-group SEM approach uses all available data on any measured 586 McArdle

11 variable, so it is a reasonable starting point for all further change analysis. The inclusion of all the cases, both complete and incomplete, allows us to examine the impact of attrition and possibly to correct for these biases. MAR results represent a convenient starting point, but many more techniques are available for dealing with incomplete data. We should try to measure the reasons why people do not participate, because nonrandom selection can create additional biases (e.g., McArdle et al. 2005, McArdle & Bell 2000, Raudenbush 2001). We are able to analyze all the data collected using these and other incomplete-data techniques. STRUCTURAL EQUATION MODELS FOR INCLUDING LATENT COMMON FACTORS The SEMs described above do not attempt to solve the potential problems of compounding measurement error in using change scores. To deal with these problems, we rely on multiple measurements of the same construct within each occasion. With multiple measures, we first examine the hypotheses about common factors (McArdle 2007b, McDonald 1985, Meredith & Horn 2001), and we then expand these common factors into more complete mean and covariance structures. Regression with Common Factors The path diagram in Figure 4a represents a structural hypothesis for multivariate observations (squares X[t], Y[t], Z[t]) repeated over two occasions of measurement. Within each occasion, we include a latent variable at each occasion (circles f [1] and f [2]) with factor loadings (one-headed arrows labeled λ m ). The unique variation for each variable is also included (ψ m 2 as double-headed arrows). Following classical factor-analysis theory, unique factor scores are thought to be decomposable into two parts one part that is specific to the test and represents valid measurement, and a second part that is random error. We assume each unique factor contributes variation at a given time but is independent of other scores within and across occasions (Meredith & Horn 2001). Common factors are used to represent the testable hypothesis that a single unobserved variable can account for the covariation among the observed scores within each occasion. We next require the factor loading for each to be the same value at all time points factor loading invariance (λ m [1] = λ m [2]). This is a formal way to assert that this factor score has the same substantive meaning at each time of measurement. It is typical in SEM to introduce a regression in which common factors at later times are regressed on common factors at earlier times (β f ). In theory, these factor scores reflect only the common variance, and they do not contain measurement error. Thus, to the degree the invariant common-factor model is correct, this factor score regression represents the stability of only the reliable components of our measures. Latent Changes in Common-Factor Scores In cases in which the factor loading invariance restrictions are reasonable, we can write an alternative form of the model. In Figure 4b, we introduce a third latent score (Δf ) that represents the latent change between the two common-factor scores. In SEM, we typically do not estimate the factor scores, so we cannot calculate this true change score directly. Instead, we follow the logic of the SEM in Figure 2b and include a set of fixed unit-valued coefficients ( = 1), so the second latent factor ( f [2]) is defined as a simple sum of the other two ( f [1] + Δf ). Because the latent change score (Δf ) now is part of the model, the model parameters include the variation in latent changes across individuals (φ Δ 2 ) as well as covariation of change with the initial common factor (φ 1Δ ). As in the factor-regression model of Figure 5a, these common factors do not include errors of measurement, and this variance in latent change score is not confounded by errors of measurement. When used in this way, this multivariate SEM avoids the classical problems of using Latent Models 0f Longitudinal Changes 587

12 ANRV364-PS60-22 ARI 27 October :54 Figure 4 Alternative two-occasion structural equation models for multivariate data. (a) Common-factor regression, (b) common-factor latent change score, and (c) multiple-common-factors crossed-lagged regression. 588 McArdle

13 inherently unreliable difference scores and the random errors cannot create regression to the mean (McArdle & Nesselroade 1994, Nesselroade et al. 1980). This SEM also allows us to test hypotheses about mean changes over time in the reliable common-factor scores. By including observed variable means, we can additionally estimate a latent level mean (θ 1 ) and a latent change score mean (θ Δ ). From this multivariate SEM, we can calculate a nonstandard repeated-measure t-test among the common-factor scores and examine whether the mean of the latent change factor is zero (θ Δ = θ 2 θ 1 = 0?). A more complete description of these tests would include intercepts for each variable (ν m ), but these are not drawn here. This new SEM offers a powerful way to answer questions typically asked by both the classical ANOVA and factor analysis techniques. In MANOVA, we estimate the linear combination weights that maximize the mean differences over time using canonical variates, which do not attempt to account for the correlations within time or across residuals. In contrast, the SEM in Figure 4b provides a highly structured approach for the repeated-measures ANOVA question. We are asking whether all the mean changes over time in this set of variables (W[t], X[t], Y[t]) are accounted for by mean changes in the common factors ( f [t]). This is often exactly the question we want to answer. Questions of Factorial Invariance Over Time The search for factorial invariance over time is viewed by many as an empirical issue (Meredith & Horn 2001). As a first question, we typically ask whether the number of factors is equal over time. Assuming no substantial misfit, we can then ask questions about the invariance of all factor loadings over time: Does [1] = [2]? Other questions of factor equivalence over time can be asked, such as whether the person s unobserved factor scores are equal over time ( f [1] n = f [2] n ). This is a more difficult question that is examined indirectly by asking if the factor means equal (θ[1] = θ[2]), if the factor variances equal (φ[1] 2 = φ[2] 2 ), and if the factors perfectly correlated (ρ[1,2] = 1). In repeated measures data, it is also reasonable to add specific covariances for each measurement over time (ψ mm ; not drawn) to remove additional confounds. Further relaxations of the factor invariance model (Figure 4a) can be tested and may fit the data better, but the results may not be easy to interpret. In the absence of the same number of factors, we would need to interpret each factor separately. In the absence of factor loading invariance, we cannot assert the same common factors are measured at each measurement occasion. Although we might be interested in this kind of evidence for qualitative change, it is difficult to go much further in SEM because we do not have analytic tools to compare apples and oranges. Using repeated measures with the same number of factors and invariantfactor loadings allows us to say we have repeated constructs. Because factor invariance is both practical and desirable, it seems appropriate to search for a metric invariant model of measurement until such a solution is found. In Figure 4c, we assume six variables are measured at each of two occasions and are indicators of two factors ( g[t] and h[t]). Each pair of factor scores are assumed to be correlated with each other and with the Time 1 factor scores (as drawn here). Here the factor loadings are invariant over time, but the factor pattern within each time is complex. The pattern is simple for the first two variables (U[t] and V[t] load on g[t]) as well as for the last two (Y[t] and Z[t] load on h[t]). But the invariant pattern is more complicated for the middle two variables, which have two loadings each (W[t] and X[t] load on both g[t] and h[t]). Typically, a variable with multiple loadings does not contribute to the factorial description, but this is helpful because these multiple loadings are the same over time. So, although all variables do not exhibit a simple structure, a complex but invariant factor pattern may end up being more practically useful because it establishes the identity of factors across occasions. Latent Models 0f Longitudinal Changes 589

14 STRUCTURAL EQUATION MODELS USING TIME-SERIES CONCEPTS Many SEMs for repeated measures data come from the time-series literature (e.g., Browne & Nesselroade 2005, Nesselroade et al. 2001). These models typically do not deal with group averages, or even invariant common factors, but are based solely on time-to-time dependencies indicated by the covariance structures. Here we discuss several popular variations based on time-series regressions among invariant common factors over time. Crossed-Lagged Regression of Factors The introduction of multiple constructs within each longitudinal occasion of measurement leads naturally to questions about timedependent relationships among changes in these factors. A classical SEM for multiple factors over time is based on a latent variable crosslagged regression model (Gollob & Reichardt 1987, Rogosa 1979, Shadish et al. 2002). In Figure 4c, we assumed that each common factor influences itself over time with lagged autoregressions (β g and β h ) and that each factor crosses over to influence the other factor at subsequent times (γ g and γ h ). This basic two-occasion two-factor model is used in an attempt to isolate the pattern of influences across the constructs over time. Indeed, this cross-lagged setup inspired the optimistic label of causal modeling for all SEMs (Bentler 1980; cf. McDonald 1985). However, for proper time-series causal inference, the variances and covariances of the factors (φ 2 g, φ 2 h, φ gh ) are required to be equal over time these restrictions imply the common factors have reached a stationary state or a point of equilibrium. As with most other invariance hypotheses, these tests are fitted to raw score covariances and not merely to correlations (Meredith & Horn 2001). These important tests require a complex set of model constraints, so they are often simply ignored (Browne & Nesselroade 2005). The lagged coefficients (β g and β h ) provide information about the general stability within each variable, and the crossed coefficients (γ g and γ h ) give information about the impact of one factor upon the other. If we can force one of these crossed coefficients to zero without a large misfit (γ g = 0), then we can say that this factor ( g[t]) is not a leading indicator over time of the other factor (h[t+1]). It is also possible to fit a model in which both influences are zero (γ g = γ h = 0), and if this fits well, then we can assert that the common factors do not influence one another. Except in rare cases (e.g., dyads), it is not reasonable to examine the exact equality of the processes (γ g = γ h ) because different common factors are not in the same scale of measurement. A simpler alternative that also needs to be rejected is that only one common factor is needed over all times so there are no crossed effects at all (e.g., Figure 4a). Under this set of assumptions, any significant cross-regression indicates a prediction over time independent of the outcome variable s own history, and this is a classic definition of a causal influence in observational data (e.g., Hsiao 2003). It has recently been pointed out that the longitudinal cross-lagged coefficient provides a reasonable test of a mediation hypothesis, and longitudinal data may be necessary for mediation theory (Cole & Maxwell 2003, MacKinnon et al. 2007). Nevertheless, the main problem with making any such causal assertions from longitudinal data is that they may be wrong, and we might not know it from our model fitting (Shadish et al. 2002). These inferences may be wrong because the variables have not reached equilibrium, or other variables are missing from the model that alter the influences, or these common factors are not invariant, and so on. These are not easy problems to overcome in longitudinal observational data. It may be useful to point out that the latent change score model (Figure 4b) can be extended for use with multiple constructs with sets of latent changes (Δg and Δh). These change equations can directly represent the parameters of most interest, but they have the misfortune of appearing far more complicated than 590 McArdle

15 the cross-lagged model so they are not drawn here. Still, these SEMs can directly represent common questions, such as whether a change in X produces a change in Y, by turning a passive factor covariance (i.e., Δg Δh) into an active regression of factor changes (i.e., Δg Δh). We must consider if this question might be best represented using a model with regressions of changes on levels (as in Figure 2c), and we return to this issue below. In either case, it is certainly possible to specify some additional change questions as formal hypotheses. The overall benefit of any crossed-lagged SEM comes when the model alternatives are clear and testable or when they can suggest a need for the collection of additional data. Patterns of causal influences are more complex and are not so easy to test, especially shorter-term feedback loops, different patterns of causal influence at different times, or different influences for subgroups of persons. In many cases, we can use SEM to ensure that these tests are meaningful statements of the hypotheses, but these models certainly cannot deal with all threats to the validity of all time-based causal assertions (see Shadish et al. 2002). Extending Time-Series Factor Models to Multiple Occasions We next consider more than two occasions of repeated measures data. Let s assume the common-factor scores at a current time ( f [t]) are fully predicted by scores on the same factor at an earlier time ( f [t 1]) with the classical inclusion of a disturbance term (z[t]). This type of model is presented in Figure 5a for four time points with invariant factor loadings ( ) for the observed variables (Y[t]). As in most time-series models, the means are not usually restricted, so neither intercepts nor changes in the means are considered here. In classical time-series analysis, the specific time of observation does not matter, but we do focus on the interval of time (Δt) between observations. For this reason, we only draw one regression weight (β) for a specific unit of time, and one disturbance variance (ω 2 ), and we as- Figure 5 Alternative multiple-occasion structural equation models based on time-series concepts. (a) Quasi-Markov simplex with one common factor and (b) crosslagged regression over many occasions. sume this is invariant (stationary) over time. This is a highly restricted structure over time a Markov simplex wherein each covariance is a function of these parameters and the time interval (for occasions j and k, σ jk = φ 1 2 β k j ). As a start, we assume that the current scores are based only on immediately past behaviors, and this is a testable hypothesis. In early work on this topic, it was shown how all the errors of measurement could be separately estimated from observed variables with only a few occasions of measurement (i.e., T = 4). In later research, models with more-complete common factors were included ( f [t] in a quasi-markov simplex; Jöreskog & Sörbom 1979). Assuming Latent Models 0f Longitudinal Changes 591

16 the common-factor loadings are invariant over time, we can test a broad hypothesis of equilibrium by asking if there is equality (i.e., stationarity) of the common-factor covariances within each time. This time-series framework leads to a highly restricted covariance structure, so the simple auto-regressive factor model of Figure 5a does not always provide a good fit to the data, and more complexity may be needed (Nesselroade et al. 2001). Many alternatives can be considered at this point, including the introduction of more predictors from earlier times (i.e., using multiple back-shifts or lags f [t 2], f [t 3], etc.), and other concepts about correlations of nearest points (i.e., latent moving average terms; Browne & Nesselroade 2005). However, it is clear that most SEM researchers avoid these issues and simply allow some or all of these parameters to vary, especially the auto-regressions (i.e., β[1], β[2], etc.) and disturbances (i.e., ω[1] 2, ω[2] 2, etc.). In real longitudinal data collections, the time period sampled may span different causal systems, and the crossed-lagged coefficients across time may need to vary, but this may lead to a more complex causal interpretation (Gollob & Reichardt 1987). This is another case in which standard SEM programs can be used to estimate parameters outside the bounds of the usual time-series interpretation. Extending Cross-Lagged Factor Regression to Multiple Occasions In Figure 5b, we presume each common factor is predicted by itself with lagged regressions (β) and by the other common factor with crossed regressions (γ). This simplified model extends the formal basis of cross-lagged common factors (Figure 4c) to many more occasions (T > 2). As a starting point, the effects over time are only included for factor scores at the immediately preceding time point. This framework allows us to evaluate whether any variable ( g[t]) is an outcome of both itself at an earlier time ( g[t 1]) and also an outcome of a different variable ( f [t 1]) at an earlier time. As stated above, we may need to consider more complex models that include effects from other time lags (e.g., t 2, t 3). In multivariate time-series analysis, we assume all common factors have reached a state of equilibrium, and we assume the pattern of causal influences is identical over equal distances in time. The invariance constraints of the factor loadings and the cross-lagged coefficients lead to a simplicity that requires empirical evaluation, but these simplifications can be effective when dealing with a lot of data. This multi-occasion cross-lagged factors model provides a rigorous way to evaluate whether the phenomena under study are linked in a stationary (i.e., nonevolving) process. In practice, this model is often applied in observational panel studies without a time-series foundation, and a multitude of additional coefficients are needed for each occasion (e.g., Cillessen & Mayeux 2004). These kinds of longitudinal analyses might isolate causal/control features among the factors, but the resulting effects must be further studied in more rigorously controlled experiments before we could be certain about the true causal influences (Shadish et al. 2002). Of course, this is not to imply that randomization solves all problems a randomized treatment may affect any model parameter, so group differences in the cross-lagged coefficients may be key (McArdle 2007a). STRUCTURAL EQUATION MODELS USING LATENT-CURVE CONCEPTS The popular time-series models do not deal with the group averages over time, but previous SEM research has considered many models that include both means and covariances (Harris 1963, Horn 1972, Horn & McArdle 1980, Jöreskog 1973, Jöreskog & Sörbom 1979). In a novel and comprehensive approach to this problem, Meredith & Tisak (1990) demonstrated how classical growth-curve models could be represented and fitted using a standard SEM based on restricted common factors for means and covariances. These representations of latent-curve SEMs were critical because they offered a wide range of alternatives to 592 McArdle

17 stationarity and equal-interval assumptions. This new SEM approach quickly spawned methodological and substantive applications (e.g., Bollen & Curran 2006; Duncan et al. 2006; McArdle 1986, 2001). Latent Growth-Curve Models One form of the latent-curve model is depicted in Figure 6a. In this model, we assume that each set of observed variables (Y[t]) reflects a set of invariant common factors ( f [t]) separated from unique factors. Here, the common factors are organized to have three unobserved or latent scores: (a) a latent intercept or initial level ( f i ), (b) a latent slope ( f s ) representing the change over time, and (c) a time-specific independent state (z[t]). To indicate the average changes over time, we define a set of group coefficients or basis weights (e.g., slope loadings α[t]) based on time since some event (e.g., time since surgery, time since birth). These model parameters are used to form the shape of the trajectory over occasions. In path diagrams such as Figure 6a, the level and slope are often assumed to be random variables with fixed means (θ i, θ s ) with random variances (φ i 2, φ s 2 ) and covariance (φ is ). We assume there is a within-time-state variance (ω 2 ) common to all observed measures (Horn 1972, McArdle & Woodcock 1997) and one unique variance (ψ m 2 ) specific to each measure. For simplicity, these variance terms are assumed to be invariant over time and uncorrelated with all other components, but we recognize these restrictions may not be appropriate for real data. This type of path diagram is a direct translation of the average cross-products matrix algebra used to estimate these models (Grimm & McArdle 2005). This inclusion of the basis coefficients (α[t]) in this way means these parameters are shared in all model restrictions. Specifically, this includes a proportional relationship over time for the means (i.e., μ[t] = θ i +θ s α[t]) and the standard deviations (σ [t], including α[t]), and a more complex relationship with the over-time correlations (ρ[t,t+1], including α[t] and α[t+1]). Notice that the mean struc- Figure 6 Alternative multiple-occasion structural equation models based on latentgrowth concepts. (a) Latent-curve model for one common factor and (b) bivariate latent-curve model for two common factors. ture is formed as in an ANOVA, but the covariance structure lies in between the restrictions of ANOVA and the unrestricted MANOVA. This is important because the inclusion of appropriate restrictions on the covariance structure from the latent-curve model increases the statistical power of tests of mean differences (Muller & Stewart 2006, Muthén & Curran 1997). Fitting Latent-Curve Hypotheses Different organizations of the basis parameters represent specific hypothesis to be tested. Latent Models 0f Longitudinal Changes 593

18 For example, if the basis is set to zero (α[t] = 0), this eliminates the slope impacts and produces a level-only model with equal means and a compound symmetry structure. If we fix the basis to be the specific time of measurement (α[t] = t 1), we can represent a straightline or linear growth curve with a more complex shared parameter structure. Other popular nonlinear models include polynomial models (quadratic, cubic) and exponential forms (e.g., Ghisletta & McArdle 2001, Grimm et al. 2007). Although not as popular, we can also estimate latent-basis coefficients as we do any other set of factor loading where, because these are essentially factors of time, this leads to an estimate of an optimal shape for the group curve and individual differences (i.e., McArdle 1986, 1989; McArdle & Bell 2000; Meredith & Tisak 1990). This latent-curve model is often expanded into what is popularly known as a multilevel, hierarchical, or random coefficient form. From an SEM perspective, we simply add a groupregression model that follows our use of group coding described above (Figure 3a). Here the predictors are group codes (G) or covariates (X) and the outcomes are the latent levels ( f i ) and latent slopes ( f s ). Because these outcomes are latent levels and latent slopes, this is termed a second-level equation. There are some minor points of disagreement about exactly which random-coefficients models can and cannot be fit using standard SEM software (Cudeck & Harring 2007), but newer SEM software offers an effective way to deal with most practical problems (e.g., Ferrer et al. 2004). We can compare the latent-curve model to standard ANOVA approaches. MANOVA makes no explicit provision for the structure of the covariances or even for uncorrelated residuals, so the otherwise comparable MANOVA tests (of linearity, etc.) require far more parameters and can be expected to yield far less power. This is not to suggest that SEM is best used with very small samples, but it does suggest that estimating a minimal number of parameters is a powerful idea. It is also well known that the standard MANOVA equations can be difficult to use with incomplete data (Bock 1975, Hedecker & Gibbons 2006). The additional requirement of homogeneity of the covariances over groups, a test often ignored in practice, may be more realistic if this test is formed as an SEM with latent-curve invariance over groups. Considering Multiple Latent Curves Assuming we have two common factors measured at multiple times, we can fit what appears to be an entirely different model a model based on multiple latent curves (McArdle 1989). One popular version of this model is displayed in Figure 6b. In this model, we assume that each series is based on its own latent curve, with unlabelled arrows but different shapes (α f [t], α g [t]) and with different parameters for the respective levels and slopes. However, the new information in this bivariate model comes from the cross-covariances of the levels and slopes. This model has recently become a popular way to represent a parallel-growth process (Bollen & Curran 2006, Duncan et al. 2006, Singer & Willett 2003). Of interest here is the correlation of the two latent slopes (φ fs,gs ), this is an error-free index of simultaneous changes across different variables ( f s and g s ). Given other restrictions (φ fi,fs = 0, φ gi,gs = 0), we can test the hypothesis of no connection in changes among the factor scores (φ fs,gs = 0; Hertzog et al. 2006). A problem of inference emerges when the direct test of correlated slopes is interpreted as the test of a dynamic impact (e.g., as in McArdle 1989; cf., MacCallum et al. 1997). The correlation of latent changes across different variables does not change over time and it does not represent a directional dynamic hypothesis. In the hope of obtaining time-dependent dynamic information, some researchers have tried to substitute a cross-lagged regression of the latent slopes on the latent levels (e.g., Bollen & Curran 2006, Singer & Willett 2003, Snyder et al. 2003). Although this model offers new latent-variable parameters, the only reasonable situation for this change regression comes when the latent levels are known to 594 McArdle

19 precede the latent slopes. That is, this levels slopes model may be useful in experimental situations when there is a similar starting point for all subjects, but these same model parameters may be quite arbitrary with most observational data. A related form of the latent-curve model with widespread usage in epidemiology and biostatistics is the time-varying covariate model adapted from work on survival analysis regression (Cox 1972). When applied in SEM (e.g., Bollen & Curran 2006), one of the variables (X[t]) is thought to be responsible for some part of the curvature of the other (Y[t]), so its influence is removed from the outcome scores within each time (or at lagged times). These time-varying covariate models are relatively easy to implement using existing computer software (e.g., Mplus, MIXED), so they are rapidly growing in popularity. Although covariate adjustment may be needed, this timevarying covariate approach is designed to remove all impacts, and this approach may not tell us much about the dynamic interplay among variables. STRUCTURAL EQUATION MODELS USING LATENT-CHANGE CONCEPTS In any data analysis problem where multiple constructs have been measured at multiple occasions, we need to consider the importance of causal sequences and determinants of changes (Nesselroade & Baltes 1979). The goal of evaluating time-based sequences, especially when things are changing, is one of the main reasons for collecting longitudinal repeated-measure data in the first place. We have pointed out above the useful benefits of the classical models, but we have also seen that each is limited to specific forms of dynamic inference. Of course, the statistical evaluation of dynamic sequences is not an easy problem, and these problems have puzzled researchers for decades. We describe below how the prior SEMs lead directly to new SEMs that can provide a more flexible framework for causal-dynamic questions. Mixing Models for Means and Covariances The time-series and latent-curve models discussed in the previous two sections are not identical, but they can be fit to the same repeated measures data. The distinguishing feature of time-series factor models (Figure 5a)is the use of time of measurement as a guide to organize the predictive regressions i.e., moving forward in time. In contrast, in the latentcurve model (Figure 6a), we use the data at any time to define group curves and individual differences around a trajectory, so timeto-time predictions are not essential. Typically, these models do not use the same parameters, so they cannot be directly compared using standard goodness-of-fit tests. This has led some researchers to use both types of models with the same data, and the use of a multiple-model strategy often seems sensible in practice (e.g., Cillessen & Mayeux 2004). Other researchers have tried to combine aspects of these models. In recent statistical work, ANOVA researchers have recognized that when the standard models do not fit well enough, a variety of built-in covariance assumptions can be added that are not strictly connected to the hypotheses about the means (e.g., AR[t]; for details, see Muller & Stewart 2006). A similar strategy was initially suggested for SEM ( Jöreskog 1973), and it became easy to follow this lead and simply paste the two diagrams (in Figure 5a and Figure 6b) together as a composite model in the hopes achieving better fit (e.g., Curran & Bollen 2001, Horn & McArdle 1980). This composite strategy should end up with a better-fitting model, but the estimated parameters may still only be interpreted as separate parts. A different way to approach this problem is to examine the specific theory generating the expectations. One common feature of contemporary repeated measures SEMs is that we are defining a trajectory over time (or an integral) in the scores, and the changes are implied using some difference (or differential) operator. However, if we look back to the classic literature Latent Models 0f Longitudinal Changes 595

20 on growth-curve analysis, the derivative (difference) was typically defined at the start, and this model of change then led to the expected integral (or trajectory) for the outcome of interest (Boker 2001, McArdle & Nesselroade 2003). Repeated measures SEMs can now be considered in this same way. Latent Change Score Models Figure 7a is a path diagram based on this concept of multiple latent change scores over time. Once again, we start with the separation of com- Figure 7 Alternative multiple-occasion structural equation models based on latent change concepts. (a) Latent change score model for one common factor and (b) bivariate latent change score model for two common factors. mon from unique factors using invariant factor loadings ( ). For single variables, this definition is similar to separating the latent or true score from the random error of measurement. We next follow the latent change score concept and consider each common-factor score ( f [t]) to be the sum of the immediately previous factor score ( f [t 1]) and some unobserved or latent change score (Δf [t]). If we then repeat this process for each time point, we add a layer of (t 1) new latent change scores to the model. This approach is a natural generalization of the previous models in which difference scores are included as unobserved variables (Figures 2b, 4b, 5b). Figure 7a includes latent change scores (Δf [t]) at each occasion (after Time 1), and we assume these latent variables are equidistant in time (Δt = 1) even if the observed scores are not. That is, the observed data may be unbalanced (Hamagami & McArdle 2000, 2007). This definition of an equal-interval latent time scale is nontrivial because it allows us to eliminate the time lag (Δt) from all equations. The use of many fixed unit coefficients, a deceptively simple algebraic device, allows us to start with a change equation and then define any trajectory equation. That is, we do not directly define auto-regressions (β[t]) or slope coefficients (α[t]). Instead, we directly define the model of change and indirectly create overtime expectations from the accumulation of latent changes among latent variables. This latent change equation produces many unusual-looking path diagrams (Figure 7a; see Collins & Sayer 2001), but because we have included all model parameters, the standard pathtracing rules of expectations remain intact. In these tracings, any change that occurs earlier accumulates and is expressed in the later occasions. The first term in the accumulation process may be traced in the diagram by starting at the first change score (Δf [2]). This change does not affect the prior score ( f [1]), but it does influence the second time directly (by the fixed 1), and it is an indirect part of all the other latent common factors through the sequence of fixed-unit values (from f [t] to f [t+1]). Similarly, the next change score (Δf [3]) does not 596 McArdle

21 affect prior times, but it is a part of all future times. This sequence is used to form a set of expectations for the means, variances, and covariances over time, but potentially complex expectations are automatically generated using any standard SEM software (Grimm & McArdle 2005; Hamagami & McArdle 2000, 2007; McArdle 2001). This approach to latent change scores can represent all difference and change concepts from the models discussed above. For example, the latent intercept term ( f i ) has effects along the one-headed arrows (from f [t] to f [t+1]), so the intercept mean (θ i ) and variance (φ i 2 ) are part of the expected value of every time point. Next we add a latent slope score ( f s ) with loadings (α[t]) and with a mean (θ s ). The latent slope is not connected to the first factor score ( f [1]), but it affects the changes (Δf [t]), and this influence is accumulated over subsequent time points. We can also include a prediction of the latent change score (Δf [t]) as a linear function (β) of the factor score at the previous time ( f [t 1]), plus a state residual (z[t]), and these effects are multiplied over time. Because of the common-factor model separation, the error variances (ψ m 2 ) are assumed to be constant over time and are not part of this accumulation. The resulting model in Figure 7a is termed a dual-change score model. In this expression of change, we permit both a systematic constant change (α) from the linear slope and a systematic proportional change (β) over time. As a simple start, these change coefficients (α, β) can be considered invariant over time (e.g., ergodic). The invariance of dynamic parameters does not mean the expectations are constant the latent scores can grow and change but it does mean that the expectations accumulate in a systematic fashion. This simple lineardifference model with multiple control parameters leads to a nonlinear growth trajectory from the accumulation of latent changes and comes from a family of curves based on linear and exponential trajectories (Ghisletta & McArdle 2001, Grimm et al. 2007, Hamagami & McArdle 2007). Multiple Latent Change Score Models Figure 7b is a latent change score model for two common factors. Here we draw the dualchange model for each set of observed variables (Y[t] and X[t]) in terms of their common factors ( f [t] and g[t]). We assume the sets of observed scores are measured over a defined interval of time and the latent variables are defined over an equal interval of time (Δt = 1), and we add layers of latent difference scores (Δf [t] and Δg[t]). This model includes the use of fixedunit values (unlabeled arrows) to define pairs of latent changes (Δf [t] and Δg[t]), and equality (invariance) constraints over time within a factor (for the α, β, and γ parameters) to simplify estimation and identification. Most critically, in this model a coupling parameter (γ f ) represents the time-dependent effect of one construct ( g[t 1]) on the subsequent change in the other (Δf [t]). We can include both directions (γ f and γ g ) and consider many different SEMs for multiple latent changes. This new change model subsumes all aspects of the previous models as special cases to be tested. We can fit the standard cross-lagged factor model (Figure 5b) by eliminating the latent intercepts and the latent slopes. The standard cross-lagged models do not allow for systematic growth components, but this is now a testable feature of this change model. To obtain a bivariate latent curve (Figure 6b), we eliminate both the autoregressive (β f = β g = 0) and coupling parameters (γ f = γ g = 0). The bivariate latent growth models may represent parallel latent processes, even including regressions of slopes on levels, but they do not allow for crosslagged dynamic coupling of the key factors over time, so such a simple model may not capture the systematic changes or fit the data. The inclusion of latent changes in a bivariate model allows a variety of dynamic models to be tested using the standard SEM statistical approach. These bivariate trajectories can be complex, but they are automatically created as a linear accumulation of first differences for each variable by standard SEM programs. These bivariate latent change score models Latent Models 0f Longitudinal Changes 597

22 can be expanded in many other ways lead to more-complex nonlinear-trajectory equations (e.g., nonhomogeneous equations; Hamagami & McArdle 2000, 2007). These dynamic models can easily be extended to multivariate form to estimate the time-dependent interplay among multiple factors (Ghisletta & Lindenberger 2005, McArdle et al. 2001). Additional dynamic features will be possible to estimate with more common factors and more occasions of measurement. ADDITIONAL RELEVANT RESEARCH ISSUES We have discussed a variety of new options for repeated measures data analysis from an SEM perspective. We have tried to show how SEMs permit researchers to make very specific hypotheses about longitudinal data and then to use traditional multivariate statistical tests about mean and covariance structures to form test statistics. In areas where the a priori hypotheses are straightforward, using SEM is relatively easy. But when the hypotheses are more complex and flexibility is needed, SEM may prove even more useful. In this final section, we discuss some new applications, some additional but overlooked topics, and offer some concluding thoughts. New Dynamic Structural Equation Model Applications This review does not emphasize substantive applications, but each model discussed here has been used to analyze real data. Latent crosslagged models were widely used in the 1970s and 1980s, and latent-curve applications increased rapidly during the 1990s and 2000s. The bivariate dual-change model has only been available a short time (post 2000), but many researchers have already applied this dynamic SEM to their substantive problems. In our own recent applications, we have used dynamic SEM to investigate the lead-lag relationships in (a) Wechsler Intelligence Scale for Children (McArdle 2001), (b) Wechsler Adult Intelligence Scale factors changing over adulthood (McArdle et al. 2001), (c) anti-social behaviors and reading achievement in the National Longitudinal Study of Youth cohort (McArdle & Hamagami 2001), (d ) cognitive dynamics in longitudinal twin data (McArdle & Hamagami 2003), (e) Experimental impacts of cognitive training (McArdle 2007a), and ( f ) brain changes in lateral ventricle size (LVS) and the Wechsler Memory Scale (WMS) (McArdle et al. 2004). Some aspects of our final study illustrate the substantive utility of dynamic modeling. These brain-behavior data were collected at two occasions over a seven-year longitudinal period for people of a wide range of ages (30 90). As shown in Figure 8a, we organized and plotted the raw data over age-at-testing. The two plots on the left-hand side show that the two-point raw data connected by small lines (over seven years) and the age coverage per person is sparse at best. To minimize selection biases, we also included persons who only participated once (shown as circles). We initially fit models to each variable separately using a latent change score model (Figure 7a) over age, and the results are given in the two plots on the right-hand side. From these analyses, we found that the brain changes (ΔLVS[t]) were linearly increasing, so the trajectories were increasing exponentially over age, probably reflecting shrinkage of related neural structures. At the same time, we found that the memory changes (ΔWMS[t]) had both constant and proportional changes and the trajectories were decreasing over age, indicating rapid losses of memory in the oldest ages. Although it seems reasonable to make statements about the likely causes and effects at this point, this raises the classic problem of the ecological fallacy from the prior trajectories, we do not know if the persons who are increasing the most in lateral ventricle size are also the same persons who are subsequently losing their memory abilities. However, we joined the two series, and from the bivariate change model (Figure 7b) analysis, we estimated a coupling coefficient from LVS ΔWMS (γ WMS = 0.2) that dominated the coupling of WMS ΔLVS 598 McArdle

23 ANRV364-PS60-22 ARI 27 October :54 Figure 8 Results from bivariate latent change score analysis (from McArdle et al. 2004). (a) Aging data and latent-curve score expectations for lateral ventricle size (LVS) and Wechsler Memory Scores (WMS) and (b) statistical vector field result of expected changes in LVS and WMS over time. (γ LVS = 0). This new dynamic SEM evidence suggested the brain changes were a leading indicator in time of the memory changes and not the other way around. To summarize the bivariate dynamic results, we also expressed the final model coefficients using a statistical vector field, shown in Figure 8b. Each arrow in this figure starts at a pair of scores, representing the bivariate starting point, and the direction of the arrow illustrates the changes expected over the next time point in the set of scores. This figure Latent Models 0f Longitudinal Changes 599

24 illustrates one of the major outcomes of a true score dynamic model the pulling of memory changes downward by increasing size of the lateral ventricles. For many knowledgeable scientists, this illustrates an obvious dynamic relationship (i.e., brain Δbehavior), but ours is an empirical observation. This pictorial description sets the stage for further work in simplifying the dynamic concepts and the corresponding multivariate results. Research led by others has used similar bivariate latent change score models to examine different lead-lag relationships, including (a) personality disorders and changes (Hamagami et al. 2000), (b) perceptual speed and knowledge changes in older ages (Ghisletta & Lindenberger 2003, 2005), (c) specific cognitive abilities and achievement using the Woodcock- Johnson scales (Ferrer & McArdle 2004), (d ) social participation and perceptual speed in aging (Lövdén et al. 2005), (e) physical activity and cognitive declines (Ghisletta et al. 2006), ( f ) reading and cognition over all ages (Ferrer et al. 2007), and ( g) forgiveness and psychological adjustment (Orth et al. 2008). Other Promising Directions Many topics that are relevant to longitudinal SEM and multivariate data analysis have not been considered here. In practice, we need to know how to best select measures, score or scale the measures, deal with different (unbalanced) intervals of time, design how many occasions and participants are needed, deal with outliers, choose appropriate estimation techniques, and select indices of goodness-of-fit. All of these problems have new and elegant statistical solutions and are worthy of more detailed discussions. New forms of multiple group and incomplete data approaches can be used with the dynamic models described here. By using a multilevel approach, we can also effectively analyze cases in which each person has different amounts of longitudinal data (i.e., unbalanced data or incomplete data), and some of the new SEM programs make this an easy task. These possibilities lead directly to the revival of practical experimental options based on incomplete data (e.g., McArdle & Woodcock 1997). Incomplete data models have also been used to describe the potential benefits of a mixture of age-based and time-based models using only two time points of data collection an accelerated longitudinal design (Duncan et al. 2006, McArdle & Bell 2000, Raudenbush 2001). Factorial invariance, especially factorloading invariance, was considered as a major requirement for any longitudinal analysis. As presented here, the goal of measurement invariance needs to be achieved before we can consider any SEMs of latent changes. Although only scale-level data are considered here, practical problems with measurement invariance at the scale level may indicate more basic measurement problems at the item level (McDonald 1985). Using incomplete data principles, item invariance is possible even if the items are originally used in the context of different scales (Grimm et al. 2007, McArdle & Nesselroade 2003). There are many techniques for linkage across measurement scales with sparse longitudinal data, and invariant item-response models may be very useful for these purposes (McArdle et al. 2002, McArdle & Nesselroade 2003). A great deal of interest has focused on another fundamental repeated measures problem the separation of latent groupings of people with different patterns of changes. Recent theoretical work on latent mixture models has been carefully developed (McLachlan & Peel 2000) and can be used for this purpose. In these analyses, the distribution of the latent parameters is assumed to come from a mixture of two or more overlapping distributions or latent classes. Mixture modeling of this variety is a recent addition to some SEM programs (e.g., Mplus; see Grimm et al. 2007). This interesting concept of a latent grouping seems very reasonable from a substantive point of view, but critics of these techniques point out the possibility of finding multiple latent classes when only one actually exists (e.g., Bauer & Curran 2003). Still, as with any multivariate cluster analysis 600 McArdle

25 result, our hope is that latent mixtures of group changes will be treated as an exploratory result that can be useful in guiding subsequent research. Finally, there are many other elegant statistical models for longitudinal data. There have been several important breakthroughs in work on dynamic modeling of continuous time data using SEM software (e.g., Boker 2001, Chow et al. 2007, Montfort et al. 2007, Oud & Jansen 2000). These differential models offer many more dynamic possibilities, and this is increasingly important when large amounts of time points of data are collected (T > N ). Other repeated-measures SEMs are based on the logic of partitioning variance components across multiple modalities (Kenny & Zautra 2001, Kroonenberg & Oort 2003, Steyer et al. 2001). These models decompose factorial influences into orthogonal common and specific components, with an emphasis on separating trait factors from state factors. These models have interesting interpretations, and they may be useful when combined with other SEMs described here. DISCLOSURE STATEMENT Final Comments The statistical training for most psychologists is still based on classical logic of ANOVA, multiple regression, and factor analysis techniques. A common theme of all this training is that statistical techniques are calculations based on specific assumptions, and the scientific inferences are limited by the accuracy of our statistical assumptions. This way of thinking about data analysis allows us to face new problems and is essential as the problems become more complex. This review has attempted to highlight what has been learned from applying newer forms of SEM thinking to repeated-measures data designs. In some cases, advances are obvious, and in other cases, cautions have been suggested. From a broad perspective, a central conclusion of this review is that any repeated measures analysis should not start by asking, What is your data collection design? or What computer program can you use? The corollary conclusion is that all repeated measures analyses should start with the question, What is your model for change? The author is not aware of any biases that might be perceived as affecting the objectivity of this review, and he has no financial interest in any specific SEM computer program. ACKNOWLEDGMENTS The work described here has been supported since 1980 by the National Institute on Aging (grant #AG-07137). I am especially grateful to the collaboration of my close friend and colleague, John R. Nesselroade. This research was also helped by the support of many others, including the editors, and Steven Boker, Emilio Ferrer, Paolo Ghisletta, Kevin Grimm, Kelly Kadlec, Carol Prescott, John Prindle, Dick Woodcock, and Yan Zhou. LITERATURE CITED Allison PD Change scores as dependent variables in regression analysis. In Sociological Methodology 1990, ed. CC Clogg, pp San Francisco, CA: Jossey-Bass Arbuckle JL, Wotke W AMOS 5.0 User s Guide. Chicago: Smallwaters Bauer DJ, Curran PJ Distributional assumptions of growth mixture models: implications for overextraction of latent trajectory classes. Psychol. Methods 8(3): Bentler PM Multivariate analysis with latent variables: causal modeling. Annu. Rev. Psychol. 31: Bentler PM, Dudgeon P Covariance structure analysis: statistical practice, theory, and directions. Annu. Rev. Psychol. 47: Latent Models 0f Longitudinal Changes 601

26 Bock RD Multivariate Statistical Methods in Behavioral Research. New York: McGraw-Hill Boker SM Differential structural equation modeling of intraindividual variability. See Collins & Sayer 2001, pp Bollen KA Latent variables in psychology and the social sciences. Annu. Rev. Psychol. 53: Bollen K, Curran PJ Latent Curve Models: A Structural Equation Perspective. New York: Wiley Browne M, Nesselroade JR Representing psychological processes with dynamic factor models: some promising uses and extensions of autoregressive moving average time series models. In Contemporary Advances in Psychometrics, ed. A Madeau, JJ McArdle, pp Mahwah, NJ: Erlbaum Chow S-M, Ferrer E, Nesselroade JR An unscented Kalman filter approach for the estimation of nonlinear dynamic systems models. Multivariate Behav. Res. 42(2): Cillessen A, Mayeux L From censure to reinforcement: developmental changes in the association between aggression and social status. Child Dev. 75: Cole DA, Maxwell SE Testing mediational models with longitudinal data: questions and tips in using structural equation models. J. Abnorm. Psychol. 112: Collins LM Analysis of longitudinal data: the integration of theoretical model, temporal design, and statistical model. Annu. Rev. Psychol. 57: Collins LM, Sayer A New Methods for the Analysis of Change. Washington, DC: Am. Psychol. Assoc. Press Cox DR Regression models and life-tables. J. R. Stat. Soc. B 34: Cronbach LJ, Furby L How we should measure change or should we? Psychol. Bull. 74:68 80 Cudeck R, Harring JR Analysis of nonlinear patterns of change with random coefficient models. Annu. Rev. Psychol. 58: Curran PJ, Bollen K The best of both worlds: combining autoregressive and latent curve models. See Collins & Sayer 2001, pp Duncan TE, Duncan SC, Strycker LA, Li F An Introduction to Latent Variable Growth Curve modeling: Concepts, Issues, and Applications. Mahwah, NJ: Erlbaum. 2nd ed. Enders CK A primer on maximum likelihood algorithms for use with missing data. Struct. Equation Model. 8: Ferrer E, Hamagami F, McArdle JJ Modeling latent growth curves with incomplete data using different types of structural equation modeling and multilevel software. Struct. Equation Model. 11(3): Ferrer E, McArdle JJ An experimental analysis of dynamic hypotheses about cognitive abilities and achievement from childhood to early adulthood. Dev. Psychol. 40: Ferrer E, McArdle JJ, Shaywitz BA, Holahan JM, Marchione K, Shaywitz SE Longitudinal models of developmental dynamics between reading and cognition from childhood to adolescence. Dev. Psychol. 43: Ghisletta P, Bickel J-F, Lövdén M Does activity engagement protect against cognitive decline in old age? Methodological and analytical considerations. J. Gerontol. B Psychol. Sci. 61: Ghisletta P, Lindenberger U Age-based structural dynamics between perceptual speed and knowledge in the Berlin Aging Study: direct evidence for ability dedifferentiation in old age. Psychol. Aging 18(4): Ghisletta P, Lindenberger U Exploring the structural dynamics of the link between sensory and cognitive functioning in old age: longitudinal evidence from the Berlin Aging Study. Intelligence 33: Ghisletta P, McArdle JJ Latent growth curve analyses of the development of height. Struct. Equation Model. 8(4): Goldberger AS, Duncan OD Structural Equation Models in the Social Sciences. New York: Seminar Press Gollob HF, Reichardt CS Taking account of time lags in causal models. Child Dev. 58:80 92 Grimm KJ, Hamagami F, McArdle JJ Nonlinear growth models in research on cognitive aging. See Montfort et al. 2007, pp Grimm KJ, McArdle JJ A note on the computer generation of mean and covariance expectations in latent growth curve analysis. In Multi-Level Issues in Strategy and Methods, ed. F Danserau, FJ Yammarino, pp New York: Elsevier 602 McArdle

27 Hamagami F, McArdle JJ Advanced studies of individual differences linear dynamic models for longitudinal data analysis. In New Developments and Techniques in Structural Equations Modeling, ed. G Marcoulides, R Schumacker, pp Mahwah, NJ: Erlbaum Hamagami F, McArdle JJ Dynamic extensions of latent difference score models. In Quantitative Methods in Contemporary Psychology, ed. SM Boker, ML Wegner, pp Mahwah, NJ: Erlbaum Hamagami F, McArdle JJ, Cohen P Bivariate dynamic systems analyses based on a latent difference score approach for personality disorder ratings. In Temperament and Personality Development Across the Life Span, ed. VJ Molfese, DL Molfese, pp Mahwah, NJ: Erlbaum Harris CW, ed Problems in Measuring Change. Madison: Univ. Wisc. Press Hedecker D, Gibbons R Longitudinal Data Analysis. New York: Wiley Hertzog C, Lindenberger U, Ghisletta P, Oertzen TV On the power of multivariate latent growth curve models to detect correlated change. Psychol. Methods 11(3): Horn JL State, trait, and change dimensions of intelligence. Br. J. Math. Stat. Psychol. 42(2): Horn JL, McArdle JJ Perspectives on Mathematical and Statistical Model Building (MASMOB) in research on aging. In Aging in the 1980 s: Psychological Issues, ed. L Poon, pp Washington, DC: Am. Psychol. Assoc. Hsiao C Analysis of Panel Data. London: Cambridge Univ. Press. 2nd ed. Jöreskog KG, Sörbom D Advances in Factor Analysis and Structural Equation Models. Cambridge, MA: Abt Books Jöreskog KG, van Thillo M, Gruvaeus GT ACOVSM: a general computer program for analysis of covariance structures including generalized MANOVA. Res. bull., Educ. Test. Serv., Princeton, NJ Kenny DA, Zautra A The trait-state model for longitudinal data. See Collins & Sayer 2001, pp Kline R Principles and Practices in Structural Equation Modeling. New York: Guilford Kroonenberg PM, Oort FJ Three-mode analysis of multimode covariance matrices. Br. J. Math. Stat. Psychol. 56(2): Little RJA, Rubin DJ Statistical Analysis with Missing Data. New York: Wiley. 2nd ed. Lord F Further problems in the measurement of growth. Educ. Psychol. Meas. 18: Lord F A paradox in the interpretation of group comparisons. Psychol. Bull. 68(5):304 5 Lövdén M, Ghisletta P, Lindenberger U Social participation attenuates decline in perceptual speed in old and very old age. Psychol. Aging 20: MacCallum RC, Austin JT Applications of structural equation modeling in psychological research. Annu. Rev. Psychol. 51: McCallum RC, Kim C, Malarkey WB, Kiecolt-Glaser JK Studying multivariate change using multilevel models and latent curve models. Multivariate Behav. Res. 32: MacKinnon DP, Fairchild AJ, Fritz MS Mediation analysis. Annu. Rev. Psychol. 58: Maxwell SE, Kelley K, Rausch JR Sample size planning for statistical power and accuracy in parameter estimation. Annu. Rev. Psychol. 59: McArdle JJ Latent variable growth within behavior genetic models. Behav. Genet. 16(1): McArdle JJ Structural modeling experiments using multiple growth functions. In Learning and Individual Differences: Abilities, Motivation, and Methodology, ed. P Ackerman, R Kanfer, R Cudeck, pp Hillsdale, NJ: Erlbaum McArdle JJ A latent difference score approach to longitudinal dynamic structural analyses. In Structural Equation Modeling: Present and Future, ed. R Cudeck, S du Toit, D Sorbom, pp Lincolnwood, IL: Sci. Softw. Int. McArdle JJ. 2007a. Dynamic structural equation modeling in longitudinal experimental studies. See Montfort et al. 2007, pp McArdle JJ. 2007b. Five steps in the structural factor analysis of longitudinal data. In Factor Analysis at 100 Years, ed. R Cudeck, R MacCallum, pp Mahwah, NJ: Erlbaum McArdle JJ, Bell RQ An introduction to latent growth curve models for developmental data analysis. In Modeling Longitudinal and Multiple-Group Data: Practical Issues, Applied Approaches, and Scientific Examples, ed. TD Little, KU Schnabel, J Baumert, pp Mahwah, NJ: Erlbaum Latent Models 0f Longitudinal Changes 603

28 McArdle JJ, Grimm K, Hamagami F, Bowles R, Meredith W A dynamic structural equation analysis of vocabulary abilities over the life-span. Presented at annu. meet. Soc. Multivariate Exp. Psychol., Charlottesville, VA McArdle JJ, Hamagami F Linear dynamic analyses of incomplete longitudinal data. See Collins & Sayer 2001, pp McArdle JJ, Hamagami F Structural equation models for evaluating dynamic concepts within longitudinal twin analyses. Behav. Genet. 33(2): McArdle JJ, Hamagami F, Jones K, Jolesz F, Kikinis R, et al Structural modeling of dynamic changes in memory and brain structure using longitudinal data from the normative aging study. J. Gerontol. Psychol. Sci. 59B(6):P McArdle JJ, Hamagami F, Meredith W, Bradway KP Modeling the dynamic hypotheses of Gf-Gc theory using longitudinal life-span data. Learn. Individ. Differences 12:53 79 McArdle JJ, Nesselroade JR Structuring data to study development and change. In Life-Span Developmental Psychology: Methodological Innovations, ed. SH Cohen, HW Reese, pp Hillsdale, NJ: Erlbaum McArdle JJ, Nesselroade JR Growth curve analyses in contemporary psychological research. In Comprehensive Handbook of Psychology, Volume Two: Research Methods in Psychology, ed. J Schinka, W Velicer, pp New York: Pergamon McArdle JJ, Small BJ, Backman L, Fratiglioni L Longitudinal models of growth and survival applied to the early detection of Alzheimer s disease. J. Geriatr. Psychiatry Neurol. 18(4): McArdle JJ, Woodcock JR Expanding test-rest designs to include developmental time-lag components. Psychol. Methods 2(4): McCall RB, Applebaum MI Bias in the analysis of repeated measures designs: some alternative approaches. Child Dev. 44: McDonald RP Factor Analysis and Related Methods. Hillsdale, NJ: Erlbaum McLachlan G, Peel D Finite Mixture Models. New York: Wiley Meredith W, Horn JL The role of factorial invariance in measuring growth and change. See Collins & Sayer 2001, pp Meredith W, Tisak J Latent curve analysis. Psychometrika 55: Montfort K, Oud H, Satorra A Longitudinal Models in the Behavioural and Related Sciences. Mahwah, NJ: Erlbaum Muller KE, Stewart PW Linear Model Theory. New York: Wiley Muthén BO, Curran P General longitudinal modeling of individual differences in experimental designs: a latent variable framework for analysis and power estimation. Psychol. Methods 2: Muthén LK, Muthén BO Mplus, the Comprehensive Modeling Program for Applied Researchers User s Guide. Los Angeles, CA: Muthen & Muthen Neale MC, Boker SM, Xie G, Maes HH Mx Statistical Modeling. Unpubl. program manual, VA Inst. Psychiatr. Behav. Genet., Med. Coll. VA, VA Commonwealth Univ., Richmond, VA. 5th ed. Nesselroade JR, Baltes PB Longitudinal Research in the Study of Behavior and Development. New York: Academic Nesselroade JR, Cable DG Sometimes it s okay to factor difference scores the separation of state and trait anxiety. Multivariate Behav. Res. 9: Nesselroade JJ, McArdle JJ, Aggen SH, Meyers J Dynamic factor analysis models for multivariate time series analysis. In Modeling Individual Variability with Repeated Measures Data: Advances and Techniques, ed. DM Moskowitz, SL Hershberger, pp Mahwah, NJ: Erlbaum Nesselroade JR, Stigler SM, Baltes PB Regression toward the mean and the study of change. Psychol. Bull. 88(3): O Brien RG, Kaiser MK MANOVA method for analyzing repeated measures designs: an extensive primer. Psychol. Bull. 97(2): Orth U, Berking M, Walker N, Meier LL, Znoj H Forgiveness and psychological adjustment following interpersonal transgressions: a longitudinal analysis. J. Res. Personal. 42: Oud JHL, Jansen RARG Continuous time state space modeling of panel data by means of SEM. Psychometrika 65: McArdle

29 Raudenbush SW Comparing personal trajectories and drawing causal inferences from longitudinal data. Annu. Rev. Psychol. 52: Rogosa D Causal models in longitudinal research: rationale, formulation, and interpretation. In Longitudinal Research in the Study of Behavior and Development, ed. JR Nesselroade, PB Baltes, pp New York: Academic Rogosa D, Willett Demonstrating the reliability of the difference score in the measurement of change. J. Educ. Meas. 20(4): Singer JD, Willett J Applied Longitudinal Data Analysis. New York: Oxford Univ. Press Shadish W, Cook TD, Campbell DT Experimental and Quasi-Experimental Design for Generalized Causal Inference. Boston, MA: Houghton-Mifflin Snyder J, Brooker M, Patrick MR, Snyder A, Schrepferman L, Stoolmiller M Observed peer victimization during early elementary school: continuity, growth, and relation to risk for child antisocial and depressive behavior. Child Dev. 74(6): Steyer R, Partchev I, Shanahan MJ Modeling true intraindividual change in structural equation models: the case of poverty and children s psychological adjustment. In Modeling Longitudinal and Multiple-Group Data: Practical Issues, Applied Approaches, and Scientific Examples, ed. TD Little, KU Schnabel, J Baumert, pp Mahwah, NJ: Erlbaum Tomarken AJ, Waller NJ Structural equation modeling: strengths, limitations, and misconceptions. Annu. Rev. Clin. Psychol. 1:31 65 Verbeke G, Molenberghs G Linear Mixed Models for Longitudinal Data. New York: Springer Walls TA, Schafer JL Models of Intensive Longitudinal Data. New York: Oxford Univ. Press Latent Models 0f Longitudinal Changes 605