Craig K. Enders Arizona State University Department of Psychology

Transcription

1 Craig K. Enders Arizona State University Department of Psychology

2 Topic Page Missing Data Patterns And Missing Data Mechanisms 1 Traditional Missing Data Techniques 7 Maximum Likelihood Estimation And Missing Data Handling 12 Maximum Likelihood Missing Data Handling In Mplus 21 Data Analysis Example: Means, Standard Deviations, and Correlations 27 Data Analysis Example: ANCOVA 29 Data Analysis Example: Repeated Measures ANOVA 31 Dealing With Nonnormal Missing Data 34 Data Analysis Example: Confirmatory Factor Analysis 37 Incorporating Auxiliary Variables Into Maximum Likelihood Analyses 42 Data Analysis Example: Scale Score Analysis With Auxiliary Variables 49

3 Topic Page Multiple Imputation: The Imputation Phase 53 Assessing The Convergence Of MCMC 57 Multiple Imputation: The Analysis and Pooling Phase 64 Multiple Imputation In Mplus 69 Data Analysis Example: Means, Standard Deviations, and Correlations 78 Data Analysis Example: ANCOVA 80 Data Analysis Example: Confirmatory Factor Analysis 82 Data Analysis Example: Scale Score Analysis 87 Data Analysis Example: Multilevel Model 91 Multiple Imputation In SPSS 96 Multiple Imputation In SAS 101 Planned Missing Data Designs 105

4 Discuss missing data theory and assumptions Briefly review traditional analysis approaches Introduce modern missing data handling methods: maximum likelihood (ML) estimation, and multiple imputation (MI) Illustrate software applications of ML and MI Introduce planned missing data designs Routine implementation of these new methods of addressing missing data [maximum likelihood and multiple imputation] will be one of the major changes in research over the next decade Steve West, former Editor of Psychological Methods, quoted in APA s Monitor On Psychology (2002, Vol. 33, p. 70) The number of ML and MI applications in behavioral science journals has increased dramatically in recent years Applied Missing Data Analysis Guilford Press, for additional information All data sets and analysis examples from the book are available for download 4 1

5 The missing data pattern describes the configuration of observed and the missing values in a data set The pattern describes the location of the holes in the data but says nothing about why the data are missing The missing data mechanism describes how an individual s propensity for missing data is related to other variables, if at all The likelihood of a missing value on Y may be related to other variables in the data or to the would-be values of Y; it is also possible that the propensity for missing data is unrelated to other variables A general pattern occurs when missing values are haphazardly dispersed throughout the data matrix Despite the seemingly random pattern, missingness may be systematic ML and MI are flexible approaches that can handle a general missing data pattern Y1 Y2 Y3 Y4 Planned missing data designs introduce intentional missing values (e.g., to reduce respondent burden and maximize resources) e.g., A four-wave longitudinal study where each case provides data at three of the four waves Y1 Y2 Y3 Y4 Theoretical foundations of modern missing data analyses described by Rubin (1976) Missing data mechanisms Missing completely at random (MCAR) Missing at random (MAR) Missing not at random (MNAR) Mechanisms describe how the propensity for a missing value on a variable Y relates to the data, if at all 2

6 For a variable Y, there are potentially two scores: The value of Y A binary variable R that denotes whether Y is observed (e.g., R = 0 if Y is observed, R = 1 if Y is missing) Similarly, there are two sets of parameter estimates: The parameters of substantive interest, (e.g., means, standard deviations, correlations) Nuisance parameters that describe the propensity for missing data, (e.g., logistic regression coefficients) Sometimes we only need to estimate, other times we must estimate and ; the missing data mechanism dictates this The probability of missing data on Y is unrelated to other measured variables and is unrelated to the values of Y itself What researchers think of as haphazard or flip of a coin missingness MCAR is strict because it says that the probability of missing data is unrelated to anything in the data The observed data are a simple random sample of the hypothetically complete data set Employees complete an IQ test during a job interview Supervisors rate job performance after 6-months Performance ratings are missing for no particular reason (e.g., maternity leave, spouse relocates, supervisor quits) IQ Scores Job Ratings (Hypothetical) Job Ratings (MCAR) MCAR is the only testable mechanism If MCAR holds, cases with missing values should be no different from the cases with complete data, on average To test, form an indicator variable that denotes missingness (e.g., 0 = complete, 1 = missing) Compare indicator groups on other variables, e.g., using an independent t test or Cohen s d effect size Multivariate extensions of this approach exist (e.g., Rubin s MCAR test) 3

7 Complete cases have an IQ mean of Missing cases have an IQ mean of Small differences between the two groups suggest haphazard missingness MCAR is plausible IQ Scores Job Ratings (MCAR) Missingness Indicator Confusing terminology because missingness is systematic MAR means that the probability of missing data on Y is related to some other measured variable but is unrelated to the wouldbe values of Y itself After controlling for other variables, there is no association between the propensity for missing data on Y and the would-be values of Y ML and MI require the MAR assumption Prospective employees complete an IQ test during a job interview IQ is a section measure, company does not hire applicants in the lower quartile Supervisors rate job performance after 6-months IQ Scores Job Ratings (Hypothetical) Job Ratings (MAR) The probability of missing data on Y is related to the values of Y itself The most problematic mechanism, can cause substantial bias Requires specialized analysis procedures (e.g., selection models, pattern mixture models) MNAR can also occur when the cause of missingness is a measured variable that is omitted from the analysis 4

8 IQ Scores Job Ratings (Hypothetical) Job Ratings (MNAR) Employees complete an IQ test during a job interview Supervisors rate job performance after 6-months Company terminates employees for poor performance prior to their evaluation IQ JP MCAR Z R IQ JP MAR Z R IQ JP MNAR Z R It is impossible to empirically differentiate MAR or MNAR Proving that the probability of missingness on Y is related (or unrelated) to Y requires the would-be values of Y It is only possible to provide evidence against MCAR Mean differences between the complete and the incomplete cases could be MAR or MNAR MAR is ultimately an untestable assumption Complete cases have an IQ mean of Missing cases have an IQ mean of Large differences between the two groups suggest systematic missingness MCAR is not plausible, the mechanism is MAR or MNAR IQ Scores Job Ratings (MAR) Missingness Indicator

9 Complete cases have an IQ mean of Missing cases have an IQ mean of Large differences between the two groups suggest systematic missingness MCAR is not plausible, the mechanism is MAR or MNAR IQ Scores Job Ratings (MNAR) Missingness Indicator Mechanisms serve as assumptions for a missing data analysis ML and MI assume MAR When MAR (or MCAR) holds, we can estimate the parameters of substantive interest without worrying about the parameters that dictate missingness MNAR analyses (e.g., selection models, pattern mixture models) require a submodel that explains why the data are missing MNAR analyses are difficult and do not necessarily perform better than MAR-based analyses ML and MI require the MAR assumption MAR is not automatically satisfied just because the causes/ correlates of missing data are measured variables MAR is satisfied on an analysis-by-analysis basis The correlates of missingness must be part of the statistical analysis or part of the imputation routine Researchers use simple regression to examine the association between self-esteem and risky sexual behavior in teens Only participants 16 years of age or older fill out the sexual behavior questionnaire MAR is only satisfied if age is included in the regression model Excluding age can produce an MNAR mechanism and can produce biased parameter estimates The correlation between age and sexual behavior dictates this bias 6

10 Researchers rarely know why data are missing At best, measured variables may be correlates of the true reasons for missingness An inclusive analysis strategy incorporates auxiliary variables that are (a) correlates of the incomplete variable or (b) correlates of missingness An inclusive strategy improves the chances of satisfying the MAR assumption Standard regression model (MNAR) Esteem Risky Sex Auxiliary variable regression model (MAR) Esteem Risky Sex e e Age Common traditional approaches, all of which can lead to substantial bias Listwise deletion Pairwise deletion Mean imputation Regression imputation 7

11 20 prospective employees take an IQ test during a job interview The company uses IQ as a selection measure and hires applicants who score above the median A supervisor rates job performance following a 6-month probationary period Performance ratings are missing for the employees who were never hired Job Performance IQ True MAR Job Performance IQ True MAR Job Performance Listwise deletion eliminates all cases with missing values, resulting in a complete data set Pairwise deletion eliminates cases on an analysis-by-analysis basis Discarding data reduces power Deletion methods assume an MCAR mechanism and will yield bias under MAR or MNAR IQ 8

12 Job Performance Excluded cases Replaces (i.e., imputes) missing values on Y with the average of the available Y scores Replacement values pile up at the mean, restricting variability Estimates are severely biased under any missing data mechanism This is the worst possible option IQ Job Performance Filled-in cases have R 2 = 0 Regression equations predict the incomplete variables from the complete variables Substituting observed data as X variables in the regression equation generates predicted values for the missing scores Imputed values fall directly on a regression surface, reducing variation in the data Measures of variation and association are biased IQ 9

13 Job Performance Filled-in cases have R 2 = 1 ^ JobPerf = B 0 + B 1 (IQ) Same as regression imputation, but adds a normal residual term to each predicted value Stochastic regression is the only traditional approach that assumes MAR This is the best of the bunch, but standard errors are too small This method is equivalent to multiple imputation with a single filled-in data set IQ Residual distributions around predicted scores Job Performance ^ JobPerf = B 0 + B 1 (IQ) Job Performance ^ JobPerf = B 0 + B 1 (IQ) IQ IQ 10

14 Job Performance 20 Filled-in cases Random residual terms IQ Analysis Method IQ M (SD) Performance M (SD) IQ-Perf Correlation Complete Data 100 (14.13) (2.68).54 Listwise Deletion (9.70) (2.71).44 Mean Imputation 100 (14.13) (1.87).21 Regression Imputation 100 (14.13) (2.43).72 Stochastic Regression 100 (14.13) (2.74) random samples of N = 250 from a bivariate normal distribution with 50% missing data on Y Average parameter estimates from the 1000 data sets Parameter True Value LD AMI RI SRI µ X µ Y " 2 X " 2 Y " XY r XY random samples of N = 250 from a bivariate normal distribution with 50% missing data on Y Average parameter estimates from the 1000 data sets Parameter True Value LD AMI RI SRI µ X µ Y " 2 X " 2 Y " XY r XY

15 1000 random samples of N = 250 from a bivariate normal distribution with 50% missing data on Y Average parameter estimates from the 1000 data sets Parameter True Value LD AMI RI SRI µ X µ Y " 2 X " 2 Y " XY r XY Maximum likelihood (ML) identifies the population parameter values that are most consistent with the raw data A likelihood (or log likelihood) function quantifies the discrepancy or fit of the data to the parameters The multivariate normal distribution is the starting point for ML estimation with continuous variables The height of the multivariate normal distribution 1 L i = ( 2 ) p/2 " 1/2 e#.5 yi#µ ( ) T " #1 ( y i #µ) This is key Plugging scores into Y yields a likelihood, L i L i is the relative probability of an individual s Y values, given the parameter estimates in µ and # 12

16 The likelihood value is largely driven by the squared z score to the right of the exponent L i = [scaling factor]e.5(z2 ) Smaller z scores reflect a better fit to the parameters, in the sense that the score is closer to the mean Small z scores = high likelihood = high probability = good fit Scores for two cases: Case 1: Y 1 = 0, Y 2 = -.5 Case 2: Y 1 = -1, Y 2 = -1.5 Case 1 has the higher likelihood value This case is closer to the parameter values and thus has better fit to µ = 0 L 2 = L 1 = Y Y Scores for two cases: Case 1: Y 1 = 0, Y 2 = -.5 Case 2: Y 1 = -1, Y 2 = -1.5 Case 2 now has the higher likelihood value This case is closer to the parameter values and has better fit to µ = -1 L 2 = Y 1 L 1 = Y Likelihoods are very small numbers; taking the natural log makes the math a bit more tractable log L i = p log( 2 ) log " 1 ( 2 y µ i ) T "1 ( y i µ) The log likelihood still quantifies the relative probability of a set of scores, but on a logarithmic scale 13

17 Scores for two cases: Case 1: Y 1 = 0, Y 2 = -.5 Case 2: Y 1 = -1, Y 2 = -1.5 Case 1 has the higher log likelihood (i.e., relative probability) value L 2 =.064 logl 2 = Y 1 L 1 =.164 logl 1 = Y The log likelihood for an entire sample is the sum of the individual log likelihoods log L = log L i The log likelihood summarizes the fit of a sample to a normal distribution with a particular mean vector and covariance matrix ML uses the log likelihood to audition and choose among different plausible parameter values A sample of IQ scores from 20 job applicants Use ML to estimate the population mean Estimation strategy: Compute the sample log likelihood for different values of µ Identify the mean value that produces the highest log likelihood (i.e., best fit to the data; highest probability of producing the sample data) logl = ( ) = ID IQ logl ID IQ logl 1 78 "#$%&' "#($)' "#($)' *#%)+' *#)&(' *#&+,' *#&*&' *#$)"' *#$)"' *#$$-'

18 logl = ( ) = logl= ( ) = ID IQ logl ID IQ logl ID IQ logl ID IQ logl logl = ( ) = ID IQ logl ID IQ logl Mean logl µ = 100 produced the highest log likelihood (i.e., relative probability) µ = 100 has the highest probability of producing this sample of 20 cases µ = 100 is the maximum likelihood estimate (MLE) 15

19 The log likelihood function describes how the sample log likelihood changes between values of µ = 90 and 110 Log Likelihood Population Mean µ = 100 maximizes the log likelihood Population Mean The complete data log likelihood log L i = p log 2 2 ( ) 1 2 log " 1 2 ( y µ i )T "1 ( y i µ) The missing data log likelihood (also called FIML for full information maximum likelihood) log L i = p i log 2 2 ( ) 1 2 log " 1 ( 2 y i µ i ) T 1 ( y i µ i ) The missing data log likelihood has an i subscript on the parameter matrices, µ and # The size and content of these matrices can vary across cases depending on which variables are observed and missing The squared z score that determines each case s likelihood (i.e., fit) is computed using the parameters for which a case has data Consequently, ML uses all available data to estimate parameters; some cases contribute more information than others An analysis with three variables: Y 1, Y 2, and Y 3 The squared z score for the complete cases is based on all parameters log L i = K i (" 13 * 2 log $ * $ $ * )# y 1 y 2 y 3 % " ' $ ' $ ' $ & # µ 1 µ 2 µ 3 T 1 % + " ' % (" $ 13 ' * $ '- $ ' * $ ' - $ ' $ &, # $ &' * )# y 1 y 2 y 3 % " ' $ ' $ ' $ & # µ 1 µ 2 µ 3 % + '- '- ' - &, Squared z score 16

20 Cases with missing values on Y 2 would have the following log likelihood log L i = K i 1 2 log 11 (" 13.5* $ * )# $ y 1 y 3 T % " ' & ' µ % + " 1 $ '- 11 % $ 13 ' # $ µ 3 & '-, # $ & ' 1 (" * $ * )# $ Squared z score y 1 y 3 % " ' & ' µ % + 1 $ '- # $ µ 3 & '-, ML does not fill in the missing values ML uses the observed data to search for the parameters that yield the highest log likelihood (i.e., best fit to the observed data) Including the incomplete cases steers estimation toward a more accurate answer ML effectively borrows information from the observed data to estimate the parameters of the incomplete variables The squared z score is based on the observed data and the corresponding parameter estimates for Y 1 and Y 2 IQ Job Perf IQ Job Perf Estimate the mean job performance rating Deleting the incomplete cases produced an average of µ = (the true value is µ = 10.35) Including the IQ scores from the five incomplete cases should improve estimation The normal distribution is the key to understanding how ML missing data handling works 17

21 ML assumes that IQ and job performance ratings are normally distributed The normal distribution effectively constrains the plausible range of missing values For a given IQ value, some job performance ratings are more plausible than others Consider the incomplete cases with IQ = 85 and IQ = 78 The IQ scores provide information about the missing performance ratings Job Performance IQ 71 Most likely performance rating 9, given that IQ = Job Performance IQ IQ = Job Performance IQ IQ = 85 18

22 The 15 complete cases produced an average of µ = A case with an IQ score of 85 would likely have a performance rating of approximately 9 Based on this information, the job performance mean would be adjusted downward to account for the plausible (but missing) performance rating This adjustment is based solely on the observed IQ value, no imputation is necessary Job Performance IQ IQ = Most likely performance rating 8.2, given that IQ = 78 IQ IQ = 78 The 15 complete cases produced an average of µ = A case with an IQ score of 78 would likely have a performance rating of approximately 8.2 Based on this information, the job performance mean would be adjusted downward to account for the plausible (but missing) performance rating Again, this adjustment is based solely on the observed IQ value, no imputation is necessary Job Performance 77 19

23 Including the five incomplete cases adjusts the parameter values in a way that closely resembles the complete-data results Analysis Method IQ M (SD) Job Perf M (SD) IQ-Perf Correlation Complete Data 100 (14.13) (2.68).54 ML Missing Data (13.77) (2.87) random samples of N = 250 from a bivariate normal distribution with 50% missing data on Y Average parameter estimates from the 1000 data sets Parameter True Value LD ML µ X µ Y " 2 X " 2 Y " XY ML standard errors were much smaller 1000 random samples of N = 250 from a bivariate normal distribution with 50% missing data on Y Average parameter estimates from the 1000 data sets 1000 random samples of N = 250 from a bivariate normal distribution with 50% missing data on Y Average parameter estimates from the 1000 data sets Parameter True Value LD ML µ X µ Y " 2 X " 2 Y " XY Parameter True Value LD ML µ X µ Y " 2 X " 2 Y " XY

24 Data set containing scores from 480 employees on eight workrelated variables Variables: Age, gender, job tenure, IQ, psychological wellbeing, job satisfaction, job performance, and turnover intentions 33% of the cases have missing well-being scores, and 33% have missing satisfaction scores The mechanism is MCAR because the data are missing by design X contains complete variables (e.g., gender, IQ, etc.) To reduce costs, 33% of the well-being and job satisfaction scores were intentionally never collected X Well- Being Job Sat Multiple regression model that predicts job performance from psychological well-being and job satisfaction jobperf = B 0 + B 1 (wbeing) + B 2 (jobsat) + $% Well- Being Job Performance $ Job Satisfaction Planned missing data 21

25 The basic Mplus commands TITLE DATA VARIABLE ANALYSIS MODEL MODEL TEST OUTPUT Variable names must be 8 characters or less Denotes a comment line that the program ignores Commands end with : Subcommands end with ; Command lines must be less than 80 characters in length; wrap commands to the next line as needed Capitalization doesn t matter The TITLE command (optional) prints a title on output file TITLE: The title command is optional; mplus multiple regression program; The DATA command points Mplus to the location of the text data on the local drive Free format text files end in.dat or.txt and should include a placeholder for missing values DATA: Location of the data file; file = c:\amda Data\employee.dat ; 22

26 Omit the file path when the data file and the Mplus syntax file are located in the same folder The VARIABLE command lists the order of the variables, selects variables for analysis, and gives the missing value code DATA: Location of the data file; file = employee.dat; VARIABLE: Information about the contents of the data file; names = id age tenure female wbeing jobsat jobperf turnover iq; usevariables = wbeing jobsat jobperf; missing = all (-99); ANAYSIS specifies the estimator and other estimation details ANALYSIS: Specify the estimator; estimator = ml; The MODEL command specifies the analysis With complete data, you can use a bare-bones specification Mplus automatically estimates most of the necessary parameters (e.g., variances, means) MODEL: Regression model on means regressed on ; jobperf on wbeing jobsat; 23

27 With missing data on the predictor variables, it is necessary to specify the variances and covariances of the IVs This ensures that cases with missing predictor scores are included in the analysis MODEL: jobperf on wbeing jobsat; Regression; wbeing jobsat; Variances of IVs; wbeing with jobsat; Covariance between IVs; In ML analyses, Wald chi-square statistics are routinely used to test a set of parameters for significance = (" 0)2 SE 2 The Wald test is the ML analog of an F statistic in OLS regression With multiple parameters, the Wald test is expressed in matrices To perform the Wald omnibus test, attach labels to the parameters of interest in the MODEL command Among other things, MODEL TEST generates a Wald test for the specified hypotheses MODEL: (b1) and (b2) are labels needed for Wald test; jobperf on wbeing (b1); Jobperf on jobsat (b2); MODEL TEST: Wald test that both coefficients = 0; b1 and b2 are user-supplied labels from MODEL; b1 = 0; b2 = 0; 24

28 The OUTPUT command specifies additional information that appears in the Mplus output file OUTPUT: standardized gives beta weights and R-square; sampstat gives ML descriptives; patterns prints missing data patterns; standardized sampstat patterns; DATA: file = employee.dat; VARIABLE: names = id age tenure female wbeing jobsat jobperf turnover iq; usevariables = wbeing jobsat jobperf; missing = all (-99); ANALYSIS: estimator = ml; MODEL: jobperf on wbeing (b1); jobperf on jobsat (b2); wbeing jobsat; wbeing with jobsat; MODEL TEST: b1 = 0; b2 = 0; OUTPUT: standardized sampstat patterns; SUMMARY OF MISSING DATA PATTERNS MISSING DATA PATTERNS (x = not missing) JOBPERF x x x WBEING x x JOBSAT x x MISSING DATA PATTERN FREQUENCIES Pattern Frequency Pattern Frequency Pattern Frequency The covariance coverage matrix gives the proportion of complete cases on each variable or variable pair PROPORTION OF DATA PRESENT Covariance Coverage JOBPERF WBEING JOBSAT JOBPERF WBEING JOBSAT

29 ESTIMATED SAMPLE STATISTICS Means JOBPERF WBEING JOBSAT Covariances JOBPERF WBEING JOBSAT JOBPERF WBEING JOBSAT Correlations JOBPERF WBEING JOBSAT JOBPERF WBEING JOBSAT The Wald statistic (a chi-square with 2 degrees of freedom) is akin to the omnibus F test in OLS regression Wald Test of Parameter Constraints Value Degrees of Freedom 2 P-Value Considered as a set, the two predictors explain significant variation in the dependent variable MODEL RESULTS Unstandardized Coefficients Standard Error z Test Two-Tailed Estimate S.E. Est./S.E. P-Value JOBPERF ON WBEING JOBSAT WBEING WITH JOBSAT Means WBEING JOBSAT Two-Tailed Estimate S.E. Est./S.E. P-Value Intercepts JOBPERF Variances WBEING JOBSAT Residual Variances JOBPERF

30 STANDARDIZED MODEL RESULTS STDYX Standardization Beta Weights Two-Tailed Estimate S.E. Est./S.E. P-Value JOBPERF ON WBEING JOBSAT R-SQUARE Observed Two-Tailed Variable Estimate S.E. Est./S.E. P-Value JOBPERF Data set containing scores from 480 employees on eight workrelated variables Variables: Age, gender, job tenure, IQ, psychological wellbeing, job satisfaction, job performance, and turnover intentions Analysis: Obtain ML descriptive statistics for all quantitative variables (gender and turnover intentions are dummy codes) 27

31 DATA: file = employee.dat; VARIABLE: names = id age tenure female wbeing jobsat jobperf turnover iq; usevariables = age tenure wbeing jobsat jobperf iq; missing = all (-99); ANALYSIS: estimator = ml; MODEL: [age tenure wbeing jobsat jobperf iq]; Means; age tenure wbeing jobsat jobperf iq; Variances; age tenure wbeing jobsat jobperf iq with age tenure wbeing jobsat jobperf iq; Covariances; OUTPUT: standardized; MODEL RESULTS Covariances Standard Error z Test Two-Tailed Estimate S.E. Est./S.E. P-Value AGE WITH TENURE WBEING JOBSAT JOBPERF IQ TENURE WITH WBEING JOBSAT JOBPERF IQ Two-Tailed Estimate S.E. Est./S.E. P-Value WBEING WITH JOBSAT JOBPERF IQ JOBSAT WITH JOBPERF IQ JOBPERF WITH IQ Two-Tailed Estimate S.E. Est./S.E. P-Value Means AGE TENURE WBEING JOBSAT JOBPERF IQ Variances AGE TENURE WBEING JOBSAT JOBPERF IQ

32 STANDARDIZED MODEL RESULTS STDYX Standardization Correlations Two-Tailed Estimate S.E. Est./S.E. P-Value AGE WITH TENURE WBEING JOBSAT JOBPERF IQ TENURE WITH WBEING JOBSAT JOBPERF IQ Two-Tailed Estimate S.E. Est./S.E. P-Value WBEING WITH JOBSAT JOBPERF IQ JOBSAT WITH JOBPERF IQ JOBPERF WITH IQ Data set containing scores from 480 employees on eight workrelated variables Variables: Age, gender, job tenure, IQ, psychological wellbeing, job satisfaction, job performance, and turnover intentions Analysis: Compare job performance means between employees that do and do not intend to quit in the next six months (the TURNOVER variable), while controlling for well-being, job satisfaction, and job tenure 29

33 Multiple regression provides a straightforward mechanism for estimating ANOVA models from between-group designs TURNOVER is dummy coded (0 = intend to stay, 1 = intend to quit in the next 6 months) jobperf = B 0 + B 1 (wbeing) + B 2 (jobsat) + B 3 (tenure) + B 4 (turnover) + $% Consistent with ANCOVA models, the three covariates are centered at their grand means 118 DATA: file = employee.dat; VARIABLE: names = id age tenure female wbeing jobsat jobperf turnover iq; usevariables = jobperf tenure wbeing jobsat turnover; missing = all (-99); centering = grandmean(tenure wbeing jobsat); ANALYSIS: estimator = ml; MODEL: jobperf on tenure wbeing jobsat turnover; wbeing jobsat; Incomplete predictors; tenure wbeing jobsat turnover with tenure wbeing jobsat turnover; Covariances among IVs; OUTPUT: standardized sampstat; MODEL RESULTS Unstandardized Estimates Standard Error z Test Two-Tailed Estimate S.E. Est./S.E. P-Value JOBPERF ON TENURE WBEING JOBSAT TURNOVER Intercepts JOBPERF Because the covariates are centered at their means, the intercept estimate (B 0 = 6.217) represents the adjusted mean for the group of employees that intend to stay on the job (TURNOVER = 0) The employees that intend to quit in the next six months (TURNOVER = 1) have a significantly lower job performance mean (B 4 = -.645, p <.001) 30

34 The STDY section standardizes only the dependent variable The estimate for the dummy variable predictor can be interpreted as a Cohen s d effect size (i.e., the adjusted means differ by.24 of a standard deviation unit) STDY Standardization Two-Tailed Estimate S.E. Est./S.E. P-Value JOBPERF ON TENURE WBEING JOBSAT TURNOVER R-SQUARE Observed Two-Tailed Variable Estimate S.E. Est./S.E. P-Value JOBPERF Repeated measures data set containing six yearly assessments of antisocial behavior from 2000 children Variables: Gender (0 = male, 1 = female), six antisocial behavior scores Analysis: Compare change in the antisocial behavior averages across the six assessments 31

35 The Wald chi-square statistic can serve the same purpose as the omnibus F test from ANOVA The hypothesis for the Wald test specifies that the means are equal across time (i.e., the null hypothesis in ANOVA) The MODEL TEST command can implement the equality constraints on the means Unlike a standard repeated measures ANOVA, the subsequent analysis does not impose a covariance structure on the data (e.g., compound symmetry, sphericity) DATA: file = antisocial.dat; VARIABLE: names = female anti1 anti2 anti3 anti4 anti5 anti6; usevariables = anti1 - anti6; missing = all (-99); ANALYSIS: estimator = ml; MODEL: [anti1 - anti6] (ybar1 ybar6); Means with labels; anti1 - anti6 (var1 var6); Variances with labels; anti1 - anti6 with anti1 - anti6; Covariances; MODEL TEST: ybar1 = ybar2; ybar2 = ybar3; ybar3 = ybar4; ybar4 = ybar5; ybar5 = ybar6; All means set equal; OUTPUT: sampstat patterns; SUMMARY OF MISSING DATA PATTERNS MISSING DATA PATTERNS (x = not missing) ANTI1 x x x x x x ANTI2 x x x x x ANTI3 x x x x ANTI4 x x x ANTI5 x x ANTI6 x MISSING DATA PATTERN FREQUENCIES Pattern Frequency Pattern Frequency Pattern Frequency The covariance coverage matrix gives the proportion of complete cases on each variable or variable pair PROPORTION OF DATA PRESENT Covariance Coverage ANTI1 ANTI2 ANTI3 ANTI4 ANTI5 ANTI ANTI ANTI ANTI ANTI ANTI Covariance Coverage ANTI6 ANTI

36 The Wald statistic (a chi-square with 5 degrees of freedom) is akin to the omnibus F test in ANOVA Wald Test of Parameter Constraints Value Degrees of Freedom 5 P-Value The significant chi-square (& 2 = ) indicates that the null hypothesis of equal means is not supported MODEL RESULTS ML Means Standard Error z Test Two-Tailed Estimate S.E. Est./S.E. P-Value Means ANTI ANTI ANTI ANTI ANTI ANTI Two-Tailed Estimate S.E. Est./S.E. P-Value Variances ANTI ANTI ANTI ANTI ANTI ANTI Among other things, the MODEL CONSTRAINT command can compute new parameters from existing estimates For example, the command can compute a standardized mean difference effect size (e.g., Cohen s d) Use the parameter labels to program the following equation d = (ybar1 - ybar6) / sqrt(var1) 33

37 Syntax MODEL CONSTRAINT: new (d); d = (ybar1 - ybar6)/sqrt(var1); Output Two-Tailed Estimate S.E. Est./S.E. P-Value New/Additional Parameters D Point estimates are relatively accurate As kurtosis increases relative to the normal curve Standard errors become too small Likelihood ratio tests become too large (i.e., Type I errors) As kurtosis decreases relative to the normal curve Standard errors become too large Likelihood ratio tests become too small (i.e., Type 2 errors) Standard errors Robust (i.e., sandwich estimator) standard errors Naïve bootstrapping Likelihood ratio test Rescaled test statistic (i.e., Satorra-Bentler chi-square) Bollen-Stine bootstrap Procedures are available in some SEM programs 34

38 Robust standard errors rescale the standard errors (up or down) according to the degree of nonnormality in the data The usual ML standard error is multiplied by a correction term that accounts for outlier scores (or lack thereof) Robust standard errors are useful because they can be used to correct the Wald test Implementing robust standard errors has no impact on the estimation routine or the resulting parameter estimates Treat the sample data as a miniature population and draw B samples (e.g., 1000) of size N with replacement Perform the statistical analysis on each bootstrap sample Save the parameter estimates from each analysis Treat the parameter estimates as data points and compute the standard deviation of each parameter The standard deviation of the estimates is the bootstrap standard error Sample Data Bootstrap 1 Bootstrap 2 Bootstrap 3 ID X Y ID X Y ID X Y ID X Y Bootstrap Sample 1 (N by k) Parameter estimate Sample Data (N by k) Bootstrap Sample 2 (N by k) etc Parameter estimate 2 S.E. = "' Empirical sampling distribution of 1000 parameter values 8 17? 9 17? ? 8 17? 8 17? Bootstrap Sample 1000 (N by k) Parameter estimate ? 10 19? 9 17? 10 19? 35

39 Bivariate data analysis N = 500 X has skewness = 0 and kurtosis = -1 Y has skewness = 0 and kurtosis = 4 A normal distribution has skewness and kurtosis = 0 Robust standard errors ANALYSIS: MLR specifies robust ML; estimator = mlr; Naïve bootstrap standard errors ANALYSIS: 2000 samples; (standard) = naïve bootstrap; bootstrap = 2000 (standard); As is often the case, the bootstrap and robust procedures produced similar standard errors Parameter Standard SE Robust SE Bootstrap SE µ X µ Y " 2 X " XY " 2 Y The likelihood ratio test does not follow a chi-square distribution when the data are nonnormal Two solutions: Rescale the test statistic up or down so that it approximates the correct sampling distribution (i.e., Satorra-Bentler) Leave the sample statistic intact, but use the bootstrap procedure to generate a new sampling distribution, from which a p-value is generated The likelihood ratio bootstrap (i.e., Bollen-Stine bootstrap) is a bit different than the naïve bootstrap 36

40 Questionnaire data from a study of eating disorder risk in a sample of 500 college-aged women Variables: Body mass index (BMI), 7 questionnaire items measuring body dissatisfaction, 6 questionnaire items measuring eating disorder risk, binary indicator of past sexual abuse history (0 = no abuse history, 1= abuse history) All questionnaire items measured on a 7-point Likert scale Analysis: Fit a one-factor CFA model to the six eating disorder risk items By definition, Likert scales violate the ML normality assumption (a normal distribution requires continuous variables) The questionnaire items have asymmetric distributions, with positive skewness (S ranging between.50 and 1.00) and kurtotis (K ranging between.20 and 1.00) An appropriate analysis should implement corrective procedures for non-normal data (e.g., robust standard errors, the bootstrap) Eating Disorder Risk EDR 1 EDR 2 EDR 3 EDR 4 EDR 5 EDR 6 e 1 e 2 e 3 e 4 e 5 e 6 37

41 TITLE: CFA with first factor loading constrained to 1; DATA: file = eatingrisk.dat; VARIABLE: names = abuse bmi bds1 - bds7 edr1 - edr6; usevariables = edr1 - edr6; missing = all (-99); ANALYSIS: mlr = robust maximum likelihood; estimator = mlr; MODEL: edrisk by edr1 - edr6; OUTPUT: sampstat standardized patterns; By default, Mplus constrains the factor loading of the first indicator (EDR 1 ) to a value of 1 for identification It is also possible to estimate all loadings and constrain the latent factor s variance to 1 Place next to the factor name constrains its variance Placing an * after the loading instructs Mplus to estimate the loading TITLE: CFA with factor variance constrained to 1; DATA: file = eatingrisk.dat; VARIABLE: names = abuse bmi bds1 - bds7 edr1 - edr6; usevariables = edr1 - edr6; missing = all (-99); ANALYSIS: mlr = robust maximum likelihood; estimator = mlr; MODEL: edrisk by edr1 - edr6*; * = estimate all loadings; edrisk@1; Constrain factor variance to 1; OUTPUT: sampstat standardized patterns; SUMMARY OF MISSING DATA PATTERNS MISSING DATA PATTERNS (x = not missing) EDR1 x x x x x x x x EDR2 x x x x EDR3 x x x x EDR4 x x x x x x x x EDR5 x x x x EDR6 x x x x x x x x MISSING DATA PATTERN FREQUENCIES Pattern Frequency Pattern Frequency Pattern Frequency

42 The covariance coverage matrix gives the proportion of complete cases on each variable or variable pair PROPORTION OF DATA PRESENT Covariance Coverage EDR1 EDR2 EDR3 EDR4 EDR5 EDR EDR EDR EDR EDR EDR Covariance Coverage EDR6 EDR MODEL RESULTS Unstandardized Estimates Robust Std. Error z Test Two-Tailed Estimate S.E. Est./S.E. P-Value EDRISK BY EDR EDR EDR EDR EDR EDR Indicator Means Standard Error z Test Two-Tailed Estimate S.E. Est./S.E. P-Value Intercepts EDR EDR EDR EDR EDR EDR Two-Tailed Estimate S.E. Est./S.E. P-Value Variances EDRISK Residual Variances EDR EDR EDR EDR EDR EDR

43 STANDARDIZED MODEL RESULTS STDYX Standardization Two-Tailed Estimate S.E. Est./S.E. P-Value EDRISK BY EDR EDR EDR EDR EDR EDR All loadings are positive and statistically significant, with all standardized values exceeding.60 Measurement intercepts are a byproduct of missing data handling Mplus constrains the latent mean to 0, making the measurement intercepts equivalent to the variable means The MLR estimator gives the Satorra-Bentler rescaled chi-square Chi-Square Test of Model Fit Value 9.440* Degrees of Freedom 9 P-Value Scaling Correction Factor for MLR The chi-square value for MLM, MLMV, MLR, ULSMV, WLSM and WLSMV cannot be used for chi-square difference testing in the regular way. MLM, MLR and WLSM chi-square difference testing is described on the Mplus website. MLMV, WLSMV,and ULSMV difference testing is done using the DIFFTEST option. TITLE: CFA with first factor loading constrained to 1; DATA: file = eatingrisk.dat; VARIABLE: names = abuse bmi bds1 - bds7 edr1 - edr6; usevariables = edr1 - edr6; missing = all (-99); ANALYSIS: estimator = ml; (residual) gives Bollen-Stine bootstrap; bootstrap = 2000 (residual); MODEL: edrisk by edr1 - edr6; OUTPUT: sampstat standardized patterns; 40

44 Estimator ML Information matrix OBSERVED Maximum number of iterations 1000 Convergence criterion 0.500D-04 Maximum number of steepest descent iterations 20 Maximum number of iterations for H Convergence criterion for H D-03 Number of bootstrap draws Requested 2000 Completed 2000 MODEL RESULTS Unstandardized Estimates Bootstrap Std. Error z Test Two-Tailed Estimate S.E. Est./S.E. P-Value EDRISK BY EDR EDR EDR EDR EDR EDR The Bollen-Stine bootstrap gives the standard chisquare test and p-value along with a bootstrap p- value from an empirical bootstrap sampling distribution Chi-Square Test of Model Fit Value Degrees of Freedom 9 P-Value Bootstrap P-Value The positive kurtosis should cause normal-theory standard errors to be too small Both the robust and bootstrap standard errors corrected this downward bias (point estimates are identical in all analyses) Loading Standard ML Robust SE Bootstrap SE EDR 1 N/A N/A N/A EDR EDR EDR EDR EDR

45 ML assumes an MAR mechanism where the propensity for missing data on a Y is related to other variables, but not to the would-be values of Y itself MAR is not automatically satisfied if the causes/correlates of missingness are measured variables The correlates of missingness must be part of the statistical analyses, even if they are not of substantive interest Auxiliary variables (AVs) are ancillary variables that are not of substantive interest The variables are included in the analysis for the purposes of reducing bias and/or improving power Good AVs are correlates of missingness (i.e., potential causes of missing data) or correlates of the incomplete analysis variables Consider an educational study that examines the change in selfreport behavioral problems AVs that correlate with reasons for missingness: Socioeconomic status Student mobility (e.g., survey question asking how likely participant is to move) Standardized test scores AVs the correlate with self-report behavior scores: Disciplinary referrals, absenteeism, juvenile justice incidents Parental supervision Quality of home environment 42

46 A study examines a number of health-related behaviors (e.g., smoking, drinking, sexual activity) in teens The risky sexual behavior questionnaire is only administered to participants above the age of 15 The substantive analysis is a regression model that uses selfesteem to predict risky sexual behavior To satisfy MAR, age must be in the model, even though it is not of substantive interest The model below satisfies MAR, but it is this an undesirable solution Esteem Age The interpretation of the esteem slope becomes a partial regression coefficient Including AVs should not affect the substantive interpretation of the model parameters Risk $ Graham (2003) outlined two approaches for incorporating auxiliary variables into an ML analysis The saturated correlates model transmits information from the auxiliary variables to the analysis variables via a series of correlations Importantly, the model does not alter the substantive interpretation of the parameter estimates The saturated correlates model is easy to implement in SEM Rules for specifying the saturated correlates model with manifest (i.e., measured or observed) variables Correlate an AV with a) Manifest predictor variables b) Other auxiliary variables c) The residual terms of any outcome variables 43

47 X Y The AUXILIARY subcommand automatically implements the saturated correlates model The MODEL section does not mention the AVs AV1 VARIABLE: names = x y av1 av2; usevariables = x y; missing = all (-99); auxiliary = (m) av1 av2; AV2 Implement AVs manually via the MODEL commands Omit the AUXILIARY command with manual specification MODEL: Regression model parameters; y on x; x y; [x y]; AV model correlations; av1 with av2; x y with av1 av2; Rules for specifying the saturated correlates model with latent variables Correlate an AV with a) Manifest predictor variables b) Other auxiliary variables c) The residual terms of any manifest indicator variables The AVs should never correlate with a latent variable or its residual term 44

48 AV1 $ $ $ $ $ $ x1 x2 x3 y1 y2 y3 1 1 AV2 VARIABLE: names = x1 x2 x3 y1 y2 y3 av1 av2; usevariables = x1 - y3 ; missing = all (-99); auxiliary = (m) av1 av2; MODEL: Latent variable regression model parameters; x by x1 x2 x3; y by y1 y2 y3; y on x; X Y ( VARIABLE: names = x1 x2 x3 y1 y2 y3 av1 av2; usevariables = x1 - y3 av1 av2; missing = all (-99); MODEL: Latent variable regression model parameters; x by x1 x2 x3; y by y1 y2 y3; y on x; AV model correlations; av1 with av2; x1 x2 x3 y1 y2 y3 with av1 av2; X1 AV1 AV2 Y1 ( Y2 ( 45

49 VARIABLE: names = x1 y1 y2 av1 av2; usevariables = x1 - y2; missing = all (-99); auxiliary = (m) av1 av2; MODEL: Path model parameters; y1 on x1; y2 on y1; x1; VARIABLE: names = x1 y1 y2 av1 av2; usevariables = x1 - y2 av1 av2; missing = all (-99); MODEL: Path model parameters; y1 on x1; y2 on y1; x1; AV model correlations; av1 with av2; x1 y1 y2 with av1 av2; AV1 $ $ $ $ $ $ x1 x2 x3 y1 y2 y3 1 1 AV2 VARIABLE: names = x1 x2 x3 y1 y2 y3 av1 av2; usevariables = x1 - y3; missing = all (-99); auxiliary = (m) av1 av2; MODEL: Factor model parameters; x by x1 x2 x3; y by y1 y2 y3; x with y; X Y 46