Simple Second Order Chi-Square Correction



Similar documents
Comparison of Estimation Methods for Complex Survey Data Analysis

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

CHAPTER 4 EXAMPLES: EXPLORATORY FACTOR ANALYSIS

CHAPTER 13 EXAMPLES: SPECIAL FEATURES

lavaan: an R package for structural equation modeling

Using Repeated Measures Techniques To Analyze Cluster-correlated Survey Responses

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Goodness of fit assessment of item response theory models

Statistical Models in R

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

Factors affecting online sales

Confirmatory factor analysis in MPlus

Chapter 23. Two Categorical Variables: The Chi-Square Test

Deciding on the Number of Classes in Latent Class Analysis and Growth Mixture Modeling: A Monte Carlo Simulation Study

THE PSYCHOMETRIC PROPERTIES OF THE AGRICULTURAL HAZARDOUS OCCUPATIONS ORDER CERTIFICATION TRAINING PROGRAM WRITTEN EXAMINATIONS

Overview of Factor Analysis

Least Squares Estimation

Simple Linear Regression Inference

From the help desk: Bootstrapped standard errors

Elements of statistics (MATH0487-1)

Indices of Model Fit STRUCTURAL EQUATION MODELING 2013

1 Overview and background

Poisson Models for Count Data

Chapter 6: Multivariate Cointegration Analysis

Multiple Linear Regression

12: Analysis of Variance. Introduction

The Latent Variable Growth Model In Practice. Individual Development Over Time

Fairfield Public Schools

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Life Table Analysis using Weighted Survey Data

Mplus Tutorial August 2012

ESTIMATION OF THE EFFECTIVE DEGREES OF FREEDOM IN T-TYPE TESTS FOR COMPLEX DATA

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Introduction to General and Generalized Linear Models

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Applications of Structural Equation Modeling in Social Sciences Research

2 Sample t-test (unequal sample sizes and unequal variances)

Multivariate Normal Distribution

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Confidence Intervals for Cp

Note 2 to Computer class: Standard mis-specification tests

11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén Table Of Contents

Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng. LISREL for Windows: SIMPLIS Syntax Files

Didacticiel - Études de cas

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

Survival Analysis of Left Truncated Income Protection Insurance Data. [March 29, 2012]

APPLICATIONS AND MODELING WITH QUADRATIC EQUATIONS

Performing Unit Root Tests in EViews. Unit Root Testing

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

The Best of Both Worlds:

Example: Boats and Manatees

Longitudinal Meta-analysis

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups

Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng. LISREL for Windows: PRELIS User s Guide

Lecture 3: Linear methods for classification

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) (d) 20 (e) 25 (f) 80. Totals/Marginal

Rens van de Schoot a b, Peter Lugtig a & Joop Hox a a Department of Methods and Statistics, Utrecht

Is it statistically significant? The chi-square test

by the matrix A results in a vector which is a reflection of the given

Weight of Evidence Module

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Does Marketing Channel Satisfaction Moderate the Association Between Alternative Attractiveness and Exiting? An Application of Ping's Technique

Statistical Models in R

Testing Research and Statistical Hypotheses

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

The effect of varying the number of response alternatives in rating scales: Experimental evidence from intra-individual effects

Multivariate normal distribution and testing for means (see MKB Ch 3)

Evaluating the Fit of Structural Equation Models: Tests of Significance and Descriptive Goodness-of-Fit Measures

Confidence Intervals for the Difference Between Two Means

Final Exam Practice Problem Answers

Linear Models for Continuous Data

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

[This document contains corrections to a few typos that were found on the version available through the journal s web page]

How To Understand Multivariate Models

UNDERSTANDING THE TWO-WAY ANOVA

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

Directions for using SPSS

Multivariate Analysis of Variance (MANOVA): I. Theory

Correlation and Simple Linear Regression

CHAPTER IV FINDINGS AND CONCURRENT DISCUSSIONS

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Tutorial 5: Hypothesis Testing

How To Calculate A Multiiperiod Probability Of Default

The Bonferonni and Šidák Corrections for Multiple Comparisons

Statistics in Retail Finance. Chapter 2: Statistical models of default

xtmixed & denominator degrees of freedom: myth or magic

Dimensionality Reduction: Principal Components Analysis

The correlation coefficient

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

Department of Economics

Transcription:

Simple Second Order Chi-Square Correction Tihomir Asparouhov and Bengt Muthén May 3, 2010 1

1 Introduction In this note we describe the second order correction for the chi-square statistic implemented in Mplus Version 6 with the estimators WLSMV/ULSMV and MLMV. Prior to Version 6 the second order correction in Mplus for the chisquare statistics is based on a Satterthwaite (1941) type correction, see also Satorra and Bentler(1994) and Muthén et al. (1997). This type of correction scales appropriately the chi-square statistic and estimates the optimal degrees of freedom to obtain the best approximation possible. Here we introduce an alternative to the Satterthwaite style correction which has the advantage that it does not need to estimate the degrees of freedom and instead uses the usual degrees of freedom, which is simply the difference between the number of parameters in the two models. In Mplus Version 6 both the new second order correction as well as the Satterthwaite second order correction can now be performed. The new second order correction is now the Mplus default and the Satterthwaite correction can be obtained using the command Satterthwaite=ON. In Section 2 we describe the three types of scaled test statistics available in Mplus Version 6 with the weighted least squares estimation. In Section 3 we present the results of a simulation study to evaluate the performance of the three scaled test statistics. In Section 4 we also include a brief discussions on how the new second order correction is constructed for the Mplus DIFFTEST command used to test two nested models. In Section 5 we discuss the construction of modification indices for scaled chi-square statistics. 2 Scaled Chi-Square Statistics Consider the weighted least squares estimation in Mplus and denote by F the fit function used in the estimation. Denote by T the minimal value of F obtained in the estimation. The value T is used to construct the test statistics. The asymptotic distribution of T is not a chi-square distribution. This distribution can be represented as a mixture (weighted sum) of chisquare distributions of 1 degree of freedom D w i χ 2 (1) (1) d=1 2

where the weights w i are the eigenvalues of a certain matrix M, see Satorra and Bentler (2001). In the above expression D is the difference between the number of parameters in the unrestricted model and the number of parameters in the estimated model. Since it is not as straight forward to compute P-values for distribution (1) we use approximations based on the chi-square distribution. The first order approximation is to use instead of T the statistic T 1 = T D D = T wi T r(m) and assume that T 1 has a chi-square distribution with D degrees of freedom. By definition E(T 1 ) = D, i.e., using the scale factor D/ w i results in the fact that the distribution of T 1 and the chi-square distribution used as an approximation have the same mean. The second order approximation by Satterthwaite (1941) is designed to match not only the mean of the test statistic distribution with the mean of the chi-square distribution but also the variance. This is done by estimating the degrees of freedom ˆD, see Muthén (1998-2004) and Satorra and Bentler (1994), and using the test statistic T 2 = T ˆD ˆD = T wi T r(m) (2) where the degrees of freedom ˆD is estimated as the integer closest to Note now that V ar(t 2 ) = E(T 2 ) = (T r(m)) 2 T r(m 2 ). (3) ˆD wi E(T ) = ˆD wi wi = ˆD ˆD 2 ( w i ) V ar(t ) = 2 ˆD2 w 2 ˆD 2 2 (T r(m)) 2 i = 2 T r(m) 2 ˆD. (T r(m)) 2 Thus the mean and the variance of T 2 match those of the chi-square distribution with ˆD degrees of freedom used as an approximation. In Muthén et al. (1997) it is shown in a simulation study that the second order correction T 2 performs much better than the first order correction 3

T 1. Here we propose a new second order correction statistic T 3 which has approximately a chi-square distribution with D degrees of freedom, i.e., the degrees of freedom is unchanged. We will also show with simulations that the new second order approximation statistic T 3 performs as well as T 2. To match the mean and the variance of the chi-square distribution with D degrees of freedom in the construction of T 3 we use not just a scale factor but also a shift parameter, that is T 3 = at + b where a and b are chosen so that E(T 3 ) = D and V ar(t 3 ) = 2D. Essentially to obtain a and b we solve the following system of linear equations D = E(T 3 ) = ae(t ) + b = a T r(m) + b 2D = V ar(t 3 ) = a 2 var(t ) = 2a 2 T r(m 2 ). The solution of that system is given by D a = T r(m 2 ) and Thus the proposed T 3 is T 3 = T b = D D T r(m) 2 T r(m 2 ). D T r(m 2 ) + D D T r(m) 2 T r(m 2 ). (4) Because of the above construction T 3 has the same mean and variance as the chi-square distribution with D degrees of freedom. 3 Simulation Study We replicate the simulation study conducted in Muthén et al. (1997) and we include the new test statistics T 3. The goal of the simulation study is to evaluate the type I error of the test statistics. The data is generated according 4

Table 1: Rejection rates under correct null hypothesis Estimator WLSM WLSMV WLSMV Satterthwaite - ON OFF Test Statistic T 1 T 2 T 3 Table 1 16% 4.8% 5.2% Table 3 25.2% 9.2% 10% Table 5 17.6% 7.6% 8.6% Table 7 14.4% 7.2% 7% Table 9 10.2% 6% 6.4% Table 11 10% 6.2% 6.6% Table 13 23% 10.6% 11.2% Table 15 12.6% 6% 6.8% to a model and analyzed according to the same model. Each test statistic is computed and the model is rejected if the test statistic is larger than the 95% percentile of the corresponding chi-square distribution, i.e., if the P-value is less than 5%. In Table 1 we report the percentage of rejected models, i.e., the rejection rates for each of the three test statistics. Rejection rates near 5% indicate correct performance and higher values indicate inflated type I error which is undesirable. All of the simulation studies from Muthén et al. (1997) are conducted. A detailed account for these simulations can be obtained in Muthén et al. (1997). In the table below we use the Table numbering from Muthén et al. (1997) to refer to a particular Montecarlo setup, i.e., Table 1 refers to the simulation study presented in Muthén et al. (1997) Table 1. The results in Table 1 show that T 1 indeed has inflated rejection rates and that both T 2 and T 3 perform substantially better than T 1. The difference between the rejection rates of T 2 and T 3 is negligible and both statistics perform quite well in all cases. 5

4 T 3 Style Second Order Correction for Chi- Square Difference Testing Consider the case when there are two nested models A and B and we need to test the more restricted model A against the less restricted model B using a second order adjusted test statistic. In Mplus this is done using the DIFFTEST command, see Asparouhov and Muthén (2006). Suppose that the fit functions for the two models are T A and T B. We use T = T A T B as the base of the test statistic. The distribution of T is not a chi-square distribution but it is a distribution such as the one given in (1). Prior to Version 6 the second order adjustment of T in Mplus is computed using Satterthwaite s approximation as described in (2) and (3) for a particular matrix M given in formula (9) in Asparouhov and Muthén (2006). To construct a T 3 style difference we use again formula (4) using the corresponding matrix M. The advantage of T 3 over T 2 is again that the degrees of freedom for this chi-square approximation matches the difference in the number of parameters of the two nested models. 5 Modification Indices for Scaled Chi-Square Statistics Let F be a fit function such as the log-likelihood, the log-likelihood difference between a restricted and unrestricted model or a weighted least squares fit function. In all of these cases F is used to construct a chi-square test of fit. Modification indices is a technique developed by Sörbom (1989) that allows us to obtain an approximate estimate for the change in the chi-square test when an additional parameter is estimated in the model. This is usually done for a large number of additional parameters and only using the first and the second order derivatives of F. When the test statistic is a scaled test statistic, it is obtained by using a formula of this form af +b where a and b are estimated quantities and F is the fit function value after the minimization. Using Sörbom (1989) method we can again make an approximate estimate for the change in the fit function value using first and second order derivatives of F. Let s assume that a and b are approximately constants or that they will 6

not change dramatically when an additional parameter is estimated. This is not a unreasonable assumption. Practical experience has shown that this is the case in most situations. Usually a and b are functions of averages of eigenvalues. When an additional parameter is estimated, essentially that amounts to eliminating a corresponding eigenvalue. These averages however will not be affected dramatically by removing a single eigenvalue. Under the assumption of approximately constant a and b we can use as the modification index the approximate change in F multiplied by a. This is the method used in Mplus for computing modification indices for all scaled chi-square tests of fit. In certain situations the assumption of approximately constant a and b may not be appropriate, such as when the degrees of freedom is small. The modification indices technique, even for unscaled chi-square statistics, does not provide a precise value nor does it guarantee a small enough error in the estimation and therefore its use should be limited to an exploratory phase in the model building rather than used as evidence for model fit. 7

References [1] Asparouhov, T & Muthén, B. (2006) Robust Chi Square Difference Testing with Mean and Variance Adjusted Test Statistics. Mplus Web Notes: No. 10. http://statmodel.com/download/webnotes/webnote10.pdf [2] Muthén, B., du Toit, S. H. C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. [3] Muthén, B.O. (1998-2004). Mplus Technical Appendices. Los Angeles, CA: Muthén & Muthén Copyright 1998-2004 Muthén & Muthén Program Copyright 1998-2004 Muthén & Muthén Version 3. [4] Satorra, A., and Bentler, P. M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. In A. von Eye and C. C. Clogg (Eds.), Latent variables analysis: Applications for developmental research. 399-419. Thousand Oaks, CA: Sage. [5] Satorra, A., and Bentler, P. M. (2001). A Scaled Difference Chi-square Test Statistic for Moment Structure Analysis. Psychometrika 66, 507-514. [6] Satterthwaite, F. E. (1941). Synthesis of variance. Psychometrika, 6, 309-316. [7] Sörbom, D. (1989) Model modification. Psychometrika, 54, 371-384. 8