Psychology 405: Psychometric Theory Homework on Factor analysis and structural equation modeling

Similar documents
Using R and the psych package to find ω

Installing R and the psych package

How to do a factor analysis with the psych package

Rens van de Schoot a b, Peter Lugtig a & Joop Hox a a Department of Methods and Statistics, Utrecht

lavaan: an R package for structural equation modeling

Using the psych package to generate and test structural models

Indices of Model Fit STRUCTURAL EQUATION MODELING 2013

Using the psych package to generate and test structural models

Introduction to Structural Equation Modeling (SEM) Day 4: November 29, 2012

Goodness of fit assessment of item response theory models

Factors affecting online sales

Psychology 205: Research Methods in Psychology

Common factor analysis

T-test & factor analysis

Multivariate Normal Distribution

Overview of Factor Analysis

Factor Analysis using R

The lavaan tutorial. Yves Rosseel Department of Data Analysis Ghent University (Belgium) June 28, 2016

Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red)

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Simple Linear Regression Inference

SPSS and AMOS. Miss Brenda Lee 2:00p.m. 6:00p.m. 24 th July, 2015 The Open University of Hong Kong

Factor Analysis. Chapter 420. Introduction

Final Exam Practice Problem Answers

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Factorial Invariance in Student Ratings of Instruction

August 2012 EXAMINATIONS Solution Part I

Regression Analysis: A Complete Example

An introduction to using Microsoft Excel for quantitative data analysis

Package EstCRM. July 13, 2015

Factor Analysis - 2 nd TUTORIAL

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Electronic Thesis and Dissertations UCLA

Applications of Structural Equation Modeling in Social Sciences Research

Richard E. Zinbarg northwestern university, the family institute at northwestern university. William Revelle northwestern university

Specification of Rasch-based Measures in Structural Equation Modelling (SEM) Thomas Salzberger

Elements of statistics (MATH0487-1)

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups

Additional sources Compilation of sources:

Exploratory Factor Analysis

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s

Reporting Statistics in Psychology

NCSS Statistical Software

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

5.2 Customers Types for Grocery Shopping Scenario

Regression step-by-step using Microsoft Excel

APPRAISAL OF FINANCIAL AND ADMINISTRATIVE FUNCTIONING OF PUNJAB TECHNICAL UNIVERSITY

Multiple Linear Regression

4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as:

Random effects and nested models with SAS

NCSS Statistical Software

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Descriptive Statistics

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Correlation and Simple Linear Regression

The Latent Variable Growth Model In Practice. Individual Development Over Time

Multivariate Analysis. Overview

Dongfeng Li. Autumn 2010

An Empirical Study on the Effects of Software Characteristics on Corporate Performance

5. Linear Regression

THE UNIVERSITY OF CHICAGO, Booth School of Business Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Homework Assignment #2

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Basic Statistical and Modeling Procedures Using SAS

2. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) F 2 X 4 U 4

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Chapter 7 Section 1 Homework Set A

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

An overview of the psych package

THE PSYCHOMETRIC PROPERTIES OF THE AGRICULTURAL HAZARDOUS OCCUPATIONS ORDER CERTIFICATION TRAINING PROGRAM WRITTEN EXAMINATIONS

Introduction to Path Analysis

Chapter 6: Multivariate Cointegration Analysis

Chapter 7: Simple linear regression Learning Objectives

Confirmatory factor analysis in MPlus

Imputing Missing Data using SAS

Simple Second Order Chi-Square Correction

An empirical study of factor analysis on M & A performance of listed companies of Chinese pharmaceutical industry

Experimental Design for Influential Factors of Rates on Massive Open Online Courses

Evaluating Goodness-of-Fit Indexes for Testing Measurement Invariance

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

How to report the percentage of explained common variance in exploratory factor analysis

An introduction to. Principal Component Analysis & Factor Analysis. Using SPSS 19 and R (psych package) Robin Beaumont robin@organplayers.co.

Factor Analysis and Structural equation modelling

Craig K. Enders Arizona State University Department of Psychology

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Lecture Notes Module 1

THE RELATIONSHIPS BETWEEN CLIENT AND CONSULTANT OBJECTIVES IN IT PROJECTS

Transcription:

Psychology 405: Psychometric Theory Homework on Factor analysis and structural equation modeling William Revelle Department of Psychology Northwestern University Evanston, Illinois USA June, 2014 1 / 20

Outline 1 The Problem 2 Answers Get the data and describe it Exploratory Factor Analysis Factor extension Confirmatory analysis of structure 3 References 2 / 20

The problem 1 Given the data set at http://personality-project.org/ r/datasets/psychometrics.prob2.txt Do basic descriptive statistics Find the basic correlation matrix 2 Exploratory Factor analysis How many factors? What are they? 3 Factor Extension Factor the first 5 variables Extend to the last 3 4 Do this as a confirmatory model With sem With lavaan 3 / 20

Get the data and describe it Read and describe #Give the file name (location/path) > fn <-"http://personality-project.org/r/datasets/psychometrics.prob2.txt" #Read in the data > dataset <- read.table(fn,header=true) # Do basic descriptive statistics > describe(dataset) vars n mean sd median trimmed mad min max range skew kurtosis se ID 1 1000 500.50 288.82 500.50 500.50 370.65 1.0 1000.00 999.00 0.00-1.20 9.13 GREV 2 1000 499.77 106.11 497.50 498.75 106.01 138.0 873.00 735.00 0.09-0.07 3.36 GREQ 3 1000 500.53 103.85 498.00 498.51 105.26 191.0 914.00 723.00 0.22 0.08 3.28 GREA 4 1000 498.13 100.45 495.00 498.67 99.33 207.0 848.00 641.00-0.02-0.06 3.18 Ach 5 1000 49.93 9.84 50.00 49.88 10.38 16.0 79.00 63.00 0.00 0.02 0.31 Anx 6 1000 50.32 9.91 50.00 50.43 10.38 14.0 78.00 64.00-0.14 0.14 0.31 Prelim 7 1000 10.03 1.06 10.00 10.02 1.48 7.0 13.00 6.00-0.02-0.01 0.03 GPA 8 1000 4.00 0.50 4.02 4.01 0.53 2.5 5.38 2.88-0.07-0.29 0.02 MA 9 1000 3.00 0.49 3.00 3.00 0.44 1.4 4.50 3.10-0.07-0.09 0.02 > 4 / 20

Get the data and describe it The correlation matrix > R <- lowercor(dataset) ID GREV GREQ GREA Ach Anx Prelm GPA MA ID 1.00 GREV -0.01 1.00 GREQ 0.00 0.73 1.00 GREA -0.01 0.64 0.60 1.00 Ach 0.00 0.01 0.01 0.45 1.00 Anx -0.01 0.01 0.01-0.39-0.56 1.00 Prelim 0.02 0.43 0.38 0.57 0.30-0.23 1.00 GPA 0.00 0.42 0.37 0.52 0.28-0.22 0.42 1.00 MA -0.01 0.32 0.29 0.45 0.26-0.22 0.36 0.31 1.00 5 / 20

Get the data and describe it Drop the ID field and redo the analysis > my.data <- dataset[-1] > R <-lowercor(my.data) GREV GREQ GREA Ach Anx Prelm GPA MA GREV 1.00 GREQ 0.73 1.00 GREA 0.64 0.60 1.00 Ach 0.01 0.01 0.45 1.00 Anx 0.01 0.01-0.39-0.56 1.00 Prelim 0.43 0.38 0.57 0.30-0.23 1.00 GPA 0.42 0.37 0.52 0.28-0.22 0.42 1.00 MA 0.32 0.29 0.45 0.26-0.22 0.36 0.31 1.00 6 / 20

Exploratory Factor Analysis How many factors? > nfactors(my.data) Number of factors Call: vss(x = x, n = n, rotate = rotate, diagonal = diagonal, fm = fm, n.obs = n.obs, plot = FALSE, title = title) VSS complexity 1 achieves a maximimum of 0.74 with 2 factors VSS complexity 2 achieves a maximimum of 0.88 with 2 factors The Velicer MAP achieves a minimum of 0.06 with 2 factors Empirical BIC achieves a minimum of -71.29 with 2 factors Sample Size adjusted BIC achieves a minimum of -23.74 with 3 factors Statistics by number of factors vss1 vss2 map dof chisq prob sqresid fit RMSEA BIC SABIC complex ech 1 0.73 0.00 0.071 20 9.7e+02 3.7e-192 4.5 0.73 0.218 829 892.6 1.0 1.1e 2 0.74 0.88 0.056 13 2.6e+01 1.9e-02 2.0 0.88 0.031-64 -23.0 1.4 1.9e 3 0.54 0.82 0.101 7 2.4e+00 9.4e-01 1.7 0.90 0.000-46 -23.7 1.8 7.4e 4 0.53 0.81 0.179 2 4.6e-02 9.8e-01 1.7 0.90 0.000-14 -7.4 1.9 2.3e 5 0.46 0.68 0.344-2 1.3e-02 NA 1.6 0.91 NA NA NA 2.1 8.2e 6 0.44 0.68 0.516-5 1.7e-09 NA 1.4 0.92 NA NA NA 2.1 1.8e 7 0.44 0.68 1.000-7 0.0e+00 NA 1.4 0.92 NA NA NA 2.1 3.0e 8 0.44 0.68 NA -8 0.0e+00 NA 1.4 0.92 NA NA NA 2.1 3.0e 7 / 20

Exploratory Factor Analysis What are the two factors? > f2 <- fa(my.data,2) > f2 Factor Analysis using method = minres Call: fa(r = my.data, nfactors = 2) Standardized loadings (pattern matrix) based upon correlation matrix MR1 MR2 h2 u2 com GREV 0.91-0.14 0.79 0.21 1.0 GREQ 0.84-0.13 0.67 0.33 1.0 GREA 0.70 0.46 0.84 0.16 1.7 Ach -0.06 0.81 0.63 0.37 1.0 Anx 0.07-0.71 0.48 0.52 1.0 Prelim 0.47 0.31 0.39 0.61 1.7 GPA 0.45 0.27 0.33 0.67 1.6 MA 0.35 0.29 0.25 0.75 1.9 MR1 MR2 SS loadings 2.65 1.73 Proportion Var 0.33 0.22 Cumulative Var 0.33 0.55 Proportion Explained 0.60 0.40 Cumulative Proportion 0.60 1.00 8 / 20

Exploratory Factor Analysis two factors (continued) With factor correlations of MR1 MR2 MR1 1.00 0.23 MR2 0.23 1.00 Mean item complexity = 1.4 Test of the hypothesis that 2 factors are sufficient. The degrees of freedom for the null model are 28 and the objective function was 3.32 with Chi Square of 3304.93 The degrees of freedom for the model are 13 and the objective function was 0.03 The root mean square of the residuals (RMSR) is 0.02 The df corrected root mean square of the residuals is 0.03 The harmonic number of observations is 1000 with the empirical chi square 18.51 with prob < 0.14 The total number of observations was 1000 with MLE Chi Square = 25.56 with prob < 0.019 Tucker Lewis Index of factoring reliability = 0.992 RMSEA index = 0.031 and the 90 % confidence intervals are 0.012 0.049 BIC = -64.24 Fit based upon off diagonal values = 1 Measures of factor score adequacy MR1 MR2 Correlation of scores with factors 0.95 0.90 Multiple R square of scores with factors 0.91 0.82 Minimum correlation of possible factor scores 0.81 0.64 > 9 / 20

Exploratory Factor Analysis Show the two factor solution > fa.diagram(f2,simple=false) Factor Analysis GREV GREQ 0.9 GREA Prelim GPA 0.8 0.7 0.5 0.4 0.3 0.5 0.3 MR1 MA Ach 0.8-0.7 MR2 10 / 20

Exploratory Factor Analysis What about a three factor solution? > f3 <- fa(my.data,3) > f3 Factor Analysis using method = minres Call: fa(r = my.data, nfactors = 3) Standardized loadings (pattern matrix) based upon correlation matrix MR1 MR2 MR3 h2 u2 com GREV 0.85-0.10 0.07 0.79 0.21 1.0 GREQ 0.85-0.05-0.03 0.68 0.32 1.0 GREA 0.61 0.44 0.14 0.84 0.16 1.9 Ach -0.10 0.75 0.10 0.63 0.37 1.1 Anx 0.01-0.74 0.05 0.50 0.50 1.0 Prelim -0.02-0.04 0.76 0.53 0.47 1.0 GPA 0.20 0.11 0.39 0.36 0.64 1.7 MA 0.14 0.15 0.32 0.26 0.74 1.8 MR1 MR2 MR3 SS loadings 2.05 1.44 1.10 Proportion Var 0.26 0.18 0.14 Cumulative Var 0.26 0.44 0.57 Proportion Explained 0.45 0.31 0.24 Cumulative Proportion 0.45 0.76 1.00 With factor correlations of MR1 MR2 MR3 MR1 1.00 0.13 0.68 MR2 0.13 1.00 0.54 MR3 0.68 0.54 1.00 11 / 20

Exploratory Factor Analysis More 3 factor output With factor correlations of MR1 MR2 MR3 MR1 1.00 0.13 0.68 MR2 0.13 1.00 0.54 MR3 0.68 0.54 1.00 Mean item complexity = 1.3 Test of the hypothesis that 3 factors are sufficient. The degrees of freedom for the null model are 28 and the objective function was 3.32 with Chi Square of The degrees of freedom for the model are 7 and the objective function was 0 The root mean square of the residuals (RMSR) is 0 The df corrected root mean square of the residuals is 0.01 The harmonic number of observations is 1000 with the empirical chi square 0.74 with prob < 1 The total number of observations was 1000 with MLE Chi Square = 2.38 with prob < 0.94 Tucker Lewis Index of factoring reliability = 1.006 RMSEA index = 0 and the 90 % confidence intervals are NA 0.01 BIC = -45.98 Fit based upon off diagonal values = 1 Measures of factor score adequacy MR1 MR2 MR3 Correlation of scores with factors 0.95 0.90 0.89 Multiple R square of scores with factors 0.90 0.81 0.79 Minimum correlation of possible factor scores 0.79 0.62 0.57 12 / 20

Exploratory Factor Analysis Show the three factor model fa.diagram(f3,simple=false) Factor Analysis GREV 0.9 GREQ GREA 0.8 0.6 MR1 Ach Anx 0.4 0.7-0.7 MR2 0.7 Prelim GPA 0.8 0.4 0.3 MR3 0.5 13 / 20

Factor extension Factor extension 1 Originally developed for the problem of new variables being added (Dwyer, 1937; Mosier, 1938; Horn, 1973) Find the correlations of the new variables with the old factors Do this by finding the factor score weights of the old variables on the factors Then find the correlations of the new variables with those scores Don t actually have to calculate the scores to do this 2 Can be used if we want to keep the original factor structure and extend it to new variables fa.extend and fa.extension will do this fa.extend will do the factor analysis on the old, and then merge output with the new fa.extension takes the original factor analysis and the correlation of original with new and finds just the loadings on the new. 14 / 20

Factor extension fa.extend > f2e <- fa.extend(my.data,2,ov=1:5,ev=6:8) > f2e Factor Analysis using method = minres Call: fa.extend(r = my.data, nfactors = 2, ov = 1:5, ev = 6:8) Standardized loadings (pattern matrix) based upon correlation matrix MR1 MR2 h2 u2 com GREV 0.90-0.11 0.79 0.21 1.0 GREQ 0.84-0.10 0.68 0.32 1.0 GREA 0.69 0.49 0.84 0.16 1.8 Ach -0.06 0.80 0.63 0.37 1.0 Anx 0.06-0.71 0.49 0.51 1.0 Prelim 0.46 0.31 0.37 0.63 1.8 GPA 0.44 0.28 0.32 0.68 1.7 MA 0.34 0.29 0.24 0.76 1.9 MR1 MR2 SS loadings 2.60 1.75 Proportion Var 0.33 0.22 Cumulative Var 0.33 0.54 Proportion Explained 0.60 0.40 Cumulative Proportion 0.60 1.00 With factor correlations of MR1 MR2 MR1 1.00 0.19 MR2 0.19 1.00 15 / 20

Factor extension fa.extension (not showing the fa of the first 5 variables) > f2o <- fa(my.data[1:5],2) > f2e <- fa.extension(cor(my.data[1:5],my.data[6:8]), f2o) > f2e Call: fa.extension(roe = cor(my.data[1:5], my.data[6:8]), fo = f2o) Standardized loadings (pattern matrix) based upon correlation matrix MR1 MR2 h2 u2 Prelim 0.46 0.31 0.37 0.63 GPA 0.44 0.28 0.32 0.68 MA 0.34 0.29 0.24 0.76 MR1 MR2 SS loadings 0.59 0.33 Proportion Var 0.20 0.11 Cumulative Var 0.20 0.31 Proportion Explained 0.64 0.36 Cumulative Proportion 0.64 1.00 MR1 MR2 MR1 1.00 0.19 MR2 0.19 1.00 16 / 20

Factor extension Factor analysis and factor extension fa.diagram(f2o,fe.result=f2e,simple=false,cut=.2) Factor analysis and extension GREV 0.9 GREQ 0.8 MR1 0.5 Prelim 0.7 0.4 GREA 0.3 GPA 0.5 0.3 Ach 0.8 MR2 0.3 MA -0.7 Anx 17 / 20

Confirmatory analysis of structure Three very good ways to do confirmatory analysis 1 sem by Fox, Nie & Byrnes (2013) Uses RAM path notation Can get some help from psych (see the psych-for-sem vignette) 2 lavaan by Rosseel (2012) Somewhat easier to use 3 OpenMx by Boker, Neale, Maes, Wilde, Spiegel, Brick, Spies, Estabrook, Kenny, Bates, Mehta & Fox (2011) Most powerful package 18 / 20

Confirmatory analysis of structure Test a model with lavaan z.data <- data.frame(scale(my.data) )#standardize m1.model <- 'ability =~ GREV + GREQ + GREA motive =~ GREA + Ach + Anx perform =~ Prelim + GPA + MA' fit <- sem(m1.model,data=z.data, auto.var=true, auto.fix.first=true, auto.cov.lv.x=true) summary(fit) lavaan (0.5-16) converged normally after 22 iterations Number of observations 1000 Estimator ML Minimum Function Test Statistic 306.074 Degrees of freedom 19 P-value (Chi-square) 0.000 Parameter estimates: Information Standard Errors Expected Standard Estimate Std.err Z-value P(> z ) 19 / 20

Confirmatory analysis of structure More lavaan Latent variables: ability =~ GREV 1.000 GREQ 0.881 0.024 36.266 0.000 GREA 0.784 0.023 33.433 0.000 motive =~ GREA 1.000 Ach 1.048 0.027 38.579 0.000 Anx -0.931 0.028-33.036 0.000 perform =~ Prelim 1.000 GPA 0.781 0.029 26.653 0.000 MA 0.671 0.030 22.076 0.000 Covariances: ability ~~ motive 0.025 0.034 0.742 0.458 perform 0.621 0.024 26.268 0.000 motive ~~ perform 0.716 0.022 31.955 0.000 Variances: GREV 0.185 0.019 GREQ 0.339 0.021 GREA 0.123 0.017 Ach 0.421 0.027 Anx 0.543 0.029 Prelim 0.516 0.032 GPA 0.620 0.032 MA 0.719 0.035 ability 1.000 motive 1.000 20 / 20

Boker, S., Neale, M., Maes, H., Wilde, M., Spiegel, M., Brick, T., Spies, J., Estabrook, R., Kenny, S., Bates, T., Mehta, P., & Fox, J. (2011). Openmx: An open source extended structural equation modeling framework. Psychometrika, 76(2), 306 317. Dwyer, P. S. (1937). The determination of the factor loadings of a given test from the known factor loadings of other tests. Psychometrika, 2(3), 173 178. Fox, J., Nie, Z., & Byrnes, J. (2013). sem: Structural Equation Models. R package version 3.1-3. Horn, J. L. (1973). On extension analysis and its relation to correlations between variables and factor scores. Multivariate Behavioral Research, 8(4), 477 489. Mosier, C. (1938). A note on Dwyer: The determination of the factor loadings of a given test. Psychometrika, 3(4), 297 299. Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1 36. 20 / 20