Factor Analysis Marieke E. Timmerman Heymans Institute for Psychology, DPMG m.e.timmerman@rug.nl 1 Goals and uses of FA Type of data: scores of n subject on (large) number (j) variables in j variables no distinction between predictors (independent variables) and criterium (dependent variable(s)) Goal FA: summarize large number of variables into (much) smaller number of factors, without loosing too much information relate observed variables to latent variabe(s) 2 1
Types FA Common FA (CFA) ( Lisrel ) Analysis (CA) Exploratory Exploratory CFA Principal Analysis Confirmatory Confirmatory CFA Multiple Group Method 3 Empirical Example: PANAS (positive and negative affect schedule) In general, to what extent do you feel... interested, distressed, excited, interested) Number of subjects in sample: 12 (very few, so generelizability to population is questionable) 4 2
Analysis Start: scores of n subjects on j variables, n>j Goal: summarize j variables into smaller number (q) of components is new variable constructed from j observed variables All n subjects get a score on each component ( component score ) 5 s : weighted sumvariabele of j variables Example jittery (J), distressed (D), excited (E), interested (I) component: component 1 (C1) C1= 0.23 J + -0.12 D + 0.89 E + 0.95 0.23 is weight for variable J I 6 3
C1= 0.23*J + -0.12*D + 0.89*E + 0.95*I 3.13=0.23*2 + -0.12*1 + 0.89*1 + 0.95*2 7 Correlations between variables and Loading matrix component(s) correlatie correlation C1 jittery -.036 distress -.332 excited.908 interest.860 Interpretation C1: arousal (o.i.d.) 8 4
Quality of summary Variance explained (compare regression) Computed from correlation(comp,variable) 2: correlation with (correlation with c1 c1) 2 jittery 0.036 0.001 Variance explained: (1.675/4)*100%= 41.8 distressed 0.332 0.110 excited 0.908 0.824 interested 0.860 0.740 sum 1.675 9 Types FA Common FA (CFA) ( Lisrel ) Analysis (CA) Exploratory Exploratory CFA Principal Analysis Confirmatory Confirmatory CFA Multiple Group Method 10 5
Principal s (PC) PC s are optimal summarizers of the data weights for variables are such that PC s explain as much variance as possible 1 e PC explains most variance, 2 e PC explains a little less variance,..., last PC explains least amount of variance weight for PC s found using a computer 11 12 6
13 Correlations between variables and component(s) Matrix SPSS terminology. More often: Loading matrix 1 2 jittery -.573.509 distressed -.780.409 upset -.823.462 afraid -.672.548 scared -.590.630 inspired.771.528 excited.529.693 determined.906.295 interested.771.411 enthusiast.816.455 Extraction Method: Principal Analysis. a 2 components extracted. 14 7
Interpretation of loadings Matrix (loadings < 0.50 en >0.50 are bold) 1 2 jittery -.573.509 distressed -.780.409 upset -.823.462 afraid -.672.548 scared -.590.630 inspired.771.528 excited.529.693 determined.906.295 interested.771.411 enthusiast.816.455 Extraction Method: Principal Analysis. Verkl. var. 53.69% 25.59% a 2 components extracted. 1: contrast between positive and negative items 2: general level Totale variance explained 53.69+25.59=79.27% (Note: summing of variance explained is only allowed with orthogonal 15 (= uncorrelated) components Loadings in a graph 1 component 2 (algemeen nivo) afraid scared upset jittery distressed 0.75 0.5 0.25 0-1 -0.75-0.5-0.25 0 0.25 0.5 0.75 1-0.25-0.5-0.75 excited inspired interested determined enthusiast -1 component 1 (contrast) 16 8
1 component 2 (algemeen nivo) afraid scared upset jittery distressed 0.75 0.5 0.25 0-1 -0.75-0.5-0.25 0 0.25 0.5 0.75 1-0.25-0.5-0.75 excited inspired interested determined enthusiast -1 component 1 (contrast) 17 Alternative graph... component 2' (negatief affect) 1 upset afraid scared distressed 0.75 jittery 0.5 0.25 excited 0-1 -0.75-0.5-0.25 0 0.25 0.5 0.75 1-0.25-0.5-0.75 inspired interested enthusiast determined -1 component 1 (positief affect) 18 9
Rotated Matrix a JITTERY DISTRESS UPSET AFRAID SCARED INSPIRED EXCITED DETERMIN INTEREST ENTHUSIA 1 2-7.13E-02.763 -.291.831 -.285.899 -.117.859-2.08E-04.863.924 -.141.860.145.863 -.403.844 -.227.907 -.225 Extraction Method: Principal Analysis. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 3 iterations. 19 Rotation Alternative representation of loadings to achieve a better interpretation Total amount of variance explained remains equal... but the variance explained per component changes Infinite number of rotations possible 20 10
Orthogonal (uncorrelated) versus oblique components Orthogonal? scores uncorrelated Oblique? scores correlated Interpretation: anxiety and depression correlated or not? 21 r comp_1,comp_2 =0 22 11
Orthogonal versus oblique rotation Options for rotation in SPSS Orthogonal Varimax Quartimax Equamax Oblique Oblimin Promax 23 Choice of numbers of components Goal: balance between a good description and sparseness Criteria: 1 Eigenvalues>1 (default SPSS): 100% J Scree-test: use number of components before scree in scree plot Content 24 12
Eigenvalue is proportional to % variance explained 6 5 4 3 Scree Plot Scree 2 Eigenvalue 1 0 1 2 3 4 5 6 7 8 9 10 Number 25 26 13
27 28 14
JITTERY DISTRESS UPSET AFRAID SCARED INSPIRED EXCITED DETERMIN INTEREST ENTHUSIA Communalities SPSS output Initial Extraction 1.000.587 1.000.775 1.000.890 1.000.751 1.000.745 1.000.874 1.000.760 1.000.907 1.000.764 1.000.873 Extraction Method: Principal Analysis. Variance explained per variabele 29 1 2 3 4 5 6 7 8 9 10 Total Variance Explained Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative % 5.368 53.681 53.681 5.368 53.681 53.681 4.058 40.579 40.579 2.559 25.592 79.273 2.559 25.592 79.273 3.869 38.694 79.273.750 7.502 86.775.477 4.770 91.546.351 3.514 95.059.268 2.677 97.736.161 1.612 99.348 4.329E-02.433 99.781 1.679E-02.168 99.949 5.098E-03 5.098E-02 100.000 Extraction Method: Principal Analysis. 30 15
JITTERY DISTRESS UPSET AFRAID SCARED INSPIRED EXCITED DETERMIN INTEREST ENTHUSIA Matrix a 1 2 -.573.509 -.780.409 -.823.462 -.672.548 -.590.630.771.528.529.693.906.295.771.411.816.455 Extraction Method: Principal Analysis. a. 2 components extracted. Rotated Matrix a JITTERY DISTRESS UPSET AFRAID SCARED INSPIRED EXCITED DETERMIN INTEREST ENTHUSIA 1 2-7.13E-02.763 -.291.831 -.285.899 -.117.859-2.08E-04.863.924 -.141.860.145.863 -.403.844 -.227.907 -.225 Extraction Method: Principal Analysis. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 3 iterations. Transformation Matrix 1 2 1 2.730 -.683.683.730 Extraction Method: Principal Analysis. Rotation Method: Varimax with Kaiser Normalization. 31 Plot in Rotated Space 1.0 upset distressafraid scared jittery.5 excited 0.0 inspired interest enthusia determin 2 -.5-1.0-1.0 -.5 0.0.5 1.0 1 32 16
Use of PCA for confirmatory purposes PCA correct solution: no problem PCA incorrect solution: more components and/or different rotation might yield solution desired 33 Types FA Common FA (CFA) ( Lisrel ) Analysis (CA) Exploratory Exploratory CFA Principal Analysis Confirmatory Confirmatory CFA Multiple Group Method 34 17
CFA versus CA X=C+U with X: observed variable C: common part of variable U: unique part of variables (structural + measurement error) CFA: factor linear combination of C CA: component (factor) linear combination of X 35 Advantage CFA over CA: Distinction between common and unique parts theoretically elegant Model of CFA more realistic Problem CFA (en niet bij CA) Distinction C versus U can be done in many different ways many different model estimates Estimation CFA model: not X=C+U, but Var(X)=Var(C)+Var(U), with Var(C): common variance=communality Var(U): unique variance Different estamation methods for Var(U) (e.g., maximum likelihood, Unweighted least squares) Estimate Factors from (co)variance of common part of variables 36 18
JITTERY DISTRESS UPSET AFRAID SCARED INSPIRED EXCITED DETERMIN INTEREST ENTHUSIA PANAS: Exploratory FA Communalities a Initial.910.952.969.893.973.946.880.985.883.988 Var(X)=Var(C)+Var(U) Extraction Method: Maximum Likelihood. a. One or more communalitiy estimates greater than 1 were encountered during iterations. The resulting solution should be interpreted with caution. Var(C): communality Var(U): unieke variantie en Var(X)=1 (analyse van correlatiematrix) Var(U)<0: highly suspect estimations! 37 observed data (Same as PCA) Common part of data Factor 1 2 3 4 5 6 7 8 9 10 Initial Eigenvalues Total Variance Explained Total % of VarianceCumulative % Total % of VarianceCumulative % 5.368 53.681 53.681 3.834 38.341 38.341 2.559 25.592 79.273 3.529 35.288 73.629.750 7.502 86.775.477 4.770 91.546.351 3.514 95.059.268 2.677 97.736.161 1.612 99.348 4.329E-02.433 99.781 1.679E-02.168 99.949 5.098E-03 5.098E-02 100.000 Extraction Method: Maximum Likelihood. Rotation Sums of Squared Loadings 38 19
Goodness-of-fit Test Chi-Square df Sig. 32.497 26.177 H 0 : model fits in population H 1 : model does not fit in population Here: H 0 not rejected. Seems nice, but Sensible to sample size (small n usually large p-values) Moreover: Is H 0 ever exactly true in population? Alternative fitmeasures: RMSEA, AIC, BIC,... 39 JITTERY DISTRESS UPSET AFRAID SCARED INSPIRED EXCITED DETERMIN INTEREST ENTHUSIA Rotated Factor Matrix a Factor 1 2-7.75E-02.720 -.257.862 -.262.954 -.147.731-3.17E-02.705.903 -.168.824.161.857 -.372.804 -.249.892 -.267 Extraction Method: Maximum Likelihood. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 3 iterations. 40 20
Types FA Common FA (CFA) ( Lisrel ) Analysis (CA) Exploratory Exploratory CFA Principal Analysis Confirmatory Confirmatory CFA Multiple Group Method 41 Measurement models Latent variabele X 1 Meas. Error T X 2 Meas. Error X n met X i : observed score on item X i T: Score on latent variable Meas. Error 42 21
Confirmatory common factor analysis Example model: F 1 F 2 X 2 X 1 Error 1 Error 2 X 3 Error 3 X 4 Error 4 43 Loadings: F1 F2 X1 x 0 X2 x 0 X3 0 x X4 0 x Asessment Model fit: Chi2-test, Fitmeasureas as RSMEA, SRMR x: estimated 0: fixed at 0 44 22
Software SPSS CEFA* Lisrel/ AMOS/ EQS/Mx* Mplus Testfact OPLM* IRT (beperkt ) Reliability maten PCA ** MGM Expl. CFA Conf. CFA *Free **Using Macro, to downloaden from http://www.gmw.rug.nl/~sda/ 45 23