Factor Rotations in Factor Analyses.



Similar documents
Partial Least Squares (PLS) Regression.

Partial Least Square Regression PLS-Regression

Introduction to Principal Components and FactorAnalysis

Metric Multidimensional Scaling (MDS): Analyzing Distance Matrices

Common factor analysis

The Method of Least Squares

Component Ordering in Independent Component Analysis Based on Data Power

Exploratory Factor Analysis: rotation. Psychology 588: Covariance structure and factor models

Multiple Correspondence Analysis

The Kendall Rank Correlation Coefficient

4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as:

PRINCIPAL COMPONENT ANALYSIS

Overview of Factor Analysis

Factor Analysis. Chapter 420. Introduction

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

Topic 10: Factor Analysis

Review Jeopardy. Blue vs. Orange. Review Jeopardy

1 Overview. Fisher s Least Significant Difference (LSD) Test. Lynne J. Williams Hervé Abdi

The Bonferonni and Šidák Corrections for Multiple Comparisons

Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA

Inner Product Spaces and Orthogonality

Factor analysis. Angela Montanari

Multivariate Analysis (Slides 13)

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

How to report the percentage of explained common variance in exploratory factor analysis

Factor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business

Statistics in Psychosocial Research Lecture 8 Factor Analysis I. Lecturer: Elizabeth Garrett-Mayer

Dimensionality Reduction: Principal Components Analysis

1 Overview and background

Factor Analysis. Sample StatFolio: factor analysis.sgp

How To Run Factor Analysis

What is Rotating in Exploratory Factor Analysis?

Soft Clustering with Projections: PCA, ICA, and Laplacian

α = u v. In other words, Orthogonal Projection

Nonlinear Iterative Partial Least Squares Method

A Brief Introduction to Factor Analysis

Exploratory Factor Analysis

Math 550 Notes. Chapter 7. Jesse Crawford. Department of Mathematics Tarleton State University. Fall 2010

Choosing the Right Type of Rotation in PCA and EFA James Dean Brown (University of Hawai i at Manoa)

2. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) F 2 X 4 U 4

x = + x 2 + x

Similarity and Diagonalization. Similar Matrices

Linear Algebra Review. Vectors

Principal component analysis (PCA) is probably the

Statistics for Business Decision Making

Math 215 HW #6 Solutions

Data Mining: Algorithms and Applications Matrix Math Review

What Are Principal Components Analysis and Exploratory Factor Analysis?

Steven M. Ho!and. Department of Geology, University of Georgia, Athens, GA

Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red)

Exploratory Factor Analysis

Psychology 7291, Multivariate Analysis, Spring SAS PROC FACTOR: Suggestions on Use

T : Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari :

A metrics suite for JUnit test code: a multiple case study on open source software

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003

1 VECTOR SPACES AND SUBSPACES

Categorical Data Visualization and Clustering Using Subjective Factors

Factor Analysis: Statnotes, from North Carolina State University, Public Administration Program. Factor Analysis

Multivariate Analysis of Variance (MANOVA)

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Using Principal Components Analysis in Program Evaluation: Some Practical Considerations

Orthogonal Diagonalization of Symmetric Matrices

521493S Computer Graphics. Exercise 2 & course schedule change

A Beginner s Guide to Factor Analysis: Focusing on Exploratory Factor Analysis

Exploratory Factor Analysis

Similar matrices and Jordan form

Chapter 6. Orthogonality

Practical Considerations for Using Exploratory Factor Analysis in Educational Research

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Subspace Analysis and Optimization for AAM Based Face Alignment

Multiple factor analysis: principal component analysis for multitable and multiblock data sets

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

Issues in Information Systems Volume 15, Issue II, pp , 2014

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

Least-Squares Intersection of Lines

MAT188H1S Lec0101 Burbulla

Section 1.1. Introduction to R n

To do a factor analysis, we need to select an extraction method and a rotation method. Hit the Extraction button to specify your extraction method.

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

LINEAR ALGEBRA W W L CHEN

Multimedia Databases. Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig

DATA VERIFICATION IN ETL PROCESSES

Multiple regression - Matrices

x1 x 2 x 3 y 1 y 2 y 3 x 1 y 2 x 2 y 1 0.

Adding vectors We can do arithmetic with vectors. We ll start with vector addition and related operations. Suppose you have two vectors

THREE DIMENSIONAL GEOMETRY

Introduction to Matrix Algebra

Text Analytics (Text Mining)

Vector Math Computer Graphics Scott D. Anderson

Inner product. Definition of inner product

ME 115(b): Solution to Homework #1

1 Introduction to Matrices

How To Cluster

FACTOR ANALYSIS NASC

SALEM COMMUNITY COLLEGE Carneys Point, New Jersey COURSE SYLLABUS COVER SHEET. Action Taken (Please Check One) New Course Initiated

Transcription:

Factor Rotations in Factor Analyses. Hervé Abdi 1 The University of Texas at Dallas Introduction The different methods of factor analysis first extract a set a factors from a data set. These factors are almost always orthogonal and are ordered according to the proportion of the variance of the original data that these factors explain. In general, only a (small) subset of factors is kept for further consideration and the remaining factors are considered as either irrelevant or nonexistent (i.e., they are assumed to reflect measurement error or noise). In order to make the interpretation of the factors that are considered relevant, the first selection step is generally followed by a rotation of the factors that were retained. Two main types of rotation are used: orthogonal when the new axes are also orthogonal to each other, and oblique when the new axes are not required to be orthogonal to each other. Because the rotations are always performed in a subspace (the so-called factor space), the new axes will always explain less variance than the original factors (which are computed to be optimal), but obviously the part of variance explained by the total subspace after rotation is the same as it was before rotation (only the partition of the variance has changed). Because the rotated axes are not defined according to a statistical criterion, their raison d être is to facilitate the interpretation. In this article, I illustrate the rotation procedures using the loadings of variables analyzed with principal component analysis (the so-called R-mode), but the methods described here are valid also for other types of analysis and when analyzing the subjects scores (the so-called Q-mode). Before proceeding further, it is important to stress that because the rotations always take place in a subspace (i.e., the space of the retained factors), the choice of this subspace strongly influences the result of the rotation. Therefore, in the practice of rotation in factor analysis, it is strongly recommended to try several sizes for the subspace of the retained factors in order to assess the robustness of the interpretation of the rotation. Notations 1 In: Lewis-Beck M., Bryman, A., Futing T. (Eds.) (2003). Encyclopedia of Social Sciences Research Methods. Thousand Oaks (CA): Sage. Address correspondence to Hervé Abdi Program in Cognition and Neurosciences, MS: Gr.4.1, The University of Texas at Dallas, Richardson, TX 75083 0688, USA E-mail: herve@utdallas.edu http://www.utdallas.edu/ herve 1

To make the discussion more concrete, I will describe rotations within the principal component analysis (pca) framework. Pca starts with a data matrix denoted Y with I rows and J columns, where each row represents a unit (in general subjects) described by J measurements which are almost always expressed as Z-scores. The data matrix is then decomposed into scores (for the subjects) or components and loadings for the variables (the loadings are, in general, correlations between the original variables and the components extracted by the analysis). Formally, pca is equivalent to the singular value decomposition of the data matrix as: Y = P Q T (1) with the constraints that Q T Q = P T P = I (where I is the identity matrix), and being a diagonal matrix with the so-called singular values on the diagonal. The orthogonal matrix Q is the matrix of loadings (or projections of the original variables onto the components), where one row stands for one of the original variables and one column for one of the new factors. In general, the subjects score matrix is obtained as F = P (some authors use P as the scores). Rotating: When and why? Most of the rationale for rotating factors comes from Thurstone (1947) and Cattell (1978) who defended its use because this procedure simplifies the factor structure and therefore makes its interpretation easier and more reliable (i.e., easier to replicate with different data samples). Thurstone suggested five criteria to identify a simple structure. According to these criteria, still often reported in the literature, a matrix of loadings (where the rows correspond to the original variables and the columns to the factors) is simple if: 1. each row contains at least one zero; 2. for each column, there are at least as many zeros as there are columns (i.e., number of factors kept); 3. for any pair of factors, there are some variables with zero loadings on one factor and large loadings on the other factor; 4. for any pair of factors, there is a sizable proportion of zero loadings; 5. for any pair of factors, there is only a small number of large loadings. Rotations of factors can (and used to) be done graphically, but are mostly obtained analytically and necessitate to specify mathematically the notion of simple structure in order to implement it with a computer program. Orthogonal rotation An orthogonal rotation is specified by a rotation matrix denoted R, where the rows stand for the original factors and the columns for the new (rotated) factors. At the intersection of row m and column n we have the cosine of the 2

angle between the original axis and the new one: r m,n = cos θ m,n. For example the rotation illustrated in Figure 1 will be characterized by the following matrix: [ ] [ ] cos θ1,1 cos θ R = 1,2 cos θ1,1 sin θ = 1,1, (2) cos θ 2,1 cos θ 2,2 sinθ 1,1 cos θ 1,1 with a value of θ 1,1 = 15 degrees. A rotation matrix has the important property of being orthonormal because it corresponds to a matrix of direction cosines and therefore R T R = I. y 2 θ 2,2 x 2 θ 1,2 y 1 θ 1,1 θ 2,1 x 1 Figure 1: An orthogonal rotation in 2 dimensions. The angle of rotation between an old axis m and a new axis n is denoted by θ m,n. Varimax Varimax, which was developed by Kaiser (1958), is indubitably the most popular rotation method by far. For varimax a simple solution means that each factor has a small number of large loadings and a large number of zero (or small) loadings. This simplifies the interpretation because, after a varimax rotation, each original variable tends to be associated with one (or a small number) of factors, and each factor represents only a small number of variables. In addition, the factors can often be interpreted from the opposition of few variables with positive loadings to few variables with negative loadings. Formally varimax searches for a rotation (i.e., a linear combination) of the original factors such that the variance of the loadings is maximized, which amounts to maximizing V = ( q 2 j,l q 2 j,l) 2, (3) 3

Table 1: An (artificial) example for pca and rotation. Five wines are described by seven variables. For For Hedonic meat dessert Price Sugar Alcohol Acidity Wine 1 14 7 8 7 7 13 7 Wine 2 10 7 6 4 3 14 7 Wine 3 8 5 5 10 5 12 5 Wine 4 2 4 7 16 7 11 3 Wine 5 6 2 4 13 3 10 3 Table 2: Wine example: Original loadings of the seven variables on the first two components. For For Hedonic meat dessert Price Sugar Alcohol Acidity Factor 1 0.3965 0.4454 0.2646 0.4160 0.0485 0.4385 0.4547 Factor 2 0.1149 0.1090 0.5854 0.3111 0.7245 0.0555 0.0865 with q 2 j,l being the squared loading of the jth variable on the l factor, and q2 j,l being the mean of the squared loadings (for computational stability, each of the loadings matrix is generally scaled to length one prior to the minimization procedure). Other orthogonal rotations There are several other methods for orthogonal rotation such as the quartimax rotation, which minimizes the number of factors needed to explain each variable, and the equimax rotation which is a compromise between varimax and quartimax. Other methods exist, but none approaches varimax in popularity. An example of orthogonal varimax rotation To illustrate the procedure for a varimax rotation, suppose that we have 5 wines described by the average rating of a set of experts on their hedonic dimension, how much the wine goes with dessert, how much the wine goes with meat; each wine is also described by its price, its sugar and alcohol content, and its acidity. The data are given in Table 1. A pca of this table extracts four factors (with eigenvalues of 4.7627, 1.8101, 0.3527, and 0.0744, respectively), and a 2 factor solution (corresponding to the components with an eigenvalue larger than unity) explaining 94% of the variance is kept for rotation. The loadings are given in Table 2. The varimax rotation procedure applied to the table of loadings gives a clockwise rotation of 15 degrees (corresponding to a cosine of.97). This gives the new set of rotated factors shown in Table 3. The rotation procedure is illustrated in Figures 2 to 4. In this example, the improvement in the simplicity of the 4

Table 3: Wine example: Loadings, after varimax rotation, of the seven variables on the first two components. For For Hedonic meat dessert Price Sugar Alcohol Acidity Factor 1 0.4125 0.4057 0.1147 0.4790 0.1286 0.4389 0.4620 Factor 2 0.0153 0.2138 0.6321 0.2010 0.7146 0.0525 0.0264 interpretation is somewhat marginal, maybe because the factorial structure of such a small data set is already very simple. The first dimension remains linked to price and the second dimension now appears more clearly as the dimension of sweetness. x 2 Hedonic Acidity Alcohol For meat x 1 Price For dessert Sugar Figure 2: Pca: Original loadings of the seven variables for the wine example. Oblique Rotations In oblique rotations the new axes are free to take any position in the factor space, but the degree of correlation allowed among factors is, in general, small because two highly correlated factors are better interpreted as only one factor. Oblique rotations, therefore, relax the orthogonality constraint in order to gain simplicity in the interpretation. They were strongly recommended by Thurstone, but are used more rarely than their orthogonal counterparts. 5

x 2 y 2 Hedonic Acidity Alcohol For meat x 1 θ = 15 o Price y 1 For dessert Sugar Figure 3: Pca: The loading of the seven variables for the wine example showing the original axes and the new (rotated) axes derived from varimax. Table 4: Wine example. Promax oblique rotation solution: correlation of the seven variables on the first two components. For For Hedonic meat dessert Price Sugar Alcohol Acidity Factor 1 0.8783 0.9433 0.4653 0.9562 0.0271 0.9583 0.9988 Factor 2 0.1559 0.4756 0.9393 0.0768 0.9507 0.2628 0.2360 For oblique rotations, the promax rotation has the advantage of being fast and conceptually simple. Its name derives from procrustean rotation because it tries to fit a target matrix which has a simple structure. It necessitates two steps. The first step defines the target matrix, almost always obtained as the result of a varimax rotation whose entries are raised to some power (typically between 2 and 4) in order to force the structure of the loadings to become bipolar. The second step is obtained by computing a least square fit from the varimax solution to the target matrix. Even though, in principle, the results of oblique rotations could be presented graphically they are almost always interpreted by looking at the correlations between the rotated axis and the original variables. These correlations are interpreted as loadings. For the wine example, using as target the varimax solution raised to the 4th power, for a promax rotation gives the loadings shown in Table 4. From these 6

y 2 Hedonic Acidity Alcohol y 1 For meat Price For dessert Sugar Figure 4: Pca: The loadings after varimax rotation of the seven variables for the wine example. loadings one can find that, the first dimension once again isolates the price (positively correlated to the factor) and opposes it to the variables: alcohol, acidity, going with meat, and hedonic component. The second axis shows only a small correlation with the first one (r =.01). It corresponds to the sweetness/going with dessert factor. Once again, the interpretation is quite similar to the original pca because the small size of the data set favors a simple structure in the solution. Oblique rotations are much less popular than their orthogonal counterparts, but this trend may change as new techniques, such as independent component analysis, are developed. This recent technique, originally created in the domain of signal processing and neural networks, derives directly from the data matrix an oblique solution that maximizes statistical independence. *References [1] Hyvärinen, A., Karhunen, J., & Oja, E. (2001). Independent component analysis. New York: Wiley. [2] Kim, J.O. & Mueller, C.W. (1978). Factor analysis: statistical methods and practical issues. Thousand Oaks (CA): Sage Publications. [3] Kline, P. (1994). An easy guide to factor analysis. Thousand Oaks (CA): Sage Publications. 7

[4] Mulaik, S.A. (1972). The foundations of factor analysis. New York: McGraw- Hill. [5] Nunnally, J.C. & Berstein, I.H. (1994). Psychometric theory (3rd Edition). New York: McGraw-Hill. [6] Reyment, R.A. & Jöreskog, K.G. (1993). Applied factor analysis in the natural sciences. Cambridge: Cambridge University Press. [7] Rummel, R.J. (1970). Applied factor analysis. Evanston: Northwestern University Press. 8