Exploratory Factor Analysis

Exploratory Factor Analysis Definition Exploratory factor analysis (EFA) is a procedure for learning the extent to which k observed variables might measure m abstract variables, wherein m is less than k. In EFA, we indirectly measure non-observable behavior by taking measures on multiple observed behaviors. Conceptually, in using EFA we can assume either nominalist or realist constructs, yet most applications of EFA in the social sciences assume realist constructs. Assumptions 1. Typically, realism rather than nominalism: Abstract variables are real in their consequences. 2. Normally distributed observed variables. 3. Continuous-level data. 4. Linear relationships among the observed variables. 5. Content validity of the items used to measure an abstract concept. 6. E(e i ) = 0 (random error). 7. All observed variables are influenced by all factors (see: model specification in CFA). 8. A sample size greater than 30 (more is better). Terminology (lots of synonyms): Factor = Abstract Concept = Abstract Construct = Latent Variable = Eigenvector. Comparison of Exploratory Factor Analysis and OLS Regression In OLS regression, we seek to predict a point, a value of a dependent variable (y) from the value of an independent variable (x). The diagram below indicates the value of y expected from a given value of x. The error represents the extent to which we fail in predicting y from x.

In EFA, we seek to predict a vector that best describes a relationship between the items used to measure the vector. The diagram below indicates the value of the vector F, expected from the correlation of X 1 and X 2. The error represents the extent to which we fail in predicting the vector from the correlation of X 1 and X 2. EFA assumes that X 1 and X 2 are linearly dependent, based upon their relationship to some underlying (i.e., abstract, latent) variable (i.e., construct, concept). In OLS regression, we solve the (standardized) equation: Y = X, where: Y is a vector of dependent variables, is a vector of parameter estimates, X is a vector of independent variables, is a vector of errors. In EFA, we solve the (standardized) equation: X = F, where: X is a vector of k observed variables, is a vector of k parameter estimates, F is a vector of m factors (abstract concepts, latent variables), is a vector of k errors.

The EFA Model Consider this simple model that consists of a single factor with two observed variables: 1 X 1 1 1 F 2 X 2 2 2 Note: When we address the topic of confirmatory factor analysis, we will designate abstract concepts with the greek letters and. Because most literature on EFA uses the designation F, we will use it in this lecture. We have two equations to solve: X 1 = 1 F + 1 1 X 2 = 2 F + 2 2 1. var(x i ) = E(x i - x ) 2 Note: for standardized variables, the mean of x = 0. 2. Thus, var(x i ) = E(X i ) 2 3. X i = i F + i i 4. var(x i ) = E( i F + i i ) 2 5. var(x i ) = i 2 E[F 2 ] + i 2 E[ i ] 2 + 2 i i E[F, i ] 6. var(x i ) = i 2 var(f) + i 2 var[ i ] + 2 i I cov(f, i ) Assume: 1. cov(f, i ) = 0 (i.e., random errors in measurement). 2. var(f) = 1 (i.e., standardized measure of F, or ontologically, "the construct has a unit value"). 3. var[ i ] = 1 (i.e., standardized measure of, or ontologically, "the construct has a unit value"). Therefore: 1. var(x i ) = i 2 + i 2 = 1 (i.e., x is a standardized variable). 2. Because cov(f,x i ) = i 3. and because var(f) + i cov(f, i ) = 1. 3. then, for standardized variables, i = r F,Xi (i.e., the correlation of F and X i ). 4. Example: cov(x 1,X 2 ) = 1 2 var(f) = 1 2 = r X1,X2 (i.e., the correlation of X 1 and X 2 ).

Summary: 1. The parameter estimate (i.e., "factor loading"), i = r F,Xi (i.e., for principle components factor analysis, this parameter is identical to ). 2. The product of two factor loadings for two variables caused by the same factor (i.e., factorial complexity = 1) is equal to the correlation between the two observed variables. 3. The "communality" or item reliability of X i is equal to i 2. 4. In principle components exploratory factor analysis, the communality of X i is identical in concept to the coefficient of determination (R-square) in OLS regression analysis. [Note: Later, we will discuss various forms of EFA. Principle components EFA relies upon the unweighted correlation matrix among the observed variables, and therefore is analogous to OLS regression analysis with a known number of factors.] Estimating the EFA Model 1. X i is caused by F m, where m = the number of factors. 2. F causes X i, where i = 1-k and k = the number of items that are caused by F. 3. X i = i F m + i. 4. To solve this equation, we need to measure F. 5. Our approach: a. We know X i (the observed variable). b. We will estimate i and use this estimate to determine i. [i.e., i + i = 1]. 5. Because X i can be caused by m factors, EFA becomes an exercise in determining the number of factors that cause X i and the parameter estimates ( i ) of each F on each X i. Determining the Number of Factors That Affect Each Observed Variable A factor is an abstract concept. In a realist (vs. nominalist) sense, this concept "causes" observable behavior in the same manner that the length of a table top "causes" the ruler to measure its longest dimension as its length. If one were to measure the longest dimension of a table top twice, and the table top did not change in its dimensions between the two measurements of it, and the measurements were taken carefully, and the measuring instrument (i.e., the ruler) were stable and consistent rather than wiggly and wobbly, then the two measurements should equal one another exactly. Similarly, if one were to measure self-esteem twice using, for example, the Rosenberg Self-Esteem Scale, and self-esteem did not change between the two measurements of it, and the measurements were taken carefully, and all ten items in the Rosenberg Self-Esteem Scale had equal content validity, and the Rosenberg Self- Esteem Scale itself was a stable and consistent measuring instrument, then people should respond equally to all ten items on the scale (taking into account that half the items are worded in reverse conceptual order). This result should occur because one's self-esteem "causes" one to respond accordingly to the items on the Rosenberg Self-Esteem Scale. In mathematical terms, if the above conditions for measuring self-esteem are met, then the matrix of responses for the ten items on the scale should have a rank of 1, wherein the figures shown in columns 2-9 should be identical to those found in column 1 (assuming the items define the columns and the cases define the rows). That is, once we know a person's response to the first question in the Rosenberg Self-Esteem Scale, then we know the person's responses to the remaining nine items. Conceptually, given that each item on the scale is intended equally to reflect self-esteem, then this outcome is exactly what we would expect to observe. Thus, the ten items on the Rosenberg Self-Esteem Scale would represent a single, abstract concept (i.e., factor): self-esteem.

With this conceptual and mathematical logic in mind, we know we can determine the number of factors affecting responses to the i = 1-k items by calculating the rank of the matrix of responses to the observed variables (i.e., X) because rank less than k indicates singularity in the matrix (i.e., at least two columns are measuring the same thing). This approach is logically consistent, but it fails in practice because, 1) not all items in a scale have equal content validity in reflecting the abstract concept and 2) people do not necessarily behave in a logically consistent manner. Therefore, to determine the number of factors causing responses to a set of observed variables, we need a measure of linear dependency that is probabilistic rather than deterministic. Consider the relationship between the rank and determinant of a matrix for a system of two linear equations, wherein the rows and columns provide unique information. 2x + 3y = 13 4x + 5y = 23 2 3 x 13 4 5 * y = 23 Solve for x, y: 1. 2x = 13 3y 2. x = 13/2 3/2y 3. 4 (13/2 3/2y) + 5y = 23 4. 26 6y + 5y = 23 5. y = 3 6. x = 13/2 9/2 = 2. Now, consider the relationship between the rank and determinant of a matrix for a system of two linear equations, wherein the rows and columns do not provide unique information. That is, note that the second equation is identical to 2 * the first equation. 2x + 6y = 22 4x + 12y = 44 2 6 x 22 4 12 * y = 44 Solve for x, y: 1. 2x = 22 6y 2. x = 11 3y 3. 4 (11 3y) + 12y = 44 4. 44 12y + 12y = 44 5. 44 = 44. Result: Because of the linear dependence between row 1 and row 2 of the matrix, we cannot find a unique solution for x and y.

Consider the rank of the second matrix: 2 6 4 12 multiply Row 1 by 1/2 1 3 4 12 multiply Row 1 by -4 and add to Row 2 1 3 0 0 the rank of this matrix equals 1. Thus, if a matrix has a perfect linear dependence, then its rank is less than k (the number of rows and columns). So, we can determine the number of factors by calculating the rank of the matrix, but this procedure requires perfect linear dependence, a result that is highly unlikely to occur in practice. Consider the definition of an eigenvector: X is an eigenvector of a matrix A if there exists a scalar, such that A x = x. That is, an eigenvector is a representation of linear dependence in a square matrix. To find the eigenvector(s) of a matrix, we solve for X: 1. A x = x. 2. A x - x = 0. 3. However, it is impossible to subtract a scalar from a matrix. It is possible, however, to subtract a scalar from the diagonal of a matrix. So, we insert "1" into the equation in the form of the Identity matrix. 4. (A - I)X = 0. 5. Let B = (A - I), such that BX = 0. 6. Note: To solve this equation, we will need to calculate the inverse of A. Not all matrices have an inverse. If a matrix has a rank less than k, then the matrix does not have an inverse. Also, if a matrix has a rank less than k, then the determinant of the matrix = 0. 7. If BX = 0, and B has an inverse, then X= B -1 0 and X = 0, which means that the matrix A has no eigenvector, meaning no indication of linear dependence. 8. Thus, X is an eigenvector of A if and only if B does not have an inverse. 9. If B does not have an inverse, then it has Det = 0 (and therefore perfect linear dependence). 10. So, X is an eigenvector of A, if and only if: Det(A - I) = 0 [i.e., the characteristic equation]. Unlike the rank of a matrix, which is deterministic, the determinant of a matrix is probabilistic, ranging in value from minus infinity to plus infinity. Therefore, the determinant of a matrix can be used to indicate the degree of linear dependence in square matrix. Thus, the solution to estimating the EFA equation is to establish a criterion of linear dependence by which to deem a matrix as containing one or more eigenvectors (i.e., factors). The approach is to solve for which is called the eigenvalue of the matrix. Hand-written notes attached to this course packet describe the Power Method and Gram-Schmidt Algorithm as procedures for estimating wherein the Power Method is a logically correct but impractical approach and the Gram-Schmidt Algorithm is the approach used in statistical analysis packages. An example of the matrix algebra used by the Gram-Schmidt Algorithm is attached to the course packet.

Calculation of After determining the number of factors in a matrix, the next step in estimating the EFA equation is to calculate the parameters in (discussed in detail below). Summary Determining the number of factors underlying a matrix of observed variables involves calculating the extent to which the matrix contains linear dependency. The rank of a matrix indicates perfect linear dependency, which is unlikely to occur in practice. The determinant of the equation for an eigenvector (i.e., wherein an eigenvector represents a factor) is probabilistic. Thus, we can calculate the determinant associated with an eigenvector to infer the presence of a factor. We achieve this goal by establishing a decision criterion by which to deem a matrix as containing one or more linear dependencies. We will discuss a mathematical logic for establishing this criterion later in this course. For principle components EFA, we will set this criterion as equal to 1. If an eigenvector has an associated eigenvalue of 1 or greater, then we will state that this vector represents an underlying abstract construct. The number of eigenvectors in a matrix of k columns and rows is equal to k. Thus, the Gram- Schmidt Algorithm will calculate k eigenvalues for a matrix of size k. The calculation of eigenvalues is a "zero-sum" game in that the degree of linear dependency calculated for one eigenvector reduces the size of the eigenvalue for the next vector, and so on. In principle components EFA, for example, the sum of eigenvalues is equal to k. Indeterminancy and Establishing a Scale Unfortunately, the calculation of eigenvectors from eigenvalues is indeterminant because of the linear dependence(s) in X. Consider this matrix: A = 1 2 4 3 The eigenvalues of A are -1 and 5. Solve for x: (A I)X = 0 at 1 = -1. 1. (A (-1)I)X = 0. 2. The vector X is: X 1 X 2 3. Then: ( 1 2 + 1 0 ) X 1 = 0 ( 4 3 0 1 ) X 2 = 0 4. So, 2 2 X 1 = 0 4 4 X 2 0 or: 2X 1 + 2X 2 = 0 4X 1 + 4X 2 = 0 These equations cannot be solved!

5. To solve the equations, one of the values in the X matrix must be set to a value. 6. Let X 2 = 1, which indicates a "unit vector," or if you will, "The vector has the value of itself." This process is called, "setting the scale" for the equation. 7. If X 2 = 1, then, where 1 = -1 2X 1 + 2 = 0, X 1 = -1. 8. Solve for x: (A I)X = 0 at 2 = 5: ( 1 2 + ( -5) 1 0 ) X 1 = 0 ( 4 3 0 1 ) X 2 = 0 or: -4X 1 + 2X 2 = 0 4X 1-2X 2 = 0 (X 2 is set to 1) So, X 1 =.5. The equation can be solved. But only if one of the vectors is set to a value of 1. Therefore, the matrix of factor loadings is arbitrary because the eigenvectors are arbitrary. The Philosophy of the Social Sciences In the social sciences we measure variables that have no mass and therefore cannot be directly observed with the senses. At the same time, the social sciences are conducted under the same rules of theory development and testing as those used in the physical and life sciences. There are no exceptions or exemptions in science. If the social sciences must operate under the same rules of theory development and testing as required of all sciences, yet without the opportunity to observe phenomena through the senses (or extensions of them, such as microscopes, telescopes, and such), then some concession must be made. The concession made is the indeterminancy of measuring abstract concepts. Social sciences must assume that the abstract vector has some fixed length. Typically, this fixed length is set to 1. The result of this concession is that to some extent, all measures of abstract concepts are arbitrary. Indeterminacy in deriving eigenvalues 1. Ontology: Must make a claim about reality. Realism: Abstract concepts are real in their consequences. Abstract concepts "exist," and this existence is equal to itself = 1. 2. Epistemology: Cannot measure something that has no concrete existence. X = F + a. Known: X, which is the vector of observed variables. b. We do not know the number of F or the scores on F. We use the GS algorithm to determine eigenvalues for each eigenvector in R (the correlation matrix). An eigenvalue is the extent to which one eigenvector is correlated with another eigenvector. If an eigenvector "stands alone" or "to some extent represents an association with another eigenvector" then the eigenvalue will be greater than or equal to 1, respectively. If the eigenvalue ge 1, then we claim that we have determined the existence of an abstract variable.

c. An eigenvalue is the extent to which an eigenvector must be "altered" to reduce the determinant of R to (near) zero, wherein the lower the determinant the greater the "singularity" of R, and the greater the extent to which we identify the existence of an abstract variable. Characteristic Equation: Det (A I) = 0. Consider the matrix: 1 8 2 15 Row 2 is nearly the double of Row 1. Setting the determinant to zero will "remove" Row 2, and thereby show singularity. If we "remove" Row 2, then we are "removing" much of the informational value of Row 1 as well. Thus, will be higher than one, indicating the existence of an abstract variable that affects both rows. d. We cannot solve the characteristic equation for an eigenvector unless we reduce the indeterminacy in the system of equations defined by A. One of the vectors of A must be set to a constant. Thus, ontologically, we have "set the scale" of our abstract variable to equal a constant (= 1). Note: In CFA, we can set the scale by setting on of the elements of to 1. Calculation of Factor Loadings Procedures Other Than Maximum Likelihood The calculation of the factor loadings (i.e., the matrix) is: [factor loadings] = [eigenvectors] * [eigenvalues] 1/2 reliability of the item in predicting the factor. That is, the factor loadings equal the Maximum Likelihood Factor Analysis For ML factor analysis the factor loadings (A) are estimated as: R = AA' + U 2, where R = the correlation matrix, and U 2 = 1 the item reliability (i.e., communality). Maximum likelihood EFA calculates weights for each element in the matrix, wherein these weights represent the communality of each observed variable and where observed variables with higher communality are given more weight. Consider the SAS output for the example labeled "Kim and Mueller: Tables 4-5, Figure 5 (http://www.soc.iastate.edu/sapp/soc512efa.html)." Note that the SAS output provides a variance explained by each factor, which equals the sum of the squared estimates for each observed variable on a factor. Thus, the unweighted variance explained by Factor 1 equals.8 2 +.7 2 +.6 2 +.0 2 +.0 2 = 1.49. The SAS output also provides the weights for each variable, which

reflect the communality of each observed variable and where this communality has been further enhanced to the extent that its reliability is stronger than the reliability of the other observed variables. These weights are shown in the table labeled "Final Communality Estimates and Variable Weights." Therefore, the weighted variance explained by Factor 1 equals (.8 2 * 2.78) + (.7 2 * 1.96) + (.6 2 * 3.57) + (.0 2 * 2.78) + (.0 2 * 1.56) = 4.02. See: Harmon, Harry H. 1976. Modern Factor Analysis, Third Edition. Chicago, The University of Chicago Press. Pp. 200-216. Principle Components EFA and OLS Regression After calculating the factor scores, one can regress each observed variable on these scores to reproduce exactly the matrix. The R-square for the OLS regression will equal the item reliability (i.e., communality) of the observed variable. Factor Scales [Scores] Once the EFA equation has been estimated, one can calculate scores on an abstract variable. The most common procedures are to calculate either the sum or the mean of responses to the observed variables caused by the factor. For example, to calculate a score on self-esteem, wherein EFA showed that the ten items on the Rosenberg Self-Esteem Scale are caused by a single abstract concept, one might add responses to the ten items on the scale. I recommend calculating the mean score across the ten items to retain the same measurement response scale as the one used for the ten observed variables. Other approaches to calculating factor scales account for varying item reliabilities in representing the abstract construct. Regression Method This method assumes that the observed variables represent the population of variables affected by the abstract concept (i.e., perfect content validity). = X( R -1 ), where: is the estimated score on the abstract variable, X is the matrix of standardized scores on the observed variables, is the matrix of parameter estimates of the effect of F on X. R -1 is the inverse of the correlation matrix. Recall that in OLS regression we estimate the equation: Y = X + We assume that the errors are random and uncorrelated with Y or X. Thus, in OLS regression, we solve for : = X'Y (X'X) -1

Similarly, in principle components factor analysis, we estimate the equation: X = F + We assume that the errors are random and uncorrelated with X or F. Thus, in principle components factor analysis, we solve for : = F'X (F'F) -1 Solving for F yields the equation shown above: = X( R -1 ) See Gorsuch, pages 261-262, formula 12.1.6. See Harmon, pages 368-369, formula 16.21. Least Squares Method This method assumes that the observed variables represent a sample from the population of variables affected by the abstract concept (i.e., imperfect content validity). = X( ') -1, where: is the estimated score on the abstract variable, X is the matrix of standardized scores on the observed variables, is the matrix of parameter estimates of the effect of F on X. Bartlett's Criterion This method gives more weight to observed variables with higher item reliability (i.e., imperfect content validity). = XU -2 ( ' U -2 ) -1, where: is the estimated score on the abstract variable, X is the matrix of standardized scores on the observed variables, is the matrix of parameter estimates of the effect of F on X. U is the matrix of 1 minus the item reliability.

Evaluation of Factor Scales 1. Factor scales can be correlated with one another even if the factors are orthogonal. 2. Correlations among oblique factor scales do not necessarily equal the correlations among the oblique factors. 3. A factor scale is said to be univocal if its partial correlation with other factors = 0. 4. Factor scales include two indeterminacies: 1) they are based upon indeterminate parameter estimates, 2) they do not account for unique error variance in F. Reliability of Factor Scales F = [var( ) (1 h i 2 )w i 2 ] / var( ), where: F (symbol Rho, for ): the reliability of the factor scale, w i = '(R -1 ) var( ) = correlation matrix, with all elements weighted by w i. Extraction Procedures in EFA Various forms of EFA are defined, wherein these forms rely upon various assumptions about the nature of social reality. These forms and assumptions are described below. All forms of EFA rely upon the same algorithm to calculate eigenvalues: the Gram-Schmidt Algorithm (also: QR and QL algorithms). Therefore, the various forms of EFA differ only in the matrix evaluated by the GS Algorithm. The Gram-Schmidt Algorithm calculates k eigenvalues associated with k eigenvectors for a square matrix (i.e., the correlation matrix or some weighted version of it). The various forms of EFA, therefore, are defined solely by their treatment of the matrix of correlations among the observed variables, prior to this matrix being evaluated using the GS Algorithm. Principle Components Characteristic equation: Det (R I) = 0, where R is the correlation matrix among the observed variables (i.e., the X matrix) with 1's on the diagonal. This is the "least squares" approach. Indeed, once the factor structure (i.e., number of factors and loadings of each X on each factor) is calculated, the scores on X and F can be input into OLS regression analysis to exactly reproduce the and matrices. Principle components is the procedure most often applied in EFA. The criterion used to deem an eigenvector as a factor is an eigenvalue of 1 or greater. Principle Axis; Common Factor Characteristic equation: Det (R 1 I) = 0, where R 1 is the correlation matrix among the observed variables (i.e., the X matrix) with the item reliabilities (i.e., commonalities) on the diagonal.

The principle axis (or common factor) form of EFA assumes that the items in X will vary in their content validity as indicators of F. Therefore, the input matrix is weighted to account for differing item reliabilities among the items in X. Conducting principle axis EFA requires initial estimates of the item reliabilities. Recall that item reliability equals the coefficient of determination (R-square) for the item as one observed outcome of the abstract concept. Therefore, prior communalities (i.e., item reliabilities) can be estimated through a series of OLS regression equations. Consider a factor structure with a single factor and three observed variables. Prior communalities for each X i are estimated as the R-square statistic for the regression of each X i on the remaining elements in X. X 1 = X 2 + X 3 + e (R 2 = prior communality for X 1 ). X 2 = X 1 + X 3 + e (R 2 = prior communality for X 2 ). X 3 = X 1 + X 2 + e (R 2 = prior communality for X 3 ). Principle axis EFA is not often used. The criterion used to deem an eigenvector as a factor is an eigenvalue of 0 or greater. Maximum Likelihood Characteristic equation: Det (R 2 I) = 0, where R 2 is the correlation matrix among the observed variables (i.e., the X matrix) with weighted item reliabilities (i.e., commonalities) on the diagonal. Observed variables with more reliability are given more weight. R 2 = U -1 (R U 2 ) U -1 : the correlation matrix divided by the square of the prior communalities. Maximum likelihood EFA assumes that the items in X will vary in their content validity as indicators of F. Therefore, the input matrix is weighted to account for differing item reliabilities among the items in X. The ML procedure calculates prior communalities in the same manner as is done for the principle axis procedure. The ML procedure is commonly used in EFA, especially when one assumes significant correlations among multiple factors. The criterion used to deem an eigenvector as a factor is an eigenvalue of 0 or greater. Alpha Characteristic equation: Det (R 3 I) = 0, where R 3 is the correlation matrix among the observed variables (i.e., the X matrix) with weighted item reliabilities (i.e., commonalities) on the diagonal. Observed variables with less reliability are given more weight (see: Correction for attenuation). R 3 = H -1 (R U 2 ) H -1 : the correlation matrix divided by the square of 1 minus the prior communalities, wherein U 2 + H 2 = 1.

Alpha EFA assumes that the items in X will vary in their content validity as indicators of F. Therefore, the input matrix is weighted to account for differing item reliabilities among the items in X, but giving more weight to items with less reliability. The alpha procedure calculates prior communalities in the same manner as is done for the principle axis procedure. I do not recall seeing a peer-reviewed publication that used alpha EFA. The criterion used to deem an eigenvector as a factor is an eigenvalue of 0 or greater. Image Characteristic equation: Det (R 4 I) = 0, where R 4 is the correlation matrix among the observed variables (i.e., the X matrix) with weighted item reliabilities (i.e., commonalities) on the diagonal. Prior communalities are adjusted to reflect that they are derived from a sample of the population. R 4 = (R S 2 ) R -1 (R S 2 ): the correlation matrix divided by the square of the correlation matrix, subtracting the variances of the observed variables from the diagonal. S 2 = the diagonal matrix of the variances of the observed variables. The image procedure calculates prior communalities in the same manner as is done for the principle axis procedure. I do not recall seeing a peer-reviewed publication that used image EFA. The criterion used to deem an eigenvector as a factor is an eigenvalue of 0 or greater. Unweighted Least Squares Characteristic equation: Det (R I) = 0, where R is the correlation matrix among the observed variables (i.e., the X matrix) with 1's on the diagonal. This approach differs from principle components in that it uses an iterative procedure to calculate the factor loadings, as compared with the procedure shown below. I do not recall seeing a peer-reviewed publication that used unweighted least squares EFA. The criterion used to deem an eigenvector as a factor is an eigenvalue of 1 or greater. Generalized Least Squares Characteristic equation: Det (R I) = 0, where R is the correlation matrix among the observed variables (i.e., the X matrix) with 1's on the diagonal. This approach differs from principle components in that it relies upon a direct estimation of the factor loadings, as compared with the procedure shown below. I do not recall seeing a peer-reviewed publication that used generalized least squares EFA. The criterion used to deem an eigenvector as a factor is an eigenvalue of 1 or greater.

The Gram-Schmidt (QR and QL) Algorithm As noted in the attached paper by Yanovsky, the QR-decomposition (also called the QR factorization) of a matrix is a decomposition of the matrix into an orthogonal matrix and a triangular matrix. Note: In this algorithm, the number of rows in the correlation matrix is referenced with the letter k (rather than the letter m, which is used in the notes above). 1. Define the magnitude of X = X, which is the length of X. X = [x 1 2 + x 2 2 + x k 2 ] 1/2 2. Two or more vectors are orthogonal if they all have a length of 1 and are uncorrelated with one another (cosin = 0). 3. Consider two sets of orthogonal vectors: {x 1, x 2, x 3 } {q 1, q 2, q 3 } where the set q is a linear combination of the set x (i.e., q is the same vector, rotated). 4. If the set q is a linear combination of the set x, then q and x have the same eigenvalues. 5. Thus, by creating successive sets of q, the QR algorithm can iteratively arrive at the set of eigenvalues describing x. 6. The QR and QL algorithms are identical, except that the QL uses the lower rather than upper half of the correlation matrix. Thus, if one conducts EFA on the same data using two different statistical software packages, wherein one uses the QR and the other uses the QL algorithm, then the parameter estimates will be identical but lined up under different columns (i.e., factors). Steps in the Gram-Schmidt (QR and QL) Algorithm 1. calculate r kk = [<x k, x k >] 1/2, which is the length of X. 2. set q k = (1 / r kk )X k, (i.e., Kaiser normalization of the vector X). 3. calculate r kj = <x j, q k >, wherein q = x rotated. 4. replace x j by x j r kj q k. (i.e., determine the eigenvalues of q). Rotation The Gram-Schmidt Algorithm projects the k eigenvectors within a space of k dimensions. These initial vectors can be difficult to interpret. The purpose of rotation is to find a simpler and more easily interpretable pattern matrix by retaining the number of factors and the final communalities of each of the observed variables in X. Rotation assumes either orthogonal axes (90 0 angle, indicating no correlation among the factors) or oblique axes (angles other than 90 0, indicating correlations among the factors).

There are three approaches to rotation. Graphic (not commonly used). Orthogonal: Rotate the axes by visual inspection of the vectors. Oblique: 1. Establish a reference axis that is perpendicular to a "primary" axis (the vector with the largest eigenvalue). 2. Plot the second vector. 3. Measure, the angle between F 1 and F 2. 4. Cosin = the correlation between F 1 and F 2. Rotation to a Target Matrix (not commonly used). 1. Specify a pattern matrix (rotated factor pattern) of interest. 2. Rotate the eigenvectors to this matrix. 3. Use hypothesis testing to determine the extent to which the pattern matrix equals the theoretically derived target matrix. Analytic (commonly used). Orthogonal: 1. Varimax (most commonly used): maximize the squared factor loadings by columns of the factor pattern. That is, maximize the interpretability of the factors. 2. Quartimax (not often used): maximize the squared factor loadings by rows of the factor pattern. That is, maximize the interpretability of the observed variables. 3. See also: Equimax, Biquartimax. Oblique: 1. Minimize errors in estimating, the angle between F 1 and F 2. 4. See: Harris-Kaiser (used in SAS), direct oblimin (used in SPSS), Quartimin, Covarimin, Bivarimin, Oblimax, and Maxplane. Normalization After rotation from oblique procedures, the resulting vectors are no longer of unit length. Normalization (see: Kaiser Normalization) resets the vectors to a standardized length of 1.