Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

Similar documents

Multivariate Analysis (Slides 13)

Common factor analysis

Introduction to Principal Components and FactorAnalysis

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

FACTOR ANALYSIS NASC

Factor analysis. Angela Montanari

Factor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business

A Brief Introduction to SPSS Factor Analysis

Factor Analysis. Chapter 420. Introduction

Factor Analysis. Factor Analysis

Exploratory Factor Analysis

Introduction to Matrix Algebra

FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003

Statistics in Psychosocial Research Lecture 8 Factor Analysis I. Lecturer: Elizabeth Garrett-Mayer

Topic 10: Factor Analysis

Exploratory Factor Analysis: rotation. Psychology 588: Covariance structure and factor models

Statistics for Business Decision Making

4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as:

2. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) F 2 X 4 U 4

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Exploratory Factor Analysis

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

Chapter 7 Factor Analysis SPSS

T-test & factor analysis

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Factor Analysis. Sample StatFolio: factor analysis.sgp

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

[1] Diagonal factorization

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

Principal Component Analysis

Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA

Least-Squares Intersection of Lines

STA 4107/5107. Chapter 3

A Beginner s Guide to Factor Analysis: Focusing on Exploratory Factor Analysis

Module 3: Correlation and Covariance

Exploratory Factor Analysis

Econometrics Simple Linear Regression

Factor Analysis: Statnotes, from North Carolina State University, Public Administration Program. Factor Analysis

Factor Analysis - 2 nd TUTORIAL

Dimensionality Reduction: Principal Components Analysis

Similarity and Diagonalization. Similar Matrices

Linear Algebra Review. Vectors

Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red)

Sections 2.11 and 5.8

Chapter 6. Orthogonality

Overview of Factor Analysis

DISCRIMINANT FUNCTION ANALYSIS (DA)

Data analysis process

Quadratic forms Cochran s theorem, degrees of freedom, and all that

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Psychology 7291, Multivariate Analysis, Spring SAS PROC FACTOR: Suggestions on Use

Introduction to Principal Component Analysis: Stock Market Values

Fitting Subject-specific Curves to Grouped Longitudinal Data

Statistical Machine Learning

Manifold Learning Examples PCA, LLE and ISOMAP

Introduction to General and Generalized Linear Models

Multivariate Analysis of Variance (MANOVA)

Part 2: Analysis of Relationship Between Two Variables

PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA

Canonical Correlation Analysis

Factor Rotations in Factor Analyses.

Data Mining: Algorithms and Applications Matrix Math Review

Additional sources Compilation of sources:

Lecture 3: Linear methods for classification

Principal Component Analysis

How to report the percentage of explained common variance in exploratory factor analysis

Multivariate Analysis of Variance (MANOVA): I. Theory

Principal components analysis

Multivariate normal distribution and testing for means (see MKB Ch 3)

Partial Least Squares (PLS) Regression.

Practical Considerations for Using Exploratory Factor Analysis in Educational Research

by the matrix A results in a vector which is a reflection of the given

3. Regression & Exponential Smoothing

An introduction to. Principal Component Analysis & Factor Analysis. Using SPSS 19 and R (psych package) Robin Beaumont robin@organplayers.co.

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

Lecture 5 Principal Minors and the Hessian

Least Squares Estimation

Association Between Variables

Vector and Matrix Norms

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

5. Linear Regression

Lecture 5: Singular Value Decomposition SVD (1)

Multivariate Analysis. Overview

Notes on Symmetric Matrices

2 Robust Principal Component Analysis

DATA ANALYSIS II. Matrix Algorithms

Similar matrices and Jordan form

Simple Linear Regression Inference

Nonlinear Iterative Partial Least Squares Method

Multivariate Analysis

Notes on Determinant

Transcription:

and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings and Communalities Properties of the Model Rotation Factor Scores Tutorial, guidelines and rules of thumb 2

Factor Analysis The aim of Factor Analysis is to find hidden (latent) variables which explain the correlation coefficients of the variables observed. Examples: A firm s image The sales aptitude of salespersons Performance of the firm Resistance to high-tech innovations 3

When to use FA? Exploratory / descriptive analysis: Learning the internal structure of the dataset: What are the main dimensions of the data? Helps to visualize multivariate data in lower-dimensional pictures Data reduction technique If you have a high-dimensional dataset, then is there a way to capture the essential information using a smaller number of variables? Can be also considered as a preprocessing step for techniques which are sensitive to multi-collinearity (e.g. regression) 4

Two Approaches Initially Factor structure to explain the correlations among variables is searched without any a priori theory What are the underlying processes that could have produced correlations among the variables? Note: there are no readily available criteria against which to test the solution Confirmatory Factor Analysis Factor structure is assumed to be known or hypothesized a priori Are the correlations among variables consistent with a hypothesized factor structure? Often performed through structural equations modeling 5

Part I: Basic concepts and examples 6

History and Example Originally developed by Spearman (1904) to explain student performance in various courses. Suppose the students test scores (M: Mathematics, P: Physics, C: Chemistry, E: English, H: History and F: French.) depends on The student s general intelligence I and The student s aptitude for a given course 7

Illustration of Factor Analysis Observed Variable Intelligence Latent Variable (Common Factor) Unique Factor Math. Physics Chem. English History French A M A P A C A E A H A F 8

Example: 1-factor model For example, as follows: M =.80I + A m P =.70I + A p C =.90I + A c E =.60I + A e H =.50I + A h F =.65I + A f, A m, A p, A c, A e, A h, and A f are standing for special aptitude (Specific Factors). The coefficients:.8,.7,.9,.6,.5, and.65 are called Factor Loadings. The variables M, P, C, E, H, and F are indicators or measures of I (Common Factor). 9

Assumptions: Means of the variables (indicators), the common factor I, and the unique factors are zero Variances of the variables (indicators), and the common factor I, are one Correlations between the common factor I and the unique factors are zero, and Correlations among the unique factors are zero 10

From Assumptions Variances of Variables (Ex. M) Var(M) = Var(0.8I + A m ) = 0.8 2 Var(I) + Var(A m ) = 0.8 2 + Var(A m ) Covariance of Variables (Ex. M and H) Cov(M, H) = Cov( (0.8I + A m ), (0.5I + A h ) ) = 0.8*0.5Cov(I, I) + 0.8Cov(I, A h ) + 0.5Cov(A m, I) + Cov(A m, A h ) = 0.8*0.5 = 0.40 Generally, the variables are standardized in factor analysis Var(M) & Cov = Corr 11

Variance Decomposition The total variance of any indicator is decomposed into two components Variance in common with I (Communality of the indicator) Variance in common with the specific factor Example: Var(M) = 0.8 2 + Var(A m ) 12

Variance decomposition (cont.) Source: Hair et al. (2010) Diagonal Value Unity (1) Variance Total Variance Communality Common Specific and Error Variance extracted Variance not used Hair et al. (2010): Multivariate Data Analysis, Pearson Education 13

Example: 2-Factor Model Factor loadings Common factors Specific factors 14

Example: 2-Factor Model M =.80Q +.20V + A m P =.70Q +.30V + A p C =.60Q +.30V + A c E =.20Q +.80V + A e H =.15Q +.82V + A h F =.25Q +.85V + A f 15

Covariances for 2-Factor Model Variances of Variables (Ex. M) Var(M) = Var(0.8Q + 0.2V + A m ) = 0.8 2 Var(Q) + 0.2 2 Var(V) + Var(A m ) = 0.8 2 + 0.2 2 + Var(A m ) Covariance of Variables (Ex. M and H) Cov(M, H) = = Cov((0.8Q + 0.2V + A m ), (0.15Q + 0.82V + A m )) = 0.8*0.15Cov(Q, Q) + 0.2*0.82Cov(V, V) = 0.8*0.15 + 0.2*0.82 = 0.284 16

Part II: Factor Model 17

Objectives of Factor Analysis To identify the smallest number of common factors that best explain the correlations among variables To estimate loadings and communalities To identify, via rotation, the most plausible factor structure To estimate factor scores, when desired 18

Some Theory: Factor Model Consider a p-dimensional random vector: x (µ, Σ). An m-factor model: x = Λf + ε + µ, where Λ = Λ (p,m) is a matrix of factor loadings, and f = f (m,1) and ε = ε (p,1) are random vectors. The elements of vector f are common factors and the elements of ε are unique factors. 19

Factor Equation Factor Equation: Σ = ΛΛ' + Ψ, where Σ is the covariance matrix of the variables X Λ is the loading matrix, Ψ is a diagonal matrix containing the unique variances 20

Factor Equation (cont.) The communalities: Σ Ψ The covariances (correlations) between the variables and the factors is given by: E((x - µ) f T ) = E((Λf + ε )f T ) = ΛE( f f T ) + E(ε f T ) = Λ 21

Interpretation of the Common Factors Loadings the covariances between the loadings and the factors Eigenvalues the variance explained by each factor 22

Factor Indeterminacy Factor indeterminacy due to rotation Factor indeterminacy due to the estimation of the communality problem 23

Factor Indeterminacy (cont.) Factor indeterminacy due to rotation Consider M =.667Q.484V + A m P =.680Q.343V + A p C =.615Q.267V + A c E =.741Q +.361V + A e These alternative loadings provide the same total communalities and uniquenesses as the previously presented solution H =.725Q +.412V + A h F =.812Q +.355V + A f Even correlation matrices are identical (Note that two common factors are assumed to be uncorrelated.) 24

Factor Indeterminacy (cont.) Factor indeterminacy due to the estimation of communality problem To estimate Loadings, the communalities are needed, and To estimate communalities, the loadings are needed (!) 25

Factor Analysis Techniques (Tabachnik and Fidell: Using Multivariate Statistics) Principal Component Factoring (PCF) The initial estimates of the communalities for all variables are equal to one (= Principal Component Analysis) Principal Axis Factoring (PAF) Principal components analyze total variance, whereas FA analyzes covariance (communality) An attempt is made to estimate the communalities: Explain each variable with the other variables and use the multiple determination as an initial estimate for communality Find the communalities through an iterative process 26

Image Factor Extraction Uses correlation matrix of predicted variables, where each variable is predicted using others via multiple regression A compromise between PCA and principal axis factoring Like PCA provides a mathematically unique solution because there are fixed values in the positive diagonal Like PAF, the values in the diagonal are communalities with unique error variability excluded Loadings represent covariances between variables and factors rather than correlations Maximum Likelihood Factor Extraction Population estimates for factor loadings are calculated which have the greatest probability of yielding a sample with the observed correlation matrix 27

Unweighted Least Squares Factoring Minimizes squared differences between the observed and reproduced correlation matrices Only off-diagonal differences considered, communalities are derived from solution rather than estimated as a part Special case of principal factors, where communalities are estimated after the solution Generalized (Weighted) Least Squares Factoring Variables that have substantial shared variance with other variables get higher weights than variables with large unique variance Alpha Factoring Interest is in discovering which common factors are found consistently when repeated samples of variables are taken from a population of variables 28

Summary of extraction procedures Technique Goal of analysis Special features Principal components Maximize variance extracted by orthogonal components Principal factors Image factoring Maximize variance extracted by orthogonal factors Provides an empirical factor analysis Source: Tabachnik and Fidell: Using Multivariate Statistics Mathematically determined, solution mixes common, unique, and error variance into components Estimates communalities to attempt to eliminate unique and error variance from variables Uses variances based on multiple regression of a variable with other variables as communalities 29

Summary of extraction procedures (cont.) Technique Goal of analysis Special features Maximum likelihood factoring Alpha factoring Unweighted least squares Generalized least squares Estimate factor loadings for population that maximize the likelihood of sampling the observed correlation matrix. Maximize the generalizability of orthogonal factors Minimize squared residual correlations Weights variables by shared variance before minimizing squared residual correlations Source: Tabachnik and Fidell: Using Multivariate Statistics Has significance test for factors; useful for confirmatory factor analysis 30

Part III: Rotations 31

Two Classes of Rotational Approaches Orthogonal = axes are maintained at 90 degrees. Oblique = axes are not maintained at 90 degrees. 32

Orthogonal Factor Rotation Source: Hair et al. (2010) Unrotated Factor II +1.0 Rotated Factor II V 1 +.50 V 2-1.0 -.50 0 +.50 +1.0 Unrotated Factor I -.50 V 5 V 4 V 3 Rotated Factor I -1.0

Oblique Factor Rotation Source: Hair et al. (2010) Unrotated Factor II +1.0 Orthogonal Rotation: Factor II V 1 Oblique Rotation: Factor II +.50 V 2-1.0 -.50 0 +.50 +1.0 Unrotated Factor I -.50-1.0 V 5 V 4 V 3 Oblique Rotation: Factor I Orthogonal Rotation: Factor I

Orthogonal Rotation Identify an orthogonal transformation matrix C such that: Λ* = ΛC and Σ = Λ*Λ* + Ψ, where C T C = I Remember the connection to factor indeterminacy problem (!) 35

Varimax Rotation Find a factor structure in which each variable loads highly on one and only one factor (i.e. to simplify columns of the loading matrix) That is, for any given factor, is the variance of the communalities of the variables within factor j Total variance: 36

Varimax Rotation (cont.) Find the orthogonal matrix C such that it maximizes V, which is equivalent to maximizing subject to the constraint that the communality of each variable remains the same. 37

Quartimax Rotation Purpose: To simplify rows of the loading matrix, i.e. to obtain a pattern of loadings such that: All the variables have a fairly high loading on one factor Each variable should have a high loading on one other factor and near zero loadings on the remaining factors The quartimax rotation will be most appropriate in the presence of the general factor 38

Quartimax Rotation (cont.) For any variable i, the variance of communalities (i.e. square of the loadings) is given by Then the total variance of all the variables is 39

Quartimax Rotation (cont.) Quartimax rotation is obtained by finding the orthogonal matrix C such that Q max. This problem can be reduced into the following form: Subject to the condition that the communality of each variable remains the same Varimax is often preferred over quartimax, since it leads to cleaner separation of factors and tends to be more invariant when a different subset of variables is analyzed 40

Oblique Rotations The factors are allowed to be correlated: oblique rotations offer a continuous range of correlations between factors The degree of correlation between factors is determined by the delta-variable δ: δ = 0 : solutions fairly highly correlated δ < 0 : solutions are increasingly orthogonal at about -4 solution is orthogonal δ ~ 1 : leads to very highly correlated solutions Note: Although delta affects size of correlation, maximum correlation at a given value depends on the dataset 41

Commonly Used Oblique Rotations Promax: Orthogonal factors rotated to oblique position (orthogonal loadings are raised to powers (usually 2, 4 or 6) to drive small and moderate loadings to zero while larger loadings are reduced Direct Oblimin: Simplifies factors by minimizing sum of cross-products of squared loadings in pattern matrix Values of δ > 0 produce highly correlated factors à careful consideration needed when deciding the number of factors (!) Helps to cope with situations encountered in practice? Note: factor loadings obtained after oblique rotations no longer represent correlations between factors and observed variables 42

Terminology in Oblique Rotations Factor correlation matrix = Correlations between factors (standardized factor scores) after rotation Pattern matrix = Regression-like weights representing the unique contribution of each factor to the variance in the variable (comparable to loadings matrix when having orthogonal factors) Structure matrix = Correlations between variables and correlated factors (given by the product of the pattern matrix and the factor correlation matrix) 43

Methods for Obtaining Factor Scores Thomson s (1951) regression estimates Assumes the factor scores to be random The assumption is appropriate when we are interested in the general structure (different samples consisting of different individuals) Bartlett s estimates Assumes the factor scores to be deterministic Assumes normality and that loadings and uniquenesses are known Anderson-Rubin estimates No clear favorite method, each has its advantages and disadvantages 44

Factor Scores via Multiple Regression Estimate: E{f ij } = β 1j x i1 + + β pj x ip In matrix form: E{F} = XB and for standardized variables: F = ZB Hence (n-1) -1 Z F = (n-1) -1 Z ZB Λ = RB B = R -1 Λ Since (n-1) -1 Z F = Λ and (n-1) -1 Z Z = R 45

FA in Practice Analysis of the HBAT s consumer survey results Form groups of 1 to 3 people Tutorial 46

Conceptual Issues Basic assumption is that an underlying structure exists in the set of variables Presence of correlated variables and detected factors do not guarantee relevance, even if statistical requirements are met Ensuring conceptual validity remains the responsibility of the researcher Remember: Do not mix dependent and independent variables in a single factor analysis, if the objective is to study dependence relationships using derived factors 47

Conceptual Issues (cont.) Ensure that the sample is homogeneous with respect to the underlying factor structure If the sample has multiple internal groups with unique characteristics, it may be inappropriate to apply factor analysis on the pooled data If different groups are expected, separate factor analysis should be performed for each group Compare group specific results to the combined sample 48

Sample Size and Missing Data Correlations estimated from small samples tend to be less reliable Minimum sample size should be 50 observations Sample must have more observations than variables Strive to maximize the number of obs / variable (desired ratio is 5:1) Recommendations by Comrey and Lee (1992): Sample size 100 = poor, 200 = fair, 300 = good, 500 = very good Missing values: If cases are missing values in a nonrandom pattern or if sample size is too small, estimation is needed Beware of using estimation procedures (e.g. regression) that are likely to overfit data and cause correlations to be too high 49

Factorability of Correlation Matrix A factorable correlation matrix should include several sizeable correlations (e.g. Bartlett s test of sphericity) If no correlation exceeds.30, use of FA is questionable Warning: high bivariate correlations do not necessarily ensure existence of factors à Examine partial correlations or anti-image correlations (negatives of partial correlations) If factors are present, high bivariate correlations become very low partial correlations 50

Partial Correlation Partial correlation between variables X and Y given a set of n controlling variables Z = {Z 1,,Z n } is the correlation coefficient ρ X,Y Z, where relatedness due to controlling variables is taken into account In practice: partial correlation is computed as bivariate correlation between residuals from linear regressions of X ~ Z and Y ~ Z X ρ 2 X,Y Z = a / (a+d) Y d a b c Z 51

Geometrical Interpretation of Partial Correlation Residuals from regressions Source: Wikipedia 52

Other Practical Issues Normality When FA is used in exploratory manner to summarize relationships, assumptions on distributions are not in force Normality enhances solution (but is not necessary) Linearity Multivariate normality implies linear relationships between pairs of variables Analysis is degraded when linearity fails (note: correlation measures linear relationship) Absence of multicollinearity and singularity Some degree of multicollinearity is desirable but extreme multicollinearity or singularity is an issue (check for eigenvalues close to zero or zero determinant of correlation matrix) 53

Outliers Among Cases and Variables Screening for outliers among cases Factor solution may be sensitive to outlying cases Screening for outliers among variables A variable with low squared multiple correlation with all other variables and low correlation with all important factors is an outlier among the variables Outlying variables are often ignored in current FA or the researcher may consider adding more related variables in a further study Note: Factors defined by just one or two variables are not stable (or real ). If the variance accounted by such factor is high enough, it may be interpreted with caution or ignored. 54

Choosing and Evaluating a Solution Number and nature of factors How many reliable and interpretable factors are there in the data set? What is the meaning of the factors? How are they interpreted? Importance of solutions / factors How much variance in a dataset is accounted for by the factors? Which factors account for the most variance? Testing theory in FA How well does the obtained solution fit an expected factor solution? Estimating scores on factors How do the subjects score on the factors? 55

Appendix: 56

Preliminary Considerations Assume that Population mean: is a vector of p random variables Population covariance: Correlation between i-th and j-th variable: 57

Preliminary Considerations Variance of a linear combination of p many variables Generalized Variance: Σ Total Variation: tr Σ 58

Preliminary Considerations Illustration: Combining Uncorrelated Variables a 1 a 1 a 1 0 2 0 1 = a a 2 1 + a 2 2 = 1, because r 12 = 0 2 s 2 2 = 1 a 2 r 12 = 0, a 12 +a 22 =1 a 1 s 1 2 = 1 59

Preliminary Considerations Illustration: Combining Correlated Variables & a % 1 a 2 $ " # 1 r 12! r $ %& 21 1 " "#! a $ %%& = a 2 1 + a 2 2 + 2a 1 a 2 r 12 > 1, when r 12 > 0 2 # "! a 1 s 2 2 = 1 a 2 r 12 > 0, a 12 +a 22 =1 a 1 s 1 2 = 1 60

Some Theory: Factor Model Consider a p-dimensional random vector: x (µ, Σ). An m-factor model: x = Λf + ε + µ, where Λ = Λ (p,m) is a matrix of factor loadings, and f = f (m, 1) and ε = ε (p,1) are random vectors. The elements of vector f are common factors and the elements of ε are unique factors. 61

Assumptions E(f) = 0 & Cov(f) = I E(ε) = 0 & Cov(ε i ε j ) = 0, i j Cov(f, ε) = 0 Cov(ε) = ψ = diag(ψ 11, ψ 22,, ψ pp ) Thus Σ = E((x - µ) (x - µ) T ) = E((Λf + ε)(λf + ε) T ) = E(Λf(Λf) T ) + E(ε ε T ) + E(Λfε T ) + E(ε(Λf) T ) = ΛE(f f T )Λ T + E(ε ε T ) + ΛE(f ε T ) + E(ε f T )Λ T = ΛΛ T + ψ 62

Label Name Size Description Λ Factor loading matrix (or pattern matrix in oblique methods) p x m Matrix of regression-like weights used to estimate the unique contribution of each factor to the variance in a variable x Vector of variables p x 1 Observed random variables Σ Covariance or correlation matrix p x p Covariances or correlations between variables µ Expected values of variables p x 1 Expected values of observed random variables f Common factors m x 1 Vector of common factors ε Unique factors p x 1 Vector of variable specific unique factors Ψ Covariance of unique factors p x p Covariance matrix for unique factors C Rotation matrix m x m Transformation matrix to produce rotated loading matrix 63

Factor Equation Factor Equation: Σ = ΛΛ' + Ψ, where Σ is the covariance matrix of the variables X Λ is the loading matrix, Ψ is a diagonal matrix containing the unique variances 64

Factor Equation (cont.) The communalities: Σ Ψ The covariances (correlations) between the variables and the factors is given by: E((x - µ) f T ) = E((Λf + ε )f T ) = ΛE( f f T ) + E(ε f T ) = Λ 65

Solving the Factor Equation How to solve? Σ Ψ = ΛΛ T Use Spectral Decomposition Theorem: Any symmetric matrix A (p,p) can be written as A = ΓΘΓ T, where Θ is a diagonal matrix of eigenvalues of A, and Γ is an orthogonal matrix whose columns are standardized eigenvectors. 66

Therefore Provided Ψ is known, we may have: Σ - Ψ = ΓΘΓ T = (ΓΘ 1/2 )( Θ 1/2 Γ T ) Assume the first k eigenvalues θ i > 0, i = 1, 2,, k, then we may write λ i = (θ i )1/2 γ i Thus Λ = Γ 1 Θ 1 1/2, where Γ 1 is k k 67

Thank you! 68