Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016


 Vivien Phillips
 2 years ago
 Views:
Transcription
1 and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016
2 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings and Communalities Properties of the Model Rotation Factor Scores Tutorial, guidelines and rules of thumb 2
3 Factor Analysis The aim of Factor Analysis is to find hidden (latent) variables which explain the correlation coefficients of the variables observed. Examples: A firm s image The sales aptitude of salespersons Performance of the firm Resistance to hightech innovations 3
4 When to use FA? Exploratory / descriptive analysis: Learning the internal structure of the dataset: What are the main dimensions of the data? Helps to visualize multivariate data in lowerdimensional pictures Data reduction technique If you have a highdimensional dataset, then is there a way to capture the essential information using a smaller number of variables? Can be also considered as a preprocessing step for techniques which are sensitive to multicollinearity (e.g. regression) 4
5 Two Approaches Initially Factor structure to explain the correlations among variables is searched without any a priori theory What are the underlying processes that could have produced correlations among the variables? Note: there are no readily available criteria against which to test the solution Confirmatory Factor Analysis Factor structure is assumed to be known or hypothesized a priori Are the correlations among variables consistent with a hypothesized factor structure? Often performed through structural equations modeling 5
6 Part I: Basic concepts and examples 6
7 History and Example Originally developed by Spearman (1904) to explain student performance in various courses. Suppose the students test scores (M: Mathematics, P: Physics, C: Chemistry, E: English, H: History and F: French.) depends on The student s general intelligence I and The student s aptitude for a given course 7
8 Illustration of Factor Analysis Observed Variable Intelligence Latent Variable (Common Factor) Unique Factor Math. Physics Chem. English History French A M A P A C A E A H A F 8
9 Example: 1factor model For example, as follows: M =.80I + A m P =.70I + A p C =.90I + A c E =.60I + A e H =.50I + A h F =.65I + A f, A m, A p, A c, A e, A h, and A f are standing for special aptitude (Specific Factors). The coefficients:.8,.7,.9,.6,.5, and.65 are called Factor Loadings. The variables M, P, C, E, H, and F are indicators or measures of I (Common Factor). 9
10 Assumptions: Means of the variables (indicators), the common factor I, and the unique factors are zero Variances of the variables (indicators), and the common factor I, are one Correlations between the common factor I and the unique factors are zero, and Correlations among the unique factors are zero 10
11 From Assumptions Variances of Variables (Ex. M) Var(M) = Var(0.8I + A m ) = Var(I) + Var(A m ) = Var(A m ) Covariance of Variables (Ex. M and H) Cov(M, H) = Cov( (0.8I + A m ), (0.5I + A h ) ) = 0.8*0.5Cov(I, I) + 0.8Cov(I, A h ) + 0.5Cov(A m, I) + Cov(A m, A h ) = 0.8*0.5 = 0.40 Generally, the variables are standardized in factor analysis Var(M) & Cov = Corr 11
12 Variance Decomposition The total variance of any indicator is decomposed into two components Variance in common with I (Communality of the indicator) Variance in common with the specific factor Example: Var(M) = Var(A m ) 12
13 Variance decomposition (cont.) Source: Hair et al. (2010) Diagonal Value Unity (1) Variance Total Variance Communality Common Specific and Error Variance extracted Variance not used Hair et al. (2010): Multivariate Data Analysis, Pearson Education 13
14 Example: 2Factor Model Factor loadings Common factors Specific factors 14
15 Example: 2Factor Model M =.80Q +.20V + A m P =.70Q +.30V + A p C =.60Q +.30V + A c E =.20Q +.80V + A e H =.15Q +.82V + A h F =.25Q +.85V + A f 15
16 Covariances for 2Factor Model Variances of Variables (Ex. M) Var(M) = Var(0.8Q + 0.2V + A m ) = Var(Q) Var(V) + Var(A m ) = Var(A m ) Covariance of Variables (Ex. M and H) Cov(M, H) = = Cov((0.8Q + 0.2V + A m ), (0.15Q V + A m )) = 0.8*0.15Cov(Q, Q) + 0.2*0.82Cov(V, V) = 0.8* *0.82 =
17 Part II: Factor Model 17
18 Objectives of Factor Analysis To identify the smallest number of common factors that best explain the correlations among variables To estimate loadings and communalities To identify, via rotation, the most plausible factor structure To estimate factor scores, when desired 18
19 Some Theory: Factor Model Consider a pdimensional random vector: x (µ, Σ). An mfactor model: x = Λf + ε + µ, where Λ = Λ (p,m) is a matrix of factor loadings, and f = f (m,1) and ε = ε (p,1) are random vectors. The elements of vector f are common factors and the elements of ε are unique factors. 19
20 Factor Equation Factor Equation: Σ = ΛΛ' + Ψ, where Σ is the covariance matrix of the variables X Λ is the loading matrix, Ψ is a diagonal matrix containing the unique variances 20
21 Factor Equation (cont.) The communalities: Σ Ψ The covariances (correlations) between the variables and the factors is given by: E((x  µ) f T ) = E((Λf + ε )f T ) = ΛE( f f T ) + E(ε f T ) = Λ 21
22 Interpretation of the Common Factors Loadings the covariances between the loadings and the factors Eigenvalues the variance explained by each factor 22
23 Factor Indeterminacy Factor indeterminacy due to rotation Factor indeterminacy due to the estimation of the communality problem 23
24 Factor Indeterminacy (cont.) Factor indeterminacy due to rotation Consider M =.667Q.484V + A m P =.680Q.343V + A p C =.615Q.267V + A c E =.741Q +.361V + A e These alternative loadings provide the same total communalities and uniquenesses as the previously presented solution H =.725Q +.412V + A h F =.812Q +.355V + A f Even correlation matrices are identical (Note that two common factors are assumed to be uncorrelated.) 24
25 Factor Indeterminacy (cont.) Factor indeterminacy due to the estimation of communality problem To estimate Loadings, the communalities are needed, and To estimate communalities, the loadings are needed (!) 25
26 Factor Analysis Techniques (Tabachnik and Fidell: Using Multivariate Statistics) Principal Component Factoring (PCF) The initial estimates of the communalities for all variables are equal to one (= Principal Component Analysis) Principal Axis Factoring (PAF) Principal components analyze total variance, whereas FA analyzes covariance (communality) An attempt is made to estimate the communalities: Explain each variable with the other variables and use the multiple determination as an initial estimate for communality Find the communalities through an iterative process 26
27 Image Factor Extraction Uses correlation matrix of predicted variables, where each variable is predicted using others via multiple regression A compromise between PCA and principal axis factoring Like PCA provides a mathematically unique solution because there are fixed values in the positive diagonal Like PAF, the values in the diagonal are communalities with unique error variability excluded Loadings represent covariances between variables and factors rather than correlations Maximum Likelihood Factor Extraction Population estimates for factor loadings are calculated which have the greatest probability of yielding a sample with the observed correlation matrix 27
28 Unweighted Least Squares Factoring Minimizes squared differences between the observed and reproduced correlation matrices Only offdiagonal differences considered, communalities are derived from solution rather than estimated as a part Special case of principal factors, where communalities are estimated after the solution Generalized (Weighted) Least Squares Factoring Variables that have substantial shared variance with other variables get higher weights than variables with large unique variance Alpha Factoring Interest is in discovering which common factors are found consistently when repeated samples of variables are taken from a population of variables 28
29 Summary of extraction procedures Technique Goal of analysis Special features Principal components Maximize variance extracted by orthogonal components Principal factors Image factoring Maximize variance extracted by orthogonal factors Provides an empirical factor analysis Source: Tabachnik and Fidell: Using Multivariate Statistics Mathematically determined, solution mixes common, unique, and error variance into components Estimates communalities to attempt to eliminate unique and error variance from variables Uses variances based on multiple regression of a variable with other variables as communalities 29
30 Summary of extraction procedures (cont.) Technique Goal of analysis Special features Maximum likelihood factoring Alpha factoring Unweighted least squares Generalized least squares Estimate factor loadings for population that maximize the likelihood of sampling the observed correlation matrix. Maximize the generalizability of orthogonal factors Minimize squared residual correlations Weights variables by shared variance before minimizing squared residual correlations Source: Tabachnik and Fidell: Using Multivariate Statistics Has significance test for factors; useful for confirmatory factor analysis 30
31 Part III: Rotations 31
32 Two Classes of Rotational Approaches Orthogonal = axes are maintained at 90 degrees. Oblique = axes are not maintained at 90 degrees. 32
33 Orthogonal Factor Rotation Source: Hair et al. (2010) Unrotated Factor II +1.0 Rotated Factor II V V Unrotated Factor I .50 V 5 V 4 V 3 Rotated Factor I 1.0
34 Oblique Factor Rotation Source: Hair et al. (2010) Unrotated Factor II +1.0 Orthogonal Rotation: Factor II V 1 Oblique Rotation: Factor II +.50 V Unrotated Factor I V 5 V 4 V 3 Oblique Rotation: Factor I Orthogonal Rotation: Factor I
35 Orthogonal Rotation Identify an orthogonal transformation matrix C such that: Λ* = ΛC and Σ = Λ*Λ* + Ψ, where C T C = I Remember the connection to factor indeterminacy problem (!) 35
36 Varimax Rotation Find a factor structure in which each variable loads highly on one and only one factor (i.e. to simplify columns of the loading matrix) That is, for any given factor, is the variance of the communalities of the variables within factor j Total variance: 36
37 Varimax Rotation (cont.) Find the orthogonal matrix C such that it maximizes V, which is equivalent to maximizing subject to the constraint that the communality of each variable remains the same. 37
38 Quartimax Rotation Purpose: To simplify rows of the loading matrix, i.e. to obtain a pattern of loadings such that: All the variables have a fairly high loading on one factor Each variable should have a high loading on one other factor and near zero loadings on the remaining factors The quartimax rotation will be most appropriate in the presence of the general factor 38
39 Quartimax Rotation (cont.) For any variable i, the variance of communalities (i.e. square of the loadings) is given by Then the total variance of all the variables is 39
40 Quartimax Rotation (cont.) Quartimax rotation is obtained by finding the orthogonal matrix C such that Q max. This problem can be reduced into the following form: Subject to the condition that the communality of each variable remains the same Varimax is often preferred over quartimax, since it leads to cleaner separation of factors and tends to be more invariant when a different subset of variables is analyzed 40
41 Oblique Rotations The factors are allowed to be correlated: oblique rotations offer a continuous range of correlations between factors The degree of correlation between factors is determined by the deltavariable δ: δ = 0 : solutions fairly highly correlated δ < 0 : solutions are increasingly orthogonal at about 4 solution is orthogonal δ ~ 1 : leads to very highly correlated solutions Note: Although delta affects size of correlation, maximum correlation at a given value depends on the dataset 41
42 Commonly Used Oblique Rotations Promax: Orthogonal factors rotated to oblique position (orthogonal loadings are raised to powers (usually 2, 4 or 6) to drive small and moderate loadings to zero while larger loadings are reduced Direct Oblimin: Simplifies factors by minimizing sum of crossproducts of squared loadings in pattern matrix Values of δ > 0 produce highly correlated factors à careful consideration needed when deciding the number of factors (!) Helps to cope with situations encountered in practice? Note: factor loadings obtained after oblique rotations no longer represent correlations between factors and observed variables 42
43 Terminology in Oblique Rotations Factor correlation matrix = Correlations between factors (standardized factor scores) after rotation Pattern matrix = Regressionlike weights representing the unique contribution of each factor to the variance in the variable (comparable to loadings matrix when having orthogonal factors) Structure matrix = Correlations between variables and correlated factors (given by the product of the pattern matrix and the factor correlation matrix) 43
44 Methods for Obtaining Factor Scores Thomson s (1951) regression estimates Assumes the factor scores to be random The assumption is appropriate when we are interested in the general structure (different samples consisting of different individuals) Bartlett s estimates Assumes the factor scores to be deterministic Assumes normality and that loadings and uniquenesses are known AndersonRubin estimates No clear favorite method, each has its advantages and disadvantages 44
45 Factor Scores via Multiple Regression Estimate: E{f ij } = β 1j x i1 + + β pj x ip In matrix form: E{F} = XB and for standardized variables: F = ZB Hence (n1) 1 Z F = (n1) 1 Z ZB Λ = RB B = R 1 Λ Since (n1) 1 Z F = Λ and (n1) 1 Z Z = R 45
46 FA in Practice Analysis of the HBAT s consumer survey results Form groups of 1 to 3 people Tutorial 46
47 Conceptual Issues Basic assumption is that an underlying structure exists in the set of variables Presence of correlated variables and detected factors do not guarantee relevance, even if statistical requirements are met Ensuring conceptual validity remains the responsibility of the researcher Remember: Do not mix dependent and independent variables in a single factor analysis, if the objective is to study dependence relationships using derived factors 47
48 Conceptual Issues (cont.) Ensure that the sample is homogeneous with respect to the underlying factor structure If the sample has multiple internal groups with unique characteristics, it may be inappropriate to apply factor analysis on the pooled data If different groups are expected, separate factor analysis should be performed for each group Compare group specific results to the combined sample 48
49 Sample Size and Missing Data Correlations estimated from small samples tend to be less reliable Minimum sample size should be 50 observations Sample must have more observations than variables Strive to maximize the number of obs / variable (desired ratio is 5:1) Recommendations by Comrey and Lee (1992): Sample size 100 = poor, 200 = fair, 300 = good, 500 = very good Missing values: If cases are missing values in a nonrandom pattern or if sample size is too small, estimation is needed Beware of using estimation procedures (e.g. regression) that are likely to overfit data and cause correlations to be too high 49
50 Factorability of Correlation Matrix A factorable correlation matrix should include several sizeable correlations (e.g. Bartlett s test of sphericity) If no correlation exceeds.30, use of FA is questionable Warning: high bivariate correlations do not necessarily ensure existence of factors à Examine partial correlations or antiimage correlations (negatives of partial correlations) If factors are present, high bivariate correlations become very low partial correlations 50
51 Partial Correlation Partial correlation between variables X and Y given a set of n controlling variables Z = {Z 1,,Z n } is the correlation coefficient ρ X,Y Z, where relatedness due to controlling variables is taken into account In practice: partial correlation is computed as bivariate correlation between residuals from linear regressions of X ~ Z and Y ~ Z X ρ 2 X,Y Z = a / (a+d) Y d a b c Z 51
52 Geometrical Interpretation of Partial Correlation Residuals from regressions Source: Wikipedia 52
53 Other Practical Issues Normality When FA is used in exploratory manner to summarize relationships, assumptions on distributions are not in force Normality enhances solution (but is not necessary) Linearity Multivariate normality implies linear relationships between pairs of variables Analysis is degraded when linearity fails (note: correlation measures linear relationship) Absence of multicollinearity and singularity Some degree of multicollinearity is desirable but extreme multicollinearity or singularity is an issue (check for eigenvalues close to zero or zero determinant of correlation matrix) 53
54 Outliers Among Cases and Variables Screening for outliers among cases Factor solution may be sensitive to outlying cases Screening for outliers among variables A variable with low squared multiple correlation with all other variables and low correlation with all important factors is an outlier among the variables Outlying variables are often ignored in current FA or the researcher may consider adding more related variables in a further study Note: Factors defined by just one or two variables are not stable (or real ). If the variance accounted by such factor is high enough, it may be interpreted with caution or ignored. 54
55 Choosing and Evaluating a Solution Number and nature of factors How many reliable and interpretable factors are there in the data set? What is the meaning of the factors? How are they interpreted? Importance of solutions / factors How much variance in a dataset is accounted for by the factors? Which factors account for the most variance? Testing theory in FA How well does the obtained solution fit an expected factor solution? Estimating scores on factors How do the subjects score on the factors? 55
56 Appendix: 56
57 Preliminary Considerations Assume that Population mean: is a vector of p random variables Population covariance: Correlation between ith and jth variable: 57
58 Preliminary Considerations Variance of a linear combination of p many variables Generalized Variance: Σ Total Variation: tr Σ 58
59 Preliminary Considerations Illustration: Combining Uncorrelated Variables a 1 a 1 a = a a a 2 2 = 1, because r 12 = 0 2 s 2 2 = 1 a 2 r 12 = 0, a 12 +a 22 =1 a 1 s 1 2 = 1 59
60 Preliminary Considerations Illustration: Combining Correlated Variables & a % 1 a 2 $ " # 1 r 12! r $ %& 21 1 " "#! a $ %%& = a a a 1 a 2 r 12 > 1, when r 12 > 0 2 # "! a 1 s 2 2 = 1 a 2 r 12 > 0, a 12 +a 22 =1 a 1 s 1 2 = 1 60
61 Some Theory: Factor Model Consider a pdimensional random vector: x (µ, Σ). An mfactor model: x = Λf + ε + µ, where Λ = Λ (p,m) is a matrix of factor loadings, and f = f (m, 1) and ε = ε (p,1) are random vectors. The elements of vector f are common factors and the elements of ε are unique factors. 61
62 Assumptions E(f) = 0 & Cov(f) = I E(ε) = 0 & Cov(ε i ε j ) = 0, i j Cov(f, ε) = 0 Cov(ε) = ψ = diag(ψ 11, ψ 22,, ψ pp ) Thus Σ = E((x  µ) (x  µ) T ) = E((Λf + ε)(λf + ε) T ) = E(Λf(Λf) T ) + E(ε ε T ) + E(Λfε T ) + E(ε(Λf) T ) = ΛE(f f T )Λ T + E(ε ε T ) + ΛE(f ε T ) + E(ε f T )Λ T = ΛΛ T + ψ 62
63 Label Name Size Description Λ Factor loading matrix (or pattern matrix in oblique methods) p x m Matrix of regressionlike weights used to estimate the unique contribution of each factor to the variance in a variable x Vector of variables p x 1 Observed random variables Σ Covariance or correlation matrix p x p Covariances or correlations between variables µ Expected values of variables p x 1 Expected values of observed random variables f Common factors m x 1 Vector of common factors ε Unique factors p x 1 Vector of variable specific unique factors Ψ Covariance of unique factors p x p Covariance matrix for unique factors C Rotation matrix m x m Transformation matrix to produce rotated loading matrix 63
64 Factor Equation Factor Equation: Σ = ΛΛ' + Ψ, where Σ is the covariance matrix of the variables X Λ is the loading matrix, Ψ is a diagonal matrix containing the unique variances 64
65 Factor Equation (cont.) The communalities: Σ Ψ The covariances (correlations) between the variables and the factors is given by: E((x  µ) f T ) = E((Λf + ε )f T ) = ΛE( f f T ) + E(ε f T ) = Λ 65
66 Solving the Factor Equation How to solve? Σ Ψ = ΛΛ T Use Spectral Decomposition Theorem: Any symmetric matrix A (p,p) can be written as A = ΓΘΓ T, where Θ is a diagonal matrix of eigenvalues of A, and Γ is an orthogonal matrix whose columns are standardized eigenvectors. 66
67 Therefore Provided Ψ is known, we may have: Σ  Ψ = ΓΘΓ T = (ΓΘ 1/2 )( Θ 1/2 Γ T ) Assume the first k eigenvalues θ i > 0, i = 1, 2,, k, then we may write λ i = (θ i )1/2 γ i Thus Λ = Γ 1 Θ 1 1/2, where Γ 1 is k k 67
68 Thank you! 68
Multivariate Analysis (Slides 13)
Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables
More informationDoing Quantitative Research 26E02900, 6 ECTS Lecture 2: Measurement Scales. OlliPekka Kauppila Rilana Riikkinen
Doing Quantitative Research 26E02900, 6 ECTS Lecture 2: Measurement Scales OlliPekka Kauppila Rilana Riikkinen Learning Objectives 1. Develop the ability to assess a quality of measurement instruments
More informationPrincipal Components Analysis (PCA)
Principal Components Analysis (PCA) Janette Walde janette.walde@uibk.ac.at Department of Statistics University of Innsbruck Outline I Introduction Idea of PCA Principle of the Method Decomposing an Association
More informationCommon factor analysis
Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor
More informationIntroduction to Principal Components and FactorAnalysis
Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a
More informationFactor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models
Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis
More informationFactor analysis. Angela Montanari
Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number
More informationFACTOR ANALYSIS NASC
FACTOR ANALYSIS NASC Factor Analysis A data reduction technique designed to represent a wide range of attributes on a smaller number of dimensions. Aim is to identify groups of variables which are relatively
More informationFactor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business
Factor Analysis Advanced Financial Accounting II Åbo Akademi School of Business Factor analysis A statistical method used to describe variability among observed variables in terms of fewer unobserved variables
More informationFactor Analysis. Factor Analysis
Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we
More informationA Brief Introduction to SPSS Factor Analysis
A Brief Introduction to SPSS Factor Analysis SPSS has a procedure that conducts exploratory factor analysis. Before launching into a step by step example of how to use this procedure, it is recommended
More informationNotes for STA 437/1005 Methods for Multivariate Data
Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.
More informationFactor Analysis. Chapter 420. Introduction
Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.
More informationExploratory Factor Analysis
Exploratory Factor Analysis Definition Exploratory factor analysis (EFA) is a procedure for learning the extent to which k observed variables might measure m abstract variables, wherein m is less than
More informationFACTOR ANALYSIS EXPLORATORY APPROACHES. Kristofer Årestedt
FACTOR ANALYSIS EXPLORATORY APPROACHES Kristofer Årestedt 20130428 UNIDIMENSIONALITY Unidimensionality imply that a set of items forming an instrument measure one thing in common Unidimensionality is
More informationIntroduction to Matrix Algebra
Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra  1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary
More informationExploratory Factor Analysis Brian Habing  University of South Carolina  October 15, 2003
Exploratory Factor Analysis Brian Habing  University of South Carolina  October 15, 2003 FA is not worth the time necessary to understand it and carry it out. Hills, 1977 Factor analysis should not
More informationStatistics in Psychosocial Research Lecture 8 Factor Analysis I. Lecturer: Elizabeth GarrettMayer
This work is licensed under a Creative Commons AttributionNonCommercialShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationFACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.
FACTOR ANALYSIS Introduction Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables Both methods differ from regression in that they don t have
More informationExploratory Factor Analysis: rotation. Psychology 588: Covariance structure and factor models
Exploratory Factor Analysis: rotation Psychology 588: Covariance structure and factor models Rotational indeterminacy Given an initial (orthogonal) solution (i.e., Φ = I), there exist infinite pairs of
More informationTopic 10: Factor Analysis
Topic 10: Factor Analysis Introduction Factor analysis is a statistical method used to describe variability among observed variables in terms of a potentially lower number of unobserved variables called
More informationStatistics for Business Decision Making
Statistics for Business Decision Making Faculty of Economics University of Siena 1 / 62 You should be able to: ˆ Summarize and uncover any patterns in a set of multivariate data using the (FM) ˆ Apply
More information4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as:
1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data 2. Linearity (in the relationships among the variablesfactors are linear constructions of the set of variables; the critical source
More information2. Linearity (in relationships among the variablesfactors are linear constructions of the set of variables) F 2 X 4 U 4
1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data. Linearity (in relationships among the variablesfactors are linear constructions of the set of variables) 3. Univariate and multivariate
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 03 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationExploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk
Doi:10.5901/mjss.2014.v5n20p303 Abstract Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk Wilbert Sibanda Philip D. Pretorius
More informationSPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011
SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 Statistical techniques to be covered Explore relationships among variables Correlation Regression/Multiple regression Logistic regression Factor analysis
More informationExtending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances?
1 Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances? André Beauducel 1 & Norbert Hilger University of Bonn,
More informationRandom Vectors and the Variance Covariance Matrix
Random Vectors and the Variance Covariance Matrix Definition 1. A random vector X is a vector (X 1, X 2,..., X p ) of jointly distributed random variables. As is customary in linear algebra, we will write
More informationTtest & factor analysis
Parametric tests Ttest & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationFactor Analysis. Sample StatFolio: factor analysis.sgp
STATGRAPHICS Rev. 1/10/005 Factor Analysis Summary The Factor Analysis procedure is designed to extract m common factors from a set of p quantitative variables X. In many situations, a small number of
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationCanonical Correlation
Chapter 400 Introduction Canonical correlation analysis is the study of the linear relations between two sets of variables. It is the multivariate extension of correlation analysis. Although we will present
More informationExploratory Factor Analysis
Introduction Principal components: explain many variables using few new variables. Not many assumptions attached. Exploratory Factor Analysis Exploratory factor analysis: similar idea, but based on model.
More informationChapter 7 Factor Analysis SPSS
Chapter 7 Factor Analysis SPSS Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of correlations within a set of observed variables. Factor analysis is often
More informationThe president of a Fortune 500 firm wants to measure the firm s image.
4. Factor Analysis A related method to the PCA is the Factor Analysis (FA) with the crucial difference that in FA a statistical model is constructed to explain the interrelations (correlations) between
More information[1] Diagonal factorization
8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:
More informationA Beginner s Guide to Factor Analysis: Focusing on Exploratory Factor Analysis
Tutorials in Quantitative Methods for Psychology 2013, Vol. 9(2), p. 7994. A Beginner s Guide to Factor Analysis: Focusing on Exploratory Factor Analysis An Gie Yong and Sean Pearce University of Ottawa
More informationCHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.
CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In
More informationRachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA
PROC FACTOR: How to Interpret the Output of a RealWorld Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a realworld example of a factor
More information1 Introduction. 2 Matrices: Definition. Matrix Algebra. Hervé Abdi Lynne J. Williams
In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 00 Matrix Algebra Hervé Abdi Lynne J. Williams Introduction Sylvester developed the modern concept of matrices in the 9th
More informationUnderstanding and Using Factor Scores: Considerations for the Applied Researcher
A peerreviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationPrincipal Component Analysis
Principal Component Analysis ERS70D George Fernandez INTRODUCTION Analysis of multivariate data plays a key role in data analysis. Multivariate data consists of many different attributes or variables recorded
More informationDimensionality Reduction: Principal Components Analysis
Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely
More information3. The Multivariate Normal Distribution
3. The Multivariate Normal Distribution 3.1 Introduction A generalization of the familiar bell shaped normal density to several dimensions plays a fundamental role in multivariate analysis While real data
More informationSections 2.11 and 5.8
Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and
More informationChapter 6. Orthogonality
6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be
More informationLeastSquares Intersection of Lines
LeastSquares Intersection of Lines Johannes Traa  UIUC 2013 This writeup derives the leastsquares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
More informationLecture 7: Factor Analysis. Laura McAvinue School of Psychology Trinity College Dublin
Lecture 7: Factor Analysis Laura McAvinue School of Psychology Trinity College Dublin The Relationship between Variables Previous lectures Correlation Measure of strength of association between two variables
More informationRegression III: Advanced Methods
Lecture 5: Linear leastsquares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression
More informationFitting Subjectspecific Curves to Grouped Longitudinal Data
Fitting Subjectspecific Curves to Grouped Longitudinal Data Djeundje, Viani HeriotWatt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK Email: vad5@hw.ac.uk Currie,
More informationSimilarity and Diagonalization. Similar Matrices
MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that
More informationData analysis process
Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis
More informationOverview of Factor Analysis
Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 354870348 Phone: (205) 3484431 Fax: (205) 3488648 August 1,
More informationFactor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red)
Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red) The following DATA procedure is to read input data. This will create a SAS dataset named CORRMATR
More informationLinear Algebra Review. Vectors
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length
More informationSTA 4107/5107. Chapter 3
STA 4107/5107 Chapter 3 Factor Analysis 1 Key Terms Please review and learn these terms. 2 What is Factor Analysis? Factor analysis is an interdependence technique (see chapter 1) that primarily uses metric
More informationFactor Analysis: Statnotes, from North Carolina State University, Public Administration Program. Factor Analysis
Factor Analysis Overview Factor analysis is used to uncover the latent structure (dimensions) of a set of variables. It reduces attribute space from a larger number of variables to a smaller number of
More informationA Introduction to Matrix Algebra and Principal Components Analysis
A Introduction to Matrix Algebra and Principal Components Analysis Multivariate Methods in Education ERSH 8350 Lecture #2 August 24, 2011 ERSH 8350: Lecture 2 Today s Class An introduction to matrix algebra
More informationDISCRIMINANT FUNCTION ANALYSIS (DA)
DISCRIMINANT FUNCTION ANALYSIS (DA) John Poulsen and Aaron French Key words: assumptions, further reading, computations, standardized coefficents, structure matrix, tests of signficance Introduction Discriminant
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationPsychology 7291, Multivariate Analysis, Spring 2003. SAS PROC FACTOR: Suggestions on Use
: Suggestions on Use Background: Factor analysis requires several arbitrary decisions. The choices you make are the options that you must insert in the following SAS statements: PROC FACTOR METHOD=????
More informationMultidimensional data and factorial methods
Multidimensional data and factorial methods Bidimensional data x 5 4 3 4 X 3 6 X 3 5 4 3 3 3 4 5 6 x Cartesian plane Multidimensional data n X x x x n X x x x n X m x m x m x nm Factorial plane Interpretation
More informationExploratory Factor Analysis
Exploratory Factor Analysis ( 探 索 的 因 子 分 析 ) Yasuyo Sawaki Waseda University JLTA2011 Workshop Momoyama Gakuin University October 28, 2011 1 Today s schedule Part 1: EFA basics Introduction to factor
More informationManifold Learning Examples PCA, LLE and ISOMAP
Manifold Learning Examples PCA, LLE and ISOMAP Dan Ventura October 14, 28 Abstract We try to give a helpful concrete example that demonstrates how to use PCA, LLE and Isomap, attempts to provide some intuition
More informationPrincipal Component Analysis Application to images
Principal Component Analysis Application to images Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering, Department of Cybernetics Center for Machine Perception http://cmp.felk.cvut.cz/
More informationIntroduction to Principal Component Analysis: Stock Market Values
Chapter 10 Introduction to Principal Component Analysis: Stock Market Values The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More informationSolution based on matrix technique Rewrite. ) = 8x 2 1 4x 1x 2 + 5x x1 2x 2 2x 1 + 5x 2
8.2 Quadratic Forms Example 1 Consider the function q(x 1, x 2 ) = 8x 2 1 4x 1x 2 + 5x 2 2 Determine whether q(0, 0) is the global minimum. Solution based on matrix technique Rewrite q( x1 x 2 = x1 ) =
More informationFactor Rotations in Factor Analyses.
Factor Rotations in Factor Analyses. Hervé Abdi 1 The University of Texas at Dallas Introduction The different methods of factor analysis first extract a set a factors from a data set. These factors are
More informationPrincipal Component Analysis
Principal Component Analysis Principle Component Analysis: A statistical technique used to examine the interrelations among a set of variables in order to identify the underlying structure of those variables.
More information2 Robust Principal Component Analysis
Robust Multivariate Methods in Geostatistics Peter Filzmoser 1, Clemens Reimann 2 1 Department of Statistics, Probability Theory, and Actuarial Mathematics, Vienna University of Technology, A1040 Vienna,
More informationFactor Analysis  2 nd TUTORIAL
Factor Analysis  2 nd TUTORIAL Subject marks File sub_marks.csv shows correlation coefficients between subject scores for a sample of 220 boys. sub_marks
More informationα α λ α = = λ λ α ψ = = α α α λ λ ψ α = + β = > θ θ β > β β θ θ θ β θ β γ θ β = γ θ > β > γ θ β γ = θ β = θ β = θ β = β θ = β β θ = = = β β θ = + α α α α α = = λ λ λ λ λ λ λ = λ λ α α α α λ ψ + α =
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationCanonical Correlation Analysis
Canonical Correlation Analysis LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the similarities and differences between multiple regression, factor analysis,
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationPRINCIPAL COMPONENTS AND THE MAXIMUM LIKELIHOOD METHODS AS TOOLS TO ANALYZE LARGE DATA WITH A PSYCHOLOGICAL TESTING EXAMPLE
PRINCIPAL COMPONENTS AND THE MAXIMUM LIKELIHOOD METHODS AS TOOLS TO ANALYZE LARGE DATA WITH A PSYCHOLOGICAL TESTING EXAMPLE Markela Muca Llukan Puka Klodiana Bani Department of Mathematics, Faculty of
More informationMultivariate Analysis of Variance (MANOVA): I. Theory
Gregory Carey, 1998 MANOVA: I  1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models  part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK2800 Kgs. Lyngby
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationMultivariate Analysis of Variance (MANOVA)
Multivariate Analysis of Variance (MANOVA) Aaron French, Marcelo Macedo, John Poulsen, Tyler Waterson and Angela Yu Keywords: MANCOVA, special cases, assumptions, further reading, computations Introduction
More information3. Regression & Exponential Smoothing
3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a
More informationproblem arises when only a nonrandom sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a nonrandom
More information, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (
Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we
More informationPractical Considerations for Using Exploratory Factor Analysis in Educational Research
A peerreviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to
More informationYiming Peng, Department of Statistics. February 12, 2013
Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop
More informationHow to report the percentage of explained common variance in exploratory factor analysis
UNIVERSITAT ROVIRA I VIRGILI How to report the percentage of explained common variance in exploratory factor analysis Tarragona 2013 Please reference this document as: LorenzoSeva, U. (2013). How to report
More informationLecture 5 Principal Minors and the Hessian
Lecture 5 Principal Minors and the Hessian Eivind Eriksen BI Norwegian School of Management Department of Economics October 01, 2010 Eivind Eriksen (BI Dept of Economics) Lecture 5 Principal Minors and
More informationSF2940: Probability theory Lecture 8: Multivariate Normal Distribution
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,
More informationMehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics
INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree
More informationMultivariate normal distribution and testing for means (see MKB Ch 3)
Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 Onesample ttest (univariate).................................................. 3 Twosample ttest (univariate).................................................
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN13: 9780470860809 ISBN10: 0470860804 Editors Brian S Everitt & David
More information11/20/2014. Correlational research is used to describe the relationship between two or more naturally occurring variables.
Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection
More informationUsing the Singular Value Decomposition
Using the Singular Value Decomposition Emmett J. Ientilucci Chester F. Carlson Center for Imaging Science Rochester Institute of Technology emmett@cis.rit.edu May 9, 003 Abstract This report introduces
More information