Principal Components Analysis (PCA)


 Shavonne Farmer
 1 years ago
 Views:
Transcription
1 Principal Components Analysis (PCA) Janette Walde Department of Statistics University of Innsbruck
2 Outline I Introduction Idea of PCA Principle of the Method Decomposing an Association Matrix Which association matrix to use? Requirements Interpreting the components Rotation of components How many components to retain? Application Factor Analysis Theoretical Concept Extraction Methods Factor Scores
3 Outline II Literature
4 Introduction Idea of PCA Idea of PCA I Suppose that we have a matrix of data X with dimension n p, where p is large. A central problem in multivariate data analysis is dimension reduction: Is it possible to describe, with accuracy, the values of p variables with a smaller number r < p of new variables (variable reduction)? The principal component analysis approach consists on providing an adequate representation of the information with a smaller number of variables constructed as linear combinations of the originals (centered).
5 Introduction Idea of PCA Idea of PCA II We begin by identifying a group of variables whose variance we believe can be represented more parsimoniously by a smaller set of components, or factors. The end result of the principal components analysis will tell us which variables can be represented by which components, and which variables should be retained as individual variables because the factor solution does not adequately represent their information. Relate observed variables to latent variables.
6 Introduction Principle of the Method PCA Principle  Axis Rotation
7 Introduction Principle of the Method Notation: Linear combinations of variables For i = 1,..., n objects (observation units) PCA transforms j = 1 to p variables (X 1, X 2,..., X p ) into k = 1,..., p new uncorrelated variables (Z 1, Z 2,..., Z p ) called principal components or factors: z ik = c 1k x i1 + c 2k x i c pk x ip z ik...value or score for component k for object i x ij... values of the original variables for object i c jk... weights (coefficients) that indicate how much each original variable (j) contributes to the linear combination forming this component (k)
8 Introduction Principle of the Method Linear combinations of variables II Depending on the analysis, these new variables are termed variously, discriminant functions, canonical functions or variates, principal components or factors. The derived variables are extracted so the first explains most of the variance in the original variables, the second explains most of the remaining variance,... The new derived variables are independent of each other. Although the number of components that can be derived is equal to the number of original variables, p, we hope that the first few components summarize most of the variation of the original variables.
9 Decomposing an Association Matrix Eigenvalues and Eigenvectors I When there are more than two variables the components are extracted in practice by a spectral decomposition of a covariance or correlation matrix. The matrix approach to deriving components produces two important pieces of information: Eigenvalues and Eigenvectors of the association matrix. Eigenvalue, also called characteristic root (λ 1,...,λ p ). Estimates of the eigenvalues provides measures of the amount of the original total variance explained by each of the new derived variables. The sum of all the eigenvalues equals the sum of the variances of the original variables. PCA rearranges the variance in the original variables so it is concentrated in the first few new components.
10 Decomposing an Association Matrix Eigenvalues and Eigenvectors II Eigenvectors (characteristic vectors). Eigenvectors are lists of coefficients or weights (cj ) showing how much each original variable contributes to each new derived variable. The eigenvectors are usually scaled so that the sum of squared coefficients for each eigenvector equals one. Eigenvectors are orthogonal.
11 Which association matrix to use? Which association matrix to use? The choice is the covariance matrix of the correlation matrix: The covariance matrix is based on meancentered variables and is appropriate when the variables are measured in comparable units and differences in variance between variables make an important contribution to the interpretation. The correlation is based on zstandardized variables and is necessary when the variables are measured in very different units and we wish to ignore differences between variances. The choice depends on how much we want different variances among variables to influence our results.
12 Requirements Requirements of PCA I The variables included must be metric level or dichotomous (dummycoded) nominal level. The sample size must be greater than 50 (preferably 100). The ratio of cases to variables must be 5 to 1 or larger. Significance of the elements of the correlation matrix. The correlation matrix for the variables must contain 2 or more correlations of 0.30 or greater. Inverse of correlation matrix. Off diagonal elements of the inverse matrix of the correlation matrix should be near zero. BartlettTest (test of sphericity). The Bartlett test of sphericity is statistically significant. H 0 : Variables are uncorrelated. H 1 : Variables are correlated.
13 Requirements Requirements of PCA II Anti Image Matrix. The antiimage correlation matrix contains partial correlation coefficients, and the antiimage covariance matrix contains partial covariances. Most of the offdiagonal elements should be small in a good factor model. Rule of thumb: The correlation matrix is not suitable for factor analysis if the proportion of offdiagonal elements of the anti image covariance matrix being unequal to zero (> 0.09) is more than 25%. KaiserMeyerOlkinCriterium. The measure of sampling adequacy (MSA) shows to what extend the original variables belong together. If the MSA is less than 0.5 the correlation matrix is not applicable.
14 Interpreting the components Interpreting the components The eigenvectors provide the coefficients (c j s) for each variable in the linear combination for each component. The further each coefficient is from zero, the greater the contribution that variable makes to that component. Component loadings are simple correlations (using Pearson s r) between the components and the original variables. Ideally we would like a situation where each variable loads strongly on only one component and the loadings are close to plus/minus one or zero. For interpretation we look at loadings in absolute value greater than 0.5.
15 Rotation of components Rotation of components I The common situation where numerous variables load moderately on each component can sometimes be alleviated by a second rotation of the components after the initial PCA. The aim of this additional rotation is to obtain simple structure, where the coefficients within a component are as close to one or zero as possible. There are two types of factor rotation methods: Orthogonal rotations. For example the Varimax procedure rotates the axis such that the two vertices remain 90 degrees (perpendicular) to each other. Assumes uncorrelated factors.
16 Rotation of components Rotation of components II Oblique rotation (Direct Oblimin) rotates the axis such that the vertices can have any angle (e.g., other than 90 degrees). Allows factors to be correlated. One can specify the parameter Delta to control the extent to which factors can be correlated among themselves. Delta should be 0 or negative, with 0 yielding the most highly correlated factors and large negative numbers yielding nearly orthogonal solutions (Rule of thumb: 5 is almost orthogonal).
17 How many components to retain? How many components to retain? Interpretability. It is important to examine the interpretability of the components and make sure that those providing a biologically interpretable result are retained. Eigenvalues greater than one rule. If the PCA is based on a correlation matrix keep any component that has an eigenvalue greater than one. Scree diagram. Plot of the eigenvalues for each component and looking for an obvious break (or elbow). Test of eigenvalue equality. Bartlett s test when using a covariance matrix. Analysis of residuals.
18 How many components to retain? Analysis of residuals  Qvalues When we retain fewer than all p components, we can only estimate the original data and there will be some of the information in the original data not explained by the components  this is the residual. Alternatively, we can measure the difference between the observed correlations or covariances and the predicted correlations or covariances based on the less than p components  this is termed the residual correlation or covariance matrix. We have a residual term for each variable for each object and the sum (across variables) of squares of the residuals, often termed Q, can be derived for each object. Unusually large values for Q for any observation are an indication that the less than p components we have retained do not adequately represent the original data set for that object.
19 Application Application Lovett et al. (2000) studied the chemistry of forested watersheds in the Catskill Mountains in New York State. They close 29 sites (observations) on first and second order streams and measured the concentrations of ten chemical variables (NO 3, total organic N, total N, NH 4 Cl, Ca 2+, Mg 2+, H + ), averaged over three years, and four watershed variables (maximum elevation, sample elevation, length of stream, watershed area)., dissolved organic C, SO2 4, Only use the chemical variables for the PCA. Preliminary checks of the data showed that one stream, Winnisook Brook, was severely acidified with a concentration of H far in excess of the other streams so this site was omitted from further analysis. Additionally, three variables (dissolved organic C, Cl and H) were very strongly skewed and were transformed to log 10.
20 Application Descriptive Statistics Mean Std. Deviation Analysis N NO TON TN NH SO CA MG LDOC LCL LH
21 Application Total Variance Explained Component Initial Eigenvalues Total % of Variance Cumulative Extraction Method: Principal Component Analysis.
22 Application Scree Plot Scree Plot 4 3 Eigenvalue Component Number
23 Application Component Matrix (a) Component NO TON TN NH SO CA MG LDOC LCL LH Extraction Method: Principal Component Analysis. (a) 3 components extracted.
24 Application Rotated Component Matrix (a) Component NO TON TN NH SO CA MG LDOC LCL LH Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. (a) Rotation converged in 5 iterations.
25 Application Rotation Sums of Squared Loadings Component Total % of Variance Cumulative % Component Plot in Rotated Space 1.0 NO3 TN Component LH LDOC CA SO4 NH4 LCLMG TON Component Component 3
26 Factor Analysis Theoretical Concept Communalities I Communality is the total amount of variance an original variable shares with all other variables included in the analysis. Two important concepts of communalities: 1. Principal components analysis assumes that the total variance of the original variables can be explained via the components and uses as starting values for the communalities 1. PCA does not explicitly estimate communalities as in the underlying theoretical model communalities are set equal to the total variance.
27 Factor Analysis Theoretical Concept Communalities II 2. However, if we assume that besides a common variance each variable has a specific variance too communalities have to be less than 1. Additionally, different factor extraction methods can be applied. As starting estimates for the communalities in many cases the highest squared correlation coefficient of two variables is employed: 1 h 2 j R 2 j max k {r 2 jk }. 3. Using the theoretical concept of specific variances of each original variable and employing an appropriate factor extracting method we are conducting a Factor Analysis.
28 Factor Analysis Extraction Methods Extraction Methods in Factor Analysis I Principal components refers to the principal components model, in which items are assumed to be exact linear combinations of factors. The Principal components method assumes that components (factors) are uncorrelated. It also assumes that the communality of each item sums to 1 over all components (factors), implying that each item has 0 unique variance. The remaining factor extraction methods allow the variance of each item to be composed to be a function of both item communality and nonzero unique item variance. The following are methods of Common Factor Analysis:
29 Factor Analysis Extraction Methods Extraction Methods in Factor Analysis II Principal axis factoring uses squared multiple correlations as initial estimates of the communalities. These communalities are entered into the diagonals of the correlation matrix, before factors are extracted from this matrix....
30 Factor Analysis Factor Scores Factor Scores I A factor score coefficient matrix shows the coefficients by which items are multiplied to obtain factor scores. 1. Regression Scores  Regression factor scores have a mean of 0 and variance equal to the squared multiple correlation between the estimated factor scores and the true factor values. They can be correlated even when factors are assumed to be orthogonal. The sum of squared discrepancies between true and estimated factors over individuals is minimized. 2. Bartlett Scores  Bartlett factor scores have a mean of 0. The sum of squares of the unique factors over the range of items is minimized.
31 Factor Analysis Factor Scores Factor Scores II 3. AndersonRubin Scores  AndersonRubin factor scores are a modification of Bartlett scores to ensure orthogonality of the estimated factors. They have a mean of 0 and a standard deviation of 1.
32 Literature Literature Lovett, G.M., Weathers, K.C., and Sobczak, W. V. (2000). Nitrogen saturation and retention in forested watersheds of the Catskill Mountains, New York. Ecological Applications 10, pp
FACTOR ANALYSIS NASC
FACTOR ANALYSIS NASC Factor Analysis A data reduction technique designed to represent a wide range of attributes on a smaller number of dimensions. Aim is to identify groups of variables which are relatively
More informationPrincipal Component Analysis
Principal Component Analysis Principle Component Analysis: A statistical technique used to examine the interrelations among a set of variables in order to identify the underlying structure of those variables.
More informationFactor Analysis. Chapter 420. Introduction
Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.
More informationDoing Quantitative Research 26E02900, 6 ECTS Lecture 2: Measurement Scales. OlliPekka Kauppila Rilana Riikkinen
Doing Quantitative Research 26E02900, 6 ECTS Lecture 2: Measurement Scales OlliPekka Kauppila Rilana Riikkinen Learning Objectives 1. Develop the ability to assess a quality of measurement instruments
More informationExploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016
and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings
More informationIntroduction to Principal Components and FactorAnalysis
Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a
More informationFactor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models
Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis
More informationA Brief Introduction to SPSS Factor Analysis
A Brief Introduction to SPSS Factor Analysis SPSS has a procedure that conducts exploratory factor analysis. Before launching into a step by step example of how to use this procedure, it is recommended
More informationExploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk
Doi:10.5901/mjss.2014.v5n20p303 Abstract Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk Wilbert Sibanda Philip D. Pretorius
More informationFactor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red)
Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red) The following DATA procedure is to read input data. This will create a SAS dataset named CORRMATR
More informationTtest & factor analysis
Parametric tests Ttest & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue
More informationCommon factor analysis
Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor
More informationOverview of Factor Analysis
Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 354870348 Phone: (205) 3484431 Fax: (205) 3488648 August 1,
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 03 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationFACTOR ANALYSIS EXPLORATORY APPROACHES. Kristofer Årestedt
FACTOR ANALYSIS EXPLORATORY APPROACHES Kristofer Årestedt 20130428 UNIDIMENSIONALITY Unidimensionality imply that a set of items forming an instrument measure one thing in common Unidimensionality is
More informationFACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.
FACTOR ANALYSIS Introduction Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables Both methods differ from regression in that they don t have
More informationLecture 7: Factor Analysis. Laura McAvinue School of Psychology Trinity College Dublin
Lecture 7: Factor Analysis Laura McAvinue School of Psychology Trinity College Dublin The Relationship between Variables Previous lectures Correlation Measure of strength of association between two variables
More information4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as:
1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data 2. Linearity (in the relationships among the variablesfactors are linear constructions of the set of variables; the critical source
More informationFactor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business
Factor Analysis Advanced Financial Accounting II Åbo Akademi School of Business Factor analysis A statistical method used to describe variability among observed variables in terms of fewer unobserved variables
More informationLecture 7: Principal component analysis (PCA) What is PCA? Why use PCA?
Lecture 7: Princial comonent analysis (PCA) Rationale and use of PCA The underlying model (what is a rincial comonent anyway?) Eigenvectors and eigenvalues of the samle covariance matrix revisited! PCA
More informationSteven M. Ho!and. Department of Geology, University of Georgia, Athens, GA 306022501
PRINCIPAL COMPONENTS ANALYSIS (PCA) Steven M. Ho!and Department of Geology, University of Georgia, Athens, GA 306022501 May 2008 Introduction Suppose we had measured two variables, length and width, and
More informationDimensionality Reduction: Principal Components Analysis
Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely
More informationChapter 7 Factor Analysis SPSS
Chapter 7 Factor Analysis SPSS Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of correlations within a set of observed variables. Factor analysis is often
More informationCHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.
CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In
More information5.2 Customers Types for Grocery Shopping Scenario
 CHAPTER 5: RESULTS AND ANALYSIS 
More informationPrincipal Component Analysis
Principal Component Analysis ERS70D George Fernandez INTRODUCTION Analysis of multivariate data plays a key role in data analysis. Multivariate data consists of many different attributes or variables recorded
More informationExploratory Factor Analysis Brian Habing  University of South Carolina  October 15, 2003
Exploratory Factor Analysis Brian Habing  University of South Carolina  October 15, 2003 FA is not worth the time necessary to understand it and carry it out. Hills, 1977 Factor analysis should not
More informationIntroduction to Matrix Algebra
Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra  1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary
More informationThe Classical Linear Regression Model
The Classical Linear Regression Model 1 September 2004 A. A brief review of some basic concepts associated with vector random variables Let y denote an n x l vector of random variables, i.e., y = (y 1,
More informationNotes for STA 437/1005 Methods for Multivariate Data
Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.
More informationUsing the Singular Value Decomposition
Using the Singular Value Decomposition Emmett J. Ientilucci Chester F. Carlson Center for Imaging Science Rochester Institute of Technology emmett@cis.rit.edu May 9, 003 Abstract This report introduces
More informationA Introduction to Matrix Algebra and Principal Components Analysis
A Introduction to Matrix Algebra and Principal Components Analysis Multivariate Methods in Education ERSH 8350 Lecture #2 August 24, 2011 ERSH 8350: Lecture 2 Today s Class An introduction to matrix algebra
More information1 Introduction. 2 Matrices: Definition. Matrix Algebra. Hervé Abdi Lynne J. Williams
In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 00 Matrix Algebra Hervé Abdi Lynne J. Williams Introduction Sylvester developed the modern concept of matrices in the 9th
More informationFactor Analysis. Factor Analysis
Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we
More informationCanonical Correlation
Chapter 400 Introduction Canonical correlation analysis is the study of the linear relations between two sets of variables. It is the multivariate extension of correlation analysis. Although we will present
More informationUnderstanding and Using Factor Scores: Considerations for the Applied Researcher
A peerreviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to
More informationRachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA
PROC FACTOR: How to Interpret the Output of a RealWorld Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a realworld example of a factor
More information2. Linearity (in relationships among the variablesfactors are linear constructions of the set of variables) F 2 X 4 U 4
1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data. Linearity (in relationships among the variablesfactors are linear constructions of the set of variables) 3. Univariate and multivariate
More informationFactor analysis. Angela Montanari
Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number
More informationIntroduction to Principal Component Analysis: Stock Market Values
Chapter 10 Introduction to Principal Component Analysis: Stock Market Values The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from
More informationFactor Analysis. Sample StatFolio: factor analysis.sgp
STATGRAPHICS Rev. 1/10/005 Factor Analysis Summary The Factor Analysis procedure is designed to extract m common factors from a set of p quantitative variables X. In many situations, a small number of
More informationJORIND 9(2) December, ISSN
AN ANALYSIS OF RURAL POVERTY IN OYO STATE: A PRINCIPAL COMPONENT APPROACH O.I. Osowole Department of Statistics University of Ibadan and Rita Ugbechie Department of Mathematics, College of Education Agbor,
More informationPRINCIPAL COMPONENTS AND THE MAXIMUM LIKELIHOOD METHODS AS TOOLS TO ANALYZE LARGE DATA WITH A PSYCHOLOGICAL TESTING EXAMPLE
PRINCIPAL COMPONENTS AND THE MAXIMUM LIKELIHOOD METHODS AS TOOLS TO ANALYZE LARGE DATA WITH A PSYCHOLOGICAL TESTING EXAMPLE Markela Muca Llukan Puka Klodiana Bani Department of Mathematics, Faculty of
More informationMultivariate Analysis (Slides 13)
Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables
More information[1] Diagonal factorization
8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:
More informationNonlinear Iterative Partial Least Squares Method
Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., RichardPlouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationStep 5: Conduct Analysis. The CCA Algorithm
Model Parameterization: Step 5: Conduct Analysis P Dropped species with fewer than 5 occurrences P Logtransformed species abundances P Rownormalized species log abundances (chord distance) P Selected
More information11/20/2014. Correlational research is used to describe the relationship between two or more naturally occurring variables.
Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection
More informationSTA 4107/5107. Chapter 3
STA 4107/5107 Chapter 3 Factor Analysis 1 Key Terms Please review and learn these terms. 2 What is Factor Analysis? Factor analysis is an interdependence technique (see chapter 1) that primarily uses metric
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationStatistics in Psychosocial Research Lecture 8 Factor Analysis I. Lecturer: Elizabeth GarrettMayer
This work is licensed under a Creative Commons AttributionNonCommercialShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationTo do a factor analysis, we need to select an extraction method and a rotation method. Hit the Extraction button to specify your extraction method.
Factor Analysis in SPSS To conduct a Factor Analysis, start from the Analyze menu. This procedure is intended to reduce the complexity in a set of data, so we choose Data Reduction from the menu. And the
More informationPrincipal Components. Sample StatFolio: pca.sgp
Principal Components Summary The Principal Components procedure is designed to extract k principal components from a set of p quantitative variables X. The principal components are defined as the set of
More informationA Brief Introduction to Factor Analysis
1. Introduction A Brief Introduction to Factor Analysis Factor analysis attempts to represent a set of observed variables X 1, X 2. X n in terms of a number of 'common' factors plus a factor which is unique
More informationLeastSquares Intersection of Lines
LeastSquares Intersection of Lines Johannes Traa  UIUC 2013 This writeup derives the leastsquares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
More informationPrinciple Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate
More informationThe president of a Fortune 500 firm wants to measure the firm s image.
4. Factor Analysis A related method to the PCA is the Factor Analysis (FA) with the crucial difference that in FA a statistical model is constructed to explain the interrelations (correlations) between
More informationAn introduction to. Principal Component Analysis & Factor Analysis. Using SPSS 19 and R (psych package) Robin Beaumont robin@organplayers.co.
An introduction to Principal Component Analysis & Factor Analysis Using SPSS 19 and R (psych package) Robin Beaumont robin@organplayers.co.uk Monday, 23 April 2012 Acknowledgment: The original version
More informationMultivariate Analysis
Table Of Contents Multivariate Analysis... 1 Overview... 1 Principal Components... 2 Factor Analysis... 5 Cluster Observations... 12 Cluster Variables... 17 Cluster KMeans... 20 Discriminant Analysis...
More informationWhat do the dimensions or axes represent? Ordinations
Ordinations Goal order or arrange samples along one or two meaningful dimensions (axes or scales) Ordinations Not specifically designed for hypothesis testing Similar to goals of clustering, so why not
More informationExamples on Variable Selection in PCA in Sensory Descriptive and Consumer Data
Examples on Variable Selection in PCA in Sensory Descriptive and Consumer Data Per Lea, Frank Westad, Margrethe Hersleth MATFORSK, Ås, Norway Harald Martens KVL, Copenhagen, Denmark 6 th Sensometrics Meeting
More informationSOME NOTES ON STATISTICAL INTERPRETATION. Below I provide some basic notes on statistical interpretation for some selected procedures.
1 SOME NOTES ON STATISTICAL INTERPRETATION Below I provide some basic notes on statistical interpretation for some selected procedures. The information provided here is not exhaustive. There is more to
More informationLinear Model Selection and Regularization
Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In
More informationMehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics
INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree
More informationExploratory Factor Analysis
Introduction Principal components: explain many variables using few new variables. Not many assumptions attached. Exploratory Factor Analysis Exploratory factor analysis: similar idea, but based on model.
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 SigmaRestricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More information1. The standardised parameters are given below. Remember to use the population rather than sample standard deviation.
Kapitel 5 5.1. 1. The standardised parameters are given below. Remember to use the population rather than sample standard deviation. The graph of crossvalidated error versus component number is presented
More informationHANDOUT ON VALIDITY. 1. Content Validity
HANDOUT ON VALIDITY A measure (e.g. a test, a questionnaire or a scale) is useful if it is reliable and valid. A measure is valid if it measures what it purports to measure. Validity can be assessed in
More informationUSEFULNESS AND BENEFITS OF EBANKING / INTERNET BANKING SERVICES
CHAPTER 6 USEFULNESS AND BENEFITS OF EBANKING / INTERNET BANKING SERVICES 6.1 Introduction In the previous two chapters, bank customers adoption of ebanking / internet banking services and functional
More informationFactor Analysis Using SPSS
Psychology 305 p. 1 Factor Analysis Using SPSS Overview For this computer assignment, you will conduct a series of principal factor analyses to examine the factor structure of a new instrument developed
More informationMarkov Chains, part I
Markov Chains, part I December 8, 2010 1 Introduction A Markov Chain is a sequence of random variables X 0, X 1,, where each X i S, such that P(X i+1 = s i+1 X i = s i, X i 1 = s i 1,, X 0 = s 0 ) = P(X
More informationMultivariate Statistical Inference and Applications
Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A WileyInterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationPrincipal Component Analysis Application to images
Principal Component Analysis Application to images Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering, Department of Cybernetics Center for Machine Perception http://cmp.felk.cvut.cz/
More informationApplied Multivariate Analysis
Neil H. Timm Applied Multivariate Analysis With 42 Figures Springer Contents Preface Acknowledgments List of Tables List of Figures vii ix xix xxiii 1 Introduction 1 1.1 Overview 1 1.2 Multivariate Models
More informationExploratory Factor Analysis: rotation. Psychology 588: Covariance structure and factor models
Exploratory Factor Analysis: rotation Psychology 588: Covariance structure and factor models Rotational indeterminacy Given an initial (orthogonal) solution (i.e., Φ = I), there exist infinite pairs of
More informationExtending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances?
1 Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances? André Beauducel 1 & Norbert Hilger University of Bonn,
More informationUse of Factor Scores in Multiple Regression Analysis for Estimation of Body Weight by Several Body Measurements in Brown Trouts (Salmo trutta fario)
INTERNATİONAL JOURNAL OF AGRİCULTURE & BİOLOGY ISSN Print: 1560 8530; ISSN Online: 1814 9596 10 164/DJZ/2010/12 4 611 615 http://www.fspublishers.org Full Length Article Use of Factor Scores in Multiple
More informationMultivariate Analysis (Slides 4)
Multivariate Analysis (Slides 4) In today s class we examine examples of principal components analysis. We shall consider a difficulty in applying the analysis and consider a method for resolving this.
More informationExploratory Factor Analysis
Exploratory Factor Analysis ( 探 索 的 因 子 分 析 ) Yasuyo Sawaki Waseda University JLTA2011 Workshop Momoyama Gakuin University October 28, 2011 1 Today s schedule Part 1: EFA basics Introduction to factor
More informationSPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011
SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 Statistical techniques to be covered Explore relationships among variables Correlation Regression/Multiple regression Logistic regression Factor analysis
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x  x) B. x 3 x C. 3x  x D. x  3x 2) Write the following as an algebraic expression
More informationEXPLORATORY FACTOR ANALYSIS IN MPLUS, R AND SPSS. sigbert@wiwi.huberlin.de
EXPLORATORY FACTOR ANALYSIS IN MPLUS, R AND SPSS Sigbert Klinke 1,2 Andrija Mihoci 1,3 and Wolfgang Härdle 1,3 1 School of Business and Economics, HumboldtUniversität zu Berlin, Germany 2 Department of
More informationTopic 10: Factor Analysis
Topic 10: Factor Analysis Introduction Factor analysis is a statistical method used to describe variability among observed variables in terms of a potentially lower number of unobserved variables called
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationPARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA
PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA ABSTRACT The decision of whether to use PLS instead of a covariance
More informationPRINCIPAL COMPONENT ANALYSIS
1 Chapter 1 PRINCIPAL COMPONENT ANALYSIS Introduction: The Basics of Principal Component Analysis........................... 2 A Variable Reduction Procedure.......................................... 2
More informationThe ith principal component (PC) is the line that follows the eigenvector associated with the ith largest eigenvalue.
More Principal Components Summary Principal Components (PCs) are associated with the eigenvectors of either the covariance or correlation matrix of the data. The ith principal component (PC) is the line
More informationPATTERNS OF ENVIRONMENTAL MANAGEMENT IN THE CHILEAN MANUFACTURING INDUSTRY: AN EMPIRICAL APPROACH
PATTERS OF EVIROMETAL MAAGEMET I THE CHILEA MAUFACTURIG IDUSTRY: A EMPIRICAL APPROACH Dr. Maria Teresa RuizTagle Research Associate, University of Cambridge, UK Research Associate, Universidad de Chile,
More informationLecture 7: Singular Value Decomposition
0368324801Algorithms in Data Mining Fall 2013 Lecturer: Edo Liberty Lecture 7: Singular Value Decomposition Warning: This note may contain typos and other inaccuracies which are usually discussed during
More informationData analysis process
Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis
More informationThe Power Method for Eigenvalues and Eigenvectors
Numerical Analysis Massoud Malek The Power Method for Eigenvalues and Eigenvectors The spectrum of a square matrix A, denoted by σ(a) is the set of all eigenvalues of A. The spectral radius of A, denoted
More informationStatistics for Business Decision Making
Statistics for Business Decision Making Faculty of Economics University of Siena 1 / 62 You should be able to: ˆ Summarize and uncover any patterns in a set of multivariate data using the (FM) ˆ Apply
More informationResearch Methodology: Tools
MSc Business Administration Research Methodology: Tools Applied Data Analysis (with SPSS) Lecture 02: Item Analysis / Scale Analysis / Factor Analysis February 2014 Prof. Dr. Jürg Schwarz Lic. phil. Heidi
More informationSummary of week 8 (Lectures 22, 23 and 24)
WEEK 8 Summary of week 8 (Lectures 22, 23 and 24) This week we completed our discussion of Chapter 5 of [VST] Recall that if V and W are inner product spaces then a linear map T : V W is called an isometry
More informationPartial Least Squares (PLS) Regression.
Partial Least Squares (PLS) Regression. Hervé Abdi 1 The University of Texas at Dallas Introduction Pls regression is a recent technique that generalizes and combines features from principal component
More informationby the matrix A results in a vector which is a reflection of the given
Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the yaxis We observe that
More information