Lecture 7: Analysis of Factors and Canonical Correlations

Size: px
Start display at page:

Download "Lecture 7: Analysis of Factors and Canonical Correlations"

Transcription

1 1/33 Lecture 7: Analysis of Factors and Canonical Correlations Måns Thulin Department of Mathematics, Uppsala University Multivariate Methods 6/5 2011

2 2/33 Outline Factor analysis Basic idea: factor the covariance matrix Methods for factoring Canonical correlation analysis Basic idea: correlations between sets Finding the canonical correlations

3 3/33 Factor analysis Assume that we have a data set with many variables and that it is reasonable to believe that all these, to some extent, depend on a few underlying but unobservable factors. The purpose of factor analysis is to find dependencies on such factors and to use this to reduce the dimensionality of the data set. In particular, the covariance matrix is described by the factors.

4 4/33 Factor analysis: an early example C. Spearman (1904), General Intelligence, Objectively Determined and Measured, The American Journal of Psychology. Children s performance in mathematics (X 1 ), French (X 2 ) and English (X 3 ) was measured. Correlation matrix: R = Assume the following model: X 1 = λ 1 f + ɛ 1, X 2 = λ 2 f + ɛ 2, X 3 = λ 3 f + ɛ 3 where f is an underlying common factor ( general ability ), λ 1, λ 2, λ 3 are factor loadings and ɛ 1, ɛ 2, ɛ 3 are random disturbance terms.

5 5/33 Factor analysis: an early example Model: X i = λ i f + ɛ i, i = 1, 2, 3 with the unobservable factor f = General ability The variation of ɛ i consists of two parts: a part that represents the extent to which an individual s matehematics ability, say, differs from her general ability a measurement error due to the experimental setup, since examination is only an approximate measure of her ability in the subject The relative sizes of λ i f and ɛ i tell us to which extent variation between individuals can be described by the factor.

6 6/33 Factor analysis: linear model In factor analysis, a linear model is assumed: X (p 1) L (p m) F (m 1) ɛ (p 1) X = µ + LF + ɛ = observable random vector = matrix of factor loadings = common factors; unobserved values of factors which describe major features of members of the population = errors/specific factors; measurement error and variation not accounted for by the common factors µ i = mean of variable i ɛ i = ith specific factor F j = jth common factor l ij = loading of the ith variable on the jth factor The above model differs from ordinary linear models in that the independent variables LF are unobservable and random.

7 7/33 Factor analysis: linear model We assume that the unobservable random vectors F and ɛ satisfy the following conditions: F and ɛ are independent. E(ɛ) = 0 and Cov(ɛ) = Ψ, where Ψ is a diagonal matrix. E(F) = 0 and Cov(F) = I. Thus, the factors are assumed to be uncorrelated. This is called the orthogonal factor model.

8 8/33 Factor analysis: factoring Σ Given the model X = µ + LF + ɛ, what is Σ? See blackboard! We find that Cov(X) = LL + Ψ where V(X i ) = l 2 i1 + + l 2 im + ψ i and Cov(X i, X k ) = l i1 l k1 + + l im l km. Furthermore, Cov(X, F) = L, so that Cov(X i, F j ) = l ij. If m = the number of factors is much smaller than p =the number of measured attributes, the covariance of X can be described by the pm elements of LL and the p nonzero elements of Ψ, rather than the (p 2 + p)/2 elements of Σ.

9 9/33 Factor analysis: factoring Σ The marginal variance σ ii can also be partitioned into two parts: The ith communality: The proportion of the variance at the ith measurement X i contributed by the factors F 1, F 2,..., F m. The uniqueness or specific variance: The remaining proportion of the variance of the ith measurement, associated with ɛ i. From Cov(X) = LL + Ψ we have that σ ii = l 2 i1 + l 2 i2 + + l 2 im + }{{} ψ }{{} i Communality,hi 2 Specific variance so σ ii = h 2 i + ψ i, i = 1, 2,..., p.

10 10/33 Orthogonal transformations Consider an orthogonal matrix T. See blackboard! We find that factor loadings L are determined only up to an orthogonal matrix T. With L = LT and F = T F, the pairs (L, F ) and (L, F) give equally valid decompositions. The communalities given by the diagonal elements of LL = (L )(L ) are also unaffected by the choice of T.

11 11/33 Factor analysis: estimation The factor model is determined uniquely up to an orthogonal transformation of the factors. How then can L and Ψ be estimated? Is some L = LT better than others? Idea: Find some L and Ψ and then consider various L. Methods for estimation: Principal component method Principal factor method Maximum likelihood method

12 12/33 Factor analysis: principal component method Let (λ i, e i ) be the eigenvalue-eigenvector pairs of Σ, with λ 1 λ 2... λ p > 0. From the spectral theorem (and the principal components lecture) we know that Σ = λ 1 e 1 e 1 + λ 2e 2 e λ pe p e p. Let L = ( λ 1 e 1, λ 2 e 2,..., λ p e p ). Then Σ = λ 1 e 1 e λ pe p e p = LL = LL + 0. Thus L is given by λ i times the coefficients of the principal components, and Ψ = 0.

13 Factor analysis: principal component method Now, if λ m+1, λ m+2,..., λ p are small, then the first m principal components explain most of Σ. Thus, with L m = ( λ 1 e 1,..., λ m e m ), Σ L m L m. With specific factors, this becomes where Ψ i = σ ii m i=1 l 2 ij. Σ L m L m + Ψ As estimators for the factor loadings and specific variances, we take ( L = L ) m = ˆλ 1 ê 1,..., ˆλ m ê m where (ˆλ i, ê i ) are the eigenvalue-eigenvector pairs of the sample covariance matrix S, and m 2 Ψ = s ii l ij. i=1 13/33

14 14/33 Factor analysis: principal component method In many cases, the correlation matrix R (which is also the covariance matrix of the standardized data) is used instead of S, to avoid problems related to measurements being in different scales. Example: consumer-preference data p. 491 in J&W. See R code! Is this a good example? Can factor analysis be applied to ordinal data? Example: stock price data p. 493 in J&W. See R code!

15 15/33 Factor analysis: other estimation methods Principal factor method: Modification of principal component approach. Uses initial estimates of the specific variances. Maximum likelihood method: Assuming normal data, the maximum likelihood estimators of L and Ψ are derived. In general the estimators must be calculated by numerical maximization of a function of matrices.

16 Factor analysis: factor rotation Given an orthogonal matrix T, a different but equally valid possible estimator of the factor loadings is L = LT. We would like to find a L that gives a nice and simple interpretation of the corresponding factors. Ideally: a pattern of loadings such that each variable loads highly on a single factor and has small to moderate loadings on the remaining factors. The varimax criterion: Define l ij = ˆl ij /ĥ i. This scaling gives variables with smaller communalities more influence. Select the orthogonal transformation T that makes ( V = 1 m p p ) 2 l 4 ij 1 l 2 ij p p i=1 i=1 i=1 as large as possible. 16/33

17 17/33 Factor analysis: factor rotation Example: consumer-preference data p. 508 in J&W. See R code! Example: stock price data p. 510 in J&W. See R code!

18 18/33 Factor analysis: limitations of orthogonal factor model The factor model is determined uniquely up to an orthogonal transformation of the factors. Linearity: the linear covariance approximation LL + Ψ may not be appropriate. The factor model is most useful when m is small, but in many cases mp + p parameters are not adequate and Σ is not close to LL + Ψ.

19 19/33 Factor analysis: further topics Some further topics of interest are: Tests for the number of common factors. Factor scores. Oblique factor model In which Cov(F) is not diagonal; the factors are correlated.

20 20/33 Canonical correlations Canonical correlation analysis CCA is a means of assessing the relationship between two sets of variables. The idea is to study the correlation between a linear combination of the variables in one set and a linear combination of the variables in another set.

21 21/33 Canonical correlations: the exams problem In a classical CCA problem exams in five topics are considered. Exams are closed-book (C) or open-book (O): Mechanics (C), Vectors (C), Algebra (O), Analysis (O), Statistics (O). Question: how highly correlated is a student s performance in closed-book exams with her performance in open-book exams?

22 22/33 Canonical correlations: basic idea Given a data set X, partition the collection of variables into two sets: X (1) (p 1) and X (2) (q 1). (Assume p q.) For these random vectors: E(X (1) ) = µ (1), Cov(X (1) ) = Σ 11 E(X (2) ) = µ (2), Cov(X (2) ) = Σ 22 and Cov(X (1), X (2) ) = Σ 12 = Σ 21.

23 23/33 Canonical correlations: basic idea Introduce the linear combinations U = a X (1), V = b X (2). For these, Var(U) = a Σ 11 a, Var(V ) = b Σ 22 b and Cov(U, V ) = a Σ 12 b. Goal: Seek a and b such that is as large as possible. Corr(U, V ) = a Σ 12 b a Σ 11 a b Σ 22 b

24 Canonical correlations: some definitions The first pair of canonical variables is the pair of linear combinations U 1 and V 1 having unit variances, which maximize the correlation Corr(U, V ) = a Σ 12 b a Σ 11 a b Σ 22 b The second pair of canonical variables is the pair of linear combinations U 2 and V 2 having unit variances, which maximize the correlation among all choices that are uncorrelated with the first pair of canonical variables. The kth pair of canonical variables is the pair of linear combinations U k and V k having unit variances, which maximize the correlation among all choices that are uncorrelated with the previous k 1 canonical variable pairs. The correlation between the kth pair of canonical variables is called the kth canonical correlation. 24/33

25 25/33 Canonical correlations: solution Result The kth pair of canonical variables is given by We have that and U k = e k Σ 1/2 11 X (1) and V k = f k Σ 1/2 22 X (2). Var(U k ) = Var(V k ) = 1 Cov(U k, V k ) = ρ k, where ρ 2 1 ρ ρ 2 p and (ρ 2 k, e k) and (ρ 2 k, f k) are the eigenvalue-eigenvectors pairs of Σ 1/2 11 Σ 12 Σ 1/2 22 Σ 21 Σ 1/2 11 and Σ 1/2 22 Σ 21 Σ 1/2 11 Σ 12 Σ 1/2 22, respectively.

26 26/33 Canonical correlations: solution Alternatively, the canonical variables coefficient vectors a and b and the corresponding correlations can be found by solving the eigenvalue equations Σ 1 11 Σ 12Σ 1 22 Σ 21 a = ρ 2 a Σ 1 22 Σ 21Σ 1 11 Σ 12 b = ρ 2 b This is often more practical to use when computing the coefficients and the correlations.

27 Canonical correlations: head-lengths of sons Example: Consider the two sons born first in n families. For these, consider the following measurements: X 1 : head length of first son; X 2 : head breadth of first son; X 3 : head length of second son; X 4 : head breadth of second son. Observations: x = , S = Correlation matrices: [ R 11 = R 12 = R 21 = See R code! ] [, R 22 = [ ] ] 27/33

28 28/33 Canonical correlations: head-lengths of sons The eigenvalues of R 1 11 R 12R 1 22 R 21 are and Thus the canonical correlations are: = and = (or are they? Remember to check signs!). From the eigenvectors we obtain the canonical correlation vectors : [ ] [ ] a 1 =, a = and b 1 = [ ] [, b 2 = ] Note: The canonical correlation ˆρ 1 = exceeds any of the individual correlations between a variable of the first set and a variable of the second set.

29 29/33 Canonical correlations: head-lengths of sons The first canonical correlation variables are { U = 0.727x (1) x (1) 2 V = 0.684x (2) x (2) 2 with Corr(U, V ) = Interpretation: The sum of length and breadth of head size of each brother. These variables are highly negatively correlated between brothers.

30 30/33 Canonical correlations: head-lengths of sons The second canonical correlation variables are { U = 0.704x (1) x (1) 2 V = 0.709x (2) x (2) 2 with Corr(U, V ) = Interpretation: Seem to measure the difference between length and breadth. Head shape of first and second brothers appear to have little correlation.

31 Canonical correlations: poor summary of variability Example: Consider the following covariance matrix (p = q = 2): [ ] Σ11 Σ Σ = 12 = Σ 21 Σ It can be shown that the first pair of canonical variates is with correlation U 1 = X (1) 2, V 1 = X (2) 1 ρ 1 = Corr(U 1, V 1 ) = However, U 1 = X (1) 2 provides a very poor summary of the variability in the first set. Most of the variability in the first set is in X (1) 1, which is uncorrelated with U 1. (The same situation is true for V 1 = X (2) 1 in the second set.) 31/33

32 32/33 Canonical correlations: further topics Some further topics of interest are: Tests for Σ 12 = 0. Sample descriptive measures. Connection to MANOVA.

33 33/33 Summary Factor analysis Basic idea: factor the covariance matrix Methods for factoring Principal components Maximum likelihood Canonical correlation analysis Basic idea: correlations between sets Finding the canonical correlations Eigenvalues and eigenvectors

Factor Analysis. Factor Analysis

Factor Analysis. Factor Analysis Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we

More information

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

Multivariate Analysis (Slides 13)

Multivariate Analysis (Slides 13) Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

FACTOR ANALYSIS NASC

FACTOR ANALYSIS NASC FACTOR ANALYSIS NASC Factor Analysis A data reduction technique designed to represent a wide range of attributes on a smaller number of dimensions. Aim is to identify groups of variables which are relatively

More information

Exploratory Factor Analysis

Exploratory Factor Analysis Introduction Principal components: explain many variables using few new variables. Not many assumptions attached. Exploratory Factor Analysis Exploratory factor analysis: similar idea, but based on model.

More information

FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.

FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables. FACTOR ANALYSIS Introduction Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables Both methods differ from regression in that they don t have

More information

Factor Analysis. Chapter 420. Introduction

Factor Analysis. Chapter 420. Introduction Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Principle Component Analysis: A statistical technique used to examine the interrelations among a set of variables in order to identify the underlying structure of those variables.

More information

Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances?

Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances? 1 Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances? André Beauducel 1 & Norbert Hilger University of Bonn,

More information

Exploratory Factor Analysis

Exploratory Factor Analysis Exploratory Factor Analysis Definition Exploratory factor analysis (EFA) is a procedure for learning the extent to which k observed variables might measure m abstract variables, wherein m is less than

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

[1] Diagonal factorization

[1] Diagonal factorization 8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:

More information

Multivariate Analysis of Variance (MANOVA): I. Theory

Multivariate Analysis of Variance (MANOVA): I. Theory Gregory Carey, 1998 MANOVA: I - 1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

More information

Introduction to Principal Components and FactorAnalysis

Introduction to Principal Components and FactorAnalysis Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a

More information

Sections 2.11 and 5.8

Sections 2.11 and 5.8 Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and

More information

Topic 10: Factor Analysis

Topic 10: Factor Analysis Topic 10: Factor Analysis Introduction Factor analysis is a statistical method used to describe variability among observed variables in terms of a potentially lower number of unobserved variables called

More information

Chapter 6. Orthogonality

Chapter 6. Orthogonality 6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be

More information

STA 4107/5107. Chapter 3

STA 4107/5107. Chapter 3 STA 4107/5107 Chapter 3 Factor Analysis 1 Key Terms Please review and learn these terms. 2 What is Factor Analysis? Factor analysis is an interdependence technique (see chapter 1) that primarily uses metric

More information

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013 Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,

More information

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Quadratic forms Cochran s theorem, degrees of freedom, and all that Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us

More information

Factor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business

Factor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business Factor Analysis Advanced Financial Accounting II Åbo Akademi School of Business Factor analysis A statistical method used to describe variability among observed variables in terms of fewer unobserved variables

More information

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis

More information

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 Statistical techniques to be covered Explore relationships among variables Correlation Regression/Multiple regression Logistic regression Factor analysis

More information

Multivariate Analysis of Variance (MANOVA)

Multivariate Analysis of Variance (MANOVA) Multivariate Analysis of Variance (MANOVA) Aaron French, Marcelo Macedo, John Poulsen, Tyler Waterson and Angela Yu Keywords: MANCOVA, special cases, assumptions, further reading, computations Introduction

More information

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

More information

Overview of Factor Analysis

Overview of Factor Analysis Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,

More information

Common factor analysis

Common factor analysis Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor

More information

Chapter 7 Factor Analysis SPSS

Chapter 7 Factor Analysis SPSS Chapter 7 Factor Analysis SPSS Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of correlations within a set of observed variables. Factor analysis is often

More information

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree

More information

Statistics in Psychosocial Research Lecture 8 Factor Analysis I. Lecturer: Elizabeth Garrett-Mayer

Statistics in Psychosocial Research Lecture 8 Factor Analysis I. Lecturer: Elizabeth Garrett-Mayer This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

T-test & factor analysis

T-test & factor analysis Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

More information

Introduction: Overview of Kernel Methods

Introduction: Overview of Kernel Methods Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University

More information

Multivariate Statistical Inference and Applications

Multivariate Statistical Inference and Applications Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Statistics for Business Decision Making

Statistics for Business Decision Making Statistics for Business Decision Making Faculty of Economics University of Siena 1 / 62 You should be able to: ˆ Summarize and uncover any patterns in a set of multivariate data using the (FM) ˆ Apply

More information

Psychology 7291, Multivariate Analysis, Spring 2003. SAS PROC FACTOR: Suggestions on Use

Psychology 7291, Multivariate Analysis, Spring 2003. SAS PROC FACTOR: Suggestions on Use : Suggestions on Use Background: Factor analysis requires several arbitrary decisions. The choices you make are the options that you must insert in the following SAS statements: PROC FACTOR METHOD=????

More information

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In

More information

Least-Squares Intersection of Lines

Least-Squares Intersection of Lines Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a

More information

Department of Economics

Department of Economics Department of Economics On Testing for Diagonality of Large Dimensional Covariance Matrices George Kapetanios Working Paper No. 526 October 2004 ISSN 1473-0278 On Testing for Diagonality of Large Dimensional

More information

Introduction to Principal Component Analysis: Stock Market Values

Introduction to Principal Component Analysis: Stock Market Values Chapter 10 Introduction to Principal Component Analysis: Stock Market Values The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from

More information

Factor Analysis. Sample StatFolio: factor analysis.sgp

Factor Analysis. Sample StatFolio: factor analysis.sgp STATGRAPHICS Rev. 1/10/005 Factor Analysis Summary The Factor Analysis procedure is designed to extract m common factors from a set of p quantitative variables X. In many situations, a small number of

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis ERS70D George Fernandez INTRODUCTION Analysis of multivariate data plays a key role in data analysis. Multivariate data consists of many different attributes or variables recorded

More information

How To Understand Multivariate Models

How To Understand Multivariate Models Neil H. Timm Applied Multivariate Analysis With 42 Figures Springer Contents Preface Acknowledgments List of Tables List of Figures vii ix xix xxiii 1 Introduction 1 1.1 Overview 1 1.2 Multivariate Models

More information

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1 (d) If the vector b is the sum of the four columns of A, write down the complete solution to Ax = b. 1 2 3 1 1 2 x = + x 2 + x 4 1 0 0 1 0 1 2. (11 points) This problem finds the curve y = C + D 2 t which

More information

Orthogonal Diagonalization of Symmetric Matrices

Orthogonal Diagonalization of Symmetric Matrices MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

More information

by the matrix A results in a vector which is a reflection of the given

by the matrix A results in a vector which is a reflection of the given Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the y-axis We observe that

More information

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

Component Ordering in Independent Component Analysis Based on Data Power

Component Ordering in Independent Component Analysis Based on Data Power Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals

More information

Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA

Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA PROC FACTOR: How to Interpret the Output of a Real-World Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a real-world example of a factor

More information

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued). MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors Jordan canonical form (continued) Jordan canonical form A Jordan block is a square matrix of the form λ 1 0 0 0 0 λ 1 0 0 0 0 λ 0 0 J = 0

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8 Spaces and bases Week 3: Wednesday, Feb 8 I have two favorite vector spaces 1 : R n and the space P d of polynomials of degree at most d. For R n, we have a canonical basis: R n = span{e 1, e 2,..., e

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

Partial Least Squares (PLS) Regression.

Partial Least Squares (PLS) Regression. Partial Least Squares (PLS) Regression. Hervé Abdi 1 The University of Texas at Dallas Introduction Pls regression is a recent technique that generalizes and combines features from principal component

More information

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents

More information

University of Lille I PC first year list of exercises n 7. Review

University of Lille I PC first year list of exercises n 7. Review University of Lille I PC first year list of exercises n 7 Review Exercise Solve the following systems in 4 different ways (by substitution, by the Gauss method, by inverting the matrix of coefficients

More information

Subspace Analysis and Optimization for AAM Based Face Alignment

Subspace Analysis and Optimization for AAM Based Face Alignment Subspace Analysis and Optimization for AAM Based Face Alignment Ming Zhao Chun Chen College of Computer Science Zhejiang University Hangzhou, 310027, P.R.China zhaoming1999@zju.edu.cn Stan Z. Li Microsoft

More information

Exploratory Factor Analysis: rotation. Psychology 588: Covariance structure and factor models

Exploratory Factor Analysis: rotation. Psychology 588: Covariance structure and factor models Exploratory Factor Analysis: rotation Psychology 588: Covariance structure and factor models Rotational indeterminacy Given an initial (orthogonal) solution (i.e., Φ = I), there exist infinite pairs of

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Inner product. Definition of inner product

Inner product. Definition of inner product Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

More information

Part 2: Community Detection

Part 2: Community Detection Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 FA is not worth the time necessary to understand it and carry it out. -Hills, 1977 Factor analysis should not

More information

Regression Analysis. Regression Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013

Regression Analysis. Regression Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013 Lecture 6: Regression Analysis MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Regression Analysis 1 Outline Regression Analysis 1 Regression Analysis MIT 18.S096 Regression Analysis 2 Multiple Linear

More information

Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

More information

Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk

Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk Doi:10.5901/mjss.2014.v5n20p303 Abstract Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk Wilbert Sibanda Philip D. Pretorius

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Understanding and Applying Kalman Filtering

Understanding and Applying Kalman Filtering Understanding and Applying Kalman Filtering Lindsay Kleeman Department of Electrical and Computer Systems Engineering Monash University, Clayton 1 Introduction Objectives: 1. Provide a basic understanding

More information

Factor Analysis - 2 nd TUTORIAL

Factor Analysis - 2 nd TUTORIAL Factor Analysis - 2 nd TUTORIAL Subject marks File sub_marks.csv shows correlation coefficients between subject scores for a sample of 220 boys. sub_marks

More information

Notes on Determinant

Notes on Determinant ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

More information

Some probability and statistics

Some probability and statistics Appendix A Some probability and statistics A Probabilities, random variables and their distribution We summarize a few of the basic concepts of random variables, usually denoted by capital letters, X,Y,

More information

The Characteristic Polynomial

The Characteristic Polynomial Physics 116A Winter 2011 The Characteristic Polynomial 1 Coefficients of the characteristic polynomial Consider the eigenvalue problem for an n n matrix A, A v = λ v, v 0 (1) The solution to this problem

More information

Linear Algebra Notes

Linear Algebra Notes Linear Algebra Notes Chapter 19 KERNEL AND IMAGE OF A MATRIX Take an n m matrix a 11 a 12 a 1m a 21 a 22 a 2m a n1 a n2 a nm and think of it as a function A : R m R n The kernel of A is defined as Note

More information

Lecture Notes 2: Matrices as Systems of Linear Equations

Lecture Notes 2: Matrices as Systems of Linear Equations 2: Matrices as Systems of Linear Equations 33A Linear Algebra, Puck Rombach Last updated: April 13, 2016 Systems of Linear Equations Systems of linear equations can represent many things You have probably

More information

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively. Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large

More information

How to report the percentage of explained common variance in exploratory factor analysis

How to report the percentage of explained common variance in exploratory factor analysis UNIVERSITAT ROVIRA I VIRGILI How to report the percentage of explained common variance in exploratory factor analysis Tarragona 2013 Please reference this document as: Lorenzo-Seva, U. (2013). How to report

More information

Solving Systems of Linear Equations

Solving Systems of Linear Equations LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how

More information

4. Matrix Methods for Analysis of Structure in Data Sets:

4. Matrix Methods for Analysis of Structure in Data Sets: ATM 552 Notes: Matrix Methods: EOF, SVD, ETC. D.L.Hartmann Page 64 4. Matrix Methods for Analysis of Structure in Data Sets: Empirical Orthogonal Functions, Principal Component Analysis, Singular Value

More information

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there is a relationship between variables, To find out the

More information

The Bivariate Normal Distribution

The Bivariate Normal Distribution The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included

More information

DISCRIMINANT FUNCTION ANALYSIS (DA)

DISCRIMINANT FUNCTION ANALYSIS (DA) DISCRIMINANT FUNCTION ANALYSIS (DA) John Poulsen and Aaron French Key words: assumptions, further reading, computations, standardized coefficents, structure matrix, tests of signficance Introduction Discriminant

More information

3.1 Least squares in matrix form

3.1 Least squares in matrix form 118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression

More information

System Identification for Acoustic Comms.:

System Identification for Acoustic Comms.: System Identification for Acoustic Comms.: New Insights and Approaches for Tracking Sparse and Rapidly Fluctuating Channels Weichang Li and James Preisig Woods Hole Oceanographic Institution The demodulation

More information

4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as:

4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as: 1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data 2. Linearity (in the relationships among the variables--factors are linear constructions of the set of variables; the critical source

More information

Multiple group discriminant analysis: Robustness and error rate

Multiple group discriminant analysis: Robustness and error rate Institut f. Statistik u. Wahrscheinlichkeitstheorie Multiple group discriminant analysis: Robustness and error rate P. Filzmoser, K. Joossens, and C. Croux Forschungsbericht CS-006- Jänner 006 040 Wien,

More information

Lecture 1: Schur s Unitary Triangularization Theorem

Lecture 1: Schur s Unitary Triangularization Theorem Lecture 1: Schur s Unitary Triangularization Theorem This lecture introduces the notion of unitary equivalence and presents Schur s theorem and some of its consequences It roughly corresponds to Sections

More information