# Multivariate Analysis (Slides 13)

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables in terms of a small number of underlying factors. The main assumption of factor analysis is that it is not possible to observe the underlying factors directly, i.e., they are latent. 1

2 Sound familiar? Factor analysis is a dimension reduction technique and shares the same objective as principal components analysis. Both attempt to describe the relationships in a large set of variables through a smaller number of dimensions. However, the factor analysis model is much more elaborate. 2

3 Example The performance of a group children in Irish (x 1 ), English (x 2 ), and Maths (x 3 ) was recorded. The correlation matrix for their scores was: The dimensionality of this matrix can be reduced from m = 3 to m = 1 by expressing the three variables as: x 1 = λ 1 f +ǫ 1 x 2 = λ 2 f +ǫ 2 x 3 = λ 3 f +ǫ 3 3

4 Example cont d The f in these equations is an underlying common factor, the λ i s are known as factor loadings, whilst the ǫ i s are known as errors or specific factors. The common factor can often be given an interpretation, e.g., general ability. The specific factors ǫ i will have small variance if x i is closely related to general ability. 4

5 The factor model The observable random vector X = (X 1,X 2,...,X m ) has mean µ and covariance matrix Σ. The factor model states that X is linearly dependent upon a few unobservable random variables f 1,f 2,...,f p called common factors and m additional sources of variation ǫ 1,ǫ 2,...,ǫ m called specific factors. Hence: X 1 µ 1 = λ 11 f 1 +λ 12 f 2 + +λ 1p f p +ǫ 1.. X m µ m = λ m1 f 1 +λ m2 f 2 + +λ mp f p +ǫ m X µ = Λf +ǫ. 5

6 The factor model The λ ij value is called the factor loading of the i-th variable on the j-th factor. Λ is the matrix of factor loadings. Note that the i-th specific factor ǫ i is associated only with the response X i. Note that f 1,f 2,...,f p,ǫ 1,ǫ 2,...,ǫ m are all unobservable random variables. 6

7 Assumptions Under the factor model it is assumed that: E[f] = 0 Cov[f] = I = ψ ψ E[ǫ] = 0 Cov[ǫ] = Ψ = ψ p 3. f and ǫ are independent and so Cov[f,ǫ] = 0. 7

8 Orthogonal Factor Model Note the above is referred to as the orthogonal factor model. However, if it is assumed that Cov[f] is not diagonal, then the model is referred to as the oblique factor model. The orthogonal factor model implies a specific covariance structure for X: Σ = Cov[X] = E[(X µ)(x µ) T ] = E[(Λf +ǫ)(λf +ǫ) T ] = E[(Λf)(Λf) T +ǫ(λf) T +(Λf)ǫ T +ǫǫ T ] = ΛE[ff T ]Λ T +E[ǫf T ]Λ T +ΛE[fǫ T ]+E[ǫǫ T ] = ΛΛ T +Ψ It can also be shown that Cov[X,f] = E[(X µ)(f 0)] = Λ, hence the covariance of the observed variable X i and the unobserved factor f j is the factor loading λ ij. 8

9 Variance The variance of X can be split into two parts. The first portion of the variance for the i-th component arises from the m common factors, and is referred to as the i-th communality. The remainder of the variance for the i-th component is due to the specific factor, and is referred to as the uniqueness. Denoting the i-th communality by h 2 i, then: σ 2 i = λ 2 i1 +λ 2 i2 + +λ 2 ip +ψ i = h 2 i +ψ i Var[X i ] = communality +uniqueness The i-th communality is the sum of squares of the loadings of the i-th variable on the p common factors. 9

10 Variance cont d The factor model assumes that the m+m(m 1)/2 variances and covariances of X can be reproduced from the mp factor loadings λ ij and the m specific variances ψ i. In the case where p << m the factor model provides a simplified version of the covariation in X with fewer parameters than the m(m+1)/2 parameters in Σ. For example, if X = (X 1,X 2,...,X 12 ) and a factor model with p = 2 is appropriate, then the 12 13/2 = 78 elements of Σ are described in terms of the pm+m = = 36 parameters λ ij and ψ i of the factor model. Unfortunately, and as we will see, most covariance matrices cannot be uniquely factored as ΛΛ +Ψ where p << m. 10

11 Scale invariance Rescaling the variables of X is equivalent to letting Y = CX, where C = diag(c i ). If the factor model holds with Λ = Λ x and Ψ = Ψ x, then Y Cµ = CΛ x f +Cǫ So, Var[Y] = CΣC = CΛ x Λ T xc+cψ x C Hence the factor model holds for Y with factor loading matrix Λ y = CΛ x and specific variances Ψ y = CΨ x C = diag(c 2 i ψ ii). In other words, factor analysis (unlike PCA and every other technique covered) is not affected by a re-scaling of the variables. 11

12 Non-uniqueness: Proof To demonstrate that most covariance matrices Σ cannot be uniquely factored as ΛΛ +Ψ, where p << m, let G be any m m orthogonal matrix (GG T = I). Then, X µ = Λf +ǫ = ΛGG T f +ǫ = Λ f +ǫ Here Λ = ΛG and f = G T f. It follows that E[f ] = 0 and Cov[f ] = I. Thus it is impossible, given the data X, to distinguish between Λ and Λ, since they both generate the same covariance matrix Σ, i.e., Σ = ΛΛ T +Ψ = ΛGG T Λ +Ψ = Λ Λ T +Ψ 12

13 Continued This issue leads to the idea of factor rotations, since orthogonal matrices correspond to rotations of the coordinate system of X (we ll return to this idea later). Usually we estimate one possibility for Λ and Ψ and rotate the resulting loading matrix (multiply by an orthogonal matrix G) so as to ease interpretation. Once the loadings and specific variances are estimated, estimated values for the factors themselves (called factor scores) are constructed. 13

14 Maximum likelihood factor analysis When the data x are assumed to be normally distributed, then estimates for Λ and Ψ can be obtained by maximizing the likelihood. The normal location parameter is replaced by its MLE x, whilst the log-likelihood of the data depends on Λ and Ψ through Σ. To ensure Λ is well defined (invariance is caused by orthogonal transformations), the computationally convenient uniqueness condition is enforced: Here is a diagonal matrix. Λ T Ψ 1 Λ = Numerical optimization of the log-likelihood can then be performed to obtain the MLEs ˆΛ and ˆΨ. 14

15 Maximum likelihood factor analysis However, problems may occur if ψ ii = 0. This happens when the uniqueness is zero, i.e., the variance in the variable X i is completely accounted for by the factors f i. Also, during estimation we may find that ψ ii < 0. This solution is clearly improper and is known as a Heywood case. Such situations occur if there are too many common factors, too few common factors, not enough data, or the inappropriate application of the model for the dataset. In most computer programs this problem is resolved by re-setting ψ ii to be a small positive number before progressing. 15

16 Factor rotation If the initial loadings are subject to an orthogonal transformation (i.e., multiplied by an orthogonal matrix G), the covariance matrix Σ can still be reproduced. An orthogonal transformation corresponds to a rigid rotation or reflection of the coordinate axes. Hence the orthogonal transformation of the factor loadings (and the implied transformation of the factors) is called a factor rotation. From a mathematical viewpoint, it is immaterial whether Λ or Λ = ΛG is reported since Σ = ΛΛ T +Ψ = ΛGG T Λ T +Ψ = Λ Λ T +Ψ. However, when the objective is statistical interpretation, it may be that one rotation is more useful than alternatives. 16

17 Controversial? This feature of factor analysis is not without controversy. Some authors argue that by rotating the factors the analyst is in someway manipulating the results. Others argue that the technique is simply akin to sharpening the focus of a microscope so as to see the detail more clearly. In a sense the orientation of the factor solution reported is arbitrary anyhow. In any case, factor analysis is generally used as an intermediate step to reduce the dimensionality of a data set prior to other statistical analyses (similar to PCA). 17

18 A Simpler Structure The idea of a simpler structure needs further explanation. One ideal would be to rotate the factors so that each variable has a large loading on a single factor and small loadings on the others. The variables can then be split into disjoint sets, each of which is associated with one factor. A factor j can then be interpreted as an average quality over those variables i for which λ ij is large. 18

19 Kaiser s Varimax Rotation First note that the squared loading λ 2 ij is the proportion of the variance in variable i that is attributable to common factor j: Var[X i ] = λ 2 i1 +λ 2 i2 + +λ 2 ip +ψ i = h 2 i +ψ i. We aim for a rotation that makes the squared loadings λ 2 ij small, i.e., few medium-sized values. either large or Let λ ij = λ ij /h i be the final rotated loadings scaled by the square root of the communalities. 19

20 Kaiser s Varimax Rotation The varimax procedure selects the orthogonal transformation G maximizing the sum of the column variances across all factors j = 1,...p: ( V = 1 p m λ 4 ij 1 p m ) 2 λ 2 m m 2 ij j=1 i=1 j=1 i=1 p (variance of the squares of scaled factor loadings for jth factor) j=1 Maximizing V corresponds to spreading out the squares of the loadings on each factor as much as possible. Hence both groups of large and negligible coefficients are found in any column of Λ. Scaling the rotated loadings has the effect of giving variables with small communalities relatively more weight. After G has been determined the loadings λ ij are multiplied by h i to ensure the original communalities are preserved. 20

21 Oblique Rotations The degree of correlation allowed between factors is generally small (two highly correlated factors one factor). An oblique rotation corresponds to a non-rigid rotation of the coordinate system, i.e., the resulting axes need no longer be perpendicular. Oblique rotations, therefore, relax the orthogonality constraint in order to gain simplicity in the interpretation. The promax rotation is an example of an oblique rotation. Its name derives from Procrustean rotation. Oblique rotations are much less popular than their orthogonal counterparts, but this trend may change as new techniques, such as independent component analysis, are developed. 21

22 Factor scores Interest usually lies in the parameters of a factor model (i.e., λ ij and ǫ i ) but the estimated values of the common factors (the factor scores, i.e., ˆf) may also be required. The location of each original observation in the reduced factor space is often necessary as input for a subsequent analysis. One method which can be used to estimate the factor scores is the regression method. From multivariate normal theory it can be shown that: ˆf = Λ T Σ 1 X 22

23 Decathlon The men s decathlon event in the Seoul 98 Olympic games involved 34 competitors who each competed in the following disciplines: 100 metres, Long jump, Shot Put, High jump, 400 metres, 110 metre hurdles, Discus, Pole vault, Javelin, 1500 metres. The result for each competitor in each event was recorded, along with their final overall score. Factor analysis can be used to explore the structure of this data. In order to ease interpretation of results, the results of the 100m, 400m, hurdle event and the 1500m event were converted to the negative of the result (so a high score in any event implies a good performance). Most pairs of events are positively correlated to some degree. Factor analysis attempts to explain the correlation between a large set of variables in terms of a small number of underlying latent factors. 23

24 Decathlon A factor analysis model with four factors was fitted to the data with the following results: Uniquenesses: m100m LongJump Shot HighJump m400m mhurdles Discus PoleVault Javelin m1500m Loadings: Factor1 Factor2 Factor3 Factor4 m100m LongJump Shot HighJump m400m mhurdles Discus PoleVault Javelin m1500m

27 Interpretation Continued The default setting in R is to perform a varimax rotation to the resulting factors (so there are few intermediate loadings reported). Factor 1: 100 metres, 400 metres and the hurdles event load highly on this factor, so it could be interpreted as the running speed factor. Factor 2: The shot put, discus and javelin load highly on this factor, so it could be interpreted as arm strength. Factor 3: The 1500 metres event loads highly on this factor, so it could be interpreted as endurance. Factor 4: Hurdles, high jump, 100 metres, long jump and the pole vault all load higher on this factor than the other events, so it could be interpreted as leg strength. 27

28 PCA vs Factor Analysis Since both PCA and factor analysis have similar aims (i.e., to reduce the dimensionality of a data set), it is worth highlighting the difference in the two approaches. PCA looks for linear combinations of the data matrix X that are uncorrelated and of high variance, whilst factor analysis seeks unobserved linear combinations of the variables representing underlying fundamental quantities. PCA makes no assumptions about the form of the covariance matrix, whilst factor analysis assumes that the data comes from a well-defined model in which specific assumptions hold (e.g., E[f] = 0, etc). PCA: data PCs. FA: factors data. When specific variances are large they are absorbed into the PCs whereas factor analysis makes special provision for them. When the specific variances are small, PCA and FA give similar results. 28

29 PCA vs Factor Analysis The two analyses are often performed together. For example, you can conduct a principal components analysis to determine the number of factors to extract in a factor analytic study. 29

### Factor Analysis. Factor Analysis

Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we

### Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings

### Common factor analysis

Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor

### Introduction to Principal Components and FactorAnalysis

Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a

### Factor analysis. Angela Montanari

Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

### Introduction to Matrix Algebra

Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

### Multivariate Analysis (Slides 4)

Multivariate Analysis (Slides 4) In today s class we examine examples of principal components analysis. We shall consider a difficulty in applying the analysis and consider a method for resolving this.

### Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round \$200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

### FACTOR ANALYSIS NASC

FACTOR ANALYSIS NASC Factor Analysis A data reduction technique designed to represent a wide range of attributes on a smaller number of dimensions. Aim is to identify groups of variables which are relatively

### Exploratory Factor Analysis: rotation. Psychology 588: Covariance structure and factor models

Exploratory Factor Analysis: rotation Psychology 588: Covariance structure and factor models Rotational indeterminacy Given an initial (orthogonal) solution (i.e., Φ = I), there exist infinite pairs of

### Exploratory Factor Analysis

Introduction Principal components: explain many variables using few new variables. Not many assumptions attached. Exploratory Factor Analysis Exploratory factor analysis: similar idea, but based on model.

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

Factor Analysis Advanced Financial Accounting II Åbo Akademi School of Business Factor analysis A statistical method used to describe variability among observed variables in terms of fewer unobserved variables

### Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA

PROC FACTOR: How to Interpret the Output of a Real-World Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a real-world example of a factor

### Factor Analysis. Chapter 420. Introduction

Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.

### Exploratory Factor Analysis

Exploratory Factor Analysis Definition Exploratory factor analysis (EFA) is a procedure for learning the extent to which k observed variables might measure m abstract variables, wherein m is less than

### The basic unit in matrix algebra is a matrix, generally expressed as: a 11 a 12. a 13 A = a 21 a 22 a 23

(copyright by Scott M Lynch, February 2003) Brief Matrix Algebra Review (Soc 504) Matrix algebra is a form of mathematics that allows compact notation for, and mathematical manipulation of, high-dimensional

### Basics Inversion and related concepts Random vectors Matrix calculus. Matrix algebra. Patrick Breheny. January 20

Matrix algebra January 20 Introduction Basics The mathematics of multiple regression revolves around ordering and keeping track of large arrays of numbers and solving systems of equations The mathematical

### Statistics for Business Decision Making

Statistics for Business Decision Making Faculty of Economics University of Siena 1 / 62 You should be able to: ˆ Summarize and uncover any patterns in a set of multivariate data using the (FM) ˆ Apply

### FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.

FACTOR ANALYSIS Introduction Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables Both methods differ from regression in that they don t have

### Principal Components Analysis (PCA)

Principal Components Analysis (PCA) Janette Walde janette.walde@uibk.ac.at Department of Statistics University of Innsbruck Outline I Introduction Idea of PCA Principle of the Method Decomposing an Association

### Similarity and Diagonalization. Similar Matrices

MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

### Doing Quantitative Research 26E02900, 6 ECTS Lecture 2: Measurement Scales. Olli-Pekka Kauppila Rilana Riikkinen

Doing Quantitative Research 26E02900, 6 ECTS Lecture 2: Measurement Scales Olli-Pekka Kauppila Rilana Riikkinen Learning Objectives 1. Develop the ability to assess a quality of measurement instruments

### Overview of Factor Analysis

Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,

### Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 FA is not worth the time necessary to understand it and carry it out. -Hills, 1977 Factor analysis should not

### Random Vectors and the Variance Covariance Matrix

Random Vectors and the Variance Covariance Matrix Definition 1. A random vector X is a vector (X 1, X 2,..., X p ) of jointly distributed random variables. As is customary in linear algebra, we will write

### Factor Analysis. Sample StatFolio: factor analysis.sgp

STATGRAPHICS Rev. 1/10/005 Factor Analysis Summary The Factor Analysis procedure is designed to extract m common factors from a set of p quantitative variables X. In many situations, a small number of

### A Introduction to Matrix Algebra and Principal Components Analysis

A Introduction to Matrix Algebra and Principal Components Analysis Multivariate Methods in Education ERSH 8350 Lecture #2 August 24, 2011 ERSH 8350: Lecture 2 Today s Class An introduction to matrix algebra

### Principal Component Analysis

Principal Component Analysis ERS70D George Fernandez INTRODUCTION Analysis of multivariate data plays a key role in data analysis. Multivariate data consists of many different attributes or variables recorded

### Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

### Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis

### MATH 304 Linear Algebra Lecture 8: Inverse matrix (continued). Elementary matrices. Transpose of a matrix.

MATH 304 Linear Algebra Lecture 8: Inverse matrix (continued). Elementary matrices. Transpose of a matrix. Inverse matrix Definition. Let A be an n n matrix. The inverse of A is an n n matrix, denoted

### The Scalar Algebra of Means, Covariances, and Correlations

3 The Scalar Algebra of Means, Covariances, and Correlations In this chapter, we review the definitions of some key statistical concepts: means, covariances, and correlations. We show how the means, variances,

### Topic 10: Factor Analysis

Topic 10: Factor Analysis Introduction Factor analysis is a statistical method used to describe variability among observed variables in terms of a potentially lower number of unobserved variables called

### Least-Squares Intersection of Lines

Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a

### Matrices provide a compact notation for expressing systems of equations or variables. For instance, a linear function might be written as: b k

MATRIX ALGEBRA FOR STATISTICS: PART 1 Matrices provide a compact notation for expressing systems of equations or variables. For instance, a linear function might be written as: y = x 1 b 1 + x 2 + x 3

### Factor Rotations in Factor Analyses.

Factor Rotations in Factor Analyses. Hervé Abdi 1 The University of Texas at Dallas Introduction The different methods of factor analysis first extract a set a factors from a data set. These factors are

### Introduction to Principal Component Analysis: Stock Market Values

Chapter 10 Introduction to Principal Component Analysis: Stock Market Values The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from

### MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors Jordan canonical form (continued) Jordan canonical form A Jordan block is a square matrix of the form λ 1 0 0 0 0 λ 1 0 0 0 0 λ 0 0 J = 0

### 1 Introduction. 2 Matrices: Definition. Matrix Algebra. Hervé Abdi Lynne J. Williams

In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 00 Matrix Algebra Hervé Abdi Lynne J. Williams Introduction Sylvester developed the modern concept of matrices in the 9th

### MAT Solving Linear Systems Using Matrices and Row Operations

MAT 171 8.5 Solving Linear Systems Using Matrices and Row Operations A. Introduction to Matrices Identifying the Size and Entries of a Matrix B. The Augmented Matrix of a System of Equations Forming Augmented

### 4. Joint Distributions of Two Random Variables

4. Joint Distributions of Two Random Variables 4.1 Joint Distributions of Two Discrete Random Variables Suppose the discrete random variables X and Y have supports S X and S Y, respectively. The joint

### How to report the percentage of explained common variance in exploratory factor analysis

UNIVERSITAT ROVIRA I VIRGILI How to report the percentage of explained common variance in exploratory factor analysis Tarragona 2013 Please reference this document as: Lorenzo-Seva, U. (2013). How to report

### UNIT 2 MATRICES - I 2.0 INTRODUCTION. Structure

UNIT 2 MATRICES - I Matrices - I Structure 2.0 Introduction 2.1 Objectives 2.2 Matrices 2.3 Operation on Matrices 2.4 Invertible Matrices 2.5 Systems of Linear Equations 2.6 Answers to Check Your Progress

### CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In

### DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

### Lecture 9: Continuous

CSC2515 Fall 2007 Introduction to Machine Learning Lecture 9: Continuous Latent Variable Models 1 Example: continuous underlying variables What are the intrinsic latent dimensions in these two datasets?

### T-test & factor analysis

Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

### Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

### Principal Component Analysis

Principal Component Analysis Principle Component Analysis: A statistical technique used to examine the interrelations among a set of variables in order to identify the underlying structure of those variables.

### Sections 2.11 and 5.8

Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and

### Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

### Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

### Penalized regression: Introduction

Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

### The president of a Fortune 500 firm wants to measure the firm s image.

4. Factor Analysis A related method to the PCA is the Factor Analysis (FA) with the crucial difference that in FA a statistical model is constructed to explain the interrelations (correlations) between

### Summary of week 8 (Lectures 22, 23 and 24)

WEEK 8 Summary of week 8 (Lectures 22, 23 and 24) This week we completed our discussion of Chapter 5 of [VST] Recall that if V and W are inner product spaces then a linear map T : V W is called an isometry

### Multiple regression - Matrices

Multiple regression - Matrices This handout will present various matrices which are substantively interesting and/or provide useful means of summarizing the data for analytical purposes. As we will see,

### Cofactor Expansion: Cramer s Rule

Cofactor Expansion: Cramer s Rule MATH 322, Linear Algebra I J. Robert Buchanan Department of Mathematics Spring 2015 Introduction Today we will focus on developing: an efficient method for calculating

### Multivariate Analysis of Variance (MANOVA): I. Theory

Gregory Carey, 1998 MANOVA: I - 1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the

### Orthogonal Projections

Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors

### Matrix Algebra LECTURE 1. Simultaneous Equations Consider a system of m linear equations in n unknowns: y 1 = a 11 x 1 + a 12 x 2 + +a 1n x n,

LECTURE 1 Matrix Algebra Simultaneous Equations Consider a system of m linear equations in n unknowns: y 1 a 11 x 1 + a 12 x 2 + +a 1n x n, (1) y 2 a 21 x 1 + a 22 x 2 + +a 2n x n, y m a m1 x 1 +a m2 x

### = [a ij ] 2 3. Square matrix A square matrix is one that has equal number of rows and columns, that is n = m. Some examples of square matrices are

This document deals with the fundamentals of matrix algebra and is adapted from B.C. Kuo, Linear Networks and Systems, McGraw Hill, 1967. It is presented here for educational purposes. 1 Introduction In

### MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

### L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

### Chapter 20. Vector Spaces and Bases

Chapter 20. Vector Spaces and Bases In this course, we have proceeded step-by-step through low-dimensional Linear Algebra. We have looked at lines, planes, hyperplanes, and have seen that there is no limit

### Covariance and Correlation

Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such

### 3. The Multivariate Normal Distribution

3. The Multivariate Normal Distribution 3.1 Introduction A generalization of the familiar bell shaped normal density to several dimensions plays a fundamental role in multivariate analysis While real data

### 5. Linear Regression

5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

### Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture - 2 Simple Linear Regression

Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 2 Simple Linear Regression Hi, this is my second lecture in module one and on simple

### Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree

### Linear Codes. In the V[n,q] setting, the terms word and vector are interchangeable.

Linear Codes Linear Codes In the V[n,q] setting, an important class of codes are the linear codes, these codes are the ones whose code words form a sub-vector space of V[n,q]. If the subspace of V[n,q]

### SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

### Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

### Elasticity Theory Basics

G22.3033-002: Topics in Computer Graphics: Lecture #7 Geometric Modeling New York University Elasticity Theory Basics Lecture #7: 20 October 2003 Lecturer: Denis Zorin Scribe: Adrian Secord, Yotam Gingold

### Linear Algebra Review. Vectors

Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

### Math 202-0 Quizzes Winter 2009

Quiz : Basic Probability Ten Scrabble tiles are placed in a bag Four of the tiles have the letter printed on them, and there are two tiles each with the letters B, C and D on them (a) Suppose one tile

### Solution to Homework 2

Solution to Homework 2 Olena Bormashenko September 23, 2011 Section 1.4: 1(a)(b)(i)(k), 4, 5, 14; Section 1.5: 1(a)(b)(c)(d)(e)(n), 2(a)(c), 13, 16, 17, 18, 27 Section 1.4 1. Compute the following, if

### Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances?

1 Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances? André Beauducel 1 & Norbert Hilger University of Bonn,

### Psychology 7291, Multivariate Analysis, Spring 2003. SAS PROC FACTOR: Suggestions on Use

: Suggestions on Use Background: Factor analysis requires several arbitrary decisions. The choices you make are the options that you must insert in the following SAS statements: PROC FACTOR METHOD=????

### ( % . This matrix consists of \$ 4 5 " 5' the coefficients of the variables as they appear in the original system. The augmented 3 " 2 2 # 2 " 3 4&

Matrices define matrix We will use matrices to help us solve systems of equations. A matrix is a rectangular array of numbers enclosed in parentheses or brackets. In linear algebra, matrices are important

### Chapter 19. General Matrices. An n m matrix is an array. a 11 a 12 a 1m a 21 a 22 a 2m A = a n1 a n2 a nm. The matrix A has n row vectors

Chapter 9. General Matrices An n m matrix is an array a a a m a a a m... = [a ij]. a n a n a nm The matrix A has n row vectors and m column vectors row i (A) = [a i, a i,..., a im ] R m a j a j a nj col

### Chapter 6. Orthogonality

6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be

### Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk

Doi:10.5901/mjss.2014.v5n20p303 Abstract Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk Wilbert Sibanda Philip D. Pretorius

### Quadratic forms Cochran s theorem, degrees of freedom, and all that

Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us

### Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

### 9 Matrices, determinants, inverse matrix, Cramer s Rule

AAC - Business Mathematics I Lecture #9, December 15, 2007 Katarína Kálovcová 9 Matrices, determinants, inverse matrix, Cramer s Rule Basic properties of matrices: Example: Addition properties: Associative:

### Least Squares Estimation

Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

### a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

### PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA

PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA ABSTRACT The decision of whether to use PLS instead of a covariance

### FEATURE. Abstract. Introduction. Background

FEATURE From Data to Information: Using Factor Analysis with Survey Data Ronald D. Fricker, Jr., Walter W. Kulzy, and Jeffrey A. Appleget, Naval Postgraduate School; rdfricker@nps.edu Abstract In irregular

### October 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix

Linear Algebra & Properties of the Covariance Matrix October 3rd, 2012 Estimation of r and C Let rn 1, rn, t..., rn T be the historical return rates on the n th asset. rn 1 rṇ 2 r n =. r T n n = 1, 2,...,

### Tutorial on Exploratory Data Analysis

Tutorial on Exploratory Data Analysis Julie Josse, François Husson, Sébastien Lê julie.josse at agrocampus-ouest.fr francois.husson at agrocampus-ouest.fr Applied Mathematics Department, Agrocampus Ouest

### Images and Kernels in Linear Algebra By Kristi Hoshibata Mathematics 232

Images and Kernels in Linear Algebra By Kristi Hoshibata Mathematics 232 In mathematics, there are many different fields of study, including calculus, geometry, algebra and others. Mathematics has been

### Linear and Piecewise Linear Regressions

Tarigan Statistical Consulting & Coaching statistical-coaching.ch Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL Hands-on Data Analysis

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents

### The aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree

PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and