PCA Principal Component Analysis

Similar documents
Review Jeopardy. Blue vs. Orange. Review Jeopardy

Dimensionality Reduction: Principal Components Analysis

Introduction to Matrix Algebra

[1] Diagonal factorization

A tutorial on Principal Components Analysis

Similar matrices and Jordan form

x = + x 2 + x

DATA ANALYSIS II. Matrix Algorithms

Principal Component Analysis

Data Mining: Algorithms and Applications Matrix Math Review

Chapter 6. Orthogonality

Factor Analysis. Chapter 420. Introduction

How To Cluster

Linear Algebra Review. Vectors

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Typical Linear Equation Set and Corresponding Matrices

MATH APPLIED MATRIX THEORY

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

by the matrix A results in a vector which is a reflection of the given

MAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A =

Lecture 5: Singular Value Decomposition SVD (1)

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Chapter 17. Orthogonal Matrices and Symmetries of Space

Exploratory Factor Analysis

1 Introduction to Matrices

8 Square matrices continued: Determinants

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

Similarity and Diagonalization. Similar Matrices

LINEAR ALGEBRA. September 23, 2010

Lecture 2 Matrix Operations

LS.6 Solution Matrices

Factor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business

Principal Component Analysis

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

Introduction to Principal Component Analysis: Stock Market Values

13 MATH FACTS a = The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

SALEM COMMUNITY COLLEGE Carneys Point, New Jersey COURSE SYLLABUS COVER SHEET. Action Taken (Please Check One) New Course Initiated

Steven M. Ho!and. Department of Geology, University of Georgia, Athens, GA

Question 2: How do you solve a matrix equation using the matrix inverse?

University of Lille I PC first year list of exercises n 7. Review

Least-Squares Intersection of Lines

18.06 Problem Set 4 Solution Due Wednesday, 11 March 2009 at 4 pm in Total: 175 points.

Orthogonal Projections

8.2. Solution by Inverse Matrix Method. Introduction. Prerequisites. Learning Outcomes

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Exploratory Factor Analysis

Introduction to Principal Components and FactorAnalysis

Orthogonal Diagonalization of Symmetric Matrices

u = [ 2 4 5] has one row with three components (a 3 v = [2 4 5] has three rows separated by semicolons (a 3 w = 2:5 generates the row vector w = [ 2 3

Linearly Independent Sets and Linearly Dependent Sets

Solving simultaneous equations using the inverse matrix

Linear Algebra Notes

Linear Algebra and TI 89

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

Matrices 2. Solving Square Systems of Linear Equations; Inverse Matrices

MULTIPLE-OBJECTIVE DECISION MAKING TECHNIQUE Analytical Hierarchy Process

Nonlinear Iterative Partial Least Squares Method

Principal components analysis

Scatter Plots with Error Bars

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression

MAT 242 Test 2 SOLUTIONS, FORM T

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Vector Math Computer Graphics Scott D. Anderson

Row Echelon Form and Reduced Row Echelon Form

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

4. Matrix Methods for Analysis of Structure in Data Sets:

A Brief Introduction to SPSS Factor Analysis

1 Review of Least Squares Solutions to Overdetermined Systems

LINEAR ALGEBRA W W L CHEN

Manifold Learning Examples PCA, LLE and ISOMAP

Notes on Determinant

Algebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

Section V.3: Dot Product

Multiple regression - Matrices

Factor Analysis - 2 nd TUTORIAL

1 Determinants and the Solvability of Linear Systems

A =

Common factor analysis

Multivariate Analysis (Slides 13)

How To Understand Multivariate Models

Matrix Algebra in R A Minimal Introduction

Grassmann Algebra in Game Development. Eric Lengyel, PhD Terathon Software

Using row reduction to calculate the inverse and the determinant of a square matrix

Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA

How To Understand And Solve Algebraic Equations

3 Orthogonal Vectors and Matrices

Zeros of a Polynomial Function

Multivariate Statistical Inference and Applications

5. Orthogonal matrices

The Projection Matrix

Linear algebra and the geometry of quadratic equations. Similarity transformations and orthogonal matrices

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

Partial Least Squares (PLS) Regression.

CURVE FITTING LEAST SQUARES APPROXIMATION

Solving Systems of Linear Equations Using Matrices

CALCULATIONS & STATISTICS

Exploratory data analysis for microarray data

Transcription:

PCA Principal Component Analysis Dr. Saed Sayad University of Toronto 2010 saed.sayad@utoronto.ca 1

Basic Statistics 2

Statistics - Example X SumX Count Average X-Xbar (X-Xbar)^2 Sxx Variance SD 0 40 4 10-10 100 208 69.33 8.33 8-2 4 12 2 4 20 10 100 Y SumY Count Average Y-Ybar (Y-Ybar)^2 Syy Variance SD 8 40 4 10-2 4 10 3.33 1.83 9-1 1 11 1 1 12 2 4 X Y Count X-Xbar Y-Ybar (X-Xbar)(Y-Ybar) Sxy CoVariance 0 8 4-10 -2 20 44 14.67 8 9-2 -1 2 12 11 2 1 2 20 12 10 2 20 3

Covariance Matrix 4

Matrix Algebra Example of one non-eigenvector and one eigenvector 5

Matrix Algebra Example of how a scaled eigenvector is still an eigenvector 6

Eigenvectors Properties Eigenvectors can only be found for square matrices and not every square matrix has eigenvectors. Given an n x n matrix that does have eigenvectors, there are n of them. Another property of eigenvectors is that even if we scale the vector by some amount before we multiply it, we still get the same multiple of it as a result. This is because if you scale a vector by some amount, all you are doing is making it longer not changing it s direction. Lastly, all the eigenvectors of a matrix are perpendicular. it means that you can express the data in terms of these perpendicular eigenvectors, instead of expressing them in terms of the x and y axes. 7

Standardized Eigenvectors We like to find the eigenvectors whose length is exactly one. This is because, the length of a vector doesn t affect whether it s an eigenvector or not, whereas the direction does. So, in order to keep eigenvectors standard, whenever we find an eigenvector we usually scale it to make it have a length of 1, so that all eigenvectors have the same length. 8

Standardized Eigenvectors Eigenvector Vector Length Eigenvector with length of one 9

Eigenvalues Eigenvalues is the amount by which the original vector was scaled after multiplication by the square matrix. 4 is the eigenvalue associated with that eigenvector in the example. No matter what multiple of the eigenvector we took before we multiplied it by the square matrix, we would always get 4 times the scaled vector. Eigenvectors and Eigenvalues always come in pairs. 10

PCA PCA is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. Since patterns in data can be hard to find in data of high dimension, where the luxury of graphical representation is not available, PCA is a powerful tool for analysing data. The other main advantage of PCA is that once you have found these patterns in the data, and you compress the data, ie. by reducing the number of dimensions, without much loss of information. This technique used in image compression. 11

PCA Original and Adjusted Data Original Data Original Data - Average 12

PCA Original Data plot 13

Calculate Covariance Matrix 14

Calculate Eigenvectors and Eigenvalues from Covariance Matrix 15

Eigenvectors Plot 16

Choosing components and forming a Feature Vector The eigenvector with the highest eigenvalue is the principle component of the data set. It is the most significant relationship between the data dimensions. In general, once eigenvectors are found from the covariance matrix, the next step is to order them by eigenvalue, highest to lowest. This gives us the components in order of significance. Now, if we like, we can decide to ignore the components of lesser significance. We do lose some information, but if the eigenvalues are small, we don t lose much. If we leave out some components, the final data set will have less dimensions than the original. 17

Feature Vector 18

Deriving the new data set Final Data = Row Feature Vector X Row Data Adjust Row Feature Vector is the matrix with the eigenvectors in the columns transposed so that the eigenvectors are now in the rows, with the most significant eigenvector at the top. Row Data Adjust is the mean-adjusted data transposed, ie. the data items are in each column, with each row holding a separate dimension. Final Data is the final data items in columns, and dimensions along rows. It is the original data solely in terms of the vectors. 19

Deriving the new data set Two Eigenvectors 20

Get the original data back Row Data Adjust = Row Feature Vector T X Final Data Row Original Data = (Row Feature Vector T X Final Data) + Original Mean 21

Principal Component Regression PCA + MLR 22

PCR X Y B TP TB E (T T) T Y -1 T is a matrix of SCORES X is a DATA matrix P is a matrix of LOADNIGS Y is a dependent variable vector B is the regression coefficients vector http://chemeng.utoronto.ca/~datamining/ 23

PCR In PCR the X matrix is replaced by the T matrix which has less and orthogonal variables. There will be no issue with inversion of the T T matrix (unlike MLR) because of orthogonal scores. PCR also resolves the issue of colinearity which could in return reduce the prediction error. There is no guarantee to have a PCR model that works better than an MLR model using the same dataset. http://chemeng.utoronto.ca/~datamining/ 24

Reference www.cs.otago.ac.nz/cosc453/student_tutorial s/principal_components.pdf 25

Questions? 26