Canonical Correlation Analysis

Similar documents

Multivariate Normal Distribution

Dimensionality Reduction: Principal Components Analysis

Module 3: Correlation and Covariance

Exploratory Factor Analysis

6 Variables: PD MF MA K IAH SBS

Overview of Factor Analysis

How To Understand Multivariate Models

Introduction to Principal Components and FactorAnalysis

Introduction: Overview of Kernel Methods

Multivariate Analysis of Variance (MANOVA)

containing Kendall correlations; and the OUTH = option will create a data set containing Hoeffding statistics.

Notes on Applied Linear Regression

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

Multiple regression - Matrices

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

DISCRIMINANT FUNCTION ANALYSIS (DA)

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Introduction to Principal Component Analysis: Stock Market Values

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Factor Analysis. Chapter 420. Introduction

Multivariate Analysis of Variance (MANOVA)

A Demonstration of Hierarchical Clustering

Data Mining: Algorithms and Applications Matrix Math Review

Multivariate Statistical Inference and Applications

Similarity and Diagonalization. Similar Matrices

Review Jeopardy. Blue vs. Orange. Review Jeopardy

FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.

Introduction to Multivariate Analysis

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Convolution. 1D Formula: 2D Formula: Example on the web:

Statistical Machine Learning

Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA

Didacticiel - Études de cas

Multivariate Analysis (Slides 13)

Principal Component Analysis

Common factor analysis

Principal components analysis

Factor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business

5. Linear Regression

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine

T-test & factor analysis

Association Between Variables

9 Hedging the Risk of an Energy Futures Portfolio UNCORRECTED PROOFS. Carol Alexander 9.1 MAPPING PORTFOLIOS TO CONSTANT MATURITY FUTURES 12 T 1)

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

Chapter 7 Factor Analysis SPSS

Elasticity Theory Basics

Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S

Regression III: Advanced Methods

Regression and Correlation

An introduction to Value-at-Risk Learning Curve September 2003

Session 7 Bivariate Data and Analysis

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

CORRELATION ANALYSIS

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Orthogonal Diagonalization of Symmetric Matrices

Principal Component Analysis

SPSS Guide: Regression Analysis

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Chapter 1 Introduction. 1.1 Introduction

Determine If An Equation Represents a Function

Metric Multidimensional Scaling (MDS): Analyzing Distance Matrices

Object Recognition and Template Matching

MEASURES OF VARIATION

Multivariate Analysis of Variance (MANOVA): I. Theory

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Factor Analysis. Factor Analysis

Sections 2.11 and 5.8

Lecture 3: Linear methods for classification

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

4. Matrix Methods for Analysis of Structure in Data Sets:

Canonical Correlation Analysis

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Factoring Trinomials of the Form x 2 bx c

October 3rd, Linear Algebra & Properties of the Covariance Matrix

Lecture 5: Singular Value Decomposition SVD (1)

An Introduction to Path Analysis. nach 3

Ridge Regression. Patrick Breheny. September 1. Ridge regression Selection of λ Ridge regression in R/SAS

Additional sources Compilation of sources:

Math 265 (Butler) Practice Midterm II B (Solutions)

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

How to Get More Value from Your Survey Data

Association between Accounting and Market-Based Variables A Canonical Correlation Approach with U.S. Data

Risk Decomposition of Investment Portfolios. Dan dibartolomeo Northfield Webinar January 2014

Introduction to Matrix Algebra

Optimal linear-quadratic control

Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red)

1 Teaching notes on GMM 1.

Introduction to Fixed Effects Methods

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

SAS Code to Select the Best Multiple Linear Regression Model for Multivariate Data Using Information Criteria

Part 2: Analysis of Relationship Between Two Variables

4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as:

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

Pearson s Correlation

Imputing Missing Data using SAS

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

3.1 Solving Systems Using Tables and Graphs

Goodness of fit assessment of item response theory models

Multivariate Analysis. Overview

Transcription:

Canonical Correlation Analysis Lecture 11 August 4, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #11-8/4/2011 Slide 1 of 39

Today s Lecture Canonical Correlation Analysis Today s Lecture What it is How it works How to do such an analysis Examples of uses of canonical correlations Lecture #11-8/4/2011 Slide 2 of 39

Purpose Purpose Concept Bivariate Correlation Multiple Correlation Canonical Correlation In general, when we have univariate data there are times when we would like to measure the linear relationship between things The simplest case is when we have 2 variables and all we are interested in is measuring their linear relationship. Here we would just use bivariate correlation Another case is in multiple regression when we have several independent variables and one dependent variable. In this case we would use the multiple correlation coefficient (R 2 ) So, it would be nice if we could expand the idea used in these to a situation where we have several y variables and several x variables Lecture #11-8/4/2011 Slide 3 of 39

Concept From Webster s Dictionary: canonical: reduced to the simplest or clearest schema possible. Purpose Concept Bivariate Correlation Multiple Correlation Canonical Correlation What do we mean by basic ideas? In describing canonical correlation, we will start with the basic cases where we only have two variables and build on it until we get to canonical correlations 1. First we will look at the bivariate correlation 2. Then we will see what was done to generalize bivariate correlation to the multiple correlation coefficient 3. Finally, these discussions will lead us right to what happens in canonical correlation analysis Lecture #11-8/4/2011 Slide 4 of 39

Bivariate Correlation Begin by thinking of just two variables y and x Purpose Concept Bivariate Correlation Multiple Correlation Canonical Correlation In this case the correlation describes the extent that one variable relates (can predict) the other That is...the stronger the correlation the more we will know about y by just knowing x No relationship Strong positive relationship Lecture #11-8/4/2011 Slide 5 of 39

Multiple Correlation On the other hand, if we have one y and multiple x variables we can no longer look at a simple relationship between the two variables Purpose Concept Bivariate Correlation Multiple Correlation Canonical Correlation But, we can look at how well the set of x variables can predict the y by just computing the regression line Using the regression line we can compute our predicted ŷ and we can compare it to the y variable. Specifically, we now have only two variables y and ŷ = x b = so we can compute a simple correlation Note: we started with something that was more complicated (many x variables) and changed it in to something that we could compute a simple correlation (between y and ŷ) Lecture #11-8/4/2011 Slide 6 of 39

Multiple Correlation Example From Weisberg (1985, p. 240). Purpose Concept Bivariate Correlation Multiple Correlation Canonical Correlation Property taxes on a house are supposedly dependent on the current market value of the house. Since houses actually sell only rarely, the sale price of each house must be estimated every year when property taxes are set. Regression methods are sometimes used to make up a prediction function. We have data for 27 houses sold in the mid 1970 s in Erie, Pennsylvania: x 1 : Current taxes (local, school, and county) 100 (dollars) x 2 : Number of bathrooms x 3 : Living space 1000 (square feet) x 4 : Age of house (years) y: Actual sale price 1000 (dollars) Lecture #11-8/4/2011 Slide 7 of 39

Multiple Correlation Example To compute the multiple correlation of x 1, x 2, x 3, and x 4 with y, first compute the multiple regression for all x variables and y: Purpose Concept Bivariate Correlation Multiple Correlation Canonical Correlation proc reg data=house; model y=x1-x4; output out=newdata p=yhat; run; Then, take the predicted values given by the model, ŷ and correlate them with y: proc corr data=newdata; var yhat y; run; Lecture #11-8/4/2011 Slide 8 of 39

Multiple Correlation Example Purpose Concept Bivariate Correlation Multiple Correlation Canonical Correlation Lecture #11-8/4/2011 Slide 9 of 39

Multiple Correlation Example Purpose Concept Bivariate Correlation Multiple Correlation Canonical Correlation Above is the multiple correlation between x 1, x 2, x 3, x 4 and y Lecture #11-8/4/2011 Slide 10 of 39

Canonical Correlation Canonical correlation seeks to find the correlation between multiple x variables and multiple y variables Purpose Concept Bivariate Correlation Multiple Correlation Canonical Correlation Now we have several y variables and several x variables so neither of our previous two examples can directly apply, BUT we can take the points from the previous cases and use them for this new case So we could look at how well the set of x variables can predict the set of y variables, but in doing this we still will not be able to compute a simple correlation On the other hand, in the multiple regression we found a linear combination of the variables b x to get a single variable In our case we have two sets of variables so it makes sense that we can define two linear combinations...one for the x variables (b 1 ) and one for the y variables (a 1 ) Lecture #11-8/4/2011 Slide 11 of 39

Canonical Correlation Purpose Concept Bivariate Correlation Multiple Correlation Canonical Correlation In the simple case where we only have a single linear combination for each set of variables we can compute the simple correlation between these two linear combinations The first canonical correlation describes the correlation between these two new variables (b 1x and a 1y) So how do we pick the linear transformations? These linear transformations (b 1 and a 1 ) are picked such that the correlation between these two new variables is maximized Notice that this idea is really no different from what we did in multiple regression This also sounds similar to something we have done in PCA Lecture #11-8/4/2011 Slide 12 of 39

Canonical Correlation ONE LAST THING Purpose Concept Bivariate Correlation Multiple Correlation Canonical Correlation Think back to PCA when we said that a single linear combination did not account for all of the information present in a data set... Then we could determine how many linear combinations were needed to capture more information (where the linear combinations were all uncorrelated) We can do the same thing here... We can define more sets of linear combinations (b i and a i, i = 1,...,s where s = min (p, q), p is the number of variables in the group of x and q is the number of variables in y) Each linear combinations maximizes the correlation between the new variables under the constraint that they are uncorrelated with all other previous linear combinations Lecture #11-8/4/2011 Slide 13 of 39

To show how to compute canonical correlations, first consider our original covariance matrix from our example: Example #2 Standardized Weights Canonical Corr. Properties Hypothesis Test for Corr. x 1 x 2 x 3 x 4 y x 1 8.3100 1.0700 1.3400 15.0300 37.7400 x 2 1.0700 0.1800 0.2100 1.2500 5.6000 x 3 1.3400 0.2100 0.3100 1.4000 7.4200 x 4 15.0300 1.2500 1.4000 197.4900 62.3900 y 37.7400 5.6000 7.4200 62.3900 204.7000 Lecture #11-8/4/2011 Slide 14 of 39

From this matrix, we will define four new sub-matrices, from which we will calculate our correlations: Example #2 Standardized Weights Canonical Corr. Properties Hypothesis Test for Corr. x 1 x 2 x 3 x 4 y x 1 x 2 S xx S xy x 3 x 4 y S xy S yy Lecture #11-8/4/2011 Slide 15 of 39

So how do we compute the canonical correlations? Example #2 Standardized Weights Canonical Corr. Properties Hypothesis Test for Corr. To begin, note that we could define the Squared Multiple Correlation R 2 M as which can be rewritten as: R 2 M = S xy S 1 xx S xy S yy R 2 M = S 1 yy S yx S 1 xx S yx For canonical correlations, however, we will focus on the matrix formed by the part of the equation within the (note this was just a scalar when y only has one variable) Lecture #11-8/4/2011 Slide 16 of 39

We first compute the square root of the eigenvalues (r 1, r 2,...,r s ) and the eigenvectors (a 1, a 2,...,a s ) of: S 1 yy S yx S 1 xx S xy Example #2 Standardized Weights Canonical Corr. Properties Hypothesis Test for Corr. Then we compute the square root of the eigenvalues (r 1, r 2,...,r s ) and the eigenvectors (b 1, b 2,...,b s ) of: S 1 xx S xy S 1 yy S yx Conveniently, the eigenvalues for both equations are equal (and are between zero and one)! The square root of the eigenvalues represents each successive canonical correlation between the successive pairs of linear combinations From the eigenvectors we have determined the linear transformations for the new linear combinations Lecture #11-8/4/2011 Slide 17 of 39

Example #2 To illustrate canonical correlations, consider the following analysis: Example #2 Standardized Weights Canonical Corr. Properties Hypothesis Test for Corr. Three physiological and three exercise variables are measured on 27 middle-aged men in a fitness club The variables collected are: Weight (in pounds - x 1 ) Waist size (in inches - x 2 ) Pulse rate (in beats-per-minute - x 3 ) Number of chin-ups performed (y 1 ) Number of sit-ups performed (y 2 ) Number of jumping-jacks performed (y 3 ) The goal of the analysis is to determine the relationship between the physiological measurements and the exercises Lecture #11-8/4/2011 Slide 18 of 39

Example #2 To run a canonical correlation analysis, use the following code: proc cancorr data=fit all vprefix=physiological vname= Physiological Measurements wprefix=exercises wname= Exercises ; var Weight Waist Pulse; with Chins Situps Jumps; run; Lecture #11-8/4/2011 Slide 19 of 39

Example #2 Lecture #11-8/4/2011 Slide 20 of 39

Example #2 Lecture #11-8/4/2011 Slide 21 of 39

Standardized Weights Just like in PCA and Factor Analysis, we are interested in interpreting the weights of the linear combination Example #2 Standardized Weights Canonical Corr. Properties Hypothesis Test for Corr. However, if our variables are in different scales they are difficult to interpret So, we can standardize them, which is the same as computing the canonical correlations and linear combination of the correlation matrix instead of using the the variance/covariance matrix We can also compute the standardize coefficients (c and d) directly: c = diag(s yy ) 1 2 a and d = diag(s xx ) 1 2 b Lecture #11-8/4/2011 Slide 22 of 39

Example #2 Lecture #11-8/4/2011 Slide 23 of 39

Canonical Corr. Properties 1. Canonical correlations are invariant. Example #2 Standardized Weights Canonical Corr. Properties Hypothesis Test for Corr. This means that, like any correlation, scale changes (such as standardizing) will not change the correlation. However, it will change the eigenvectors... 2. The first canonical correlation is the best we can do with associations. Which means it is better than any of the simple correlations or any multiple correlation with the variables under study Lecture #11-8/4/2011 Slide 24 of 39

Hypothesis Test for Corr. We begin by testing that at least the first (the largest) correlation is significantly different from zero Example #2 Standardized Weights Canonical Corr. Properties Hypothesis Test for Corr. If we cannot get a significant relationship out of the optimal linear combination of variables this is the same as testing H 0 : Σ xy = 0 or B 1 = 0 This is tested using Wilk s Lambda: Λ 1 = S S yy S xx Or, equivalently (where 2 is the eigenvalue from the matrix term produced from the submatrices of the covariance matrix): s Λ 1 = (1 ri 2 ) i=1 Lecture #11-8/4/2011 Slide 25 of 39

The Rest Example #2 Standardized Weights Canonical Corr. Properties Hypothesis Test for Corr. In this case Λ 1 as Λ 1 = s (1 ri 2 ) i=1 which can be compared to Λ α,p,q,n 1 q (or to a Λ α,q,p,n 1 p ) In general we can compute Λ j = s (1 ri 2 ) i=k which can be compared to Λ α,p k+1,q k+1,n k q (or to a Λ α,q,p,n 1 p ) Lecture #11-8/4/2011 Slide 26 of 39

Example #2 Lecture #11-8/4/2011 Slide 27 of 39

Standardized Correlation of Linear Combination with Variables Rotation Redundancy Because in many ways a canonical correlation analysis is similar to what we discussed in PCA, the interpretation methods are also similar Specifically, we will discuss four methods that are used to interpret the results: 1. Standardized Coefficients 2. Correlation between Canonical Variates (the linear combination) and each variable 3. Rotation 4. Redundancy Analysis Lecture #11-8/4/2011 Slide 28 of 39

Standardized Standardized Correlation of Linear Combination with Variables Rotation Redundancy Because the standardized variables are on the same scale they can be directly compared Those variables that are most important to the association are the ones with the largest absolute values (i.e., determine importance) To interpret what the linear combination is capturing we will also consider the sign of each weight Lecture #11-8/4/2011 Slide 29 of 39

Correlation of Linear Combination with Variab This was mentioned in PCA and EFA... Standardized Correlation of Linear Combination with Variables Rotation Redundancy That is, we compute our linear combinations and then compute the correlation between the linear combination (canonical variates) with each of the actual variables The correlations are typically called the loadings or structure coefficients As was the case in PCA this ignores the overall multidimensional structure and so it is not a recommend analysis to make interpretations from Lecture #11-8/4/2011 Slide 30 of 39

Rotation Standardized Correlation of Linear Combination with Variables Rotation Redundancy We could try rotating the weights of the analysis to provide an interpretable result... For this we begin to rely on the spacial representation of what is going on with the data Every linear combination is projecting our observations on to a different dimension Sometimes these dimensions are difficult to interpret (i.e., based on the sign and magnitude Sometimes we can rotate these dimensions so that the weights are easier to interpret Some are large and some are small Rotations in CCA are not recommended, because we lose the optimal interpretation of the analysis Lecture #11-8/4/2011 Slide 31 of 39

Redundancy Another method for interpretation is a redundancy analysis (this, again, is often not liked by statisticians because it only summarizes univariate relationships) Lecture #11-8/4/2011 Slide 32 of 39

Redundancy Lecture #11-8/4/2011 Slide 33 of 39

Redundancy Lecture #11-8/4/2011 Slide 34 of 39

In a study of social support and mental health, measures of the following seven variables were taken on 405 subjects: Total Social Support Family Social Support Friend Social Support Significant Other Social Support Depression Loneliness Stress The researchers were interested in determining the relationship between social support and mental health...how about using a canonical correlation analysis? Lecture #11-8/4/2011 Slide 35 of 39

*SAS Example #3; data depress (type=corr); _type_= corr ; input _name_ $ v1-v7; label v1= total social support v2= family social support v3= friend social support v4= significant other social support v5= depression v6= loneliness v7= stress ; datalines; v1 1.00...... v2 0.8280 1.0000..... v3 0.8136 0.5192 1.0000.... v4 0.8569 0.5972 0.6109 1.0000... v5-0.3691-0.3218-0.3150-0.3044 1.0000.. v6-0.6282-0.4945-0.5774-0.5266 0.5368 1.0000. v7-0.1849-0.2049-0.1132-0.1291 0.4872 0.2846 1.000 ; proc cancorr data=depress all corr edf=404 vprefix=mental_health vname= Mental Health wprefix=social_support wname= Social Support ; var v1-v4; with v5-v7; run; 35-1

Lecture #11-8/4/2011 Slide 36 of 39

Lecture #11-8/4/2011 Slide 37 of 39

In general, the results from a canonical correlations routine are related to: 1. Regression 2. Discriminant Analysis (we will learn this next week) 3. MANOVA However, the goals of canonical correlation overlap with the information provided by a confirmatory factor analysis or structural equation model... Lecture #11-8/4/2011 Slide 38 of 39

Final Thought The midterm was accomplished using MANOVA and MANCOVA. Canonical correlation analysis is a complicated analysis that provides many results of interest to researchers. Perhaps because of it s complicated nature, canonical correlation analysis is not often used. Last week: Nebraska...This week: Texas...After that: The world. Tomorrow: Lab Day! Meet in Helen Newberry s Michigan Lab Lecture #11-8/4/2011 Slide 39 of 39