# Factor Analysis. Factor Analysis

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we assume that such latent variables, or factors, exist. NC STATE UNIVERSITY 1 / 38

2 The Orthogonal Factor Model equation: X 1 µ 1 = l 1,1 F 1 + l 1,2 F l 1,m F m + ɛ 1, X 2 µ 2 = l 2,1 F 1 + l 2,2 F l 2,m F m + ɛ 2,.. X p µ p = l p,1 F 1 + l p,2 F l p,m F m + ɛ p, where: F1, F 2,..., F m are the common factors (latent variables); li,j is the loading of variable i, X i, on factor j, F j ; ɛi is a specific factor, affecting only X i. NC STATE UNIVERSITY 2 / 38

3 In matrix form: X µ = L F + ɛ. p 1 p 1 p m 1 p 1 To make this identifiable, we further assume, with no loss of generality: E(F) = 0 m 1 Cov(F) = I m m E(ɛ) = 0 p 1 Cov(ɛ, F) = 0 p m NC STATE UNIVERSITY 3 / 38

4 and with serious loss of generality: Cov(ɛ) = Ψ = diag (ψ 1, ψ 2,..., ψ p ). In terms of the observable variables X, these assumptions mean that E(X) = µ, Cov(X) = Σ = L L p m p + Ψ p p. Usually X is standardized, so Σ = R. The observable X and the unobservable F are related by Cov(X, F) = L. NC STATE UNIVERSITY 4 / 38

5 Some terminology: the (i, i) entry of the matrix equation Σ = LL + Ψ is or where σ i,i }{{} Var(X i ) is the i th communality. = li,1 2 + li, li,m 2 } {{ } Communality σ i,i = h 2 i + ψ i h 2 i = l 2 i,1 + l 2 i,2 + + l 2 i,m + ψ }{{} i, Specific variance Note that if T is (m m) orthogonal, then (LT)(LT) = LL, so loadings LT generate the same Σ as L: loadings are not unique. NC STATE UNIVERSITY 5 / 38

6 Existence of Factor Representation For any p, every (p p) Σ can be factorized as Σ = LL for (p p) L, which is a factor representation with m = p and Ψ = 0; however, m = p is not much use we usually want m p. For p = 3, every (3 3) Σ can be represented as Σ = LL + Ψ for (3 1) L, which is a factor representation with m = 1, but Ψ may have negative elements. NC STATE UNIVERSITY 6 / 38

7 In general, we can only approximate Σ by LL + Ψ. Principal components method: the spectral decomposition of Σ is with m = p. Σ = EΛE = ( EΛ 1/2) ( EΛ 1/2) = LL If λ 1 + λ λ m λ m λ p, and L (m) is the first m columns of L, then Σ L (m) L (m) gives such an approximation with Ψ = 0. NC STATE UNIVERSITY 7 / 38

8 The remainder term Σ L (m) L (m) is non-negative definite, so its diagonal entries are non-negative we can get a closer approximation as Σ L (m) L (m) + Ψ (m), ( where Ψ (m) = diag Σ L (m) L (m) ). SAS proc factor program and output: proc factor data = all method = prin; var cvx -- xom; title Method = Principal Components ; proc factor data = all method = prin nfact = 2 plot; var cvx -- xom; title Method = Principal Components, 2 factors ; NC STATE UNIVERSITY 8 / 38

9 Principal Factor Solution Recall the Orthogonal Factor Model which implies X = LF + ɛ Σ = LL + Ψ. The m-factor Principal Component solution is to approximate Σ (or, if we standardize the variables, R) by a rank-m matrix using the spectral decomposition Σ = λ 1 e 1 e λ m e m e m + λ m+1 e m+1 e m λ p e p e p. The first m terms give the best rank-m approximation to Σ. NC STATE UNIVERSITY 9 / 38

10 We can sometimes achieve higher communalities (= diag (LL )) by either: specifying an initial estimate of the communalities iterating the solution or both. Suppose we are working with R. Given initial communalities hi 2, form the reduced correlation matrix h1 2 r 1,2... r 1,p r 2,1 h r 2,p R r = r p,1 r.. p,2 h 2 p NC STATE UNIVERSITY 10 / 38

11 Now use the spectral decomposition of R r to find its best rank-m approximation R r L r L r. New communalities are h 2 i = m j=1 Find Ψ by equating the diagonal terms: l 2 i,j. ψ i = 1 h 2 i, or Ψ = I diag ( L r L r ). NC STATE UNIVERSITY 11 / 38

12 This is the Principal Factor solution. The Principal Component solution is the special case where the initial communalities are all 1. In proc factor, use method = prin as for the Principal Component solution, but also specify the initial communalities: the priors =... option on the proc factor statement specifies a method, such as squared multiple correlations (priors = SMC); the priors statement provides explicit numerical values. NC STATE UNIVERSITY 12 / 38

13 SAS program and output: proc factor data = all method = prin priors = smc; title Method = Principal Factors ; var cvx -- xom; In this case, the communalities are smaller than for the Principal Component solution. NC STATE UNIVERSITY 13 / 38

14 Other choices for the priors option include: MAX maximum absolute correlation with any other variable; ASMC Adjusted SMC (adjusted to make their sum equal to the sum of the maximum absolute correlations); ONE 1; RANDOM uniform on (0, 1). NC STATE UNIVERSITY 14 / 38

15 Iterated Principal Factors One issue with both Principal Components and Principal Factors: if S or R is exactly in the form LL + Ψ (or, more likely, approximately in that form), neither method produces L and Ψ (unless you specify the true communalities). Solution: iterate! Use the new communalities as initial communalities to get another set of Principal Factors. Repeat until nothing much changes. NC STATE UNIVERSITY 15 / 38

16 In proc factor, use method = prinit; may also specify the initial communalities (default = ONE). SAS program and output: proc factor data = all method = prinit; title Method = Iterated Principal Factors ; var cvx -- xom; The communalities are still smaller than for the Principal Component solution, but larger than for Principal Factors. NC STATE UNIVERSITY 16 / 38

17 Likelihood Methods If we assume that X N p (µ, Σ) with Σ = LL + Ψ, we can fit by maximum likelihood: ˆµ = x; L is not identified without a constraint (uniqueness condition) such as L Ψ 1 L = diagonal; still no closed form equation for ˆL; numerical optimization required. NC STATE UNIVERSITY 17 / 38

18 We can also test hypotheses about m with the likelihood ratio test (Bartlett s correction improves the χ 2 approximation): H0 : m = m 0 ; H A : m > m 0 ; ] 2 log likelihood ratio χ 2 with 1 2 [(p m 0 ) 2 p m 0 degrees of freedom. ( Degrees of freedom > 0 m0 < 1 ) 2 2p + 1 8p + 1. E.g. for p = 5, m 0 < m 0 2: p m 0 degrees of freedom NC STATE UNIVERSITY 18 / 38

19 In proc factor, use method = ml; may also specify the initial communalities (default = SMC); SAS program and output: proc factor data = all method = ml; var cvx -- xom; title Method = Maximum Likelihood ; proc factor data = all method = ml heywood plot; var cvx -- xom; title Method = Maximum Likelihood with Heywood fixup ; proc factor data = all method = ml ultraheywood plot; var cvx -- xom; title Method = Maximum Likelihood with Ultra-Heywood fixup ; NC STATE UNIVERSITY 19 / 38

20 Note that the iteration can produce communalities > 1! Two fixes: use the Heywood option on the proc factor statement; caps the communalities at 1; use the UltraHeywood option on the proc factor statement; allows the iteration to continue with communalities > 1. NC STATE UNIVERSITY 20 / 38

21 Scaling and the Likelihood If the maximum likelihood estimates for a data matrix X are ˆL and ˆΨ, and Y = X D n p n p p is a scaled data matrix, with the columns of X scaled by the entries of the diagonal matrix D, then the maximum likelihood estimates for Y are DˆL and D 2 ˆΨ. That is, the mle s are invariant to scaling: ˆΣ Y = D ˆΣ X D. NC STATE UNIVERSITY 21 / 38

22 Proof: L Y (µ, Σ) = L X (D 1 µ, D 1 ΣD 1 ). No distinction between covariance and correlation matrices. NC STATE UNIVERSITY 22 / 38

23 Weighting and the Likelihood Recall the uniqueness condition Write L Ψ 1 L =, diagonal. Σ = Ψ 1 2 ΣΨ 1 2 = Ψ 1 2 (LL + Ψ)Ψ 1 2 ) ( ) = (Ψ 1 2 L Ψ 1 2 L + Ip = L L + I p. Σ is the weighted covariance matrix. NC STATE UNIVERSITY 23 / 38

24 Here L = Ψ 1 2 L and L L = L Ψ 1 L =. Note: Σ L = L L L + L = L + L = L ( + I m ) so the columns of L are the (unnormalized) eigenvectors of Σ, the weighted covariance matrix. NC STATE UNIVERSITY 24 / 38

25 Also (Σ I p )L = L so the columns of L are also the eigenvectors of Σ I p = Ψ 1 2 (Σ Ψ)Ψ 1 2, the weighted reduced covariance matrix. Since the likelihood analysis is transparent to scaling, the weighted reduced correlation matrix gives essentially the same results as the weighted reduced covariance matrix. NC STATE UNIVERSITY 25 / 38

26 Factor Rotation In the orthogonal factor model X µ = LF + ɛ, factor loadings are not always easily interpreted. J&W (p 504): Ideally, we should like to see a pattern of loadings such that each variable loads highly on a single factor and has small to moderate loadings on the remaining factors. That is, each row of L should have a single large entry. NC STATE UNIVERSITY 26 / 38

27 Recall from the corresponding equation Σ = LL + Ψ that L and LT give the same Σ for any orthogonal T. We can choose T to make the rotated loadings LT more readily interpreted. Note that rotation changes neither Σ nor Ψ, and hence the communalities are also unchanged. NC STATE UNIVERSITY 27 / 38

28 The Varimax Criterion Kaiser proposed a criterion that measures interpretability: ˆL is some set of loadings with communalities ĥi 2, i = 1, 2,..., p; ˆL is a set of rotated loadings, ˆL = ˆLT; l i,j = ˆl i,j /ĥi are scaled loadings; criterion is ( V = 1 m p p ) 2 4 l i,j 1 2 l i,j. p p j=1 i=1 i=1 NC STATE UNIVERSITY 28 / 38

29 Note that the term in [ ]s is the variance of the l 2 i,j in column i. Making this variance large tends to produce two clusters of scaled loadings, one of small values and one of large values. So each column of the rotated loading matrix tends to contain: a group of large loadings, which identify the variables associated with the factor; the remaining loadings are small. NC STATE UNIVERSITY 29 / 38

30 Example: Weekly returns for the 30 Dow Industrials stocks from January, 2005 to March, 2007 (115 returns). R code to rotate Principal Components 2 10: dowprcomp = prcomp(dow, scale. = TRUE); dowvmax = varimax(dowprcomp\$rotation[, 2:10], normalize = FALSE); loadings(dowvmax); Note: when R prints the loadings, entries with absolute value below a cutoff (default: 0.1) are printed as blanks, to draw attention to the larger values. NC STATE UNIVERSITY 30 / 38

31 Loadings: PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 AA AIG AXP BA CAT C DD DIS GE GM HD HON HPQ IBM INTC JNJ JPM KO NC STATE UNIVERSITY 31 / 38

32 MCD MMM MO MRK MSFT PFE PG T UTX VZ WMT XOM NC STATE UNIVERSITY 32 / 38

33 In proc factor, use rotate = varimax; may also request plots both before (preplot) and after (plot) rotation; SAS program and output: proc factor data = all method = prinit nfact = 2 rotate = varimax preplot plot out = stout; title Method = Iterated Principal Factors with Varimax Rotation ; var cvx -- xom; NC STATE UNIVERSITY 33 / 38

34 Factor Scores Interpretation of a factor analysis is usually based on the factor loadings. Sometimes we need the (estimated) values of the unobserved factors for further analysis the factor scores. In Principal Components Analysis, typically the principal components are used, scaled to have variance 1. In other types of factor analysis, two methods are used. NC STATE UNIVERSITY 34 / 38

35 Bartlett s Weighted Least Squares Suppose that in the equation L is known. X µ = LF + ɛ, We can view the equation as a regression of X on L, with coefficients F and heteroscedastic errors ɛ with variance matrix Ψ. This suggests using to estimate F. ˆf = ( L Ψ 1 L ) 1 L Ψ 1 (x µ) NC STATE UNIVERSITY 35 / 38

36 With L, Ψ, and µ replaced by estimates, and for the j th observation x j, this gives as estimated values of the factors. ˆf j = (ˆL ˆΨ 1ˆL) 1 ˆL ˆΨ 1 (x j x) The sample mean of the scores is 0. If the factor loadings are ML estimates, ˆL ˆΨ 1ˆL is a diagonal matrix ˆ, and the sample covariance matrix of the scores is n ( I + ˆ 1). n 1 In particular, the sample correlations of the factor scores are zero. NC STATE UNIVERSITY 36 / 38

37 Regression Method The second method depends on the normal distribution assumption. X and F have a joint multivariate normal distribution the conditional distribution of F given X is also multivariate normal. Best Linear Unbiased Predictor is the conditional mean. NC STATE UNIVERSITY 37 / 38

38 This leads to ˆfj = ˆL (ˆLˆL ˆΨ) 1 + (xj x) ( = I + ˆL ˆΨ 1ˆL) 1 ˆL ˆΨ 1 (x j x) The two methods are related by [ ) ] 1 ˆfLS j = I + (ˆL ˆΨ 1ˆL ˆfR j. In proc factor, use out = <data set name> on the proc factor statement; proc factor uses the regression method. NC STATE UNIVERSITY 38 / 38

### Factor analysis. Angela Montanari

Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

### Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003

Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 FA is not worth the time necessary to understand it and carry it out. -Hills, 1977 Factor analysis should not

### Smith Barney Portfolio Manager Institute Conference

Smith Barney Portfolio Manager Institute Conference Richard E. Cripps, CFA Portfolio Strategy Group March 2006 The EquityCompass is an investment process focused on selecting stocks and managing portfolios

### SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

### Common factor analysis

Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor

### CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In

### Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis

### problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random

### Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA

PROC FACTOR: How to Interpret the Output of a Real-World Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a real-world example of a factor

### Statistics for Business Decision Making

Statistics for Business Decision Making Faculty of Economics University of Siena 1 / 62 You should be able to: ˆ Summarize and uncover any patterns in a set of multivariate data using the (FM) ˆ Apply

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### Overview of Factor Analysis

Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### Sections 2.11 and 5.8

Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and

### Read chapter 7 and review lectures 8 and 9 from Econ 104 if you don t remember this stuff.

Here is your teacher waiting for Steve Wynn to come on down so I could explain index options to him. He never showed so I guess that t he will have to download this lecture and figure it out like everyone

### Introduction: Overview of Kernel Methods

Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University

### ADVANCED FORECASTING MODELS USING SAS SOFTWARE

ADVANCED FORECASTING MODELS USING SAS SOFTWARE Girish Kumar Jha IARI, Pusa, New Delhi 110 012 gjha_eco@iari.res.in 1. Transfer Function Model Univariate ARIMA models are useful for analysis and forecasting

### PRINCIPAL COMPONENT ANALYSIS

1 Chapter 1 PRINCIPAL COMPONENT ANALYSIS Introduction: The Basics of Principal Component Analysis........................... 2 A Variable Reduction Procedure.......................................... 2

### Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

### Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they

### Lesson 5 Save and Invest: Stocks Owning Part of a Company

Lesson 5 Save and Invest: Stocks Owning Part of a Company Lesson Description This lesson introduces students to information and basic concepts about the stock market. In a bingo game, students become aware

### 5.2 Customers Types for Grocery Shopping Scenario

------------------------------------------------------------------------------------------------------- CHAPTER 5: RESULTS AND ANALYSIS -------------------------------------------------------------------------------------------------------

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### Multivariate Analysis of Variance (MANOVA): I. Theory

Gregory Carey, 1998 MANOVA: I - 1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the

### To do a factor analysis, we need to select an extraction method and a rotation method. Hit the Extraction button to specify your extraction method.

Factor Analysis in SPSS To conduct a Factor Analysis, start from the Analyze menu. This procedure is intended to reduce the complexity in a set of data, so we choose Data Reduction from the menu. And the

### 11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression

Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c11 2013/9/9 page 221 le-tex 221 11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial

### Linear Classification. Volker Tresp Summer 2015

Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

S&P 500 vs. DJIA Stock Index Futures Spread Trading S&P MidCap 400 vs. S&P SmallCap 600 Second Quarter 2008 2 Contents Introduction S&P 500 vs. DJIA Introduction Index Methodology, Calculations and Weightings

### Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

### 9.2 User s Guide SAS/STAT. The FACTOR Procedure. (Book Excerpt) SAS Documentation

SAS/STAT 9.2 User s Guide The FACTOR Procedure (Book Excerpt) SAS Documentation This document is an individual chapter from SAS/STAT 9.2 User s Guide. The correct bibliographic citation for the complete

### VI. Introduction to Logistic Regression

VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models

### Component Ordering in Independent Component Analysis Based on Data Power

Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals

### Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

### Analysis of Financial Data Using Non-Negative Matrix Factorization

International Mathematical Forum,, 008, no. 8, 8-870 Analysis of Financial Data Using Non-Negative Matrix Factorization Konstantinos Drakakis UCD CASL, University College Dublin Belfield, Dublin, Ireland

### Medical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu

Medical Information Management & Mining You Chen Jan,15, 2013 You.chen@vanderbilt.edu 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?

### A Brief Introduction to SPSS Factor Analysis

A Brief Introduction to SPSS Factor Analysis SPSS has a procedure that conducts exploratory factor analysis. Before launching into a step by step example of how to use this procedure, it is recommended

### Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

### Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

### Steven M. Ho!and. Department of Geology, University of Georgia, Athens, GA 30602-2501

PRINCIPAL COMPONENTS ANALYSIS (PCA) Steven M. Ho!and Department of Geology, University of Georgia, Athens, GA 30602-2501 May 2008 Introduction Suppose we had measured two variables, length and width, and

### 3. Regression & Exponential Smoothing

3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a

### Recall the basic property of the transpose (for any A): v A t Aw = v w, v, w R n.

ORTHOGONAL MATRICES Informally, an orthogonal n n matrix is the n-dimensional analogue of the rotation matrices R θ in R 2. When does a linear transformation of R 3 (or R n ) deserve to be called a rotation?

### Linear Threshold Units

Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

### Section 5.3. Section 5.3. u m ] l jj. = l jj u j + + l mj u m. v j = [ u 1 u j. l mj

Section 5. l j v j = [ u u j u m ] l jj = l jj u j + + l mj u m. l mj Section 5. 5.. Not orthogonal, the column vectors fail to be perpendicular to each other. 5..2 his matrix is orthogonal. Check that

### Indices of Model Fit STRUCTURAL EQUATION MODELING 2013

Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit A recommended minimal set of fit indices that should be reported and interpreted when reporting the results of SEM analyses:

### Multivariate Normal Distribution

Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

### Gamma Distribution Fitting

Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

### 6. Cholesky factorization

6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix

### Week 5: Multiple Linear Regression

BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

### 1 Short Introduction to Time Series

ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The

### Sales forecasting # 2

Sales forecasting # 2 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting

### The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression The SVD is the most generally applicable of the orthogonal-diagonal-orthogonal type matrix decompositions Every

### Penalized regression: Introduction

Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

### State Space Time Series Analysis

State Space Time Series Analysis p. 1 State Space Time Series Analysis Siem Jan Koopman http://staff.feweb.vu.nl/koopman Department of Econometrics VU University Amsterdam Tinbergen Institute 2011 State

### SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

### Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

### Organization & Analysis of Stock Option Market Data. A Professional Master's Project. Submitted to the Faculty of the WORCESTER

1 Organization & Analysis of Stock Option Market Data A Professional Master's Project Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE In partial fulfillment of the requirements for the

### Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

### HLM software has been one of the leading statistical packages for hierarchical

Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush

### Joint models for classification and comparison of mortality in different countries.

Joint models for classification and comparison of mortality in different countries. Viani D. Biatat 1 and Iain D. Currie 1 1 Department of Actuarial Mathematics and Statistics, and the Maxwell Institute

### A Brief Introduction to Factor Analysis

1. Introduction A Brief Introduction to Factor Analysis Factor analysis attempts to represent a set of observed variables X 1, X 2. X n in terms of a number of 'common' factors plus a factor which is unique

### Machine Learning and Pattern Recognition Logistic Regression

Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,

### Statistics 104: Section 6!

Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: dbloome@fas.harvard.edu Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC

### Degrees of Freedom and Model Search

Degrees of Freedom and Model Search Ryan J. Tibshirani Abstract Degrees of freedom is a fundamental concept in statistical modeling, as it provides a quantitative description of the amount of fitting performed

### Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

### 1. Introduction to multivariate data

. Introduction to multivariate data. Books Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall Krzanowski, W.J. Principles of multivariate analysis. Oxford.000 Johnson,

### Partial Least Squares (PLS) Regression.

Partial Least Squares (PLS) Regression. Hervé Abdi 1 The University of Texas at Dallas Introduction Pls regression is a recent technique that generalizes and combines features from principal component

### GLM, insurance pricing & big data: paying attention to convergence issues.

GLM, insurance pricing & big data: paying attention to convergence issues. Michaël NOACK - michael.noack@addactis.com Senior consultant & Manager of ADDACTIS Pricing Copyright 2014 ADDACTIS Worldwide.

### Linear Models and Conjoint Analysis with Nonlinear Spline Transformations

Linear Models and Conjoint Analysis with Nonlinear Spline Transformations Warren F. Kuhfeld Mark Garratt Abstract Many common data analysis models are based on the general linear univariate model, including

### DISCRIMINANT FUNCTION ANALYSIS (DA)

DISCRIMINANT FUNCTION ANALYSIS (DA) John Poulsen and Aaron French Key words: assumptions, further reading, computations, standardized coefficents, structure matrix, tests of signficance Introduction Discriminant

### ANOVA. February 12, 2015

ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R

### APPRAISAL OF FINANCIAL AND ADMINISTRATIVE FUNCTIONING OF PUNJAB TECHNICAL UNIVERSITY

APPRAISAL OF FINANCIAL AND ADMINISTRATIVE FUNCTIONING OF PUNJAB TECHNICAL UNIVERSITY In the previous chapters the budgets of the university have been analyzed using various techniques to understand the

### 2. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) F 2 X 4 U 4

1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) 3. Univariate and multivariate

### A Multivariate Statistical Analysis of Crime Rate in US Cities

A Multivariate Statistical Analysis of Crime Rate in US Cities Kendall Williams Ralph Gedeon Howard University July 004 University of Florida Washington DC Gainesville, FL k_r_williams@howard.edu ralphael@ufl.edu

### THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE Alexer Barvinok Papers are available at http://www.math.lsa.umich.edu/ barvinok/papers.html This is a joint work with J.A. Hartigan

### 1 Theory: The General Linear Model

QMIN GLM Theory - 1.1 1 Theory: The General Linear Model 1.1 Introduction Before digital computers, statistics textbooks spoke of three procedures regression, the analysis of variance (ANOVA), and the

### Bayesian logistic betting strategy against probability forecasting. Akimichi Takemura, Univ. Tokyo. November 12, 2012

Bayesian logistic betting strategy against probability forecasting Akimichi Takemura, Univ. Tokyo (joint with Masayuki Kumon, Jing Li and Kei Takeuchi) November 12, 2012 arxiv:1204.3496. To appear in Stochastic

### Data analysis process

Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis

### 7 Time series analysis

7 Time series analysis In Chapters 16, 17, 33 36 in Zuur, Ieno and Smith (2007), various time series techniques are discussed. Applying these methods in Brodgar is straightforward, and most choices are

### Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

### CS229 Lecture notes. Andrew Ng

CS229 Lecture notes Andrew Ng Part X Factor analysis Whenwehavedatax (i) R n thatcomesfromamixtureofseveral Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting, we usually

### The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables

Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables 1 The Monte Carlo Framework Suppose we wish

### STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238-239

STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. by John C. Davis Clarificationof zonationprocedure described onpp. 38-39 Because the notation used in this section (Eqs. 4.8 through 4.84) is inconsistent

### Response variables assume only two values, say Y j = 1 or = 0, called success and failure (spam detection, credit scoring, contracting.

Prof. Dr. J. Franke All of Statistics 1.52 Binary response variables - logistic regression Response variables assume only two values, say Y j = 1 or = 0, called success and failure (spam detection, credit

### Citi Volatility Balanced Beta (VIBE) Equity Eurozone Net Total Return Index Index Methodology. Citi Investment Strategies

Citi Volatility Balanced Beta (VIBE) Equity Eurozone Net Total Return Index Citi Investment Strategies 21 November 2011 Table of Contents Citi Investment Strategies Part A: Introduction 1 Part B: Key Information

### 1 Simple Linear Regression I Least Squares Estimation

Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and

### Pa8ern Recogni6on. and Machine Learning. Chapter 4: Linear Models for Classiﬁca6on

Pa8ern Recogni6on and Machine Learning Chapter 4: Linear Models for Classiﬁca6on Represen'ng the target values for classifica'on If there are only two classes, we typically use a single real valued output

### Probabilistic Linear Classification: Logistic Regression. Piyush Rai IIT Kanpur

Probabilistic Linear Classification: Logistic Regression Piyush Rai IIT Kanpur Probabilistic Machine Learning (CS772A) Jan 18, 2016 Probabilistic Machine Learning (CS772A) Probabilistic Linear Classification:

### Multivariate Statistical Inference and Applications

Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim

### CS 688 Pattern Recognition Lecture 4. Linear Models for Classification

CS 688 Pattern Recognition Lecture 4 Linear Models for Classification Probabilistic generative models Probabilistic discriminative models 1 Generative Approach ( x ) p C k p( C k ) Ck p ( ) ( x Ck ) p(

### ANALYZING INVESTMENT RETURN OF ASSET PORTFOLIOS WITH MULTIVARIATE ORNSTEIN-UHLENBECK PROCESSES

ANALYZING INVESTMENT RETURN OF ASSET PORTFOLIOS WITH MULTIVARIATE ORNSTEIN-UHLENBECK PROCESSES by Xiaofeng Qian Doctor of Philosophy, Boston University, 27 Bachelor of Science, Peking University, 2 a Project

### Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng. LISREL for Windows: SIMPLIS Syntax Files

Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng LISREL for Windows: SIMPLIS Files Table of contents SIMPLIS SYNTAX FILES... 1 The structure of the SIMPLIS syntax file... 1 \$CLUSTER command... 4

### Exploratory data analysis for microarray data

Eploratory data analysis for microarray data Anja von Heydebreck Ma Planck Institute for Molecular Genetics, Dept. Computational Molecular Biology, Berlin, Germany heydebre@molgen.mpg.de Visualization

### 5. Multiple regression

5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

### Manifold Learning Examples PCA, LLE and ISOMAP

Manifold Learning Examples PCA, LLE and ISOMAP Dan Ventura October 14, 28 Abstract We try to give a helpful concrete example that demonstrates how to use PCA, LLE and Isomap, attempts to provide some intuition

### α = u v. In other words, Orthogonal Projection

Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v