Factor Analysis. Factor Analysis
|
|
- Malcolm Hall
- 7 years ago
- Views:
Transcription
1 Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we assume that such latent variables, or factors, exist. NC STATE UNIVERSITY 1 / 38
2 The Orthogonal Factor Model equation: X 1 µ 1 = l 1,1 F 1 + l 1,2 F l 1,m F m + ɛ 1, X 2 µ 2 = l 2,1 F 1 + l 2,2 F l 2,m F m + ɛ 2,.. X p µ p = l p,1 F 1 + l p,2 F l p,m F m + ɛ p, where: F1, F 2,..., F m are the common factors (latent variables); li,j is the loading of variable i, X i, on factor j, F j ; ɛi is a specific factor, affecting only X i. NC STATE UNIVERSITY 2 / 38
3 In matrix form: X µ = L F + ɛ. p 1 p 1 p m 1 p 1 To make this identifiable, we further assume, with no loss of generality: E(F) = 0 m 1 Cov(F) = I m m E(ɛ) = 0 p 1 Cov(ɛ, F) = 0 p m NC STATE UNIVERSITY 3 / 38
4 and with serious loss of generality: Cov(ɛ) = Ψ = diag (ψ 1, ψ 2,..., ψ p ). In terms of the observable variables X, these assumptions mean that E(X) = µ, Cov(X) = Σ = L L p m p + Ψ p p. Usually X is standardized, so Σ = R. The observable X and the unobservable F are related by Cov(X, F) = L. NC STATE UNIVERSITY 4 / 38
5 Some terminology: the (i, i) entry of the matrix equation Σ = LL + Ψ is or where σ i,i }{{} Var(X i ) is the i th communality. = li,1 2 + li, li,m 2 } {{ } Communality σ i,i = h 2 i + ψ i h 2 i = l 2 i,1 + l 2 i,2 + + l 2 i,m + ψ }{{} i, Specific variance Note that if T is (m m) orthogonal, then (LT)(LT) = LL, so loadings LT generate the same Σ as L: loadings are not unique. NC STATE UNIVERSITY 5 / 38
6 Existence of Factor Representation For any p, every (p p) Σ can be factorized as Σ = LL for (p p) L, which is a factor representation with m = p and Ψ = 0; however, m = p is not much use we usually want m p. For p = 3, every (3 3) Σ can be represented as Σ = LL + Ψ for (3 1) L, which is a factor representation with m = 1, but Ψ may have negative elements. NC STATE UNIVERSITY 6 / 38
7 In general, we can only approximate Σ by LL + Ψ. Principal components method: the spectral decomposition of Σ is with m = p. Σ = EΛE = ( EΛ 1/2) ( EΛ 1/2) = LL If λ 1 + λ λ m λ m λ p, and L (m) is the first m columns of L, then Σ L (m) L (m) gives such an approximation with Ψ = 0. NC STATE UNIVERSITY 7 / 38
8 The remainder term Σ L (m) L (m) is non-negative definite, so its diagonal entries are non-negative we can get a closer approximation as Σ L (m) L (m) + Ψ (m), ( where Ψ (m) = diag Σ L (m) L (m) ). SAS proc factor program and output: proc factor data = all method = prin; var cvx -- xom; title Method = Principal Components ; proc factor data = all method = prin nfact = 2 plot; var cvx -- xom; title Method = Principal Components, 2 factors ; NC STATE UNIVERSITY 8 / 38
9 Principal Factor Solution Recall the Orthogonal Factor Model which implies X = LF + ɛ Σ = LL + Ψ. The m-factor Principal Component solution is to approximate Σ (or, if we standardize the variables, R) by a rank-m matrix using the spectral decomposition Σ = λ 1 e 1 e λ m e m e m + λ m+1 e m+1 e m λ p e p e p. The first m terms give the best rank-m approximation to Σ. NC STATE UNIVERSITY 9 / 38
10 We can sometimes achieve higher communalities (= diag (LL )) by either: specifying an initial estimate of the communalities iterating the solution or both. Suppose we are working with R. Given initial communalities hi 2, form the reduced correlation matrix h1 2 r 1,2... r 1,p r 2,1 h r 2,p R r = r p,1 r.. p,2 h 2 p NC STATE UNIVERSITY 10 / 38
11 Now use the spectral decomposition of R r to find its best rank-m approximation R r L r L r. New communalities are h 2 i = m j=1 Find Ψ by equating the diagonal terms: l 2 i,j. ψ i = 1 h 2 i, or Ψ = I diag ( L r L r ). NC STATE UNIVERSITY 11 / 38
12 This is the Principal Factor solution. The Principal Component solution is the special case where the initial communalities are all 1. In proc factor, use method = prin as for the Principal Component solution, but also specify the initial communalities: the priors =... option on the proc factor statement specifies a method, such as squared multiple correlations (priors = SMC); the priors statement provides explicit numerical values. NC STATE UNIVERSITY 12 / 38
13 SAS program and output: proc factor data = all method = prin priors = smc; title Method = Principal Factors ; var cvx -- xom; In this case, the communalities are smaller than for the Principal Component solution. NC STATE UNIVERSITY 13 / 38
14 Other choices for the priors option include: MAX maximum absolute correlation with any other variable; ASMC Adjusted SMC (adjusted to make their sum equal to the sum of the maximum absolute correlations); ONE 1; RANDOM uniform on (0, 1). NC STATE UNIVERSITY 14 / 38
15 Iterated Principal Factors One issue with both Principal Components and Principal Factors: if S or R is exactly in the form LL + Ψ (or, more likely, approximately in that form), neither method produces L and Ψ (unless you specify the true communalities). Solution: iterate! Use the new communalities as initial communalities to get another set of Principal Factors. Repeat until nothing much changes. NC STATE UNIVERSITY 15 / 38
16 In proc factor, use method = prinit; may also specify the initial communalities (default = ONE). SAS program and output: proc factor data = all method = prinit; title Method = Iterated Principal Factors ; var cvx -- xom; The communalities are still smaller than for the Principal Component solution, but larger than for Principal Factors. NC STATE UNIVERSITY 16 / 38
17 Likelihood Methods If we assume that X N p (µ, Σ) with Σ = LL + Ψ, we can fit by maximum likelihood: ˆµ = x; L is not identified without a constraint (uniqueness condition) such as L Ψ 1 L = diagonal; still no closed form equation for ˆL; numerical optimization required. NC STATE UNIVERSITY 17 / 38
18 We can also test hypotheses about m with the likelihood ratio test (Bartlett s correction improves the χ 2 approximation): H0 : m = m 0 ; H A : m > m 0 ; ] 2 log likelihood ratio χ 2 with 1 2 [(p m 0 ) 2 p m 0 degrees of freedom. ( Degrees of freedom > 0 m0 < 1 ) 2 2p + 1 8p + 1. E.g. for p = 5, m 0 < m 0 2: p m 0 degrees of freedom NC STATE UNIVERSITY 18 / 38
19 In proc factor, use method = ml; may also specify the initial communalities (default = SMC); SAS program and output: proc factor data = all method = ml; var cvx -- xom; title Method = Maximum Likelihood ; proc factor data = all method = ml heywood plot; var cvx -- xom; title Method = Maximum Likelihood with Heywood fixup ; proc factor data = all method = ml ultraheywood plot; var cvx -- xom; title Method = Maximum Likelihood with Ultra-Heywood fixup ; NC STATE UNIVERSITY 19 / 38
20 Note that the iteration can produce communalities > 1! Two fixes: use the Heywood option on the proc factor statement; caps the communalities at 1; use the UltraHeywood option on the proc factor statement; allows the iteration to continue with communalities > 1. NC STATE UNIVERSITY 20 / 38
21 Scaling and the Likelihood If the maximum likelihood estimates for a data matrix X are ˆL and ˆΨ, and Y = X D n p n p p is a scaled data matrix, with the columns of X scaled by the entries of the diagonal matrix D, then the maximum likelihood estimates for Y are DˆL and D 2 ˆΨ. That is, the mle s are invariant to scaling: ˆΣ Y = D ˆΣ X D. NC STATE UNIVERSITY 21 / 38
22 Proof: L Y (µ, Σ) = L X (D 1 µ, D 1 ΣD 1 ). No distinction between covariance and correlation matrices. NC STATE UNIVERSITY 22 / 38
23 Weighting and the Likelihood Recall the uniqueness condition Write L Ψ 1 L =, diagonal. Σ = Ψ 1 2 ΣΨ 1 2 = Ψ 1 2 (LL + Ψ)Ψ 1 2 ) ( ) = (Ψ 1 2 L Ψ 1 2 L + Ip = L L + I p. Σ is the weighted covariance matrix. NC STATE UNIVERSITY 23 / 38
24 Here L = Ψ 1 2 L and L L = L Ψ 1 L =. Note: Σ L = L L L + L = L + L = L ( + I m ) so the columns of L are the (unnormalized) eigenvectors of Σ, the weighted covariance matrix. NC STATE UNIVERSITY 24 / 38
25 Also (Σ I p )L = L so the columns of L are also the eigenvectors of Σ I p = Ψ 1 2 (Σ Ψ)Ψ 1 2, the weighted reduced covariance matrix. Since the likelihood analysis is transparent to scaling, the weighted reduced correlation matrix gives essentially the same results as the weighted reduced covariance matrix. NC STATE UNIVERSITY 25 / 38
26 Factor Rotation In the orthogonal factor model X µ = LF + ɛ, factor loadings are not always easily interpreted. J&W (p 504): Ideally, we should like to see a pattern of loadings such that each variable loads highly on a single factor and has small to moderate loadings on the remaining factors. That is, each row of L should have a single large entry. NC STATE UNIVERSITY 26 / 38
27 Recall from the corresponding equation Σ = LL + Ψ that L and LT give the same Σ for any orthogonal T. We can choose T to make the rotated loadings LT more readily interpreted. Note that rotation changes neither Σ nor Ψ, and hence the communalities are also unchanged. NC STATE UNIVERSITY 27 / 38
28 The Varimax Criterion Kaiser proposed a criterion that measures interpretability: ˆL is some set of loadings with communalities ĥi 2, i = 1, 2,..., p; ˆL is a set of rotated loadings, ˆL = ˆLT; l i,j = ˆl i,j /ĥi are scaled loadings; criterion is ( V = 1 m p p ) 2 4 l i,j 1 2 l i,j. p p j=1 i=1 i=1 NC STATE UNIVERSITY 28 / 38
29 Note that the term in [ ]s is the variance of the l 2 i,j in column i. Making this variance large tends to produce two clusters of scaled loadings, one of small values and one of large values. So each column of the rotated loading matrix tends to contain: a group of large loadings, which identify the variables associated with the factor; the remaining loadings are small. NC STATE UNIVERSITY 29 / 38
30 Example: Weekly returns for the 30 Dow Industrials stocks from January, 2005 to March, 2007 (115 returns). R code to rotate Principal Components 2 10: dowprcomp = prcomp(dow, scale. = TRUE); dowvmax = varimax(dowprcomp$rotation[, 2:10], normalize = FALSE); loadings(dowvmax); Note: when R prints the loadings, entries with absolute value below a cutoff (default: 0.1) are printed as blanks, to draw attention to the larger values. NC STATE UNIVERSITY 30 / 38
31 Loadings: PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 AA AIG AXP BA CAT C DD DIS GE GM HD HON HPQ IBM INTC JNJ JPM KO NC STATE UNIVERSITY 31 / 38
32 MCD MMM MO MRK MSFT PFE PG T UTX VZ WMT XOM NC STATE UNIVERSITY 32 / 38
33 In proc factor, use rotate = varimax; may also request plots both before (preplot) and after (plot) rotation; SAS program and output: proc factor data = all method = prinit nfact = 2 rotate = varimax preplot plot out = stout; title Method = Iterated Principal Factors with Varimax Rotation ; var cvx -- xom; NC STATE UNIVERSITY 33 / 38
34 Factor Scores Interpretation of a factor analysis is usually based on the factor loadings. Sometimes we need the (estimated) values of the unobserved factors for further analysis the factor scores. In Principal Components Analysis, typically the principal components are used, scaled to have variance 1. In other types of factor analysis, two methods are used. NC STATE UNIVERSITY 34 / 38
35 Bartlett s Weighted Least Squares Suppose that in the equation L is known. X µ = LF + ɛ, We can view the equation as a regression of X on L, with coefficients F and heteroscedastic errors ɛ with variance matrix Ψ. This suggests using to estimate F. ˆf = ( L Ψ 1 L ) 1 L Ψ 1 (x µ) NC STATE UNIVERSITY 35 / 38
36 With L, Ψ, and µ replaced by estimates, and for the j th observation x j, this gives as estimated values of the factors. ˆf j = (ˆL ˆΨ 1ˆL) 1 ˆL ˆΨ 1 (x j x) The sample mean of the scores is 0. If the factor loadings are ML estimates, ˆL ˆΨ 1ˆL is a diagonal matrix ˆ, and the sample covariance matrix of the scores is n ( I + ˆ 1). n 1 In particular, the sample correlations of the factor scores are zero. NC STATE UNIVERSITY 36 / 38
37 Regression Method The second method depends on the normal distribution assumption. X and F have a joint multivariate normal distribution the conditional distribution of F given X is also multivariate normal. Best Linear Unbiased Predictor is the conditional mean. NC STATE UNIVERSITY 37 / 38
38 This leads to ˆfj = ˆL (ˆLˆL ˆΨ) 1 + (xj x) ( = I + ˆL ˆΨ 1ˆL) 1 ˆL ˆΨ 1 (x j x) The two methods are related by [ ) ] 1 ˆfLS j = I + (ˆL ˆΨ 1ˆL ˆfR j. In proc factor, use out = <data set name> on the proc factor statement; proc factor uses the regression method. NC STATE UNIVERSITY 38 / 38
Multivariate Analysis (Slides 13)
Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables
More informationExploratory Factor Analysis
Introduction Principal components: explain many variables using few new variables. Not many assumptions attached. Exploratory Factor Analysis Exploratory factor analysis: similar idea, but based on model.
More informationExploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016
and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings
More informationFactor analysis. Angela Montanari
Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number
More informationFACTOR ANALYSIS NASC
FACTOR ANALYSIS NASC Factor Analysis A data reduction technique designed to represent a wide range of attributes on a smaller number of dimensions. Aim is to identify groups of variables which are relatively
More informationExploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003
Exploratory Factor Analysis Brian Habing - University of South Carolina - October 15, 2003 FA is not worth the time necessary to understand it and carry it out. -Hills, 1977 Factor analysis should not
More informationExploratory Factor Analysis
Exploratory Factor Analysis Definition Exploratory factor analysis (EFA) is a procedure for learning the extent to which k observed variables might measure m abstract variables, wherein m is less than
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationFactor Analysis. Chapter 420. Introduction
Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.
More informationSmith Barney Portfolio Manager Institute Conference
Smith Barney Portfolio Manager Institute Conference Richard E. Cripps, CFA Portfolio Strategy Group March 2006 The EquityCompass is an investment process focused on selecting stocks and managing portfolios
More informationSF2940: Probability theory Lecture 8: Multivariate Normal Distribution
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationSF2940: Probability theory Lecture 8: Multivariate Normal Distribution
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance
More informationFACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.
FACTOR ANALYSIS Introduction Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables Both methods differ from regression in that they don t have
More informationCommon factor analysis
Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationCHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.
CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In
More informationIntroduction to Matrix Algebra
Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary
More informationFactor Analysis. Sample StatFolio: factor analysis.sgp
STATGRAPHICS Rev. 1/10/005 Factor Analysis Summary The Factor Analysis procedure is designed to extract m common factors from a set of p quantitative variables X. In many situations, a small number of
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationAnalysing equity portfolios in R
Analysing equity portfolios in R Using the portfolio package by David Kane and Jeff Enos Introduction 1 R is used by major financial institutions around the world to manage billions of dollars in equity
More informationFactor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models
Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis
More informationOctober 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix
Linear Algebra & Properties of the Covariance Matrix October 3rd, 2012 Estimation of r and C Let rn 1, rn, t..., rn T be the historical return rates on the n th asset. rn 1 rṇ 2 r n =. r T n n = 1, 2,...,
More informationFactor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red)
Factor Analysis Example: SAS program (in blue) and output (in black) interleaved with comments (in red) The following DATA procedure is to read input data. This will create a SAS dataset named CORRMATR
More informationExploratory Factor Analysis: rotation. Psychology 588: Covariance structure and factor models
Exploratory Factor Analysis: rotation Psychology 588: Covariance structure and factor models Rotational indeterminacy Given an initial (orthogonal) solution (i.e., Φ = I), there exist infinite pairs of
More informationproblem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationAnalyzing Structural Equation Models With Missing Data
Analyzing Structural Equation Models With Missing Data Craig Enders* Arizona State University cenders@asu.edu based on Enders, C. K. (006). Analyzing structural equation models with missing data. In G.
More informationChapter 6. Orthogonality
6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be
More informationRachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA
PROC FACTOR: How to Interpret the Output of a Real-World Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a real-world example of a factor
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More informationChapter 7 Factor Analysis SPSS
Chapter 7 Factor Analysis SPSS Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of correlations within a set of observed variables. Factor analysis is often
More informationStatistics for Business Decision Making
Statistics for Business Decision Making Faculty of Economics University of Siena 1 / 62 You should be able to: ˆ Summarize and uncover any patterns in a set of multivariate data using the (FM) ˆ Apply
More informationSTA 4107/5107. Chapter 3
STA 4107/5107 Chapter 3 Factor Analysis 1 Key Terms Please review and learn these terms. 2 What is Factor Analysis? Factor analysis is an interdependence technique (see chapter 1) that primarily uses metric
More informationPrincipal Component Analysis
Principal Component Analysis Principle Component Analysis: A statistical technique used to examine the interrelations among a set of variables in order to identify the underlying structure of those variables.
More informationOverview of Factor Analysis
Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,
More informationStatistics in Psychosocial Research Lecture 8 Factor Analysis I. Lecturer: Elizabeth Garrett-Mayer
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationExtending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances?
1 Extending the debate between Spearman and Wilson 1929: When do single variables optimally reproduce the common part of the observed covariances? André Beauducel 1 & Norbert Hilger University of Bonn,
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationRegression Analysis. Regression Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013
Lecture 6: Regression Analysis MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Regression Analysis 1 Outline Regression Analysis 1 Regression Analysis MIT 18.S096 Regression Analysis 2 Multiple Linear
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationThis chapter will demonstrate how to perform multiple linear regression with IBM SPSS
CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the
More informationPsychology 7291, Multivariate Analysis, Spring 2003. SAS PROC FACTOR: Suggestions on Use
: Suggestions on Use Background: Factor analysis requires several arbitrary decisions. The choices you make are the options that you must insert in the following SAS statements: PROC FACTOR METHOD=????
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationFactor Analysis. Advanced Financial Accounting II Åbo Akademi School of Business
Factor Analysis Advanced Financial Accounting II Åbo Akademi School of Business Factor analysis A statistical method used to describe variability among observed variables in terms of fewer unobserved variables
More informationNotes on Orthogonal and Symmetric Matrices MENU, Winter 2013
Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationTrading activity as driven Poisson process: comparison with empirical data
Trading activity as driven Poisson process: comparison with empirical data V. Gontis, B. Kaulakys, J. Ruseckas Institute of Theoretical Physics and Astronomy of Vilnius University, A. Goštauto 2, LT-008
More informationEstimating an ARMA Process
Statistics 910, #12 1 Overview Estimating an ARMA Process 1. Main ideas 2. Fitting autoregressions 3. Fitting with moving average components 4. Standard errors 5. Examples 6. Appendix: Simple estimators
More informationRead chapter 7 and review lectures 8 and 9 from Econ 104 if you don t remember this stuff.
Here is your teacher waiting for Steve Wynn to come on down so I could explain index options to him. He never showed so I guess that t he will have to download this lecture and figure it out like everyone
More information[1] Diagonal factorization
8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:
More informationDATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationMehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics
INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree
More informationSections 2.11 and 5.8
Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and
More informationT-test & factor analysis
Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue
More information12.5: CHI-SQUARE GOODNESS OF FIT TESTS
125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability
More informationA Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
More informationEfficiency and the Cramér-Rao Inequality
Chapter Efficiency and the Cramér-Rao Inequality Clearly we would like an unbiased estimator ˆφ (X of φ (θ to produce, in the long run, estimates which are fairly concentrated i.e. have high precision.
More informationLinear Algebra Review. Vectors
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length
More informationIntroduction to Principal Components and FactorAnalysis
Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a
More informationarxiv:1504.02972v1 [q-fin.pm] 12 Apr 2015
Computing trading strategies based on financial sentiment data using evolutionary optimization Ronald Hochreiter April 2015 arxiv:1504.02972v1 q-fin.pm] 12 Apr 2015 Abstract In this paper we apply evolutionary
More informationMultivariate Analysis
Table Of Contents Multivariate Analysis... 1 Overview... 1 Principal Components... 2 Factor Analysis... 5 Cluster Observations... 12 Cluster Variables... 17 Cluster K-Means... 20 Discriminant Analysis...
More information1 2 3 1 1 2 x = + x 2 + x 4 1 0 1
(d) If the vector b is the sum of the four columns of A, write down the complete solution to Ax = b. 1 2 3 1 1 2 x = + x 2 + x 4 1 0 0 1 0 1 2. (11 points) This problem finds the curve y = C + D 2 t which
More informationRandom effects and nested models with SAS
Random effects and nested models with SAS /************* classical2.sas ********************* Three levels of factor A, four levels of B Both fixed Both random A fixed, B random B nested within A ***************************************************/
More informationMISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
More informationFactor Analysis - 2 nd TUTORIAL
Factor Analysis - 2 nd TUTORIAL Subject marks File sub_marks.csv shows correlation coefficients between subject scores for a sample of 220 boys. sub_marks
More informationIntroduction: Overview of Kernel Methods
Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
More informationADVANCED FORECASTING MODELS USING SAS SOFTWARE
ADVANCED FORECASTING MODELS USING SAS SOFTWARE Girish Kumar Jha IARI, Pusa, New Delhi 110 012 gjha_eco@iari.res.in 1. Transfer Function Model Univariate ARIMA models are useful for analysis and forecasting
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationThe president of a Fortune 500 firm wants to measure the firm s image.
4. Factor Analysis A related method to the PCA is the Factor Analysis (FA) with the crucial difference that in FA a statistical model is constructed to explain the interrelations (correlations) between
More information13. Poisson Regression Analysis
136 Poisson Regression Analysis 13. Poisson Regression Analysis We have so far considered situations where the outcome variable is numeric and Normally distributed, or binary. In clinical work one often
More informationPRINCIPAL COMPONENT ANALYSIS
1 Chapter 1 PRINCIPAL COMPONENT ANALYSIS Introduction: The Basics of Principal Component Analysis........................... 2 A Variable Reduction Procedure.......................................... 2
More informationMultivariate normal distribution and testing for means (see MKB Ch 3)
Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................
More informationLinear Models for Continuous Data
Chapter 2 Linear Models for Continuous Data The starting point in our exploration of statistical models in social research will be the classical linear model. Stops along the way include multiple linear
More informationIntroduction to Principal Component Analysis: Stock Market Values
Chapter 10 Introduction to Principal Component Analysis: Stock Market Values The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from
More informationNotes on Symmetric Matrices
CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.
More informationTopic 10: Factor Analysis
Topic 10: Factor Analysis Introduction Factor analysis is a statistical method used to describe variability among observed variables in terms of a potentially lower number of unobserved variables called
More information5. Orthogonal matrices
L Vandenberghe EE133A (Spring 2016) 5 Orthogonal matrices matrices with orthonormal columns orthogonal matrices tall matrices with orthonormal columns complex matrices with orthonormal columns 5-1 Orthonormal
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationLeast-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
More informationSimilarity and Diagonalization. Similar Matrices
MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that
More informationEigenvalues, Eigenvectors, Matrix Factoring, and Principal Components
Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they
More information5.2 Customers Types for Grocery Shopping Scenario
------------------------------------------------------------------------------------------------------- CHAPTER 5: RESULTS AND ANALYSIS -------------------------------------------------------------------------------------------------------
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationLesson 5 Save and Invest: Stocks Owning Part of a Company
Lesson 5 Save and Invest: Stocks Owning Part of a Company Lesson Description This lesson introduces students to information and basic concepts about the stock market. In a bingo game, students become aware
More informationTo do a factor analysis, we need to select an extraction method and a rotation method. Hit the Extraction button to specify your extraction method.
Factor Analysis in SPSS To conduct a Factor Analysis, start from the Analyze menu. This procedure is intended to reduce the complexity in a set of data, so we choose Data Reduction from the menu. And the
More informationMultivariate Analysis of Variance (MANOVA): I. Theory
Gregory Carey, 1998 MANOVA: I - 1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationIntroduction to Fixed Effects Methods
Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed
More informationVariance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers
Variance Reduction The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output If this variance can be lowered without changing the expected value, fewer replications
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationSIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.
SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation
More informationLinear Programming. March 14, 2014
Linear Programming March 1, 01 Parts of this introduction to linear programming were adapted from Chapter 9 of Introduction to Algorithms, Second Edition, by Cormen, Leiserson, Rivest and Stein [1]. 1
More information10.2 ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS. The Jacobi Method
578 CHAPTER 1 NUMERICAL METHODS 1. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS As a numerical technique, Gaussian elimination is rather unusual because it is direct. That is, a solution is obtained after
More informationEstimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
More information