# Principal Component Analysis

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Principal Component Analysis ERS70D George Fernandez INTRODUCTION Analysis of multivariate data plays a key role in data analysis. Multivariate data consists of many different attributes or variables recorded for each observation. If there are p variables in a database, each variable could be regarded as constituting a different dimension, in a p-dimensional hyperspace. multi-dimensional hyperspace is often difficult to visualize, and thus the main objectives of unsupervised learning methods are to reduce dimensionality, scoring all observations based on a composite index and clustering similar observations together based on multi-attributes. summarizing multivariate attributes by, two or three that can be displayed graphically with minimal loss of information is useful in knowledge discovery. 2

2 PRINCIPAL COMPONENT ANALYSIS Because it is hard to visualize multi-dimensional space, principal components analysis (PCA), a popular multivariate technique, is mainly used to reduce the dimensionality of p multi-attributes to two or three dimensions. PCA summarizes the variation in a correlated multi-attribute to a set of uncorrelated components, each of which is a particular linear combination of the original variables. The extracted uncorrelated components are called principal components (PC) and are estimated from the eigenvectors of the covariance or correlation matrix of the original variables. Therefore, the objective of PCA is to achieve parsimony and reduce dimensionality by extracting the smallest number components that account for most of the variation in the original multivariate data and to summarize the data with little loss of information. 3 PRINCIPAL COMPONENT ANALYSIS In PCA, uncorrelated PC s are extracted by linear transformations of the original variables so that the first few PC s contain most of the variations in the original dataset. These PCs are extracted in decreasing order of importance so that the first PC accounts for as much of the variation as possible and each successive component accounts for a little less. Following PCA, analyst tries to interpret the first few principal components in terms of the original variables, and thereby have a greater understanding of the data. To reproduce the total system variability of the original p variables, we need all p PCs. However, if the first few PCs account for a large proportion of the variability (80-90%), we have achieved our objective of dimension reduction. Because the first principal component accounts for the co-variation shared by all attributes, this may be a better estimate than simple or weighted averages of the original variables. Thus, PCA can be useful when there is a severe high-degree of correlation present in the multi-attributes. 4 2

3 PRINCIPAL COMPONENT ANALYSIS In PCA, the extractions of PC can be made using either original multivariate datasets or using the covariance or the correlation matrix if the original dataset is not available. In deriving PC, the correlation matrix is commonly used when different variables in the dataset are measured using different units (annual income, educational level, numbers of cars owned per family) or if different variables have different variances. Using the correlation matrix is equivalent to standardizing the variables to zero mean and unit standard deviation. 5 Applications of PCA analysis Hall R.I, Leavitt P.R,Quinlan R., Dixit A.S, Smol, J.P 999 Effects of agriculture, urbanization, and climate on water quality in the northern Great plains. Limnol. Oceanogr. 44(3, part 2) PCA of sample scores and selected species from (a) diatom percent abundances, (b) fossil pigment concentrations, and Chironomid percent abundances. 6 3

4 Applications of PCA analysis PCA2 3.9% S. hantzschii 3-76 F. capun. S. niagare PCA 45.5% A. granulata 7 Correlation matrix Y2 Y4 X4 X8 X X5 Y midrprce <.000 < <.000 <.000 Y ctympg <.000 <.000 <.000 <.000 <.000 X hp <.000 < <.000 <.000 X pcap < <.000 <.000 X width <.000 <.000 <.000 <.000 <.000 X weight <.000 <.000 <.000 <.000 <

5 Eigen value / vector decomposition of a correlation matrics r2 r32 r2 r3 r3 r23 = p j= λjaja j ' 9 PCA TERMINOLOGY Eigenvalues measure the amount of the variation explained by each PC and will be largest for the first PC and smaller for the subsequent PCs. An eigenvalue greater than indicates that PCs account for more variance than accounted by one of the original variables in standardized data. This is commonly used as a cutoff point for which PCs are retained. Eigenvectors provides the weights to compute the uncorrelated PC, which are the linear combination of the centered standardized or centered un-standardized original variables. 0 5

6 Eigenvectors and eigen values Eigenvectors 2 Y2 midrprce Y4 ctympg X4 hp X8 pcap X width X5 weight Eigenvalues of the Correlation Matrix: Total = 6 Eigenvalue Difference Proportion Cumulative PCA PC = p j a jx PC- principal compnent aj= Linear coefficient eigen vectors 2 6

7 Estimating the Number of PC Scree Test: Plotting the eigenvalues against the corresponding PC produces a scree plot that illustrates the rate of change in the magnitude of the eigenvalues for the PC. The rate of decline tends to be fast first then levels off. The elbow, or the point at which the curve bends, is considered to indicate the maximum number of PC to extract. One less PC than the number at the elbow might be appropriate if you are concerned about getting an overly defined solution. 3 cars:scree PLOT and Parallel Analysis - PCA 4 E 3 Eigenvalue 2 0 p pe p E p E E E Number of PC PLOT E E E Eigenvalue p p p ev 4 7

8 PCA TERMINOLOGY PC loadings are correlation coefficients between the PC scores and the original variables. PC loadings measure the importance of each variable in accounting for the variability in the PC. It is possible to interpret the first few PCs in terms of 'overall' effect or a 'contrast' between groups of variables based on the structures of PC loadings. high correlation between PC and a variable indicates that the variable is associated with the direction of the maximum amount of variation in the dataset. More than one variable might have a high correlation with PC. A strong correlation between a variable and PC2 indicates that the variable is responsible for the next largest variation in the data perpendicular to PC, and so on. if a variable does not correlate to any PC, or correlates only with the last PC, or one before the last PC, this usually suggests that the variable has little or no contribution to the variation in the dataset. Therefore, PCA may often indicate which variables in a dataset are important and which ones may be of little consequence. Some of these low-performance variables might therefore be removed from consideration in order to simplify the overall analyses.. 5 PC loadings Factor Pattern Factor Factor2 Y2 midrprce Y4 ctympg X4 hp X8 pcap X width X5 weight

9 PC Scores PC scores are the derived composite scores computed for each observation based on the eigenvectors for each PC. The means of PC scores are equal to zero, as these are the linear combination of the centered variables. These uncorrelated PC scores can be used in subsequent analyses, to check for multivariate normality, to detect multivariate outliers, or as a remedial measure in regression analysis with severe multi-colliniarity. 7 PC Scores Obs ID Factor Factor

10 BI-PLOT DISPLAY OF PCA Bi-plot display is a visualization technique for investigating the inter-relationships between the observations and variables in multivariate data. To display a bi-plot, the data should be considered as a matrix, in which the column represents the variable space while the row represents the observational space. The term bi-plot means it is a plot of two dimensions with the observation and variable spaces plotted simultaneously. In PCA, relationships between PC scores and PCA loadings associated with any two PCs can be illustrated in a bi-plot display. 9 Bi-Plot display cars: Factor method: p - Rotation-none Factor2 score Y X X X Y2 X Factor score 20 0

### Introduction to Principal Components and FactorAnalysis

Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a

### Introduction to Principal Component Analysis: Stock Market Values

Chapter 10 Introduction to Principal Component Analysis: Stock Market Values The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from

### Factor Analysis. Chapter 420. Introduction

Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.

### Principal Components. Sample StatFolio: pca.sgp

Principal Components Summary The Principal Components procedure is designed to extract k principal components from a set of p quantitative variables X. The principal components are defined as the set of

### Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA

PROC FACTOR: How to Interpret the Output of a Real-World Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a real-world example of a factor

### Principal Components Analysis (PCA)

Principal Components Analysis (PCA) Janette Walde janette.walde@uibk.ac.at Department of Statistics University of Innsbruck Outline I Introduction Idea of PCA Principle of the Method Decomposing an Association

### Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

### Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

### What do the dimensions or axes represent? Ordinations

Ordinations Goal order or arrange samples along one or two meaningful dimensions (axes or scales) Ordinations Not specifically designed for hypothesis testing Similar to goals of clustering, so why not

### Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round \$200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

### Steven M. Ho!and. Department of Geology, University of Georgia, Athens, GA 30602-2501

PRINCIPAL COMPONENTS ANALYSIS (PCA) Steven M. Ho!and Department of Geology, University of Georgia, Athens, GA 30602-2501 May 2008 Introduction Suppose we had measured two variables, length and width, and

### Common factor analysis

Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor

### Multivariate Analysis (Slides 4)

Multivariate Analysis (Slides 4) In today s class we examine examples of principal components analysis. We shall consider a difficulty in applying the analysis and consider a method for resolving this.

### Data Clustering. Dec 2nd, 2013 Kyrylo Bessonov

Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

### Statistics for Business Decision Making

Statistics for Business Decision Making Faculty of Economics University of Siena 1 / 62 You should be able to: ˆ Summarize and uncover any patterns in a set of multivariate data using the (FM) ˆ Apply

### BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents

### Principal Component Analysis

Principal Component Analysis Principle Component Analysis: A statistical technique used to examine the interrelations among a set of variables in order to identify the underlying structure of those variables.

### Exploratory data analysis for microarray data

Eploratory data analysis for microarray data Anja von Heydebreck Ma Planck Institute for Molecular Genetics, Dept. Computational Molecular Biology, Berlin, Germany heydebre@molgen.mpg.de Visualization

### Principal Component Analysis Application to images

Principal Component Analysis Application to images Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering, Department of Cybernetics Center for Machine Perception http://cmp.felk.cvut.cz/

### FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.

FACTOR ANALYSIS Introduction Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables Both methods differ from regression in that they don t have

### The ith principal component (PC) is the line that follows the eigenvector associated with the ith largest eigenvalue.

More Principal Components Summary Principal Components (PCs) are associated with the eigenvectors of either the covariance or correlation matrix of the data. The ith principal component (PC) is the line

### Random Vectors and the Variance Covariance Matrix

Random Vectors and the Variance Covariance Matrix Definition 1. A random vector X is a vector (X 1, X 2,..., X p ) of jointly distributed random variables. As is customary in linear algebra, we will write

### FACTOR ANALYSIS EXPLORATORY APPROACHES. Kristofer Årestedt

FACTOR ANALYSIS EXPLORATORY APPROACHES Kristofer Årestedt 2013-04-28 UNIDIMENSIONALITY Unidimensionality imply that a set of items forming an instrument measure one thing in common Unidimensionality is

### Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree

### Step 5: Conduct Analysis. The CCA Algorithm

Model Parameterization: Step 5: Conduct Analysis P Dropped species with fewer than 5 occurrences P Log-transformed species abundances P Row-normalized species log abundances (chord distance) P Selected

### PCA to Eigenfaces. CS 510 Lecture #16 March 23 th A 9 dimensional PCA example

PCA to Eigenfaces CS 510 Lecture #16 March 23 th 2015 A 9 dimensional PCA example is dark around the edges and bright in the middle. is light with dark vertical bars. is light with dark horizontal bars.

### Canonical Correlation

Chapter 400 Introduction Canonical correlation analysis is the study of the linear relations between two sets of variables. It is the multivariate extension of correlation analysis. Although we will present

### A Introduction to Matrix Algebra and Principal Components Analysis

A Introduction to Matrix Algebra and Principal Components Analysis Multivariate Methods in Education ERSH 8350 Lecture #2 August 24, 2011 ERSH 8350: Lecture 2 Today s Class An introduction to matrix algebra

### Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings

### Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis

### JORIND 9(2) December, ISSN

AN ANALYSIS OF RURAL POVERTY IN OYO STATE: A PRINCIPAL COMPONENT APPROACH O.I. Osowole Department of Statistics University of Ibadan and Rita Ugbechie Department of Mathematics, College of Education Agbor,

Factor Analysis Advanced Financial Accounting II Åbo Akademi School of Business Factor analysis A statistical method used to describe variability among observed variables in terms of fewer unobserved variables

### Lecture 7: Principal component analysis (PCA) What is PCA? Why use PCA?

Lecture 7: Princial comonent analysis (PCA) Rationale and use of PCA The underlying model (what is a rincial comonent anyway?) Eigenvectors and eigenvalues of the samle covariance matrix revisited! PCA

### Clustering and Data Mining in R

Clustering and Data Mining in R Workshop Supplement Thomas Girke December 10, 2011 Introduction Data Preprocessing Data Transformations Distance Methods Cluster Linkage Hierarchical Clustering Approaches

### Medical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu

Medical Information Management & Mining You Chen Jan,15, 2013 You.chen@vanderbilt.edu 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?

### MULTIVARIATE DATA ANALYSIS WITH PCA, CA AND MS TORSTEN MADSEN 2007

MULTIVARIATE DATA ANALYSIS WITH PCA, CA AND MS TORSTEN MADSEN 2007 Archaeological material that we wish to analyse through formalised methods has to be described prior to analysis in a standardised, formalised

### T-test & factor analysis

Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue

### Multivariate Analysis (Slides 13)

Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Extended control charts

Extended control charts The control chart types listed below are recommended as alternative and additional tools to the Shewhart control charts. When compared with classical charts, they have some advantages

### Multivariate Analysis

Table Of Contents Multivariate Analysis... 1 Overview... 1 Principal Components... 2 Factor Analysis... 5 Cluster Observations... 12 Cluster Variables... 17 Cluster K-Means... 20 Discriminant Analysis...

### 1. The standardised parameters are given below. Remember to use the population rather than sample standard deviation.

Kapitel 5 5.1. 1. The standardised parameters are given below. Remember to use the population rather than sample standard deviation. The graph of cross-validated error versus component number is presented

### Introduction to Matrix Algebra

Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

### Doing Quantitative Research 26E02900, 6 ECTS Lecture 2: Measurement Scales. Olli-Pekka Kauppila Rilana Riikkinen

Doing Quantitative Research 26E02900, 6 ECTS Lecture 2: Measurement Scales Olli-Pekka Kauppila Rilana Riikkinen Learning Objectives 1. Develop the ability to assess a quality of measurement instruments

### A Brief Introduction to Factor Analysis

1. Introduction A Brief Introduction to Factor Analysis Factor analysis attempts to represent a set of observed variables X 1, X 2. X n in terms of a number of 'common' factors plus a factor which is unique

### DISCRIMINANT FUNCTION ANALYSIS (DA)

DISCRIMINANT FUNCTION ANALYSIS (DA) John Poulsen and Aaron French Key words: assumptions, further reading, computations, standardized coefficents, structure matrix, tests of signficance Introduction Discriminant

### DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

### Component Ordering in Independent Component Analysis Based on Data Power

Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals

### FACTOR ANALYSIS NASC

FACTOR ANALYSIS NASC Factor Analysis A data reduction technique designed to represent a wide range of attributes on a smaller number of dimensions. Aim is to identify groups of variables which are relatively

### Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

### Overview of Factor Analysis

Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,

### Nonlinear Iterative Partial Least Squares Method

Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., Richard-Plouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for

### Multivariate Analysis of Ecological Data

Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology

### Statistical Analysis. NBAF-B Metabolomics Masterclass. Mark Viant

Statistical Analysis NBAF-B Metabolomics Masterclass Mark Viant 1. Introduction 2. Univariate analysis Overview of lecture 3. Unsupervised multivariate analysis Principal components analysis (PCA) Interpreting

### 5.2 Customers Types for Grocery Shopping Scenario

------------------------------------------------------------------------------------------------------- CHAPTER 5: RESULTS AND ANALYSIS -------------------------------------------------------------------------------------------------------

### Teaching Multivariate Analysis to Business-Major Students

Teaching Multivariate Analysis to Business-Major Students Wing-Keung Wong and Teck-Wong Soon - Kent Ridge, Singapore 1. Introduction During the last two or three decades, multivariate statistical analysis

### Using Principal Components Analysis in Program Evaluation: Some Practical Considerations

http://evaluation.wmich.edu/jmde/ Articles Using Principal Components Analysis in Program Evaluation: Some Practical Considerations J. Thomas Kellow Assistant Professor of Research and Statistics Mercer

### Feature Extraction and Selection. More Info == Better Performance? Curse of Dimensionality. Feature Space. High-Dimensional Spaces

More Info == Better Performance? Feature Extraction and Selection APR Course, Delft, The Netherlands Marco Loog Feature Space Curse of Dimensionality A p-dimensional space, in which each dimension is a

### 1 Introduction. 2 Matrices: Definition. Matrix Algebra. Hervé Abdi Lynne J. Williams

In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 00 Matrix Algebra Hervé Abdi Lynne J. Williams Introduction Sylvester developed the modern concept of matrices in the 9th

### A Solution Manual and Notes for: Exploratory Data Analysis with MATLAB by Wendy L. Martinez and Angel R. Martinez.

A Solution Manual and Notes for: Exploratory Data Analysis with MATLAB by Wendy L. Martinez and Angel R. Martinez. John L. Weatherwax May 7, 9 Introduction Here you ll find various notes and derivations

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

### Factor Analysis. Sample StatFolio: factor analysis.sgp

STATGRAPHICS Rev. 1/10/005 Factor Analysis Summary The Factor Analysis procedure is designed to extract m common factors from a set of p quantitative variables X. In many situations, a small number of

### USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu

### Data analysis process

Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis

### SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 Statistical techniques to be covered Explore relationships among variables Correlation Regression/Multiple regression Logistic regression Factor analysis

### A Study of Internet Routing Stability Using Link Weight

A Study of Internet Routing Stability Using Link Weight Mohit Lad Jong Han Park Tiziana Refice Lixia Zhang ABSTRACT The global Internet routing infrastructure is a large scale distributed system where

### A Survey on Outlier Detection Techniques for Credit Card Fraud Detection

IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 16, Issue 2, Ver. VI (Mar-Apr. 2014), PP 44-48 A Survey on Outlier Detection Techniques for Credit Card Fraud

### Research Methodology: Tools

MSc Business Administration Research Methodology: Tools Applied Data Analysis (with SPSS) Lecture 02: Item Analysis / Scale Analysis / Factor Analysis February 2014 Prof. Dr. Jürg Schwarz Lic. phil. Heidi

### MULTIVARIATE DATA ANALYSIS i.-*.'.. ' -4

SEVENTH EDITION MULTIVARIATE DATA ANALYSIS i.-*.'.. ' -4 A Global Perspective Joseph F. Hair, Jr. Kennesaw State University William C. Black Louisiana State University Barry J. Babin University of Southern

### A Brief Introduction to SPSS Factor Analysis

A Brief Introduction to SPSS Factor Analysis SPSS has a procedure that conducts exploratory factor analysis. Before launching into a step by step example of how to use this procedure, it is recommended

### Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation

### DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

### A Demonstration of Hierarchical Clustering

Recitation Supplement: Hierarchical Clustering and Principal Component Analysis in SAS November 18, 2002 The Methods In addition to K-means clustering, SAS provides several other types of unsupervised

### Introduction to Multivariate Data Analysis and Visualization By Dmitry Grapov 2012

Introduction to Multivariate Data Analysis and Visualization By Dmitry Grapov 2012 This is an accompanying text to the introductory workshop in multivariate data analysis and visualization, taught at University

### Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

### Exploratory Factor Analysis

Introduction Principal components: explain many variables using few new variables. Not many assumptions attached. Exploratory Factor Analysis Exploratory factor analysis: similar idea, but based on model.

### APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large

### CLASSIFICATION OF EUROPEAN UNION COUNTRIES FROM DATA MINING POINT OF VIEW, USING SAS ENTERPRISE GUIDE

CLASSIFICATION OF EUROPEAN UNION COUNTRIES FROM DATA MINING POINT OF VIEW, USING SAS ENTERPRISE GUIDE Abstract Ana Maria Mihaela Iordache 1 Ionela Catalina Tudorache 2 Mihai Tiberiu Iordache 3 With the

### Principal Component Analysis (PCA) and Factor Analysis in Oasis montaj. Geosoft Technical Note.

Principal Component Analysis (PCA) and Factor Analysis in Oasis montaj Geosoft Technical Note www.geosoft.com Principal Component and Factor Analysis Principal Component Analysis (PCA) and Factor Analysis

### Examples on Variable Selection in PCA in Sensory Descriptive and Consumer Data

Examples on Variable Selection in PCA in Sensory Descriptive and Consumer Data Per Lea, Frank Westad, Margrethe Hersleth MATFORSK, Ås, Norway Harald Martens KVL, Copenhagen, Denmark 6 th Sensometrics Meeting

### Lecture - 32 Regression Modelling Using SPSS

Applied Multivariate Statistical Modelling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture - 32 Regression Modelling Using SPSS (Refer

### 4. Matrix Methods for Analysis of Structure in Data Sets:

ATM 552 Notes: Matrix Methods: EOF, SVD, ETC. D.L.Hartmann Page 64 4. Matrix Methods for Analysis of Structure in Data Sets: Empirical Orthogonal Functions, Principal Component Analysis, Singular Value

### Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

### A Comparison of Variable Selection Techniques for Credit Scoring

1 A Comparison of Variable Selection Techniques for Credit Scoring K. Leung and F. Cheong and C. Cheong School of Business Information Technology, RMIT University, Melbourne, Victoria, Australia E-mail:

### Joint Use of Factor Analysis (FA) and Data Envelopment Analysis (DEA) for Ranking of Data Envelopment Analysis

Joint Use of Factor Analysis () and Data Envelopment Analysis () for Ranking of Data Envelopment Analysis Reza Nadimi, Fariborz Jolai Abstract This article combines two techniques: data envelopment analysis

### PRINCIPAL COMPONENTS AND THE MAXIMUM LIKELIHOOD METHODS AS TOOLS TO ANALYZE LARGE DATA WITH A PSYCHOLOGICAL TESTING EXAMPLE

PRINCIPAL COMPONENTS AND THE MAXIMUM LIKELIHOOD METHODS AS TOOLS TO ANALYZE LARGE DATA WITH A PSYCHOLOGICAL TESTING EXAMPLE Markela Muca Llukan Puka Klodiana Bani Department of Mathematics, Faculty of

### Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement

Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement Toshio Sugihara Abstract In this study, an adaptive

### The Classical Linear Regression Model

The Classical Linear Regression Model 1 September 2004 A. A brief review of some basic concepts associated with vector random variables Let y denote an n x l vector of random variables, i.e., y = (y 1,

### Lecture 7: Factor Analysis. Laura McAvinue School of Psychology Trinity College Dublin

Lecture 7: Factor Analysis Laura McAvinue School of Psychology Trinity College Dublin The Relationship between Variables Previous lectures Correlation Measure of strength of association between two variables

### Cluster Analysis. Chapter. Chapter Outline. What You Will Learn in This Chapter

5 Chapter Cluster Analysis Chapter Outline Introduction, 210 Business Situation, 211 Model, 212 Distance or Dissimilarities, 213 Combinatorial Searches with K-Means, 216 Statistical Mixture Model with

### GIVING MEANINGFUL INTERPRETATION TO ORDINATION AXES: ASSESSING LOADING SIGNIFICANCE IN PRINCIPAL COMPONENT ANALYSIS

Ecology, 84(9), 2003, pp. 2347 2363 2003 by the Ecological Society of America GIVING MEANINGFUL INTERPRETATION TO ORDINATION AXES: ASSESSING LOADING SIGNIFICANCE IN PRINCIPAL COMPONENT ANALYSIS PEDRO R.

### Volume 2, Issue 9, September 2014 International Journal of Advance Research in Computer Science and Management Studies

Volume 2, Issue 9, September 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com

### Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk

Doi:10.5901/mjss.2014.v5n20p303 Abstract Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk Wilbert Sibanda Philip D. Pretorius

### Joint models for classification and comparison of mortality in different countries.

Joint models for classification and comparison of mortality in different countries. Viani D. Biatat 1 and Iain D. Currie 1 1 Department of Actuarial Mathematics and Statistics, and the Maxwell Institute

### POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

### 9 Hedging the Risk of an Energy Futures Portfolio UNCORRECTED PROOFS. Carol Alexander 9.1 MAPPING PORTFOLIOS TO CONSTANT MATURITY FUTURES 12 T 1)

Helyette Geman c0.tex V - 0//0 :00 P.M. Page Hedging the Risk of an Energy Futures Portfolio Carol Alexander This chapter considers a hedging problem for a trader in futures on crude oil, heating oil and