# Using Principal Component Analysis in Loan Granting

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 BULETINUL Universităţii Petrol Gaze din Ploieşti Vol. LXII No. 1/ Seria Matematică - Informatică - Fizică Using Principal Component Analysis in Loan Granting Irina Ioniţă, Daniela Şchiopu Petroleum - Gas University of Ploiesti, Informatics Department, Ploieşti, Romania Abstract This paper describes the utility of Principal Component Analysis (PCA) in the banking domain, more exactly in the consumer lending problem. PCA is a powerful tool for analyzing data of high dimension. When an applicant requests a loan for personal needs, a credit officer collects data from him and makes a scoring. The factors analyzed can be significant as well as insignificant. The principal component analysis can help in this case to extract those factors, which produce a better credit scoring model. The data set used for the analysis is provided by a public database containing credit data from a German bank. The results emphasize the utility of PCA in the banking sector to reduce the dimension of data, without much loss of information. Keywords: principal component analysis, consumer lending, credit scoring model, banking domain Introduction In banking domain to know what are the best decisions to make is a permanent concern for managers. An active banking area with higher risk is represented by credit department. Here, credits officers analyze the customers application credit forms and calculate a score. The factors considered can influence more or less the credit scoring model. Identifying those factors with higher significance is not a simply task. Principal Component Analysis (PCA) represents a powerful tool for analyzing data by reducing the number of dimensions, without important loss of information and has been applied on datasets in all scientific domains [16, 17]. On the other hand, PCA is known as an unsupervised dimensionality reduction technique which transfers the data linearly and projects original data to a new set of parameters called the factors (further on, we will use the term factor with the meaning of principal component ), while retaining as much as possible of the variation present in the data set. In this paper we discuss use of PCA in credit approval problem, considering a set of records provided by a German bank [24]. The results indicate the utility of eliminating variables with a minimum influence in credit scoring model in order to make a better decision for consumer loan granting. The instrument used to apply the PCA technique was SPSS [25]. The paper structure contains a section with theoretical sequences referring to PCA, a section regarding the PCA application in banking domain and a final section presenting a case study.

2 Using Principal Component Analysis in Loan Granting 89 Principal Component Analysis PCA is considered the oldest technique in multivariate analysis and was first introduced by Pearson in 1901, and it has been experiencing several modifications until it was generalized by Loeve in 1963 [21]. PCA is a method that reduces the dimensionality of a dataset, by finding a new set of variables, smaller than the original set of variables [15]. This efficient reduction of the number of variables is achieved by obtaining orthogonal linear combinations of the original variables the so-called Principal Components (PCs) [12]. PCA is useful for the compression of data and to find patterns in high-dimensional data. PCA and Factor Analysis (FA) are both methods for data reduction. FA analyzes only the variance shared among the variables, while PCA analyzes all of the variance. Concepts such eigenvalues, eigenvectors, loadings and scores are characteristics for these statistical methods. The main steps of the PCA algorithm are as shown in Figure 1, adapted from [18]. Fig. 1. Principal Components Analysis steps The mathematical equations for PCA are presented below. We consider a set of n observations on a vector of p variables organized in a matrix X (n x p): p { x1, x2, L, x n } R. (1) The PCA method finds p artificial variables (principal components). Each principal component is a linear combination of X matrix columns, in which the weights are elements of an eigenvector to the data covariance matrix or to the correlation matrix, provided the data are centered and standardized [7]. The principal components are uncorrelated.

3 90 Irina Ioniţă, Daniela Şchiopu The first principal component of the set by the linear transformation is: p T z1 = a1 x j = ai1xij, j = 1,..., n. (2) i = 1 In equation (2), the vectors a 1 and x j are: a 1 = ( a11, a21,..., a p1) (3) x = x, x,..., x ). (4) j ( 1 j 2 j pj One chooses a 1 and x j such as the variance of z 1 is maximum. All principal components start at the origin of the ordinate axes. First PC is direction of maximum variance from origin, while subsequent PCs are orthogonal to first PC and describe maximum residual variance. For example, when we work with two dimensions, we have the situation depicted in Figure 2, adapted from [23]. Fig. 2. Principal Components In the next section, we make a survey of various applications of the PCA method and we consider the use of this data reduction method in banking sector. PCA and Banking Domain PCA is applied in various domains such as medicine [11], face detection and recognition [3], signal processing [5], banking [1] etc. As we discussed earlier in this paper, PCA is an effective transformation method for reduction of a large number of correlated variables in situations in which variable selection is hard to achieve. The result of PCA is a set of new independent variables that can be directly used by credit scoring techniques. Continuous changes in the banking world produce strategies remodeling, adaptation on new financial trends and manage knowledge. A bank wants to maintain it in balance on the market, to obtain benefits with minimum costs. Managers have to find better solution to make decisions to increase their credibility and to situate their institution on the top. The basic objectives of bank management have focused on the need to balance between liquidity, assets, credit, interest rate risks, in order to minimize the risks for bankruptcy. Analyzing the factors that may affect these risks is an important job for mangers. An example of factor analysis (similarly with PCA) applied in banking domain to identity the risk exposure is presented in [14]. The results of this analysis indicated that liquidity and interest, domestic market, international market, business operation and credit are the factors affecting banks risk

4 Using Principal Component Analysis in Loan Granting 91 exposure. The managers have to consider these factors in formulating the risk management strategy to avoid any situation of bankruptcy. Credit department confronts with various problems regarding loan granting process. When a customer applies for a loan, the credit officer requests some financial and nonfinancial date and calculates a score. If the customer obtains a good score, the file with necessary documents will be analyze and sent to Central Bank. Other verifying procedures will be applied (for example, analyzing of customer credit history by Credit Bureau). A response affirmative or negative will be send bank to credit officer and the customer will be announced. In the favorable case, after signing the contract, the bank will supply the customer account with the value of loan granted. The credit score system used today was designed to provide lenders with financial profiles on consumers who wished to borrow money. The lenders' biggest concern was whether or not an individual had the ability to repay a loan on established terms, and find what percentage of risk might be involved [6, 8, 10]. Credit scoring is calculated by a mathematical equation that evaluates many types of information found in a consumer s credit file (duration of loan, loan amount, number of years on service, number of years of residence, marital status, education level etc.) [2, 4, 9]. By comparing this information to the repayment patterns store in hundreds of thousands of consumers past credit reports, the score identifies the lender s level of future credit risk. For example, the FICO credit score model takes into consideration five factors to create a model for credit scoring [19]: o Payment history (35% significance); o Outstanding credit balances (30% significance); o Credit history (15% significance); o Type of credit (10% significance); o Inquiries (10% significance). A FICO score can range between 300 and 850 and is a measure of client creditworthiness. Most credit bureau scores used in the U.S. are produced by Fair Isaac and Company, or FICO. FICO scores are provided to lenders by the three major credit reporting agencies: Equifax, Experian, and TransUnion [20]. The FICO score is considered an efficient predictive scoring model designed to evaluate costumer credit risk [19]. This credit scoring model has been widely developed and is used in many credit bureaus around the world, such as [19]: Asia/Pacific Rim, including South Korea, Singapore, India, Taiwan and Thailand, Europe/Middle East, including Ireland, Poland, Sweden, Saudi Arabia and Turkey, Latin America, including Brazil, Mexico, Peru and Panama. The FICO score is presented in Romania, since January In order to develop a credit scoring system to assist the credit officers in their decision process, we formulate the hypothesis of using PCA to identify those variables with minimum effect in credit scoring computing and to eliminate them from the scoring model. The next sections present our case study. The software package used in this case is SPSS [25]. Case Study The dataset that has been used in our case study has been obtained from a public database that contains credit data of a German bank [22]. We have chosen 500 records from this database, data that were organized in a table in SPSS [25]. In Table 1 we present the first five rows. The table also contained information about duration of the credit, credit history, purpose of the loan, credit amount, savings account, years employed, payment rate, personal status, residency, property, age, housing, number of credits at bank, job, dependents and credit approval (target variable). The first 15 variables will be further on denoted with V1 to V15 and the target variable with VT.

5 92 Irina Ioniţă, Daniela Şchiopu Table 1. Credit Data V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 VT The criteria we take into account for the number of the retained factors are: the cumulated percent in variation explained by the retained factors should be higher than 50% and the variance of each retained factor should be higher than 1. First, PCA summarizes the pattern of intercorrelations between variables. The variables that are highly correlated with one another are grouped together into factors. The correlation matrix for the first 13 variables is shown in Table 2. Value 0 for correlation coefficient indicates the absence of statistical linkage between variables. We note that only the V5 variable is not well correlated with some others. Table 2. The Correlation Matrix V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V1 1 V2,103 1 V3,108,149 1 V4,611,115,201 1 V5 -,058 -,054,006 -,084 1 V6,011,122 -,019 -,074,035 1 V7,057,087 -,063 -,288,007,136 1 V8 -,053 -,049,060 -,025,075,091,033 1 V9,052,104,121,024,011,220,030 -,112 1 V10 -,188 -,096 -,069 -,247 -,049 -,094 -,024,002 -,103 1 V11 -,036,198,104,021,003,270,124,038,327 -,130 1 V12 -,141 -,091 -,067 -,136,058 -,095 -,079 -,036 -,046,318 -,342 1 V13 -,031,452,131,016,015,099,043 -,046,064 -,022,149 -,090 1 KMO (Kaiser-Meyer-Olkin) is a statistic test that indicates the degree of association of the variables [16]. In our case, KMO test is This value is an argument favorable to the existence of factors, suggesting that factoring is appropriate. The communality for a given variable can be interpreted as the proportion of variation in that variable that is explained by the analyzed factor. A factor loading represents the correlation between a variable and a factor that has been extracted from the data. The communalities for the V i variable (i = 1,,15) are computed by taking the sum of the squared loadings for that variable. Lowest values of communality indicate that the analyzed variable is inadequate represented by the factorial model. Here, the most variable communalities are between 0.5 and 0.9 (see Table 3). Using the PCA method the fifteen variables are reduced to seven factors as shown in Table 4. These seven factors can be used further as predictors. The variance in the correlation matrix is reassembled into 15 eigenvalues. Each eigenvalue represents the amount of variance that has been captured by one component. In general, once eigenvectors are found from the covariance matrix, the next step is to order them by eigenvalue, highest to lowest. This gives the components in order of significance and we can decide to ignore the components of lesser significance. This procedure implies to lose some information, but if the eigenvalues are small, the loss of information is minimal.

6 Using Principal Component Analysis in Loan Granting 93 Table 3. Communalities V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 Initial Extraction 1,000,837 1,000,707 1,000,482 1,000,819 1,000,713 1,000,633 1,000,691 1,000,714 1,000,730 1,000,458 1,000,602 1,000,651 1,000,707 1,000,621 1,000,553 In the first part of Table4 (the first three columns) are presented the eigenvalues for our data and proportions of variance for the fifteen components. Only the first seven components have eigenvalues greater than 1. Component Table 4. Total Variance Explained Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative % 2,325 15,500 15,500 2,325 15,500 15,500 1,641 10,942 10,942 1,854 12,357 27,858 1,854 12,357 27,858 1,638 10,921 21,863 1,330 8,870 36,727 1,330 8,870 36,727 1,609 10,725 32,588 1,172 7,815 44,542 1,172 7,815 44,542 1,551 10,339 42,927 1,123 7,488 52,030 1,123 7,488 52,030 1,229 8,191 51,118 1,077 7,183 59,213 1,077 7,183 59,213 1,147 7,648 58,766 1,037 6,910 66,123 1,037 6,910 66,123 1,104 7,358 66,123,933 6,222 72,345,813 5,422 77,767,725 4,831 82,598,693 4,618 87,216,629 4,195 91,411,543 3,623 95,034,474 3,161 98,195,271 1, ,000 Extraction Method: Principal Component Analysis. We select varimax, the most used method of factors rotation. Its purpose is to reduce the variance of values not accounted in composition of the factor. Varimax minimize the complexity of the components. The columns Extraction Sums of Squared Loadings contain eigenvalues, explanatory and cumulative version for that three factors, in the context of the initially factorial solution (without rotation). The explanatory version for each factor is distributed as follow: factor I %; factor II %, factor III 8.870% and so on. The proportion of the total variation explained by the seven factors is %. The individual communalities tell how well the model is working for the individual variables, and the total communality gives an overall assessment of performance. In the columns Rotation Sums of Squared Loadings, we have the same values for all the factors, but, as a result of rotation, a new distribution of explanatory variance of each factor can be observed (factor I %; factor II %, factor III % and so on), with the same total variation (66.123%). The remaining variation till 100% is unexplained by this model. With the method of rotation, factor I and factor II loose from the saturated degree in favor of factor III and factor VII.

7 94 Irina Ioniţă, Daniela Şchiopu Fig. 3. SPSS Scree Plot Scree Plot (see Figure 3) presents graphically the eigenvalues for all the principal components obtained, the numeric values being stored in Table 4. In Table 5 are indicated the data after factors rotation. Table 5. Rotated Component Matrix V9 V6 V11 V1 V4 V12 V14 V10 V13 V2 V7 V3 V8 V5 V15 Component ,724,016,063,007,172 -,391,136,695,085 -,124,074 -,231,260 -,038,619 -,151,405,165,003 1,98E-005 -,071,013,904,091,034 -,089 -,038 -,001 -,054,809,186,038,347 -,025 -,066 -,174 -,045 -,711 -,047,102 -,160,272 -,248,052,687,029,138 -,192,171 -,160 -,290 -,581,021,034 -,071 -,062,028 -,081,026,835,038 -,018 -,001,116,142,012,815 -,045 -,032 -,078,100 -,069,152,179 -,777,033,127,069,087,193,310,550,085,163 -,025 -,040,055 -,056,030,831,120,120 -,063 -,037 -,004,067,225,799,294 -,040 -,040,139,268,326 -,516 Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 14 iterations. The data in Table 5 lead to the final conclusions on the factorial structure of the analyzed variables, as follows: o Factor I has a good correlation to variables V9, V6, V11 (and will be labeled TimeFactor); o Factor II has a good correlation to V1, V4 (and will be labeled CreditFactor); o Factor III has a good correlation to V14 (and will be labeled JobFactor); o Factor IV has a good correlation to V13, V2 (and will be labeled CustomerCreditFactor); o Factor V has a good correlation to V3 (and will be labeled PurposeFactor); o Factor VI has a good correlation to V8, V15 (and will be labeled CustomerFamilyFactor);

8 Using Principal Component Analysis in Loan Granting 95 o Factor VII has a good correlation to V5 (and will be labeled SavingsFactor). A related work is presented in [13], where the authors presents an improved credit scoring to the Chinese commercial bank for credit card risk management. Also, they choose PCA to calculate the weights of the original indexes and to obtain a predicting function for computing the score of new applicants. In their case, the KMO value given by SPSS is Conclusions PCA is a method for multivariate data analysis and it is used in many fields to extract relevant information from confusing data sets. Also, PCA provides a way of identifying patterns in data, and of expressing the data in a way that highlights their similarities and differences between them. An advantage of using PCA consists in quantifying the importance of each dimension for describing the variability of a data set. This method reduces the number of dimensions, without much loss of information. In this paper we presented the possible use of PCA in the banking domain to reduce the dimension of data related to the credit problem. In our case study, we used 15 variables describing consumer loan data; by means of PCA, we have got only seven factors that concentrate more than 60% of the information provided by the original 15 variables. Future work will focus on developing a credit scoring system based on use of PCA in the banking domain. References 1. A b u - S h a n a b, E., P e a r s o n, M. - Internet Banking in Jordan: An Arabic Instrument Validation Process, The International Arab Journal of Information Technology, Vol. 6, No. 3, 2009, pp Basno, C., Dardac, N. - Management bancar, Editura Economică, Bucureşti, E l - B a k r y, H. M. - New Fast Principal Component Analysis for Face Detection, Journal of Advanced Computational Intelligence and Intelligent Informatics, Vol.11, No.2, 2007, pp H a n d, D. J., H e n e y, W. E. - Statistical Classification Methods in Consumer Client Scoring: A Review, J. R. Statist. Soc., 160, 1997, pp H e l m y, A. K., E l - T a w e e l, G H. S. - Authentication Scheme Based on Principal Component Analysis for Satellite Images, International Journal of Signal Processing, Image Processing and Pattern Recognition, Vol. 2, No.3, 2009, pp L y n, C. T. - A survey of credit and behavioral scoring: forecasting financial risk of lending to consumers, accessed M a r i n o i u, C. - Grades-based Characterization of Freshmen Using PCA, Petroleum-Gas University of Ploiesti Bulletin, Mathematics, Informatics, Physics Series, vol. LXI, no.2, M e s t e r, L. J. - What s the point of Credit Scoring, Business review, September-October Olteanu, A., Olteanu, F.M., Badea, L. - Management bancar. Caracteristici, strategii, studii de caz, Editura Dareco, Bucureşti, Rosenbaum, B.Jr. - Your Credit Score. What It Means to You as a Prospective Home Buyer, accessed S i n e s c u, I. e t a l. - Principal Component Analysis and Classification with Application in Medicine, accessed Sustersic, M., Mramor, D., Zupan, J. - Consumer Credit Scoring Models With Limited Data, EFA 2007 Ljubljana Meetings Paper, accessed X u, W., L i u, J., L i, J. - An Improved Credit Scoring Method for Chinese Commercial Banks, accessed Y a p, V. C. e t a l. - Factor Affecting Banks Risk Exposure: Evidence from Malaysia, European Journal of Economics, Finance and Administrative Science, ISSN Issue 19, 2010, pp

9 96 Irina Ioniţă, Daniela Şchiopu 15. Ye, J. - Principal Component Analysis, /NOTES/PCA.ppt, accessed ***- Annotated SPSS Output Principal Components Analysis, output/principal_components.htm, accessed ***-A Tutorial on Principal Component Analysis, accessed ***- Essential Steps of Principal Component Analysis, multivariatestatistics/principalcomponent.html, accessed ***- FICO score, accessed ***- accessed ***- Introduction to Pattern Recognition. Lecture 5: Dimensionality reduction (PCA), accessed ***- Know the Score. Educate Yourself about Credit Scores and Critical Credit Issues, accessed ***- Principal Component Analysis (PCA), pcatutorial.pdf, accessed ***- Source data, accessed *** - SPSS (Statistical Package for the Social Sciences), accessed 2010 Utilizarea analizei în componente principale în problema acordării creditelor Rezumat Această lucrare descrie utilitatea Analizei în Componente Principale (ACP) în domeniul bancar, mai exact în soluţionarea problemei acordării creditelor de consum. ACP reprezintă un instrument puternic pentru analiza datelor de mari dimensiuni. Atunci când există o cerere pentru un împrumut de nevoi personale, ofiţerul de credite colectează date de la această persoană şi îi calculează un punctaj. Factorii consideraţi pot influenţa diferit analiza de credit realizată. ACP poate ajuta în acest caz la extragerea acelor factori care descriu cel mai bine modelul de credit scoring. Datele folosite în aplicaţie au fost preluate dintr-o bază de date publică ce conţine date de creditare de la o bancă germană. Rezultatele subliniază utilitatea aplicării ACP în sectorul bancar pentru a reduce dimensiunea datelor, fără pierderi mari de informaţii utile.

### Introduction to Principal Components and FactorAnalysis

Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a

### THE USING FACTOR ANALYSIS METHOD IN PREDICTION OF BUSINESS FAILURE

THE USING FACTOR ANALYSIS METHOD IN PREDICTION OF BUSINESS FAILURE Mary Violeta Petrescu Ph. D University of Craiova Faculty of Economics and Business Administration Craiova, Romania Abstract: : After

### The Analysis of Currency Exchange Rate Evolution using a Data Mining Technique

Petroleum-Gas University of Ploiesti BULLETIN Vol. LXIII No. 3/2011 105-112 Economic Sciences Series The Analysis of Currency Exchange Rate Evolution using a Data Mining Technique Mădălina Cărbureanu Petroleum-Gas

### Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis

### The Annual Inflation Rate Analysis Using Data Mining Techniques

Economic Insights Trends and Challenges Vol. I (LXIV) No. 4/2012 121-128 The Annual Inflation Rate Analysis Using Data Mining Techniques Mădălina Cărbureanu Petroleum-Gas University of Ploiesti, Bd. Bucuresti

### Common factor analysis

Common factor analysis This is what people generally mean when they say "factor analysis" This family of techniques uses an estimate of common variance among the original variables to generate the factor

### To do a factor analysis, we need to select an extraction method and a rotation method. Hit the Extraction button to specify your extraction method.

Factor Analysis in SPSS To conduct a Factor Analysis, start from the Analyze menu. This procedure is intended to reduce the complexity in a set of data, so we choose Data Reduction from the menu. And the

### A Decision Tree for Weather Prediction

BULETINUL UniversităŃii Petrol Gaze din Ploieşti Vol. LXI No. 1/2009 77-82 Seria Matematică - Informatică - Fizică A Decision Tree for Weather Prediction Elia Georgiana Petre Universitatea Petrol-Gaze

CREDIT BASICS About your credit score Your credit score influences the credit that s available to you and the terms (interest rate, etc.) that lenders offer you. It s a vital part of your credit health.

### Principal Components Analysis (PCA)

Principal Components Analysis (PCA) Janette Walde janette.walde@uibk.ac.at Department of Statistics University of Innsbruck Outline I Introduction Idea of PCA Principle of the Method Decomposing an Association

### Principal Component Analysis

Principal Component Analysis ERS70D George Fernandez INTRODUCTION Analysis of multivariate data plays a key role in data analysis. Multivariate data consists of many different attributes or variables recorded

### Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA

PROC FACTOR: How to Interpret the Output of a Real-World Example Rachel J. Goldberg, Guideline Research/Atlanta, Inc., Duluth, GA ABSTRACT THE METHOD This paper summarizes a real-world example of a factor

### Applying TwoStep Cluster Analysis for Identifying Bank Customers Profile

BULETINUL UniversităŃii Petrol Gaze din Ploieşti Vol. LXII No. 3/00 66-75 Seria ŞtiinŃe Economice Applying TwoStep Cluster Analysis for Identifying Bank Customers Profile Daniela Şchiopu Petroleum-Gas

### T-test & factor analysis

Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue

### Doing Quantitative Research 26E02900, 6 ECTS Lecture 2: Measurement Scales. Olli-Pekka Kauppila Rilana Riikkinen

Doing Quantitative Research 26E02900, 6 ECTS Lecture 2: Measurement Scales Olli-Pekka Kauppila Rilana Riikkinen Learning Objectives 1. Develop the ability to assess a quality of measurement instruments

### Factor Analysis. Chapter 420. Introduction

Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.

### 5.2 Customers Types for Grocery Shopping Scenario

------------------------------------------------------------------------------------------------------- CHAPTER 5: RESULTS AND ANALYSIS -------------------------------------------------------------------------------------------------------

### Principal Component Analysis

Principal Component Analysis Principle Component Analysis: A statistical technique used to examine the interrelations among a set of variables in order to identify the underlying structure of those variables.

### EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

### PRINCIPAL COMPONENTS AND THE MAXIMUM LIKELIHOOD METHODS AS TOOLS TO ANALYZE LARGE DATA WITH A PSYCHOLOGICAL TESTING EXAMPLE

PRINCIPAL COMPONENTS AND THE MAXIMUM LIKELIHOOD METHODS AS TOOLS TO ANALYZE LARGE DATA WITH A PSYCHOLOGICAL TESTING EXAMPLE Markela Muca Llukan Puka Klodiana Bani Department of Mathematics, Faculty of

### Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

### Factor analysis. Angela Montanari

Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

### FACTOR ANALYSIS EXPLORATORY APPROACHES. Kristofer Årestedt

FACTOR ANALYSIS EXPLORATORY APPROACHES Kristofer Årestedt 2013-04-28 UNIDIMENSIONALITY Unidimensionality imply that a set of items forming an instrument measure one thing in common Unidimensionality is

Factor Analysis Advanced Financial Accounting II Åbo Akademi School of Business Factor analysis A statistical method used to describe variability among observed variables in terms of fewer unobserved variables

http://www.myfico.com/crediteducation/creditscores.aspx About your credit score Your credit score influences the credit that s available to you and the terms (interest rate, etc.) that lenders offer you.

### Overview of Factor Analysis

Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,

### Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round \$200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

### Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk

Doi:10.5901/mjss.2014.v5n20p303 Abstract Exploratory Factor Analysis of Demographic Characteristics of Antenatal Clinic Attendees and their Association with HIV Risk Wilbert Sibanda Philip D. Pretorius

### A Brief Introduction to SPSS Factor Analysis

A Brief Introduction to SPSS Factor Analysis SPSS has a procedure that conducts exploratory factor analysis. Before launching into a step by step example of how to use this procedure, it is recommended

### International Journal of Business and Social Science Vol. 3 No. 3; February 2012

50 COMOVEMENTS AND STOCK MARKET INTEGRATION BETWEEN INDIA AND ITS TOP TRADING PARTNERS: A MULTIVARIATE ANALYSIS OF INTERNATIONAL PORTFOLIO DIVERSIFICATION Abstract Alan Harper, Ph.D. College of Business

### Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

### A Brief Introduction to Factor Analysis

1. Introduction A Brief Introduction to Factor Analysis Factor analysis attempts to represent a set of observed variables X 1, X 2. X n in terms of a number of 'common' factors plus a factor which is unique

### FACTOR ANALYSIS NASC

FACTOR ANALYSIS NASC Factor Analysis A data reduction technique designed to represent a wide range of attributes on a smaller number of dimensions. Aim is to identify groups of variables which are relatively

### FACTOR ANALYSIS. Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables.

FACTOR ANALYSIS Introduction Factor Analysis is similar to PCA in that it is a technique for studying the interrelationships among variables Both methods differ from regression in that they don t have

### Mitchell H. Holt Dec 2008 Senior Thesis

Mitchell H. Holt Dec 2008 Senior Thesis Default Loans All lending institutions experience losses from default loans They must take steps to minimize their losses Only lend to low risk individuals Low Risk

### CHAPTER VI ON PRIORITY SECTOR LENDING

CHAPTER VI IMPACT OF PRIORITY SECTOR LENDING 6.1 PRINCIPAL FACTORS THAT HAVE DIRECT IMPACT ON PRIORITY SECTOR LENDING 6.2 ASSOCIATION BETWEEN THE PROFILE VARIABLES AND IMPACT OF PRIORITY SECTOR CREDIT

### What Are Principal Components Analysis and Exploratory Factor Analysis?

Statistics Corner Questions and answers about language testing statistics: Principal components analysis and exploratory factor analysis Definitions, differences, and choices James Dean Brown University

### A Comparison of Variable Selection Techniques for Credit Scoring

1 A Comparison of Variable Selection Techniques for Credit Scoring K. Leung and F. Cheong and C. Cheong School of Business Information Technology, RMIT University, Melbourne, Victoria, Australia E-mail:

### CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In

### Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement

Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement Toshio Sugihara Abstract In this study, an adaptive

### Factor Rotations in Factor Analyses.

Factor Rotations in Factor Analyses. Hervé Abdi 1 The University of Texas at Dallas Introduction The different methods of factor analysis first extract a set a factors from a data set. These factors are

### Factor Analysis and Structural equation modelling

Factor Analysis and Structural equation modelling Herman Adèr Previously: Department Clinical Epidemiology and Biostatistics, VU University medical center, Amsterdam Stavanger July 4 13, 2006 Herman Adèr

### BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents

### CLASSIFICATION OF EUROPEAN UNION COUNTRIES FROM DATA MINING POINT OF VIEW, USING SAS ENTERPRISE GUIDE

CLASSIFICATION OF EUROPEAN UNION COUNTRIES FROM DATA MINING POINT OF VIEW, USING SAS ENTERPRISE GUIDE Abstract Ana Maria Mihaela Iordache 1 Ionela Catalina Tudorache 2 Mihai Tiberiu Iordache 3 With the

### Partial Least Squares (PLS) Regression.

Partial Least Squares (PLS) Regression. Hervé Abdi 1 The University of Texas at Dallas Introduction Pls regression is a recent technique that generalizes and combines features from principal component

### A.MORTGAGE LENDER B. CREDIT CARD ISSUER C.HOME INSURER E. ELECTRIC COMPANY F. LANDLORD G.ALL OF THE ABOVE D.CELL PHONE COMPANY

1. WHICH OF THE FOLLOWING MIGHT USE CREDIT SCORES? A.MORTGAGE LENDER B. CREDIT CARD ISSUER C.HOME INSURER E. ELECTRIC COMPANY F. LANDLORD G.ALL OF THE ABOVE D.CELL PHONE COMPANY 1. WHICH OF THE FOLLOWING

### Title: Lending Club Interest Rates are closely linked with FICO scores and Loan Length

Title: Lending Club Interest Rates are closely linked with FICO scores and Loan Length Introduction: The Lending Club is a unique website that allows people to directly borrow money from other people [1].

### How to report the percentage of explained common variance in exploratory factor analysis

UNIVERSITAT ROVIRA I VIRGILI How to report the percentage of explained common variance in exploratory factor analysis Tarragona 2013 Please reference this document as: Lorenzo-Seva, U. (2013). How to report

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Getting Started in Factor Analysis (using Stata 10) (ver. 1.5)

Getting Started in Factor Analysis (using Stata 10) (ver. 1.5) Oscar Torres-Reyna Data Consultant otorres@princeton.edu http://dss.princeton.edu/training/ Factor analysis is used mostly for data reduction

### APPRAISAL OF FINANCIAL AND ADMINISTRATIVE FUNCTIONING OF PUNJAB TECHNICAL UNIVERSITY

APPRAISAL OF FINANCIAL AND ADMINISTRATIVE FUNCTIONING OF PUNJAB TECHNICAL UNIVERSITY In the previous chapters the budgets of the university have been analyzed using various techniques to understand the

### Chapter 7 Factor Analysis SPSS

Chapter 7 Factor Analysis SPSS Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of correlations within a set of observed variables. Factor analysis is often

### Research Methodology: Tools

MSc Business Administration Research Methodology: Tools Applied Data Analysis (with SPSS) Lecture 02: Item Analysis / Scale Analysis / Factor Analysis February 2014 Prof. Dr. Jürg Schwarz Lic. phil. Heidi

### PRINCIPAL COMPONENT ANALYSIS

1 Chapter 1 PRINCIPAL COMPONENT ANALYSIS Introduction: The Basics of Principal Component Analysis........................... 2 A Variable Reduction Procedure.......................................... 2

### Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

### InvestorsInvestmentDecisionsinCapitalMarketKeyFactors

Global Journal of Management and Business Research: C Finance Volume 15 Issue 4 Version 1.0 Year 2015 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc. (USA)

### Volume 2, Issue 9, September 2014 International Journal of Advance Research in Computer Science and Management Studies

Volume 2, Issue 9, September 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com

### U.P.B. Sci. Bull., Series C, Vol. 77, Iss. 1, 2015 ISSN 2286 3540

U.P.B. Sci. Bull., Series C, Vol. 77, Iss. 1, 2015 ISSN 2286 3540 ENTERPRISE FINANCIAL DISTRESS PREDICTION BASED ON BACKWARD PROPAGATION NEURAL NETWORK: AN EMPIRICAL STUDY ON THE CHINESE LISTED EQUIPMENT

### FICO Credit Score Algorithm. Group 14

FICO Credit Score Algorithm Group 14 What is FICO? Fico is short for Fair Isaac and Co. It is the most widely used credit score. The Fair Isaac Company developed custom software back in the 1980s that

### Component Ordering in Independent Component Analysis Based on Data Power

Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals

### AN IMPROVED CREDIT SCORING METHOD FOR CHINESE COMMERCIAL BANKS

AN IMPROVED CREDIT SCORING METHOD FOR CHINESE COMMERCIAL BANKS Jianping Li Jinli Liu Weixuan Xu 1.University of Science & Technology of China, Hefei, 230026, P.R. China 2.Institute of Policy and Management

### Your credit score. What it is. What it means.

Your credit score. What it is. What it means. Y ou may have heard of credit scores and wonder what they are. How do they affect your ability to get a loan? How do they affect the interest rate and points

### Principal Component Analysis Application to images

Principal Component Analysis Application to images Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering, Department of Cybernetics Center for Machine Perception http://cmp.felk.cvut.cz/

### Introduction to Principal Component Analysis: Stock Market Values

Chapter 10 Introduction to Principal Component Analysis: Stock Market Values The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from

### Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree

### Determinants of Non Performing Loans: Case of US Banking Sector

141 Determinants of Non Performing Loans: Case of US Banking Sector Irum Saba 1 Rehana Kouser 2 Muhammad Azeem 3 Non Performing Loan Rate is the most important issue for banks to survive. There are lots

### Exploratory Factor Analysis

Introduction Principal components: explain many variables using few new variables. Not many assumptions attached. Exploratory Factor Analysis Exploratory factor analysis: similar idea, but based on model.

### 1. The standardised parameters are given below. Remember to use the population rather than sample standard deviation.

Kapitel 5 5.1. 1. The standardised parameters are given below. Remember to use the population rather than sample standard deviation. The graph of cross-validated error versus component number is presented

### LESSON 7 -- CREDIT REPORTS AND CREDIT SCORES

LESSON 7 -- CREDIT REPORTS AND CREDIT SCORES LESSON DESCRIPTION AND BACKGROUND Using the Better Money Habits video What s the Difference Between a Credit Report and a Credit Score? (www.bettermoneyhabits.com),

### A Survey on Outlier Detection Techniques for Credit Card Fraud Detection

IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 16, Issue 2, Ver. VI (Mar-Apr. 2014), PP 44-48 A Survey on Outlier Detection Techniques for Credit Card Fraud

### 2. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) F 2 X 4 U 4

1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) 3. Univariate and multivariate

Understanding Your Credit Score Contents Your Credit Score A Vital Part of Your Credit Health...... 1 How Credit Scoring Helps You........ 2 Your Credit Report The Basis of Your Score............. 4 How

### 4. There are no dependent variables specified... Instead, the model is: VAR 1. Or, in terms of basic measurement theory, we could model it as:

1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data 2. Linearity (in the relationships among the variables--factors are linear constructions of the set of variables; the critical source

### Joint models for classification and comparison of mortality in different countries.

Joint models for classification and comparison of mortality in different countries. Viani D. Biatat 1 and Iain D. Currie 1 1 Department of Actuarial Mathematics and Statistics, and the Maxwell Institute

### Some Statistical Applications In The Financial Services Industry

Some Statistical Applications In The Financial Services Industry Wenqing Lu May 30, 2008 1 Introduction Examples of consumer financial services credit card services mortgage loan services auto finance

### CREDIT SCORES: AN OFTEN-OVERLOOKED, BUT ESSENTIAL, ELEMENT IN FINANCIAL PLANNING Learn Why They re Important and How You Can Improve Yours

CREDIT SCORES: AN OFTEN-OVERLOOKED, BUT ESSENTIAL, ELEMENT IN FINANCIAL PLANNING Learn Why They re Important and How You Can Improve Yours ENGLEWOOD, COLORADO Have you applied for a mortgage, car loan,

### User Behaviour on Google Search Engine

104 International Journal of Learning, Teaching and Educational Research Vol. 10, No. 2, pp. 104 113, February 2014 User Behaviour on Google Search Engine Bartomeu Riutord Fe Stamford International University

### Behavioral Entropy of a Cellular Phone User

Behavioral Entropy of a Cellular Phone User Santi Phithakkitnukoon 1, Husain Husna, and Ram Dantu 3 1 santi@unt.edu, Department of Comp. Sci. & Eng., University of North Texas hjh36@unt.edu, Department

### Standard 7: The student will identify the procedures and analyze the 3 responsibilities of borrowing money.

TEACHER GUIDE 7.3 BORROWING MONEY PAGE 1 Standard 7: The student will identify the procedures and analyze the 3 responsibilities of borrowing money. Your Credit Score Priority Academic Student Skills Personal

### Improved Neural Network Performance Using Principal Component Analysis on Matlab

Improved Neural Network Performance Using Principal Component Analysis on Matlab Improved Neural Network Performance Using Principal Component Analysis on Matlab Junita Mohamad-Saleh Senior Lecturer School

### Relationship Quality as Predictor of B2B Customer Loyalty. Shaimaa S. B. Ahmed Doma

Relationship Quality as Predictor of B2B Customer Loyalty Shaimaa S. B. Ahmed Doma Faculty of Commerce, Business Administration Department, Alexandria University Email: Shaimaa_ahmed24@yahoo.com Abstract

### Bulletin of the Petroleum Gas University of Ploieşti Description of the Website and the Content Management Application

BULETINUL Universităţii Petrol Gaze din Ploieşti Vol. LIX No. 1/2007 67-72 Seria Matematică - Informatică - Fizică Bulletin of the Petroleum Gas University of Ploieşti Description of the Website and the

### EFFECT OF ENVIRONMENTAL CONCERN & SOCIAL NORMS ON ENVIRONMENTAL FRIENDLY BEHAVIORAL INTENTIONS

169 EFFECT OF ENVIRONMENTAL CONCERN & SOCIAL NORMS ON ENVIRONMENTAL FRIENDLY BEHAVIORAL INTENTIONS Joshi Pradeep Assistant Professor, Quantum School of Business, Roorkee, Uttarakhand, India joshipradeep_2004@yahoo.com

### A Introduction to Matrix Algebra and Principal Components Analysis

A Introduction to Matrix Algebra and Principal Components Analysis Multivariate Methods in Education ERSH 8350 Lecture #2 August 24, 2011 ERSH 8350: Lecture 2 Today s Class An introduction to matrix algebra

### A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication

2012 45th Hawaii International Conference on System Sciences A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication Namhyoung Kim, Jaewook Lee Department of Industrial and Management

### Pull and Push Factors of Migration: A Case Study in the Urban Area of Monywa Township, Myanmar

Pull and Push Factors of Migration: A Case Study in the Urban Area of Monywa Township, Myanmar By Kyaing Kyaing Thet Abstract: Migration is a global phenomenon caused not only by economic factors, but

### Your Credit Score. 800-491-2328 www.congressionalfcu.org. score of at least 620 for approval and 760 for the best interest rate.

800-491-2328 www.congressionalfcu.org Your Credit Score Your credit score can have a major impact on your life. Not only do creditors typically check your score when deciding whether or not to approve

### Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

### Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings

### EXPLORATORY FACTOR ANALYSIS IN MPLUS, R AND SPSS. sigbert@wiwi.hu-berlin.de

EXPLORATORY FACTOR ANALYSIS IN MPLUS, R AND SPSS Sigbert Klinke 1,2 Andrija Mihoci 1,3 and Wolfgang Härdle 1,3 1 School of Business and Economics, Humboldt-Universität zu Berlin, Germany 2 Department of

### Multivariate Analysis (Slides 4)

Multivariate Analysis (Slides 4) In today s class we examine examples of principal components analysis. We shall consider a difficulty in applying the analysis and consider a method for resolving this.

### The Flow of Funds Into and Out of Business

BULETINUL Universităţii Petrol Gaze din Ploieşti Vol. LX No. 2/2008 69-76 Seria Ştiinţe Economice The Flow of Funds Into and Out of Business Mihail Vincenţiu Ivan Petroleum-Gas University of Ploieşti,

### An introduction to. Principal Component Analysis & Factor Analysis. Using SPSS 19 and R (psych package) Robin Beaumont robin@organplayers.co.

An introduction to Principal Component Analysis & Factor Analysis Using SPSS 19 and R (psych package) Robin Beaumont robin@organplayers.co.uk Monday, 23 April 2012 Acknowledgment: The original version

### THE IMPACT OF MACROECONOMIC FACTORS ON NON-PERFORMING LOANS IN THE REPUBLIC OF MOLDOVA

Abstract THE IMPACT OF MACROECONOMIC FACTORS ON NON-PERFORMING LOANS IN THE REPUBLIC OF MOLDOVA Dorina CLICHICI 44 Tatiana COLESNICOVA 45 The purpose of this research is to estimate the impact of several