Credit Risk Estimation Model Development Process: Main Steps and Model Improvement



Similar documents
U + PV(Interest Tax Shield)

Risk and Return: Estimating Cost of Capital

Maintenance Scheduling Optimization for 30kt Heavy Haul Combined Train in Daqin Railway

FIXED INCOME ATTRIBUTION

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION

Credit Risk Models. August 24 26, 2010

An Anatomy of Futures Returns: Risk Premiums and Trading Strategies

Labor Demand. 1. The Derivation of the Labor Demand Curve in the Short Run:

Financial Services [Applications]

Fuzzy Numbers in the Credit Rating of Enterprise Financial Condition

Data quality in Accounting Information Systems

Neural Networks and Support Vector Machines

Classification of Bad Accounts in Credit Card Industry

Supply Chain Forecasting Model Using Computational Intelligence Techniques

GUIDE TO LISTING ON NASDAQ OMX FIRST NORTH

A RAPID METHOD FOR WATER TARGET EXTRACTION IN SAR IMAGE

Retrospective Test for Loss Reserving Methods - Evidence from Auto Insurers

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

Quality assurance of solar thermal systems with the ISFH- Input/Output-Procedure

Equity forecast: Predicting long term stock price movement using machine learning

Evaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

An Approach for Constructing Evaluation Model of Suitability Assessment of Agile Methods using Analytic Hierarchy Process

Investment Analysis (FIN 670) Fall Homework 5

Copyright 2005 IEEE. Reprinted from IEEE MTT-S International Microwave Symposium 2005

Why focus on assessment now?

Factors Affecting the Adoption of Global Sourcing: A Case Study of Tuskys Supermarkets, Kenya

Bank Customers (Credit) Rating System Based On Expert System and ANN

Software Development Cost and Time Forecasting Using a High Performance Artificial Neural Network Model

The Calculation of Marginal Effective Tax Rates

Offshore Oilfield Development Planning under Uncertainty and Fiscal Considerations

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

Algorithmic Scoring Models

Additional sources Compilation of sources:

Power functions: f(x) = x n, n is a natural number The graphs of some power functions are given below. n- even n- odd

Novelty Detection in image recognition using IRF Neural Networks properties

Prediction of Stock Performance Using Analytical Techniques

A MPCP-Based Centralized Rate Control Method for Mobile Stations in FiWi Access Networks

Research on supply chain risk evaluation based on the core enterprise-take the pharmaceutical industry for example

ANALYTICAL STUDY AND MODELING OF STATISTICAL METHODS FOR FINANCIAL DATA ANALYSIS: THEORETICAL ASPECT

Credit Risk Assessment of POS-Loans in the Big Data Era

Fairfield Public Schools

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics

A performance analysis of EtherCAT and PROFINET IRT

Statistics for Retail Finance. Chapter 8: Regulation and Capital Requirements

Statistics in Retail Finance. Chapter 2: Statistical models of default

A Multi-level Artificial Neural Network for Residential and Commercial Energy Demand Forecast: Iran Case Study

SIMPLIFIED CBA CONCEPT AND EXPRESS CHOICE METHOD FOR INTEGRATED NETWORK MANAGEMENT SYSTEM

Credit Scorecards for SME Finance The Process of Improving Risk Measurement and Management

MSCA Introduction to Statistical Concepts

(Part.1) FOUNDATIONS OF RISK MANAGEMENT

The Design and Implementation of a Solar Tracking Generating Power System

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios

Towards applying Data Mining Techniques for Talent Mangement

Integrated airline scheduling

Leveraging Ensemble Models in SAS Enterprise Miner

THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS

Neural Network Applications in Stock Market Predictions - A Methodology Analysis

NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling

FINANCIAL ANALYSIS GUIDE

U.P.B. Sci. Bull., Series C, Vol. 77, Iss. 1, 2015 ISSN

Financial Markets and Valuation - Tutorial 5: SOLUTIONS. Capital Asset Pricing Model, Weighted Average Cost of Capital & Practice Questions

A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode

Real-Time Reservoir Model Updating Using Ensemble Kalman Filter Xian-Huan Wen, SPE and Wen. H. Chen, SPE, ChevronTexaco Energy Technology Company

Using quantum computing to realize the Fourier Transform in computer vision applications

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing Classifier

Predicting borrowers chance of defaulting on credit loans

Lean Six Sigma Analyze Phase Introduction. TECH QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY

Simple Linear Regression Inference

A logistic regression approach to estimating customer profit loss due to lapses in insurance

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Handling missing data in Stata a whirlwind tour

Machine Learning with MATLAB David Willingham Application Engineer

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Java Modules for Time Series Analysis

Interpretation of Somers D under four simple models

Hewlett-Packard 12C Tutorial

Accounting Information and Stock Price Reaction of Listed Companies Empirical Evidence from 60 Listed Companies in Shanghai Stock Exchange

Statistics Graduate Courses

Pattern recognition using multilayer neural-genetic algorithm

Inequality: Methods and Tools

How To Make A Credit Risk Model For A Bank Account

Social Media Mining. Data Mining Essentials

Options on Stock Indices, Currencies and Futures

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

Transcription:

Inzinerine Ekonomika-Engineering Economics, 211, 22(2), 126-133 Credit Risk Estimation Model Development Process: Main Steps and Model Improvement Ricardas Mileris, Vytautas Boguslauskas Kaunas University o Technology K. Donelaicio str. 73, LT-4439, Kaunas, Lithuania e-mail: ricardas.mileris@ktu.lt, vytautas.boguslauskas@ktu.lt http://dx.doi.org/1.5755/j1.ee.22.2.39 The attribution o credit ratings or clients is a very important issue in the banking sector. Banks must evaluate credit risk o credit applicants by using standardized (external rating institutions) or internal ratings-based (IRB) methods. Banks which decided to use IRB method attempt to develop precise internal credit rating models or the evaluation o creditworthiness o their borrowers. The internal rating method or the estimation o deault probability requires to collect the deault inormation rom the historical data in banks. The major studies about deault determinating actors are based on classiication methods (Zhou, Xie, Yuan, 28). A classiication model considers the deault measurement as the pattern recognition where all borrowers are divided to non-deault and deault groups based on their inancial and non inancial data. Banks attempt to construct an evaluation model that can be used to discriminate new sample. This research ocuses on a credit rating model development which could attribute credit ratings or Lithuanian companies. The steps o a model s development and improvement process are described in this paper. The model s development begins with the selection o initial variables (inancial ratios) characterizing deault and non-deault companies. 2 inancial ratios o 5 years were calculated according to annual inancial reports. Then statistical and artiicial intelligence methods were selected or the classiication o companies into two groups: deault and non-deault. A discriminant analysis, logistic regression and artiicial neural networks (multilayer perceptron) were applied or this purpose. Oten statistical methods are not able to operate with a large amount o data, so the analysis o variance, Kolmogorov-Smirnov test and actor analysis were applied or data reduction. Artiicial neural networks oten are able to analyze a large amount o data so variable selection was accomplished by the network itsel calculating ranks o importance or every initial variable. There were constructed 15 classiication models and their classiication accuracy was measured by calculating correct classiication rates. The most accurate was a logistic regression model analyzing data o 3 years (97% o correctly classiied companies). Then the sample o companies was supplemented with new data and changes in classiication accuracy were estimated. The signiicant decrease o classiication accuracy conditioned the need o model update. For this reason the logistic regression coeicients were recalculated. In order to classiy nondeault companies into 7 classes: proitability, liquidity, inancial structure and individual possibility o deault estimated by a logistic regression model were determined as rating criterions. Then the rating scale was constructed and credit ratings were attributed or companies in the sample. The calculated probabilities o deault indicated that some lower ratings have lower probabilities o deault. These imperections were corrected by the modiication o a rating scale. The research has shown that the developed model is a valid tool or the estimation o credit risk. Keywords: bank, credit ratings, credit risk, classiication methods. Introduction Banking activity is exposed to various risks. Understanding and quantiying these risks is crucial or bank management as well as or stability o the whole economy (Sieczka, Holyst, 29). Banks tend to lend to irms with high credit quality and not to lend to low credit quality irms. So the most important actor in determining lending practices is credit risk (Daniels, Ramirez, 28). According to Du & Einig (29), research considering credit risk has been one o the most active areas o recent inancial research, with signiicant eorts deployed to analyse the meaning, role, and inluence o credit ratings. The development o the credit rating prediction models has attracted lots o research interests in academic and business community. Many researchers have attempted to construct automatic classiication systems using methods rom data mining, such as statistical and artiicial intelligence techniques (Huang, 29). These techniques include discriminant analysis (Min, Lee, 28), logistic regression (Psillaki, Tsolas, Margaritis, 21), k-nearest neighbour (Twala, 21), decision trees (Paleologo, Elissee, Antonini, 21), artiicial neural networks (Angelini, Tollo, Roli, 28; Yu, Wang, Lai, 28), support vector machine (Danenas, Garsva, 29; Lee, 27), nearest subspace method (Zhou, Jiang, Shi, 21), hybrid models (Lin, 29), etc. Mostly the scientiic publications about estimation o credit risk present models which classiy companies into 2 groups: deault or non-deault. This research is intended or the development o credit risk estimation model which could classiy reliable companies into seven classes and not reliable companies into one class. The object o this research is credit risk estimation models. The aim o this research is to develop credit risk estimation model and to describe the development process. ISSN 1392 2785 (print) ISSN 229 5839 (online) -126-

Ricardas Mileris, Vytautas Boguslauskas. Credit Risk Estimation Model Development Process: Main Steps and Model... The methods o this research: 1. Analysis o scientiic publications. 2. Development o a credit risk estimation model. The developed model will allow the bank to estimate the credit risk o Lithuanian companies. Internal credit rating models in banks The activity o business companies is directly or indirectly inluenced by various internal and external actors (Boguslauskas, Adlyte, 21). Unavourable business environment, unexpected and unavourable events, the risky decisions o enterprise managers may make any enterprise insolvent and lead it towards bankruptcy (Purvinis, Sukys, Virbickaite, 25). Diicult business environment stimulates to look or new ways to assess the changing situation as well as to implement and manage new means or business continuity (Valackiene, 21). These negative actors and ability to manage them also inluence the credit risk o companies which must be measured by banks properly. With the rapid growth in credit industry and the management o large loan portolios, internal credit rating models have been extensively used or the credit admission evaluation. In general, the bank s internal measures o credit risk are based on assessments o borrower and transaction risk. Mostly banks base their rating methodology on borrower s deault risk and typically assign a borrower to a credit rating grade (Butera, Fa, 26). The credit rating models are developed to classiy loan applicants as either a reliable group (accepted) or not reliable group (rejected) with their related characteristics based on the data o the previous accepted and rejected applicants (Tsai, Chen, 21). The guidelines or measuring credit risk are suggested in the Basel II Capital Accord (Weißbach, Tschiersch, Lawrenz, 29). With the implementation o the Basel II Capital Accord in 26, banks are now also allowed to use internal data and rating models or the purpose o estimating own deault probabilities (PD) and calculating regulatory capital (Ryser, Denzler, 29). According to the Basel Committee on Banking Supervision banks are required to measure the one year deault probability. So usually statistical credit risk estimation models try to predict the probability that a loan applicant or existing borrower will deault in one year (Fantazzini, Figini, 29). In addition, banks should also provide the loss given deault (LGD) a measure o how much per unit o exposure bank will lose i the client deaults (Butera, Fa, 26). Credit rating models allow to rate clients credibility and to evaluate expected losses o bank (Sieczka, Holyst, 29). One o the central ideas o IRB method that banks are ree to choose their own method when: 1. A meaningul dierentiation o risk between classes is achieved. 2. Plausible, intuitive, and current input data are used. 3. All relevant inormation is taken into account (Ryser, Denzler, 29). Commercial banks can use statistical tools to discriminate between reliable and not reliable borrowers and to measure the credit quality according to internal rating scales (Ryser, Denzler, 29). There are several well-known statistical techniques or constructing credit rating predictions. These techniques include logistic regression, discriminant analysis, linear probit model, etc. (Hwang, Chung, Chu, 21). Also methods o artiicial intelligence are widely used or classiication o companies. In addition, the hybridization approach has been an active research area to improve the classiication (prediction) perormance over single learning approaches. In general, it is based on combining two dierent machine learning techniques. Thereore, to develop a hybrid learning credit model, there are three dierent ways to combine the two machine learning techniques: 1. Combining two classiication techniques. 2. Combining two clustering techniques. 3. One clustering technique combined with one classiication technique (Tsai, Chen, 21). These statistical and machine learning techniques analyze the particular data set about credit applicants. So in credit risk estimation model development the successul selection o inormative variables is very important. According to Arslan & Karan (29), dierent actors aect credit risks o irms. Jiao, Syau & Lee (27) maintained that the data can be classiied into three categories: 1. Financial conditions: liquidity, inancial structure, proitability and eiciency ratios. 2. Management measures: administrator s management experiences, stockholders structure type, conditions o capital increment during the last years, etc. 3. Characteristics and perspectives o products and competitions: equipment and technologies, product marketability, collateral, economic conditions o the industry in the next year. Thus, the total credit rating can be obtained by summarizing these three categories. It should be noted that only the irst category (inancial conditions) can be expressed numerically. Most o the quantities in the other two categories cannot be expressed numerically easily. To overcome this diiculty, uzzy numbers can be used or linguistic expressions (Jiao, Syau, Lee, 27). The interaction between dierent risk actors and credit ratings was analyzed in many publications. The importance o inancial actors is widely accepted because its impact is measurable. The relevance o non-inancial actors is mainly considered as not decisive (Grunert, Norden, Weber, 25). So companies should seek to present a true and air view o their inancial perormance and results because banks need inormative and truthul accounting data or making right decisions (Rudzioniene, 26). The results o a credit risk estimation are credit ratings. Usually credit ratings range rom AAA to D. Basing on the rating scale o Standard & Poor s, it is possible to group ratings into three categories: AAA, AA, A as category 1. BBB as category 2. Below BBB as category 3. Firms in the {AAA, AA, A} category mean that they have demonstrated strong capacity to meet their inancial obligations. Firms receiving BBB rating mean that they - 127 -

Inzinerine Ekonomika-Engineering Economics, 211, 22(2), 126-133 have adequate capacity to meet their inancial commitments. However, irms receiving ratings below BBB mean that they are regarded as having speculative characteristics (Hwang, Chung, Chu, 21). The internal rating models have many beneits or banks. These beneits include reducing the cost o credit analysis, enabling aster decisions, ensuring credit collections, and diminishing possible risks. Even a slight improvement in credit rating accuracy may reduce large credit risks and translate into signiicant uture savings (Tsai, Chen, 21). In addition Ince & Aktan (29) airm that credit rating models, used to model the potential risk o loan applications, may be an eective substitute or the use o judgment among inexperienced loan oicers, thus helping to control bad debt losses. When the credit risk estimation model is developed, it is important to evaluate its quality. The low-quality rating models have two important negative eects on a bank s inancial stability. First, return on the portolio will be lower than expected. I a borrower receives too avorable rating, he will most likely accept the loan oer, whereas rejection is more likely in the case o a rating error in the opposite direction, i.e., the bank will earn money rom adverse selection. Second, due to the adverse selection eect, the calculated regulatory capital held by the bank is too low. As more borrowers with too avorable ratings are in the portolio, the resulting regulatory capital based on these ratings may be inadequate (Hornik, Jankowitsch, Lingo, Pichler, Winkler, 21). Credit risk estimation model development The model or rating o Lithuanian companies was developed in order to measure their credit risk. The main steps o a model s development process are described below. 1. Variable selection. For typical classiication problems, values or a set o independent variables are given in a set o (training) examples, upon which a model is developed to categorize uture observations into appropriate classes (Piramuthu, 26). In this case the analyst must have the set o variables which allow to estimate the credit risk o companies successuly. Developing the credit risk estimation model, the set o initial variables consisted o 2 inancial ratios o 5 years. So there were 1 independent variables X i and one dependent variable Y (deault or non-deault). 2. Data analysis methods selection. Discriminant analysis (DA), logistic regression (LR) and artiicial neural networks (ANN) methods were applied or classiication o companies into 2 groups: deault and non-deault. Discriminant analysis is a commonly used technique to classiy a set o observations into predeined classes in various ields (Huang, Kao, Wang, 27). It is a classiication method that can predict the group membership o a newly sampled observation. In DA, a group o observations whose memberships are already identiied are used or the estimation o weights in a discriminant unction. A new sample is classiied into one o several groups by DA results (Sueyoshi, 26). The goal o linear DA is to ind a linear transormation that maximizes class separability in the reduced dimensional space (Park, Park, 28). The mathematical model is: n z = α + βix i (1) i= 1 where α is an intercept, x i is the value o independent variable, β i is the coeicient o the variable x i. Two discriminant unctions were constructed or deault (z 1 ) and non-deault (z ) companies in every developed DA model. The company was numbered into group which z gets higher value. Logistic regression models allow to estimate the indivudual possibility o deault or every company in the range o [; 1]. In a logistic regression model the possibility o individual to deault is expressed as ollows: n exp α + i Pn = 1+ exp α + = 1 n β ixi β ixi 1 where α is an intercept, x i is the value o independent variable, β i is the coeicient o the variable x i (Dong, Lai, Yen, 21). The classiication threshold o P n in developed LR models was set to.5. A multilayer perceptron was used or the classiication o companies as one o the possible types o ANN (Figure 1). Inputs: X 1 X 2 X k Input layer Bias i= Figure 1. Multilayer perceptron In this network, the inormation lows orward to the output continuously without any eedback (Boguslauskas, Mileris, 29). The initial variables are ed into input nodes, while the output provides the orecast or the uture value. Hidden nodes with appropriate nonlinear transer unctions are used to process the inormation received by the input nodes. The model can be written as: m n yt = α + α j β ij xi β + j + ε (3) t j= 1 i= 1 where n is the number o input nodes, m is the number o hidden nodes, is a sigmoid transer unction, {α j, j =, 1,..., m} is a vector o weights rom the hidden to output Hidden layer Output layer (2) Weights Forecast: deault or non-deault - 128 -

Ricardas Mileris, Vytautas Boguslauskas. Credit Risk Estimation Model Development Process: Main Steps and Model... nodes, {β ij, i = 1, 2,..., n; j =, 1,..., m} are weights rom the input to hidden nodes, α and β j are weights o connections leading rom the bias which have values always equal to 1 (Azadeh, Ghaderi, Sohrabkhani, 27). 3. Data reduction. Not all initial variables are useul or the analysis, so it is important to ind inormative parameters or credit risk estimation model development in the set o initial variables. Also the statistical methods oten are not able to analyze high amount o data, so the need o data reduction arises. Four methods or data reduction were applied: the analysis o variance (), Kolmogorov-Smirnov test (K-S), actor analysis (FA) and ranks o importance in ANN (). Figure 2 illustrates the process o data reduction or classiication models. The 1 st column consists o the period o data (rom the last year to 5 years ago) and number o initial variables. The 2 nd and 3 rd columns show methods applied or data reduction. In the 4 th column there is a number o variables used or classiication o companies. The 5 th column consists o a code o classiication models (method and period o data analyzed). 1 year (X 1-X 2) 2 years (X 1-X 4) 3 years (X 1-X 6) 4 years (X 1-X 8) 5 years (X 1-X 1) FA FA FA FA K-S 6 DA1 Figure 2. Data reduction or classiication models The purpose o is to test or signiicant dierences between means o independent variables in groups o ailed and not ailed companies. I the means did not dier signiicantly the variable was rejected rom urther analysis. The Kolmogorov-Smirnov one sample test or normality is based on the maximum dierence between the sample cumulative distribution and the hypothesized cumulative distribution (Mileris, Boguslauskas, 21). This test allowed to veriy i the variable has the normal distribution. I the statistics D o K-S test is signiicant, then the hypothesis that the respective distribution is 9 LR1 15 ANN1 K-S 11 DA2 16 LR2 37 ANN2 K-S 12 DA3 25 LR3 5 ANN3 17 17 73 2 17 86 DA4 LR4 ANN4 DA5 LR5 ANN5 normal should be rejected. These variables were not included into the credit risk analysis. Factor analysis is a statistical method that is based on the correlation analysis o variables. The purpose is to reduce multiple variables to a lesser number o actors (Ocal, Oral, Erdis, Vural, 27). The reduction o observed initial variables to less actors allows to enhance interpretability and detect hidden structures in the data (Treiblmaier, Filzmoser, 21). These methods were applied or data reduction beore DA and LR models were developed. Artiicial neural networks are able to operate with a large amount o data, so all initial variables were ed to them. Ranks o importance were attributed or every variable by ANN itsel. The variables that > were included into urther analysis. 4. Estimation o classiication accuracy. Overall 15 models were developed or the classiication o companies into 2 groups. The correct classiication rates (CCR) were calculated or the estimation o classiication accuracy (Table 1). These rates indicate the proportion o correctly classiied companies by a certain model. Correct classiication rates, % Table 1 Model Years o inancial data analyzed 1 2 3 4 5 DA 77. 84. 84. 82.2 83.91 LR 83. 92. 97. 89.89 87.36 ANN 85.87 92.22 95.51 82.2 81.61 The highest classiication accuracy (97%) was reached by the logistic regression model which employed threeyears inancial data o companies. So this LR3 model was used or the estimation o credit risk. 5. Estimation o changes in classiication accuracy. The classiication model LR3 was developed using inancial ratios o 24-26 years. The data o 1 Lithuanaian companies was analyzed. Ater that the initial data set was complemented with inancial ratios o 26-28 years o 1 new companies and the changes in classiication accuracy were estimated. The test o model indicated that the correct classiication rate decreased to 82.23% (Table 2). It is the signiicant decrease (Δ 1 = - 14.77%) so the need o model update arised. 6. Update o classiication model. The regression coeicients were recalculated or the same 25 variables o LR3 model. The CCR o updated classiication model increased to 93.43% (Table 2). The increase inluenced on model update was 11.2% (Δ 2 ). Changes o CCR Table 2 LR3 Test Δ 1 Updated Δ 2 97. 82.23-14.77 93.43 +11.2 Changes o CCR show the importance o model update when bank gets new data about clients. The update can signiicantly improve the classiication model s perormance. 7. Determination o ratings criterions. The actor analysis has shown that 3 actors which explain 4.4% - 129 -

Inzinerine Ekonomika-Engineering Economics, 211, 22(2), 126-133 42.8% o total variance are: proitability, liquidity and inancial structure. So rom an initial set o variables 4 proitability, 2 liquidity and 1 inancial structure ratios were selected which had the highest correlation coeicients r with individual possibility o deault (P n ). Table 3 Correlation coeicients between inancial ratios and P n Group Financial ratio r r Proitability: NPM -.27147.27147 EBIT/TA -.5944.5944 NP/TA -.48495.48495 EBIT/S -.31458.31458 Liquidity: CR -.1571.1571 QR -.12926.12926 Financial structure: DR.18183.18183 In Table 3 NPM is a net proit margin, EBIT/TA is earnings beore interest and taxes to total assets, NP/TA is net proit to total assets, EBIT/S is earnings beore interest and taxes to sales, CR is current ratio, QR is quick ratio, DR is debt ratio. As a criterion o ratings the individual possibility o deault also included into analysis. So the proitability determines 5%, liquidity 25%, inancial structure 12.5%, individual possibility o deault 12.5% o credit rating. 8. Construction o rating scale. The rating model is based on classiication o companies according to logistic regression analysis results and rating criterions (Figure 3). X 1 X 2 X n LR3 Non-deault P n [;.5) Rating criterions Deault P n [.5; 1] D1 Figure 3. Attribution o credit ratings AAA The variables X i were analyzed by the logistic regression (LR3) model. According to estimated individual possibility o deault (P n ), companies were classiied into 2 classes: deault and non-deault. For companies classiied by LR3 model as deault the rating D1 was attributed. The Basel II Capital Accord requires that non-deault companies must be classiied at least into 7 classes. So according to the determined rating criterions ratings AAA C were attributed or reliable companies. In addition, i company by LR3 model was classiied as reliable but its inancial ratios are bad, or such companies rating D2 was attributed. In group o not deault companies 5 statistical characteristics were calculated or 7 inancial ratios: minimum value (Min), maximum value (Max), standard deviation (σ), mean (μ) and median (Me). The exceptions (E) o values were rejected rom urther analysis: AA A BBB BB B C D2 E (μ - 3σ; μ + 3σ) (4) Then Min, Max and Me values in the sample were recalculated. In order to create rating scale the interval o values or each inancial ratio was divided into 2 sections: Min Me (rom minimum to median). Me Max (om median to maximum). Every section was divided into 4 equal intervals and the margins o inancial ratios in these intervals were calculated. According to the values o inancial ratios and individual possibility o deault (P n ), scores were attributed or companies (Table 4). Table 4 Intervals o rating criterions and scores Score NPM EBIT/TA... P n 7 >.75 >.38....1 6 (.51;.75] (.28;.38]... (.1;.3]................1 -.772... >.541 The credit rating o a company depends on the sum o scores (Table 5). Table 5 Credit ratings and total scores Rating AAA AA A... D2 Total score 5-56 43-49 36-42... -7 9. Calculation o probabilities o deault. The probabilities o deault (PD) were calculated or every rating: NC PD = 1% (5) IC where N C number o deault companies in particular rating, I C - number o companies in particular rating. The probabilities o deault illustrated in Figure 4. % 12 1 8 6 4 2-3.8 6.1 1.9 8.3 14.3 91.1 AAA AA A BBB BB B C D1 D2 Figure 4. Probabilities o deault (%) and credit ratings Grunert, Norden & Weber (25) point out the necessity o a link between credit rating and probability o deault. The credit rating models o high quality have the inverse relation between credit ratings and PD values: when the credit ratings decrease the PD values constantly increase. In developed model ratings A and BBB have higher PD than rating BB. Also it was impossible to calculate PD or rating AAA because no one company got this rating. So it is worth to modiy the rating scale in order to improve the model s perormance. 1-13 -

Ricardas Mileris, Vytautas Boguslauskas. Credit Risk Estimation Model Development Process: Main Steps and Model... 1. Modiication o rating scale. The intervals o total scores were changed (Table 5) and all imperections were corrected (Figure 5). % 12 1 8 6 4 2 3 3.7 7.3 2 91.1 AAA AA A BBB BB B C D1 D2 Figure 5. PD (%) and credit ratings o modiied scale Companies with ratings D1 and D2 can be considered as one class o not reliable clients and bank should not credit them. These ratings can be merged to one rating D. The probability o deault o companies with ratings AAA, AA and A is equal to. The dierence between these ratings is inluenced on the inancial state o companies. The companies with higher ratings have higher average proitability and liquidity ratios. Also these companies have lower average debt ratio and individual 1 possibility o deault. So according to the rating o reliable companies, a bank can determine the interest rate or loans. Conclusions 1. This research conirmed that the discriminant analysis, logistic regression and artiicial neural networks are relevant methods or the classiication o banks clients. The highest classiication accuracy (97%) was reached by the logistic regression model. 2. The analysis o variance, Kolmogorov-Smirnov test and actor analysis are statistical methods capable to reduce a large amount o data and help to select variables or the estimation o credit risk. 3. The percentage o credit rating criterions suggested in this research is: proitability 5%, liquidity 25%, leverage 12.5%, individual possibility o deault estimated by logistic regression 12.5%. These criterions allowed to create valid rating scale or the measurement o Lithuanian companies credit risk. 4. The recalculation o classiication unctions coeicients when new data o clients available can improve the correct classiication results o the model. Also the changing o total scores intervals can improve a model s rating scale. 5. The described steps o the credit risk estimation model development can help other researchers to create models which allow to measure credit risk successuly. Reerences Angelini, E., Tollo, G., & Roli, A. (28). A Neural Network Approach or Credit Risk Evaluation. The Quarterly Review o Economics and Finance, 48(4), 733-755. Arslan, O., & Karan, M. B. (29). Credit Risks and Internationalization o SMEs. Journal o Business Economics and Management, 1(4), 361-368. Azadeh, A., Ghaderi, S. F., & Sohrabkhani, S. (27). Forecasting Electrical Consumption by Integration o Neural Network, Time Series and. Applied Mathematics and Computation, 186(2), 1753-1761. Boguslauskas, V., & Adlyte, R. (21). Evaluation o Criteria or the Classiication o Enterprises. Inzinerine Ekonomika- Engineering Economics, 21(2), 119-127. Boguslauskas, V., & Mileris, R. (29). Estimation o Credit Risk by Artiicial Neural Networks Models. Inzinerine Ekonomika-Engineering Economics(4), 7-14. Butera, G., & Fa, R. (26). An Integrated Multi-Model Credit Rating System or Private Firms. Review o Quantitative Finance and Accounting, 27(3), 311-34. Danenas, P., & Garsva, G. (29). Support Vector Machines and their Application in Credit Risk Evaluation Process. Transormations in Business & Economics, 3(18), 46-58. Daniels, K., & Ramirez, G. G. (28). Inormation, Credit Risk, Lender Specialization and Loan Pricing: Evidence rom the DIP Financing Market. Journal o Financial Services Research, 34(1), 35-59. Dong, G., Lai, K. K., & Yen, J. (21). Credit Scorecard Based on Logistic Regression with Random Coeicients. Procedia Computer Science, 1(1), 2457-2462. Du, A., & Einig, S. (29). Understanding Credit Ratings Quality: Evidence rom UK Debt Market Participants. The British Accounting Review, 41(2), 17-119. Fantazzini, D., & Figini, S. (29). Random Survival Forests Models or SME Credit Risk Measurement. Methodology and Computing in Applied Probability, 11(1), 29-45. Grunert, J., Norden, L., & Weber, M. (25). The Role o Non-Financial Factors in Internal Credit Ratings. Journal o Banking & Finance, 29(2), 59-531. Hornik, K., Jankowitsch, R., Lingo, M., Pichler, S., & Winkler, G. (21). Determinants o Heterogeneity in European Credit Ratings. Financial Markets and Portolio Management, 24(3), 271-287. - 131 -

Inzinerine Ekonomika-Engineering Economics, 211, 22(2), 126-133 Huang, Y., Kao, T. L., & Wang, T. H. (27). Inluence Functions and Local Inluence in Linear Discriminant Analysis. Computational Statistics & Data Analysis, 51(8), 3844-3861. Huang, S. C. (29). Integrating Nonlinear Graph Based Dimensionality Reduction Schemes with SVMs or Credit Rating Forecasting. Expert Systems with Applications, 36(4), 7515-7518. Hwang, R. C., Chung, H., & Chu, C. K. (21). Predicting Issuer Credit Ratings Using a Semiparametric Method. Journal o Empirical Finance, 17(1), 12-137. Ince, H., & Aktan, B. (29). A Comparison o Data Mining Techniques or Credit Scoring in Banking: A Managerial Perspective. Journal o Business Economics and Management, 1(3), 233-24. Jiao, Y., Syau, Y. R., & Lee, E. S. (27). Modelling Credit Rating by Fuzzy Adaptive Network. Mathematical and Computer Modelling, 45(5-6), 717-731. Lee, Y. C. (27). Application o Support Vector Machines to Corporate Credit Rating Prediction. Expert Systems with Applications, 33(1), 67-74. Lin, S. L. (29). A New Two-Stage Hybrid Approach o Credit Risk in Banking Industry. Expert Systems with Applications, 36(4), 8333-8341. Mileris, R., & Boguslauskas, V. (21). Data Reduction Inluence on the Accuracy o Credit Risk Estimation Models. Inzinerine Ekonomika-Engineering Economics, 21(1), 5-11. Min, J. H., & Lee, Y. C. (28). A Practical Approach to Credit Scoring. Expert Systems with Applications, 35(4), 1762-177. Ocal, M. E., Oral, E. L., Erdis, E., & Vural, G. (27). Industry Financial Ratios Application o Factor Analysis in Turkish Construction Industry. Building and Environment, 42(1), 385-392. Paleologo, G., Elissee, A., & Antonini, G. (21). Subagging or Credit Scoring Models. European Journal o Operational Research, 21(2), 49-499. Park, C. H., & Park, H. (28). A Comparison o Generalized Linear Discriminant Analysis Algorithms. Pattern Recognition, 41(3), 183-197. Piramuthu, S. (26). On Preprocessing Data or Financial Credit Risk Evaluation. Expert Systems with Applications, 3(3), 489-497. Psillaki, M., Tsolas, I. E., & Margaritis, D. (21). Evaluation o Credit Risk Based on Firm Perormance. European Journal o Operational Research, 21(3), 873-881. Purvinis, O., Sukys, P., & Virbickaite, R. (25). Research o Possibility o Bankruptcy Diagnostics Applying Neural Network. Inzinerine Ekonomika-Engineering Economics(1), 16-22. Rudzioniene, K. (26). Impact o Stakeholders Interests on Financial Accounting Policy-Making: The Case o Lithuania. Transormations in Business & Economics, 1(9), 51-64. Ryser, M., & Denzler, S. (29). Selecting Credit Rating Models: A Cross-Validation-Based Comparison o Discriminatory Power. Financial Markets and Portolio Management, 23(2), 187-23. Sieczka, P., & Holyst, J. A. (29). Collective Firm Bankruptcies and Phase Transition in Rating Dynamics. The European Physical Journal B - Condensed Matter and Complex Systems, 71(4), 461-466. Sueyoshi, T. (26). DEA-Discriminant Analysis: Methodological Comparison Among Eight Discriminant Analysis Approaches. European Journal o Operational Research, 169(1), 247-272. Treiblmaier, H., & Filzmoser, P. (21). Exploratory Factor Analysis Revisited: How Robust Methods Support the Detection o Hidden Multivariate Data Structures in IS Research. Inormation & Management, 47(4), 197-27. Tsai, C. F., & Chen, M. L. (21). Credit Rating by Hybrid Machine Learning Techniques. Applied Sot Computing, 1(2), 374-38. Twala, B. (21). Multiple Classiier Application to Credit Risk Assessment. Expert Systems with Applications, 37(4), 3326-3336. Valackiene, A. (21). Eicient Corporate Communication: Decisions in Crisis Management. Inzinerine Ekonomika- Engineering Economics, 21(1), 99-11. Weißbach, R., Tschiersch, P. & Lawrenz, C. (29). Testing Time-Homogeneity o Rating Transitions Ater Origination o Debt. Empirical Economics, 36(3), 575-596. Yu, L., Wang, S., & Lai, K. K. (28). Credit Risk Assessment with a Multistage Neural Network Ensemble Learning Approach. Expert Systems with Applications, 34(2), 1434-1444. Zhou, X., Jiang, W., & Shi, Y. (21). Credit Risk Evaluation by Using Nearest Subspace Method. Procedia Computer Science, 1(1), 2443-2449. Zhou, Y., Xie, S., & Yuan, Y. (28). Statistical Inerence on the Deault Probability in the Credit Risk Models. Systems Engineering - Theory & Practice, 28(8), 26-214. - 132 -

Ricardas Mileris, Vytautas Boguslauskas. Credit Risk Estimation Model Development Process: Main Steps and Model... Ričardas Mileris, Vytautas Boguslauskas Kredito rizikos vertinimo modelio sudarymo procesas: pagrindiniai etapai ir modelio tobulinimas Santrauka Kredito rizikos vertinimas yra vienas iš kredito suteikimo etapų. Pagrindinis kredito rizikos vertinimo tikslas yra nustatyti, ar verta suteikti kreditą. Bankai stengiasi sumažinti kredito riziką, nesuteikdami kreditų klientams, kurie turi mažiausias galimybes įvykdyti prisiimtus inansinius įsipareigojimus. Suteikus kreditą, skolininko kredito rizikos lygis taip pat būtinas banko kapitalo pakankamumui apskaičiuoti. Vienas iš kredito rizikos vertinimo būdų yra bankų vidaus kredito rizikos vertinimo modelių taikymas. Šiuos modelius bankai sudaro atsižvelgdami į savo poreikius ir turimus duomenis, pasirenka tinkamiausius duomenų analizės metodus. Mokslinėse publikacijose dažniausiai aprašomi modeliai, kuriais įmonės klasiikuojamos į 2 grupes (patikimų ir nepatikimų klientų). Bazelio bankų priežiūros komiteto rekomendacijose nurodyta, kad patikimos įmonės turi būti suskirstytos į ne mažiau kaip 7 grupes (reitingus). Todėl šiame tyrime buvo siekiama sudaryti įmonių kredito rizikos vertinimo modelį, kuriuo patikimi banko klientai klasiikuojami į 7 grupes, o nepatikimi į 1 grupę. Šio tyrimo objektas įmonių kredito rizikos vertinimo modeliai. Tyrimo tikslas sudaryti įmonių kredito rizikos vertinimo modelį ir aprašyti jo sudarymo proceso etapus. Tyrimo metodai: Mokslinių publikacijų, susijusių su kredito rizikos vertinimu, analizė. Įmonių kredito rizikos vertinimo modelio sudarymas. Įmonių aplinkos pokyčiai daro įtaką jų patikimumui kredito rizikos atžvilgiu ir keičia patikimumo vertinimo kriterijus. Todėl, atsižvelgiant į šiuos pokyčius, aktualu nuolat atnaujinti bankų taikomus klientų kredito rizikos vertinimo modelius. Tyrime buvo siekiama sudaryti modelį, kuriuo bankas galėtų kiekybiškai įvertinti įmonių kredito riziką. Taigi šiame straipsnyje aprašyta modelio esmė ir jo sudarymo proceso etapai. Modelio sudarymas pradėtas kintamųjų rinkinio ormavimu. Buvo suskaičiuota 2 santykinių inansinių rodiklių, kurie apima 5 metų įmonių veiklos rezultatus, t. y. buvo sudarytas pradinių 1 kintamųjų rinkinys. Toliau pasirinkti duomenų analizės metodai, kuriais galima klasiikuoti įmones į patikimų ir nepatikimų įmonių grupes: diskriminantinė analizė, logistinė regresija ir dirbtinių neuronų tinklai (DNT). Diskriminantinės analizės metodu įmonėms klasiikuoti į patikimų ir nepatikimų klientų grupes buvo sudarytos 2 klasiikavimo unkcijos. Klasiikuojant tiriamoji įmonė priskiriama tai grupei, kurios klasiikavimo unkcija įgyja didesnę reikšmę. Logistinės regresijos metodu įmonių klasiikavimo modelį galima sudaryti taip, kad būtų prognozuojamas ne dichotominis, o tolydusis kintamasis, kurio reikšmių intervalas yra [; 1]. Šiuo atveju nustatoma galimybė, kad įmonė neįvykdys prisiimtų inansinių įsipareigojimų bankui. Apskaičiuotos reikšmės įvertina įmonės individualią inansinių įsipareigojimų neįvykdymo galimybę nuo iki 1 proc. Nustatyta įmonių klasiikavimo slenksčio reikšmė,5. Trečiasis metodas, kuriuo buvo klasiikuojamos įmonės DNT. Pagrindinis DNT skirtumas, palyginti su statistiniais duomenų analizės metodais, yra tas, kad DNT neturi tikslaus išankstinio duomenų analizės modelio, o sudaro jį patys pagal į tinklą įvestą inormaciją. Pasirinktas analizei DNT tipas daugiasluoksnis perceptronas. Statistiniai objektų klasiikavimo metodai dažnai negali apdoroti didelio inormacijos kiekio. Taip pat į įmonių klasiikavimo procesą nėra tikslinga įtraukti mažai inormatyvių parametrų, todėl buvo mažinama pradinio kintamųjų rinkinio apimtis. Duomenų apimčiai sumažinti diskriminantinės analizės ir logistinės regresijos modeliuose pritaikyta vienaktorinė dispersinė analizė (), Kolmogorovo ir Smirnovo kriterijus bei aktorinė analizė. Atliekant vienaktorinę dispersinę analizę, buvo nustatyta, kurių kintamųjų vidurkiai skirtingų įmonių grupėse statistiškai reikšmingai nesiskiria. Kolmogorovo ir Smirnovo kriterijumi nustatyti požymiai, kurių pasiskirstymas grupėse nėra normalusis. Tokie požymiai nebuvo įtraukti į įmonių klasiikavimą. Faktorinės analizės tikslas pakeisti požymių, apibūdinančių stebimą reiškinį, rinkinį tam tikru aktorių skaičiumi. Atliekant aktorinę analizę, kintamieji suskirstomi į grupes pagal jų koreliacijas su tam tikrais tiesiogiai nestebimais (latentiniais) aktoriais. Faktorinė analizė leido didelį kiekį kintamųjų paaiškinti išskirtaisiais aktoriais. Taip sumažintas analizuojamų duomenų kiekis, o aktorinės analizės rezultatai įtraukti į analizę kitais daugiamatės statistikos metodais. DNT pasižymi galimybe apdoroti didelius inormacijos kiekius, todėl šiems modeliams duomenų apimtis, taikant papildomus statistinius duomenų analizės metodus, nebuvo mažinama. Analizei reikšmingi kintamieji buvo atrinkti pačių neuronų tinklų, atsižvelgiant į kintamųjų reikšmingumo rangus. Įmones suklasiikuoti į dvi grupes buvo sudaryta po 5 diskriminantinės analizės, logistinės regresijos ir DNT modelius. Jie skiriasi analizuojamų duomenų laikotarpiu (nuo 1 iki 5 metų). Tiksliausio modelio atranka buvo atliekama remiantis teisingo klasiikavimo rodiklio reikšmėmis. Didžiausia šio rodiklio reikšmė pasiekta logistinės regresijos modeliu analizuojant 3 metų įmonių duomenis (LR3 modelis). LR3 modelis buvo sudarytas analizuojant 24 26 m. Lietuvoje veikiančių ir bankrutavusių 1 įmonių duomenis. Buvo siekiama nustatyti modelio klasiikavimo tikslumo rodiklio pokytį, analizuojant vėlesnių laikotarpių įmonių inansinius rodiklius. Tuo tikslu analizuotų duomenų rinkinys buvo papildytas 26 28 m. Lietuvoje veikiančių 1 įmonių duomenimis. Teisingo klasiikavimo rodiklis, parodantis bendrą įmonių klasiikavimo tikslumą, sumažėjo 14,77 proc. Šis reikšmingas neigiamas pokytis lėmė modelio atnaujinimo poreikį. Atsižvelgiant į tai, buvo perskaičiuoti LR3 modelio regresijos lygties koeicientai. Tai padidino modelio klasiikavimo tikslumą 11,2 proc. Reitingai patikimoms įmonėms priskirti remiantis paskutiniojo analizuojamo laikotarpio santykiniais inansiniais rodikliais. Finansinių rodiklių atranka buvo atliekama skaičiuojant porinės koreliacijos koeicientus tarp logistinės regresijos modeliu nustatytos įmonių individualios inansinių įsipareigojimų neįvykdymo galimybės ir santykinių rodiklių reikšmių. Analizei atrinkti labiausiai koreliuojantys rodikliai. Įmonės kredito reitingą lemia: 5 proc. įmonės pelningumas, 25 proc. likvidumas, 12,5 proc. inansų struktūra, 12,5 proc. individuali įmonės įsipareigojimų neįvykdymo galimybė. Analizuojamų rodiklių reikšmių intervalas buvo padalytas į dvi atkarpas: nuo mažiausios reikšmės iki medianos ir nuo medianos iki didžiausios reikšmės. Kiekviena iš šių atkarpų padalyta į 4 vienodas dalis, suskaičiuotos rodiklių reikšmių ribos. Buvo sudaryti rodiklių reikšmių intervalai ir pagal juos įmonei priskiriamas balas. Pagal įmonės balų sumą įmonei priskiriamas kredito reitingas. Kiekvieno reitingo grupės įmonėms buvo suskaičiuotos inansinių įsipareigojimų neįvykdymo tikimybės. Mažėjant reitingui kokybiško kredito rizikos vertinimo modelio tikimybės turi būti išsidėsčiusios didėjančiai. Tuo tikslu sudarytas modelis buvo modiikuojamas, t. y. pakeisti balų intervalai, pagal kuriuos priskiriamas reitingas. Įmonių reitingavimas vykdomas dviem etapais. Įmonės santykiniai inansiniai rodikliai analizuojami atnaujintu logistinės regresijos modeliu. Pagal gautą individualią įmonės inansinių įsipareigojimų neįvykdymo galimybę įmonė klasiikuojama į patikimų arba nepatikimų įmonių grupę. Nepatikimų įmonių grupės įmonėms priskiriamas reitingas D1. Patikimų įmonių grupės įmonės analizuojamos vertinimo balais metodu. Pagal šios analizės rezultatus įmonei priskiriamas kredito reitingas nuo AAA iki D2. Siekiant padidinti modelio veiksmingumą, įmonėms, kurios logistinės regresijos modeliu klasiikuotos kaip patikimos, tačiau vertinimo balais metodu jų rezultatai žemiausi, priskiriamas reitingas D2, reiškiantis įmonės inansinių įsipareigojimų neįvykdymą. Nepatikimos įmonės gali būti klasiikuojamos ir į vieną grupę. Todėl reitingus D1 ir D2 galima sujungti į reitingą D. Tyrimas parodė, kad sudarytas įmonių kredito rizikos vertinimo modelis gali būti veiksminga priemonė Lietuvoje veikiančių įmonių kredito rizikai vertinti. Raktažodžiai: bankas, kredito reitingai, kredito rizika, klasiikavimo metodai. The article has been reviewed. Received in September, 21; accepted in April, 211. - 133 -