Loss Given Default models for UK retail credit cards


 Beatrice McDaniel
 2 years ago
 Views:
Transcription
1 Loss Given Default models for UK retail credit cards Tony Bellotti and Jonathan Crook Credit Research Centre University of Edinburgh Business School William Robertson Building 50 George Square, Edinburgh EH8 9JY 19 March 2009 Abstract Loss Given Default is an important measure of credit loss used by financial institutions to compute risk within credit portfolios, expected loss on individual loans and capital requirements. The Basel II Capital Accord gives banks the opportunity to calculate their own estimates of Loss Given Default. Based on UK data for major retail credit cards, we build several models of Loss Given Default based on account level data including Tobit, a decision tree model, a Beta and fractional logit transformation. However, we find that Ordinary Least Squares models with macroeconomic variables perform best for forecast of Loss Given Default at account and portfolio level on independent holdout data sets. The inclusion of macroeconomic conditions in the model is important since it provides a means to model Loss Given Default in downturn conditions as required by Basel II and enables stress testing. Key Words: Loss given default; retail credit; regression; model comparison; macroeconomic; forecast. 1 of 28
2 1. Introduction Loss Given Default (LGD) is the loss incurred by a financial institution when an obligor defaults on a loan, given as the fraction of exposure at default (EAD) unpaid after some period of time. It is usual for LGD to have a value between 0 and 1 where 0 means the balance is fully recovered and 1 means total loss of EAD. LGD is an important value that banks need to estimate accurately for several reasons. Firstly, it can be used along with probability of default (PD) and EAD to estimate expected financial loss. Secondly, a forecast of LGD for an individual can help determine the collection policy to be used for that individual following default. For example, if high LGD is expected, then more effort may be employed to help reduce this loss. Thirdly, an estimate of LGD, and therefore portfolio financial risk, is an integral part of the operational calculation of capital requirements to cover credit loss during extreme economic conditions. The Basel II Capital Accord [2006] allows banks the opportunity to estimate LGD using their own models with the advanced internal ratings based (IRB) approach. In this paper we focus on modelling and forecasting LGD for UK retail credit cards based on account variables (AVs) and also the inclusion of macroeconomic variables (MVs). Our prior expectation is that as interest rates rise, so the cost of mortgage and other debt increases, making it more difficult for an obligor to repay outstanding credit card balances and so increasing the mean LGD. Equally, an increase in unemployment level means more people find themselves in circumstances where they cannot repay credit and this would also increase the mean LGD. On the other hand, an increase in earnings would mean more people have more income available to pay off debt and would therefore decrease mean LGD. In addition it is possible that some defaulters are more likely to be less able to repay than others when the state of the economy changes. For example, those who are unemployed at the time of credit card application may be particularly sensitive to interest rate increases, as may home owners. Similarly borrowers with a higher default balance may be particularly sensitive to increases in interest rates. For this reason we also consider interactions between MVs and account data. We consider four key research questions: 2 of 28
3 Q1. Which credit card application and default variables are the key drivers of retail LGD? Q2. What is the best modelling approach for retail LGD? Q3. How well do the models perform at forecasting LGD? Q4. Does the inclusion of MVs lead to improved models of retail LGD? We investigate these questions by building several alternative models of LGD. We find that there are many important drivers of LGD taken from application details and default information. Given that the distribution of LGD is a bimodal Ushape, we consider a Tobit model and a decision tree model along with various transformations of the dependent variable. Although LGD is not easy to model and poor model fit is typical, nevertheless we find that models can be built which provide improved estimates of LGD and good forecasts of mean LGD across a portfolio of accounts. Surprisingly, we find that the best forecasting model is Ordinary Least Squares (OLS) regression. Economic conditions are included as values of MVs for bank interest rate, unemployment level and earnings growth at time of account default. We find that the first two MVs are statistically significant explanatory variables that give rise to improved forecasts of LGD in holdout tests at both an account and portfolio level. Building LGD models with MVs also addresses the Basel II requirement to estimate downturn LGD since stressed values of MVs can be used in the model to forecast LGD during poor economic conditions. We demonstrate how this can be done by stressing interest rate values. The modelling and forecast of LGD for retail credit using macroeconomic conditions is a new area of study. There is an extensive literature regarding LGD models for corporate loans (see eg Altman et al [2005]). However, there is less about forecasting LGD. An exception is Gupton and Stein [2005] who describe a predictive LGD model for corporate loans using MoodyKMV s Losscalc software. There is also very little literature regarding retail credit LGD, even though this is a large financial market: total lending in the UK consumer credit market reached over 1.4 trillion in 2009 [source: Bank of England]. Grippa et al [2005] publish empirical LGD models for a sample of 20,724 Italian accounts that includes small businesses along with households. They observe differences in LGD and recovery periods across different geographic regions and different recovery channels. They also conducted a 3 of 28
4 multivariate analysis that showed a statistically significant negative relationship between the presence of collateral or personal guarantee and LGD, and a positive relationship with size of loan. However, the range of variables used is far more limited than would be available to a financial institution that has made credit card or personal loans and the study did not attempt to forecast LGD. Dermine and de Carvalho [2005] model LGD for loans to small and medium sized firms in Portugal. They apply mortality analysis and include annual GDP growth as an explanatory variable. However, they found that GDP growth was not significant. They suggest that this may be due to the fact that the period of analysis, , did not include a significant recession. We may also note that their training sample size (374 defaults) was relatively small and may not have been large enough for a significant relationship between the economy and LGD to be discovered. Querci [2005] provides an LGD model for loans to small businesses and individuals by an Italian bank. This study shows the importance of regional differences on LGD variation but does not include time varying macroeconomic conditions. Figlewski et al [2007] model the effect of macroeconomic factors on corporate default with a detailed study of numerous economic conditions including unemployment level, inflation, GDP and a production index. They found that many of these MVs were significant explanatory variables. Saurina and Trucharte [2007] model PD for retail mortgage portfolios in Spain. They show that the GDP growth rate is a significant cyclical variable in the regression and has a negative sign as we would expect. That is, during downturns (low GDP growth), PD increases. However, they also include an interest rate variable and, although it has a positive sign and is significant, they report that including interest rates does not improve accuracy. The novelties of our paper are that unlike published work (1) we consider forecasts of LGD for retail credit cards, (2) we report results of model comparison, (3) we include macroeconomic conditions in our models and (4) we do this using a very large sample across several different credit card products. In section 2 we describe our modelling and performance assessment methods. In section 3 we discuss the application and macroeconomic data used. In section 4 we provide model comparisons and test results along with a description of an explanatory model with MVs. Finally in Section 5 we provide some conclusions and discussion. 4 of 28
5 2. Method We consider several models as combinations of different variables, modelling frameworks and data transformations. 2.1 Models In general, for retail credit, there are five categories of circumstances that will affect the amount an individual repays on a defaulted loan and can be used to build models of LGD: (1) individual details, some of which can be collected at time of application such as age, income, employment, housing status and address; (2) account information at default: date or age of account at default and outstanding balance; (3) changes in personal circumstances of an obligor over time; (4) macroeconomic or business conditions on date of default, or possibly with a lag or lead on date of default; (5) operational decisions made by the bank, such as the level of risk they were willing to accept on the credit product and the process they use to follow up bad debt. Of these, the richest source of explanatory variables we have is the information provided at time of application for credit along with the credit bureau score collected by the bank at time of application. This is data falling into category (1). We also include category (2) data, account information at default. Including this data implies the model is conditional on default. It is possible to build models unconditionally but this is outside the scope of this paper. It is difficult for a lender to extract data in category (3). It cannot easily keep track of an individual s employment status or, even less so, his or her personal difficulties, such as divorce or illness, that may lead him or her to be unable to fully repay debt. It is possible to use account behaviour data or a behavioural score but we do not do this in this study since such information is not homogeneous within the data we have. It is understood that LGD is likely to be time dependent, varying over the business cycle [Schuermann 2005]. Therefore we include macroeconomic conditions (4). Including a bank s operational decisions (5) for each credit card product over time could also be fruitful. However, this information was not available for our study. 5 of 28
6 To answer Q4, we build and compare models with and without MVs. The question can be further explored by contrasting with models that also include interaction terms between AVs and MVs. To help answer Q3 we should contrast our models with a simple model built with no variables. Within the context of OLS this simple model effectively forecasts LGD to be the mean value taken from the training data set. If the more complex models give improved forecasts over this simple model then the variables we include provide useful information for LGD estimation. Therefore we have four model structures based on including different explanatory variables: Simple: no covariates in the model. AV: account variables only. AV&MV: account and macroeconomic variables. AV&MV with interactions: also includes interaction terms between AVs and MVs. It is not feasible to include all interaction terms so variable selection is used as described later in Section 3.3. We do not restrict our models to OLS but also consider three alternatives. Tobit and a decision tree model are considered since they have a structure better suited to the bimodal nature of LGD. Least absolute value regression is also considered since it may be that the absolute error may be a more sensible criterion for estimating LGD than least square error. Since LGD follows a truncated distribution with a large number of cases at the extreme values 0 and 1, the Tobit model may be better suited since it takes account of bounds on a dependent variable y i through truncation. The twotailed Tobit model uses a latent variable y* to model boundary cases such that y = min i ( 1, max( 0, y *)) i yi * = β x + ε where. Assuming the distribution of the residuals conditional on x is normal, the following loglikelihood function is constructed for maximum 2 likelihood estimation of β and variance of residuals σ : log T x β σ i i ( β, σ ) = log φ σ + log 1 Φ + y i x σ T β x log Φ < < = = 0 yi 1 yi 0 yi 1 i i T i β 1 σ 6 of 28
7 where φ and Φ are the probability and cumulative density functions for the standard normal distribution respectively. This likelihood function is constructed by considering the probabilities of the dependent variable being between the boundaries and also on each boundary separately [Greene 1997, pp ]. The decision tree model uses two logistic regression submodels to model the special cases for total loss and no loss, ie LGD = 1 and 0 respectively, as binary classification problems. Then, if 0<LGD<1, an OLS regression model is used. This decision tree model is illustrated in Figure 1. We may expect this decision tree to be good at modelling LGD due to the large number of boundary cases at 0 and 1 which allow us to naturally approach the problem as a hybrid of two classification problems and a regression problem. This approach is meaningful since we may suppose there are special conditions which would make a customer pay back the full amount of debt or to pay back nothing, rather than just a portion. LGD is forecast for an account i as the expected value given the three submodels, ie ( 1 )( p + ( p ) L ) p0i 1i 1 1i i where i p 0 is the probability of LGD=0 for account i computed from the second estimated logistic regression model, p 1 i is the probability of LGD=1 from the first estimated logistic regression model and L i is the estimate of LGD assuming the loss is fractional and computed from the regression model. Is LGD=1? True LGD=1 False Is LGD=0? False Regression of LGD True LGD=0 0<LGD<1 Figure 1. Decision tree model. The first two binary decision boxes (white) are modelled by logistic regression, whilst the final decision box models LGD using OLS regression. 7 of 28
8 As is conventional in the literature we model LGD in terms of recovery rate (RR) rather than LGD directly, where RR = 1LGD. Our working definition of RR is sum of repayments made over a period t following default RR =. outstanding balance at date of default The choice of recovery period is a business decision. We consider a recovery period of t=12 months following default. Often banks are interested in longer periods or want to estimate recovery at close of account, but as we discover, a 12 month model can be used to estimate for longer recovery periods. It is possible to calculate LGD in alternative ways. For example, time after default could be an event, such as account charge off, rather than a fixed period; or the market value of the bad debt based on sale of the exposure in the market could be taken into account along with repayments; or administration costs of following up default may be included in the LGD calculation. However, these alternatives are beyond the scope of this paper. Loss of interest payments could also be included in the definition of LGD but this is not required by Basel II. Since the distribution of RR is bimodal and Ushaped, we also consider modelling fractional logit, beta distribution and probit transformations of RR. Fractional logit transformation: = ( RR) log( 1 RR) T RR log. The fractional logit model is particularly attractive since it deals specifically with response variables in the range 0 to 1, like RR, by transforming them into a larger range of values. It has been applied in several other econometric analyses [Papke & Wooldridge 1996] and was applied in particular by Dermine and Neto de Carvalho [2005] for modelling LGD for corporate loans. 1 Beta distribution transform: Φ ( beta( RR, α, β,0,1) ) T RR = where Φ is the cumulative density function of the standard normal distribution and α, β are parameters estimated from training data using maximum likelihood estimation. The Beta distribution is appealing since it is able to model bimodal variables with a U shaped distribution over the interval 0 to 1. It is therefore particularly useful for RR 8 of 28
9 and tends to transform RR into an approximately normal distribution. Figure 2 shows a distribution of the Beta transformation of RR for the credit card default data we use. It illustrates how the transformation creates an approximately normal distribution between two extreme values. The Beta distribution has been used successfully in Moody s KMV Losscalc software package for modelling RR [Gupton & Stein 2005]. Beta(RR) Figure 2. Distribution of recovery rates following application of a beta distribution transformation 1. Probit transformation : T RR = Φ 1 { i : Ri RR} n density function of the standard normal distribution and where Φ is the cumulative R,, R are observed RRs taken from training data. This transformation uses a nonparametric approach to transform RR into a normal distribution based on the empirical distribution in the training data. 2.2 Model assessment For OLS we report adjusted R 2 for model fit. This is for two reasons. Firstly, the various models we consider are nested so inevitably those with additional covariates will give improved R 2 model fit. The adjusted R 2 compensates for the additional variables. Secondly, reporting R 2 is misleading since it does not give a fair comparison between different studies with different sample sizes. 1 n 1 Axes values withheld for reasons of commercial confidentiality of data provider. 9 of 28
10 We report coefficient estimates for the OLS model with RR as dependent variable. However, since RR is not normally distributed, the error terms may not be normally distributed. Therefore conventional estimators of standard errors may not be unbiased. Instead we use a bootstrap to construct distributions for the coefficient estimates [Kennedy 2003, section 4.6]. As Lam and Veall [2002] show when OLS is used with nonnormal, and in particular bimodal, distributed error terms, the bootstrap gives accurate estimates of confidence intervals where the usual analytic method fails. We report the relative effect of each MV within the model. The coefficient estimates for MVs are multiplied by their standard deviation in the training data to derive standardized coefficient estimates. They show change in RR for a one standard deviation change in the covariate value and are therefore comparable and give an indication of the relative importance of each MV within the model. This approach was taken to study the effects of MVs on corporate default by Figlewski et al [2007]. Since our credit card data spans the period from 1999 to 2005, standard deviations for each MV are computed based on values within this period Holdout test procedure for forecasts To test the effectiveness of the LGD model for forecasts, we use a holdout sample, testing on credit card default data that is independent and follows chronologically after the period of the training data used to build the models. This approach allows us to simulate the expected operational use of LGD models in retail credit when a financial institution may want to assess the LGD risk on a new batch of defaults based on the performance of past defaults. In detail, we select cohorts of test data sets consisting only of accounts that default in a particular quarter. For each of these cohorts we train only on default data available prior to that quarter. Since we need to measure LGD after a period of t months following default, we need to ensure that date of defaults in the training data are at least t months prior to the beginning of the test quarter. This test procedure is illustrated in Figure 3. So, for example, if our test set was 2003Q3 and we are considering LGD after 12 months, our training data would consist of all cases that default within the period from 1999Q1 to 2002Q2. To estimate robust models with MVs we should train over the whole business cycle which is usually considered between 3 to 5 years. Therefore we consider taking training data sets with a minimum of 3 years of defaults. This then gives us of 28
11 quarters of test set data from 2003Q1 to 2005Q2. Results across these independent test sets then form a time series of forecast results. Training data set (eg 1999Q1 to 2002Q2) Recovery period (12 months) Test data set: one quarter (eg 2003Q3) Recovery period (12 months) Time (1999Q1 to 2006Q2) Figure 3. Illustration of hold out test procedure for one quarter of test data. The test data set holds all cases that default in the same financial quarter and the training data set comprises all cases that default and have full recovery period prior to the test quarter. The accuracy of forecasts relative to the observed true values is measured at the account level by mean square error (MSE). However, we are also interested in how well the model is able to estimate the observed, or true, mean LGD or RR over a portfolio of accounts. Therefore, for each test set quarter we measure the difference between forecast and observed mean RR across all test cases. If this difference is greater than zero, then the model is generally overestimating RR, whereas when it is less than zero, it is underestimating RR. The closer the difference between forecast and observed RR is to zero, the better the estimate. The mean value of both the MSE and absolute value of difference in forecast and observed mean RR (abs diff RR) across the several test quarters are reported in order to get an aggregate measure of performance. Since RR must be between 0 and 1, we truncate all forecasts of RR to fall within that range prior to measuring performance for all cases and all models. Also, to generate comparable results, performance is measured between predicted and observed RR, regardless of which transformation of RR is modelled. Therefore, if a transformation of RR is used, the inverse transform is applied to extract predicted RR and performance is measured on this rather than the transformed value. This is reasonable 11 of 28
12 since ultimately the value we want to model is RR and the transform should merely be a means to that end Forecasting 24 month LGD using a 12 month LGD model For these experiments, we will be assessing models of LGD after 12 months. However, financial institutions often follow a bad debt over several years, so they are also interested in forecasting LGD over a longer period of say 24 or 48 months. For this reason, we also assess the 12 month model to see how well it forecasts for longer periods in comparison with a 24 month model. If it does well, then this implies that a single LGD model may be used to forecast for any LGD period. A 12 month LGD model can be used to forecast 24 month LGD by calibrating the 12 month forecasts to 24 month forecasts. A simple way to do this is to use OLS regression on the 24 month LGD training data to build a linear model of 24 month RR with an intercept and 12 month RR as the explanatory variable. This model is then used to convert 12 month to 24 month forecasts. We use the same training and testing scheme as is set out above. For 24 month LGD we use 6 quarters of test data from 2003Q1 to 2004Q2. 3. Data 3.1 Application data For this study we have available a data set consisting of over 55,000 credit card accounts in default over the period 1999 to 2005 for customers across the whole of the UK. Account holders are expected to make a minimum payment of outstanding balance each month. We define default as a case where a credit card holder has failed to make minimum payments for three consecutive months or more. This is a typical definition of default for credit cards [Thomas et al 2002, p.123] and is in line with the default definition given by the Basel II Accord [BCBS 2006]. The data consists of four different credit card products which are a selection from those offered by a financial institution. As is typical for LGD, its distribution in our data set is between 0 and 1 and approximately Ushaped 2. The calculation of LGD should ideally include 2 For reasons of commercial confidentiality, to protect our data supplier, we cannot reveal exact distributions or values of LGD for our data, nor descriptive details of other variables. This is usual in this research area when large portfolios of live financial data are being studied. Nevertheless since our focus is on reporting forecast performance, this should not be a hindrance to scientific reporting. 12 of 28
13 administration costs for managing and implementing a collection procedure following account default. Unfortunately, this information was not available for the credit card data we used. The credit card data we use has many details extracted at time of application. These include the applicant s housing and employment status, age, income, total number of known credit cards and length of relationship with bank (time with bank). A credit bureau score is also provided and is from the same source and so is homogeneous across products. Demographic information is provided to classify the areas of residence of the credit card holder at time of application. The demographic information was from the same source for all credit card products and was coded into four broad categories: (1) council or poor housing, (2) rural, (3) suburban or wealthy area and (4) others. Additionally, information at time of default is also included in the data. This consists of balance outstanding at default and age of credit card account. We include balance at default since there is strong evidence from past studies that it is an important effect [Grippa et al 2005; Dermine and de Carvalho 2006] and it makes sense to include it operationally, especially if it improves forecasts of LGD. Table 1 shows the full list of variables used. Some variables, time with bank, income and age, have a small percentage of missing values (less than 6% of accounts). For each of these variables, we code missing values to 0 and create a dummy variable to indicate a missing value to capture the mean value amongst accounts with missing values. Several studies have found that PD and LGD are positively correlated (see Rösche and Scheule (2006)). We also found this to be the case in our data. Nevertheless, PD is not included in our models since it is effectively represented by including the application variables that are usually used to model it, along with macroeconomic variables that may explain the joint systematic risk to both PD and LGD (Altman et al 2005). Any single credit card portfolio is liable to have operational effects that will alter overall risk over time for that specific product, such as changes in cutoffs on credit score when accepting applications. This may lead to idiosyncratic links to economic conditions and therefore poorer models using MVs. By combining data across several products the impact of these idiosyncratic effects will be reduced and changes in risk over time are more likely to be linked to more objective effects such as the economy. Additionally, combining several products into one data set will increase the training 13 of 28
14 set size. These two factors should lead to stronger MV models and we test this hypothesis by running experiments for all products combined and also for each product separately. When all products are included in the data set, a dummy variable is used to indicate which product the account belongs to, in order to model different levels of RR between products. 3.2 Macroeconomic variables We consider these three series of macroeconomic data for the UK which we believe would have a strong direct effect on mean LGD for UK retail credit cards: Selected UK retail banks base interest rates. UK unemployment level: measured as thousands of adults (16+) unemployed. (Earnings) UK earnings index (2000 = 100) for the whole economy including bonuses as a ratio of the retail price index. These are all available from the UK Office for National Statistics as monthly data. We use nonseasonally adjusted data for earnings since we expect that seasonal changes in the economy may have some effect on abilities to repay. We would also have preferred to use nonseasonally adjusted data for the unemployment level but unfortunately this was not available. GDP growth for the UK is a common indicator of economic conditions but we have not included it since it is not available as monthly data which is the granularity that lenders typically require and therefore the granularity we require for our models. The MVs are included for each case at the date of default but it may be that if there is a relationship between MVs and LGD then this is lagged or led. For example, changes in interest rates may, in general, affect ability to pay several months later. For this reason, we also consider LGD models with MVs lag or lead 6 months. Each MV has a time trend: interest rates and unemployment level are generally falling over the period whilst real earnings are steadily increasing. Indeed, earnings generally increase exponentially with time therefore we include it in our model as growth in log earnings over 12 months to remove this obvious time trend. The three MV time series we use are shown in Figure 4. We ensure that any model fit of MVs as explanatory variables is not simply because they follow a time trend that matches a trend in RR by including date of default explicitly in the models. If a MV 14 of 28
15 is a good explanatory variable simply because of a time trend, then the inclusion of date of default should weaken its effect within the model. Including date of default in the AV model also allows us to test whether improvements in forecasts are simply due to a time trend rather than specifically economic conditions Interest rate and log earnings growth (rescaled) Unemployment level Jan 1999 Jul 2000 Jan 2000 Jul 2001 Jan 2001 Jul 2002 Jan 2002 Jul 2003 Jan 2003 Jul 2004 Jan 2004 Jul 2005 Jan 1000 Interest rate Earnings growth (log) Unemployment level Figure 4. UK macroeconomic data Different effects on LGD over time could be captured by using dummy variables for cohorts either at the yearly or quarterly level. There are two reasons we do not do this however. Firstly, the freedom gained when using any time dummies could simply absorb the effect we expect the MVs to explain. Secondly, although fine for explanatory models, it is not clear how such time dummies could be used for forecasting on a holdout sample since the dummy variables for the period of the holdout cohort will necessarily have the value 0 for all cases in the training data and so will not have a coefficient estimate. High correlation between MVs is a potential problem since this could lead to multicollinearity within the LGD model and therefore distort parameter estimates. We can test for multicollinearity by measuring the variance inflation factor (VIF) 2 given by ( ) 1 1 R when each MV is regressed on all other model covariates (Kennedy 2003). A high VIF indicates multicollinearity and a VIF greater than 5 is an indication that there may be a problem. 15 of 28
16 3.3 Inclusion of interaction terms Since there are many possible combinations of variables to form interaction terms, the number included in the model is controlled using forward variable selection. All AVs are included in the model but an iterative process is used to include MVs and interaction terms between AVs and MVs. At each step, each of the outstanding MVs and interaction terms not already in the model are added separately. The term that maximally increases a fit criterion is added to the model. The process is repeated until no new interaction terms are found that improve fit. There are several possible fit criteria that could be used and it is common to use an Ftest. However, since we are interested in forecasting, we use Akaike s information criterion (AIC) [Akaike 1973]. This has the advantage that it takes account of the parameter space of the model and discourages complex models with large numbers of variables. In turn, this discourages overfitting to the training data set. We approximate AIC by ( MSE) p n ln + 2 where n and p are number of observations and number of parameters in the model respectively and MSE is the mean square error for observations in the training data. We find using the AIC criterion gives better forecasts than using the standard Ftest. Further discussion of variable selection methods and use of AIC for predictive models is given by Miller [1990]. The variable selection procedure we use is further constrained so that for each interaction term included, its constitutive terms are also automatically included [Brambor 2005]. 4 Results Section 4.1 describes forecast performance for comparison of the different models. Then section 4.2 describes the best performing model for forecasts and its statistically significant explanatory variables. Models for longer recovery periods and stress tests are considered in further sections. 4.1 Model comparisons Table 1 shows forecast results for different models. Focussing on the first section showing results for a 12 month recovery period and for all products combined, it is clear that the standard OLS model with both AVs & MVs performs best for both 16 of 28
17 measures of forecast performance. The more complex models, Tobit, decision tree or least absolute value regression give worse performance and so does using any of the transformations of RR. We may expect OLS to do well for the MSE measure, but we may expect that one of the alternative models would be better when estimating mean RR over the portfolio. In particular we may expect least absolute value regression to be better since it is a linear loss function. However OLS forecasts best estimates of mean RR. This is a robust result which was obtained with many alternative experiments. The inclusion of interaction terms gives slightly worse forecasts than the AV&MV model so including interaction terms does not provide any benefit in estimating LGD. It is also notable that the simple model that effectively forecasts the mean RR from the training data set does well and outperforms many of the more complex models. Nevertheless using AVs along with MVs shows considerable gain in performance over the simple model. We also consider MVs with lags or leads of 6 months. We consider lags since it is possible that the effect of the economy on obligor behaviour may be delayed. We consider a lead since this would give the values of MVs midway through the recovery period. Table 1 shows that using lags or leads on MVs gives worse forecasts than using MV values at time of default and are almost as bad as not including MVs at all. The second section of Table 1 shows results for separate products. It is clear that the AV&MV model does not do consistently well for separate products and rarely as well as when all products are combined. In many cases the simple model with no explanatory variables is the best performing for forecasting mean RR. This is due partly to operational peculiarities within each portfolio that may have a spurious link to macroeconomic movements and also partly to reduced training sample size. These results suggest that better LGD models can be built when data from different products are combined. 17 of 28
18 Table 1. Aggregate forecast results for different models. Abs RR diff is the absolute difference between observed and forecast mean RR over each test quarter. Results are given as mean values across the series of test quarters from 2003Q1 to 2005Q2 or 2004Q2 for 12 or 24 month recovery periods respectively. Figures in bold show best results for each recovery period and product group. Recovery period 12 month 12 month Product All 1 Experiment RR Model transform None OLS regression Result Explanatory MSE Abs RR variables diff None (simple) AV AV & MV AV & MV & interaction terms AV & MV lag AV & MV lead Tobit Decision tree Least absolute value regression Beta AV & MV Logit OLS regression Probit Tobit Beta 24 month All None 2 None OLS regression 3 4 Decision tree None (simple) AV AV & MV None (simple) AV AV & MV None (simple) AV AV & MV None (simple) AV AV & MV OLS regression OLS model built AV & MV for 12 month recovery period Figures 5 and 6 show forecast time series results for the first three models listed in Table 1 in more detail for each test quarter. They show that the AV&MV model performs consistently better over time. Figure 5 shows that MSE is lower for AV&MV than either the simple or AV model and improves over time, possibly a result of having training data over a longer period of the business cycle to model the MV effects. Figure 6 shows that overall the AV&MV model forecasts mean RR much more closely than the other models as is evidenced by how close the difference between forecast and observed mean RR is to zero. In contrast the AV model is 18 of 28
19 consistently overestimating RR over time. In 2004Q2, the simple model achieves the best result but this is serendipitous since its forecasts are simply moving from underestimating to overestimating RR at that time. These results show that including MVs is important to improve LGD forecast results. It should be noted, however, that in our study at least 3 years of training data is required. We found that using less than this led to unstable models with MVs that occasionally extremely over or under estimate LGD MSE Q1 2003Q2 2003Q3 2003Q4 2004Q1 2004Q2 2004Q3 2004Q4 2005Q1 2005Q2 simple (no variables) AV AV&MV Figure 5. MSE of forecasts for each test data set from 2003Q1 to 2005Q2 for three OLS models with different variables. Difference between forecast and observed mean RR Q1 2003Q2 2003Q3 2003Q4 2004Q1 2004Q2 2004Q3 2004Q4 2005Q1 2005Q2 simple (no variables) AV AV&MV zero line Figure 6. Difference between forecast and observed mean RR. Results relate to each test data set from 2003Q1 to 2005Q2 for three OLS model with different variables. 19 of 28
20 4.2 Explanatory model Table 1 shows that the AV&MV model using OLS regression was the best performing forecaster so we describe this model estimate in further detail. We report model fit results for LGD models built on all data from 1999 to Including MVs in the LGD model improves fit to training data and is statistically significant. The AV model has an adjusted R 2 model fit of When MVs are included this increases to When interaction terms between MVs and application variables are also included using forward selection, the adjusted R 2 is This small increase indicates that adding interactions does not give a noticeable improvement reinforcing the results observed for forecasting. Table 2 shows coefficient estimates using OLS regression with the AV&MV model. Since the error residuals are nonnormal we have used bootstrap to compute statistical significance. The reported p values are from a normalbased distribution imposed on bootstrap coefficient estimates. This is reasonable since we found the 95% confidence intervals for the normalbased distributions matched closely those for the empirical percentile distribution sharing never less than 92% of the others range. Many of the model variables are statistically significant at a 0.01 level. In particular, housing status is important with council and private tenants having generally lower RR than home owners; the longer the time that the customer has been with the bank (time with bank) and the longer the individual has held the credit card prior to default (time on books at default) both tend to increase RR; individuals with higher income also tend to have higher RR. All these are indicators of customer stability which we would expect to give lower risk. Higher credit bureau scores tend to give higher RR, which again shows that individuals with expected low credit risk tend to pay back more of their bad debt. Also, the size of the initial balance at default has a negative effect on RR which is what we would expect since higher outstanding debt is more difficult to pay back. We note a positive correlation between default balance and income, which suggests that people with high income tend to build up larger balances on their credit card. Indeed if we remove default balance from the model the sign on income becomes negative since it becomes a surrogate for the missing default balance variable. However, when both income and default balance are in the model together then the sign on income becomes positive which is what we would expect since the 20 of 28
21 Table 2. Coefficent estimates for OLS recovery rate model. Training data set for this AV&MV model included all available cases from 1999 to Standard errors and pvalues are computed with bootstrap estimation with 1,000 repetitions. Values in italics show standardized estimates for MVs. Dummy variable taking the value 0 or 1, for false and true respectively, are indicated by (D). Variable Coefficient Standard z P> z estimate error Intercept Home status (D): Council tenant Private tenant Term time accommodation Other Lives with parents Excluded category: Own home Time with bank No data on time with bank (D) Income (log) Income unknown (D) Number of cards Years at current address Employment status (D): Home maker Unemployed Retired Part time Self employed Other Student Excluded category: Employed Age at default Age unknown (D) Credit bureau score Demographic group (D): Council or poor housing area Rural area Suburban or wealthy area Excluded category: Other Credit card product (D): Excluded category: 4 Time on books at default (months) Balance at default (log) Date of default Macroeconomic variables: Bank interest rates standardized estimate Unemployment level standardized estimate Earnings growth (log) standardized estimate availability of higher income implies a greater capacity to repay the outstanding balance. Employment status has less impact, although we see that home makers tend to have lower RR. Demographic information was important with those living in areas 21 of 28
22 classified as council or poor housing tending to have lower RR than those in rural, suburban or wealthy areas. Table 2 shows that coefficient estimates for MVs have the expected signs. That is, the parameter estimate for interest rates is negative meaning that higher interest rates at the time of default tend to give lower RR. Similarly, higher unemployment levels are also linked to lower RR. However, higher earnings growth, yearonyear, leads to increased RR which suggests earnings growth leads to better recoveries. Bank interest rates and unemployment level are both statistically significant at a 0.01 level, although earnings growth is not statistically significant in the model. The coefficient estimate for date of default is small and not statistically significant which implies that the effect of the MVs is not due to a simple time trend. Date of default is a stronger explanatory variable in the AV model and is statistically significant at a 0.01 level. Nevertheless, the results in Table 1, comparing AV and AV&MV, show that the time trend is not sufficient to explain the improved performance given by the model with MVs. Table 2 also shows standardized parameter estimates for MVs. These give an indication of the relative size of effect of each MV. Bank interest rate clearly has the largest magnitude with unemployment level having less than half the interest rate effect. Additionally we find that the VIF for any of the MVs when regressed on all other covariates in model (1) was always less than 2 which is sufficiently small that we should not expect that the results are affected by multicollinearity. Since including interaction terms did not improve forecasts or model fit we do not report interaction terms in the explanatory model. 4.3 Forecasting 24 month LGD with a 12 month LGD model We test whether it is possible to use a model built for a 12 month recovery period to forecast for a 24 month period. Following the procedure given in section we get RR (24 months) = RR (12 months) with adjusted R 2 = This model always gives a higher estimated RR after 24 months than after 12 months. This is intuitively correct since we would normally expect RR for an individual not to decrease over time. This model is used to compare the AV&MV model built on the 12 month period with one built on the 24 month 22 of 28
23 period using the procedure described in Section Results are shown in the third section of Table 1. The 12 month MV model outperforms the 24 month model for forecasts of 24 month RR for both reported forecast measures. However, this is natural since the recovery period buffer between training and test data (see Figure 2) implies that the 12 month model is built from more recent data. These results indicate that working with a 12 month LGD model is sufficient since the same model can also be used to model longer periods. However, with further investigation, we expect that some hybrid model combining the 12 and 24 month trained LGD models would be the most effective. 4.4 Stress test using interest rates We can use LGD models with MVs for stress testing by replacing values of MVs by stressed values. Therefore they can be used to model particular scenarios or as part of a larger simulation based on a distribution of past macroeconomic conditions. We conduct a univariate stress test by considering the effect of interest rate changes on the test data set in 2005Q2 which is the last quarter of test data available from our data set. Figure 4 shows that for the training period, extreme values of interest rate are 6% and 3.5%. When these values are used in the AV&MV model, the forecast mean RR for the test quarter changes by 17% and +28% respectively. This shows the shift in overall RR for the worst and best cases. A downturn RR 17% less than during normal or good conditions is not implausible. 5. Conclusion In the introduction we posed four main research questions. We discuss conclusions relating to each of these questions in turn. Q1. Which credit card application and default variables are the key drivers of retail LGD? Our experiments have shown that application variables can be used to model LGD. In particular, Table 2 shows that home status, time with bank and credit bureau score are strong explanatory variables. Additionally, we found that income and balance at default form a joint effect. The negative correlation of default balance with RR matches the findings of Dermine and Neto de Carvalho [2005] that size of loan is a significant explanatory variable for RR. We found that default balance contributes to 23 of 28
24 forecasts of LGD since when it is removed from the AV&MV model, forecast performance is worse. Other variables at time of default such as age of obligor and age of account also influence LGD. Interestingly age has a positive effect on LGD (negative on RR) and we found the linear relation between age and LGD remained even when age was replaced by several age categories using dummy variables. This is surprising since we would normally expect risk to reduce with maturity. That is, for PD models, typically the positive effect of age peaks in the mid30 s. Q2. What is the best modelling approach for retail LGD? Despite trying several different combinations of variables, models such as Tobit and a decision tree and various transformations of the dependent variable all of which should in theory be good models for bimodal LGD it turns out the best forecast model in our experiments was simple OLS. Why this may be so is unclear, although we conjecture that since LGD is difficult to model, with poor model fit, this implies that regression forecasts tend to fall in a narrow range away from the boundary cases of 0 and 1, therefore models dealing carefully with the boundary cases are superfluous in practice. Q3. How well do the models perform at forecasting LGD? Model fit is weak with R 2 = 0.11 but such low values are typical of modelling LGD. On a sample of 1,118 defaulted financial leases, De Laurentis and Riani [2005] report R 2 values from 0.20 to 0.45 after outliers were deliberately removed. On a data set of 374 defaulted loans to small and medium size firms, Dermine and Neto de Carvalho [2005] report a pseudo R 2 value of 0.13 when considering a 12month recovery period. The lower R 2 value we report is partly a consequence of the large sample size of our data. In contrast, if we restrict our sample size to just 500 randomly selected cases we get R 2 = This is more typical of other studies but nevertheless it is misleading and we get adjusted R 2 = 0.13 which is closer to the value for the full sample. Therefore we report adjusted R 2 in our main results and suggest its use for comparison of LGD models across studies with different sample sizes. For forecasts, the simple model which effectively forecasts mean LGD from the training data set does very well and outperforms many of the complex models as can be seen in Table 1. Nevertheless we still see a modest improvement in MSE 24 of 28
Statistics for Retail Finance. Chapter 8: Regulation and Capital Requirements
Statistics for Retail Finance 1 Overview > We now consider regulatory requirements for managing risk on a portfolio of consumer loans. Regulators have two key duties: 1. Protect consumers in the financial
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics: Behavioural
More informationCredit Scoring Modelling for Retail Banking Sector.
Credit Scoring Modelling for Retail Banking Sector. Elena Bartolozzi, Matthew Cornford, Leticia GarcíaErgüín, Cristina Pascual Deocón, Oscar Iván Vasquez & Fransico Javier Plaza. II Modelling Week, Universidad
More informationCorporate Defaults and Large Macroeconomic Shocks
Corporate Defaults and Large Macroeconomic Shocks Mathias Drehmann Bank of England Andrew Patton London School of Economics and Bank of England Steffen Sorensen Bank of England The presentation expresses
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationMacroeconomic drivers of private health insurance coverage. nib Health Insurance
Macroeconomic drivers of private health insurance coverage nib Health Insurance 1 September 2011 Contents Executive Summary...i 1 Methodology and modelling results... 2 2 Forecasts... 6 References... 8
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 SigmaRestricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationIMPLEMENTATION NOTE. Validating Risk Rating Systems at IRB Institutions
IMPLEMENTATION NOTE Subject: Category: Capital No: A1 Date: January 2006 I. Introduction The term rating system comprises all of the methods, processes, controls, data collection and IT systems that support
More informationLOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y
More information, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (
Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we
More informationNonBank Deposit Taker (NBDT) Capital Policy Paper
NonBank Deposit Taker (NBDT) Capital Policy Paper Subject: The risk weighting structure of the NBDT capital adequacy regime Author: Ian Harrison Date: 3 November 2009 Introduction 1. This paper sets out,
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationModels for Count Data With Overdispersion
Models for Count Data With Overdispersion Germán Rodríguez November 6, 2013 Abstract This addendum to the WWS 509 notes covers extrapoisson variation and the negative binomial model, with brief appearances
More information1 Logit & Probit Models for Binary Response
ECON 370: Limited Dependent Variable 1 Limited Dependent Variable Econometric Methods, ECON 370 We had previously discussed the possibility of running regressions even when the dependent variable is dichotomous
More informationStress Test Modeling in Cards Portfolios
Stress Test Modeling in Cards Portfolios José J. CanalsCerdá The views expressed here are my own and not necessarily those of the Federal Reserve or its staff. Credit Card Performance: Chargeoff Rate
More informationPenalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20thcentury statistics dealt with maximum likelihood
More informationDániel Holló: Risk developments on the retail mortgage loan market*
Dániel Holló: Risk developments on the retail mortgage loan market* In this study, using three commercial banks retail mortgage loan portfolios (consisting of approximately 200,000 clients with housing
More informationCapital adequacy for credit risk: A practical exercise
Capital adequacy for credit risk: A practical exercise Financial Institutions www.msnorthamerica.com Design and Layout Marketing and Communication Department Management Solutions  Spain Photographs: Management
More informationOptimal DebttoEquity Ratios and Stock Returns
Utah State University DigitalCommons@USU All Graduate Plan B and other Reports Graduate Studies 32014 Optimal DebttoEquity Ratios and Stock Returns Courtney D. Winn Utah State University Follow this
More informationThe Financial Crisis and the Bankruptcy of Small and Medium SizedFirms in the Emerging Market
The Financial Crisis and the Bankruptcy of Small and Medium SizedFirms in the Emerging Market SungChang Jung, Chonnam National University, South Korea Timothy H. Lee, Equifax Decision Solutions, Georgia,
More informationYiming Peng, Department of Statistics. February 12, 2013
Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop
More informationCredit Risk Models. August 24 26, 2010
Credit Risk Models August 24 26, 2010 AGENDA 1 st Case Study : Credit Rating Model Borrowers and Factoring (Accounts Receivable Financing) pages 3 10 2 nd Case Study : Credit Scoring Model Automobile Leasing
More informationModelling Bank Loan LGD of Corporate and SME Segments A Case Study 1 Radovan Chalupka 2 and Juraj Kopecsni 3
Modelling Bank Loan LGD of Corporate and SME Segments A Case Study 1 Radovan Chalupka 2 and Juraj Kopecsni 3 Abstract A methodology is proposed to estimate loss given default (LGD), which is applied to
More informationStatistics in Retail Finance. Chapter 2: Statistical models of default
Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measuresoffit in multiple regression Assumptions
More informationValidation of Internal Rating and Scoring Models
Validation of Internal Rating and Scoring Models Dr. Leif Boegelein Global Financial Services Risk Management Leif.Boegelein@ch.ey.com 07.09.2005 2005 EYGM Limited. All Rights Reserved. Agenda 1. Motivation
More informationPrivate Sector Employment Indicator, Quarter 1 2015 (February 2015 to April 2015)
STATISTICAL RELEASE Date: 14 July 2015 Status: Experimental Official Statistics Coverage: England; Regions Private Sector Employment Indicator, Quarter 1 2015 (February 2015 to April 2015) 1. Introduction
More information11/20/2014. Correlational research is used to describe the relationship between two or more naturally occurring variables.
Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection
More informationPredicting Defaults of Loans using Lending Club s Loan Data
Predicting Defaults of Loans using Lending Club s Loan Data Oleh Dubno Fall 2014 General Assembly Data Science Link to my Developer Notebook (ipynb)  http://nbviewer.ipython.org/gist/odubno/0b767a47f75adb382246
More informationGet ready for IFRS 9. The impairment requirements
Get ready for IFRS 9 The impairment requirements IFRS 9 (204) Financial Instruments fundamentally rewrites the accounting rules for financial instruments. It introduces a new approach for financial asset
More informationTesting for Granger causality between stock prices and economic growth
MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at http://mpra.ub.unimuenchen.de/2962/ MPRA Paper No. 2962, posted
More informationApril 2016. Online Payday Loan Payments
April 2016 Online Payday Loan Payments Table of contents Table of contents... 1 1. Introduction... 2 2. Data... 5 3. Representments... 8 3.1 Payment Request Patterns... 10 3.2 Time between Payment Requests...15
More informationESTIMATING PROBABILITIES OF DEFAULT UNDER MACROECONOMIC SCENARIOS*
Articles Part II ESTIMATING PROBABILITIES OF DEFAULT UNDER MACROECONOMIC SCENARIOS* António Antunes** Nuno Ribeiro** Paula Antão** INTRODUCTION The assessment of the creditworthiness of current and prospective
More informationExecutive Summary. Abstract. Heitman Analytics Conclusions:
Prepared By: Adam Petranovich, Economic Analyst apetranovich@heitmananlytics.com 541 868 2788 Executive Summary Abstract The purpose of this study is to provide the most accurate estimate of historical
More informationApproaches to Stress Testing Credit Risk: Experience gained on Spanish FSAP
Approaches to Stress Testing Credit Risk: Experience gained on Spanish FSAP Messrs. Jesús Saurina and Carlo Trucharte Bank of Spain Paper presented at the Expert Forum on Advanced Techniques on Stress
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More information5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
More informationECON Introductory Econometrics. Lecture 15: Binary dependent variables
ECON4150  Introductory Econometrics Lecture 15: Binary dependent variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 11 Lecture Outline 2 The linear probability model Nonlinear probability
More informationSolutions to Problem Set #2 Spring, 2013. 1.a) Units of Price of Nominal GDP Real Year Stuff Produced Stuff GDP Deflator GDP
Economics 1021, Section 1 Prof. Steve Fazzari Solutions to Problem Set #2 Spring, 2013 1.a) Units of Price of Nominal GDP Real Year Stuff Produced Stuff GDP Deflator GDP 2003 500 $20 $10,000 95.2 $10,504
More informationCREDIT RISK MANAGEMENT
GLOBAL ASSOCIATION OF RISK PROFESSIONALS The GARP Risk Series CREDIT RISK MANAGEMENT Chapter 1 Credit Risk Assessment Chapter Focus Distinguishing credit risk from market risk Credit policy and credit
More informationESTIMATING EXPECTED LOSS GIVEN DEFAULT
102 ESTIMATING EXPECTED LOSS GIVEN DEFAULT ESTIMATING EXPECTED LOSS GIVEN DEFAULT Petr Jakubík and Jakub Seidler This article discusses the estimation of a key credit risk parameter loss given default
More informationCredit Risk in the Leasing Business
Credit Risk in the Leasing Business  A case study of low probability of default  by Mathias SCHMIT & DEGOUYS C. DELZELLE D. STUYCK J. WAUTELET F. 1 Université Libre de Bruxelles, Solvay Business School
More informationRegression analysis in practice with GRETL
Regression analysis in practice with GRETL Prerequisites You will need the GNU econometrics software GRETL installed on your computer (http://gretl.sourceforge.net/), together with the sample files that
More informationAccurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios
Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios By: Michael Banasiak & By: Daniel Tantum, Ph.D. What Are Statistical Based Behavior Scoring Models And How Are
More informationEffect of Macroeconomic variables on Healthcare Loan/Lease Portfolio Delinquency Rate
Effect of Macroeconomic variables on Healthcare Loan/Lease Portfolio Delinquency Rate Shylu John ** (shylu.john@yahoo.co.in) Principal Consultant, Genpact India Ltd with 10 years industry experience in
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationAuxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationAssessing Credit Risk of the Companies Sector
Assessing Credit Risk of the Companies Sector Gerhard Winkler Annual Regional Seminar On Financial Stability Issues INTERNATIONAL MONETARY FUND and NATIONAL BANK OF ROMANIA Sinaia, Romania September 1819,
More informationCredit Risk and Capital Requirements under Basel II
Credit Risk and Capital Requirements under Basel II Ana Lacerda Banco de Portugal Paula Antão Banco de Portugal 6th World Congress June 2226, 2010 Outline of the talk 1. An overview of capital requirements
More informationSELFTEST: SIMPLE REGRESSION
ECO 22000 McRAE SELFTEST: SIMPLE REGRESSION Note: Those questions indicated with an (N) are unlikely to appear in this form on an inclass examination, but you should be able to describe the procedures
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation:  Feature vector X,  qualitative response Y, taking values in C
More informationSAMPLE SELECTION BIAS IN CREDIT SCORING MODELS
SAMPLE SELECTION BIAS IN CREDIT SCORING MODELS John Banasik, Jonathan Crook Credit Research Centre, University of Edinburgh Lyn Thomas University of Southampton ssm0 The Problem We wish to estimate an
More informationUNDERSTANDING MULTIPLE REGRESSION
UNDERSTANDING Multiple regression analysis (MRA) is any of several related statistical methods for evaluating the effects of more than one independent (or predictor) variable on a dependent (or outcome)
More informationNature of credit risk
Nature of credit risk Credit risk The risk that obligors (borrowers, counterparties and debtors) might default on their obligations, or having an impaired ability to repay them. How to measure the credit
More informationRegression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology
Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of
More informationSEMINAR ON CREDIT RISK MANAGEMENT AND SME BUSINESS RENATO MAINO. Turin, June 12, 2003. Agenda
SEMINAR ON CREDIT RISK MANAGEMENT AND SME BUSINESS RENATO MAINO Head of Risk assessment and management Turin, June 12, 2003 2 Agenda Italian market: the peculiarity of Italian SMEs in rating models estimation
More informationFinancialInstitutions Management
Solutions 3 Chapter 11: Credit Risk Loan Pricing and Terms 9. County Bank offers oneyear loans with a stated rate of 9 percent but requires a compensating balance of 10 percent. What is the true cost
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, TTESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, TTESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationModule 6: Introduction to Time Series Forecasting
Using Statistical Data to Make Decisions Module 6: Introduction to Time Series Forecasting Titus Awokuse and Tom Ilvento, University of Delaware, College of Agriculture and Natural Resources, Food and
More informationAge to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA
Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA Introduction A common question faced by many actuaries when selecting loss development factors is whether to base the selected
More informationCredit Scorecards for SME Finance The Process of Improving Risk Measurement and Management
Credit Scorecards for SME Finance The Process of Improving Risk Measurement and Management April 2009 By Dean Caire, CFA Most of the literature on credit scoring discusses the various modelling techniques
More informationJava Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Nonnormal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
More informationWORKING PAPER NO. 1218 CREDIT RISK ANALYSIS OF CREDIT CARD PORTFOLIOS UNDER ECONOMIC STRESS CONDITIONS. Piu Banerjee Federal Reserve Bank of New York
WORKING PAPER NO. 1218 CREDIT RISK ANALYSIS OF CREDIT CARD PORTFOLIOS UNDER ECONOMIC STRESS CONDITIONS Piu Banerjee Federal Reserve Bank of New York José J. CanalsCerdá Federal Reserve Bank of Philadelphia
More informationModelling Bank Loan LGD of Corporate and SME Segments A Case Study
Institute of Economic Studies, Faculty of Social Sciences Charles University in Prague Modelling Bank Loan LGD of Corporate and SME Segments A Case Study Radovan Chalupka Juraj Kopecsni IES Working Paper:
More informationMSCI US Equity Indices Methodology
Index Construction Objectives and Methodology for the MSCI US Equity Indices Contents Section 1: US Equity Indices Methodology Overview... 5 1.1 Introduction... 5 1.2 Defining the US Equity Market Capitalization
More informationMORTGAGE LENDER PROTECTION UNDER INSURANCE ARRANGEMENTS Irina Genriha Latvian University, Tel. +371 26387099, email: irina.genriha@inbox.
MORTGAGE LENDER PROTECTION UNDER INSURANCE ARRANGEMENTS Irina Genriha Latvian University, Tel. +371 2638799, email: irina.genriha@inbox.lv Since September 28, when the crisis deepened in the first place
More informationANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? 1. INTRODUCTION
ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? SAMUEL H. COX AND YIJIA LIN ABSTRACT. We devise an approach, using tobit models for modeling annuity lapse rates. The approach is based on data provided
More informationTest of Hypotheses. Since the NeymanPearson approach involves two statistical hypotheses, one has to decide which one
Test of Hypotheses Hypothesis, Test Statistic, and Rejection Region Imagine that you play a repeated Bernoulli game: you win $1 if head and lose $1 if tail. After 10 plays, you lost $2 in net (4 heads
More informationInterpreting Market Responses to Economic Data
Interpreting Market Responses to Economic Data Patrick D Arcy and Emily Poole* This article discusses how bond, equity and foreign exchange markets have responded to the surprise component of Australian
More informationInvestment Statistics: Definitions & Formulas
Investment Statistics: Definitions & Formulas The following are brief descriptions and formulas for the various statistics and calculations available within the ease Analytics system. Unless stated otherwise,
More informationAssociation Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
More informationBasel Committee on Banking Supervision. An Explanatory Note on the Basel II IRB Risk Weight Functions
Basel Committee on Banking Supervision An Explanatory Note on the Basel II IRB Risk Weight Functions July 2005 Requests for copies of publications, or for additions/changes to the mailing list, should
More informationSection 6: Model Selection, Logistic Regression and more...
Section 6: Model Selection, Logistic Regression and more... Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Model Building
More informationAn Introduction to Ensemble Learning in Credit Risk Modelling
An Introduction to Ensemble Learning in Credit Risk Modelling October 15, 2014 Han Sheng Sun, BMO Zi Jin, Wells Fargo Disclaimer The opinions expressed in this presentation and on the following slides
More informationCounterparty Credit Risk for Insurance and Reinsurance Firms. Perry D. Mehta Enterprise Risk Management Symposium Chicago, March 2011
Counterparty Credit Risk for Insurance and Reinsurance Firms Perry D. Mehta Enterprise Risk Management Symposium Chicago, March 2011 Outline What is counterparty credit risk Relevance of counterparty credit
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 15 scale to 0100 scores When you look at your report, you will notice that the scores are reported on a 0100 scale, even though respondents
More informationPredicting Recovery Rates for Defaulting Credit Card Debt
Predicting Recovery Rates for Defaulting Credit Card Debt Angela Moore Quantitative Financial Risk Management Centre School of Management University of Southampton Abstract Defaulting credit card debt
More informationRegression with a Binary Dependent Variable
Regression with a Binary Dependent Variable Chapter 9 Michael Ash CPPA Lecture 22 Course Notes Endgame Takehome final Distributed Friday 19 May Due Tuesday 23 May (Paper or emailed PDF ok; no Word, Excel,
More informationA macroeconomic credit risk model for stress testing the Romanian corporate credit portfolio
Academy of Economic Studies Bucharest Doctoral School of Finance and Banking A macroeconomic credit risk model for stress testing the Romanian corporate credit portfolio Supervisor Professor PhD. Moisă
More informationAMF Instruction DOC Calculation of global exposure for authorised UCITS and AIFs
AMF Instruction DOC201115 Calculation of global exposure for authorised UCITS and AIFs Reference texts : Articles 41172 to 41181 and 42251 to 42260 1 of the AMF General Regulation 1. General provisions...
More informationTradeOff Study Sample Size: How Low Can We go?
TradeOff Study Sample Size: How Low Can We go? The effect of sample size on model error is examined through several commercial data sets, using five tradeoff techniques: ACA, ACA/HB, CVA, HBReg and
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationMemento on Eviews output
Memento on Eviews output Jonathan Benchimol 10 th June 2013 Abstract This small paper, presented as a memento format, reviews almost all frequent results Eviews produces following standard econometric
More informationUniversity of Wisconsin Oshkosh
University of Wisconsin Oshkosh Oshkosh Scholar Submission Volume IV Fall 2009 The Correlation Between Happiness and Inequality Kiley Williams, author Dr. Marianne Johnson, Economics, faculty adviser Kiley
More informationCONSULTATION PAPER P0162006 October 2006. Proposed Regulatory Framework on Mortgage Insurance Business
CONSULTATION PAPER P0162006 October 2006 Proposed Regulatory Framework on Mortgage Insurance Business PREFACE 1 Mortgage insurance protects residential mortgage lenders against losses on mortgage loans
More informationFinancial Evolution and Stability The Case of Hedge Funds
Financial Evolution and Stability The Case of Hedge Funds KENT JANÉR MD of Nektar Asset Management, a marketneutral hedge fund that works with a large element of macroeconomic assessment. Hedge funds
More informationEquifax Risk Score 2.0
Equifax Risk Score 2.0 Customer Handbook Prepared by: Larry Macdonald, Sr. Product Manager 10Jun2014 Table of Contents Introducing the Equifax Risk Score 2.0... 3 1. Modeling Concepts... 3 1.1 Population
More informationAsymmetry and the Cost of Capital
Asymmetry and the Cost of Capital Javier García Sánchez, IAE Business School Lorenzo Preve, IAE Business School Virginia Sarria Allende, IAE Business School Abstract The expected cost of capital is a crucial
More informationA Test Of The M&M Capital Structure Theories Richard H. Fosberg, William Paterson University, USA
A Test Of The M&M Capital Structure Theories Richard H. Fosberg, William Paterson University, USA ABSTRACT Modigliani and Miller (1958, 1963) predict two very specific relationships between firm value
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationAN ASSESSMENT OF CAPITAL REQUIREMENTS UNDER BASEL II: THE PORTUGUESE CASE*
Articles Part II AN ASSESSMENT OF CAPITAL REQUIREMENTS UNDER BASEL II: THE PORTUGUESE CASE* Paula Antão** Ana Lacerda** 1. INTRODUCTION Capital requirements for banks are of foremost importance for financial
More informationTHE IMPACT OF MACROECONOMIC FACTORS ON NONPERFORMING LOANS IN THE REPUBLIC OF MOLDOVA
Abstract THE IMPACT OF MACROECONOMIC FACTORS ON NONPERFORMING LOANS IN THE REPUBLIC OF MOLDOVA Dorina CLICHICI 44 Tatiana COLESNICOVA 45 The purpose of this research is to estimate the impact of several
More informationChapter 8: Interval Estimates and Hypothesis Testing
Chapter 8: Interval Estimates and Hypothesis Testing Chapter 8 Outline Clint s Assignment: Taking Stock Estimate Reliability: Interval Estimate Question o Normal Distribution versus the Student tdistribution:
More informationDeterminants of Non Performing Loans: Case of US Banking Sector
141 Determinants of Non Performing Loans: Case of US Banking Sector Irum Saba 1 Rehana Kouser 2 Muhammad Azeem 3 Non Performing Loan Rate is the most important issue for banks to survive. There are lots
More informationIntegrated Resource Plan
Integrated Resource Plan March 19, 2004 PREPARED FOR KAUA I ISLAND UTILITY COOPERATIVE LCG Consulting 4962 El Camino Real, Suite 112 Los Altos, CA 94022 6509629670 1 IRP 1 ELECTRIC LOAD FORECASTING 1.1
More informationSection A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA  Part I
Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA  Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting
More informationUniversità della Svizzera italiana. Lugano, 20 gennaio 2005
Università della Svizzera italiana Lugano, 20 gennaio 2005 Default probabilities and business cycle regimes: a forwardlooking approach Chiara Pederzoli Università di Modena e Reggio Emilia 1 Motivation:
More informationA Proven Approach to Stress Testing Consumer Loan Portfolios
A Proven Approach to Stress Testing Consumer Loan Portfolios Interthinx, Inc. 2013. All rights reserved. Interthinx is a registered trademark of Verisk Analytics. No part of this publication may be reproduced,
More informationCausal Forecasting Models
CTL.SC1x Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics Causal Models Used when demand is correlated with some known and measurable environmental
More informationConsumer Confidence and UK House Price Inflation
Consumer Confidence and UK House Price Inflation Dean Garratt, Economist, Council of Mortgage Lenders Consumer confidence alone can explain up to twothirds of the variation in the annual growth of house
More information