Loss Given Default models for UK retail credit cards

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Loss Given Default models for UK retail credit cards"

Transcription

1 Loss Given Default models for UK retail credit cards Tony Bellotti and Jonathan Crook Credit Research Centre University of Edinburgh Business School William Robertson Building 50 George Square, Edinburgh EH8 9JY 19 March 2009 Abstract Loss Given Default is an important measure of credit loss used by financial institutions to compute risk within credit portfolios, expected loss on individual loans and capital requirements. The Basel II Capital Accord gives banks the opportunity to calculate their own estimates of Loss Given Default. Based on UK data for major retail credit cards, we build several models of Loss Given Default based on account level data including Tobit, a decision tree model, a Beta and fractional logit transformation. However, we find that Ordinary Least Squares models with macroeconomic variables perform best for forecast of Loss Given Default at account and portfolio level on independent hold-out data sets. The inclusion of macroeconomic conditions in the model is important since it provides a means to model Loss Given Default in downturn conditions as required by Basel II and enables stress testing. Key Words: Loss given default; retail credit; regression; model comparison; macroeconomic; forecast. 1 of 28

2 1. Introduction Loss Given Default (LGD) is the loss incurred by a financial institution when an obligor defaults on a loan, given as the fraction of exposure at default (EAD) unpaid after some period of time. It is usual for LGD to have a value between 0 and 1 where 0 means the balance is fully recovered and 1 means total loss of EAD. LGD is an important value that banks need to estimate accurately for several reasons. Firstly, it can be used along with probability of default (PD) and EAD to estimate expected financial loss. Secondly, a forecast of LGD for an individual can help determine the collection policy to be used for that individual following default. For example, if high LGD is expected, then more effort may be employed to help reduce this loss. Thirdly, an estimate of LGD, and therefore portfolio financial risk, is an integral part of the operational calculation of capital requirements to cover credit loss during extreme economic conditions. The Basel II Capital Accord [2006] allows banks the opportunity to estimate LGD using their own models with the advanced internal ratings based (IRB) approach. In this paper we focus on modelling and forecasting LGD for UK retail credit cards based on account variables (AVs) and also the inclusion of macroeconomic variables (MVs). Our prior expectation is that as interest rates rise, so the cost of mortgage and other debt increases, making it more difficult for an obligor to repay outstanding credit card balances and so increasing the mean LGD. Equally, an increase in unemployment level means more people find themselves in circumstances where they cannot repay credit and this would also increase the mean LGD. On the other hand, an increase in earnings would mean more people have more income available to pay off debt and would therefore decrease mean LGD. In addition it is possible that some defaulters are more likely to be less able to repay than others when the state of the economy changes. For example, those who are unemployed at the time of credit card application may be particularly sensitive to interest rate increases, as may home owners. Similarly borrowers with a higher default balance may be particularly sensitive to increases in interest rates. For this reason we also consider interactions between MVs and account data. We consider four key research questions:- 2 of 28

3 Q1. Which credit card application and default variables are the key drivers of retail LGD? Q2. What is the best modelling approach for retail LGD? Q3. How well do the models perform at forecasting LGD? Q4. Does the inclusion of MVs lead to improved models of retail LGD? We investigate these questions by building several alternative models of LGD. We find that there are many important drivers of LGD taken from application details and default information. Given that the distribution of LGD is a bimodal U-shape, we consider a Tobit model and a decision tree model along with various transformations of the dependent variable. Although LGD is not easy to model and poor model fit is typical, nevertheless we find that models can be built which provide improved estimates of LGD and good forecasts of mean LGD across a portfolio of accounts. Surprisingly, we find that the best forecasting model is Ordinary Least Squares (OLS) regression. Economic conditions are included as values of MVs for bank interest rate, unemployment level and earnings growth at time of account default. We find that the first two MVs are statistically significant explanatory variables that give rise to improved forecasts of LGD in hold-out tests at both an account and portfolio level. Building LGD models with MVs also addresses the Basel II requirement to estimate downturn LGD since stressed values of MVs can be used in the model to forecast LGD during poor economic conditions. We demonstrate how this can be done by stressing interest rate values. The modelling and forecast of LGD for retail credit using macroeconomic conditions is a new area of study. There is an extensive literature regarding LGD models for corporate loans (see eg Altman et al [2005]). However, there is less about forecasting LGD. An exception is Gupton and Stein [2005] who describe a predictive LGD model for corporate loans using Moody-KMV s Losscalc software. There is also very little literature regarding retail credit LGD, even though this is a large financial market: total lending in the UK consumer credit market reached over 1.4 trillion in 2009 [source: Bank of England]. Grippa et al [2005] publish empirical LGD models for a sample of 20,724 Italian accounts that includes small businesses along with households. They observe differences in LGD and recovery periods across different geographic regions and different recovery channels. They also conducted a 3 of 28

4 multivariate analysis that showed a statistically significant negative relationship between the presence of collateral or personal guarantee and LGD, and a positive relationship with size of loan. However, the range of variables used is far more limited than would be available to a financial institution that has made credit card or personal loans and the study did not attempt to forecast LGD. Dermine and de Carvalho [2005] model LGD for loans to small and medium sized firms in Portugal. They apply mortality analysis and include annual GDP growth as an explanatory variable. However, they found that GDP growth was not significant. They suggest that this may be due to the fact that the period of analysis, , did not include a significant recession. We may also note that their training sample size (374 defaults) was relatively small and may not have been large enough for a significant relationship between the economy and LGD to be discovered. Querci [2005] provides an LGD model for loans to small businesses and individuals by an Italian bank. This study shows the importance of regional differences on LGD variation but does not include time varying macroeconomic conditions. Figlewski et al [2007] model the effect of macroeconomic factors on corporate default with a detailed study of numerous economic conditions including unemployment level, inflation, GDP and a production index. They found that many of these MVs were significant explanatory variables. Saurina and Trucharte [2007] model PD for retail mortgage portfolios in Spain. They show that the GDP growth rate is a significant cyclical variable in the regression and has a negative sign as we would expect. That is, during downturns (low GDP growth), PD increases. However, they also include an interest rate variable and, although it has a positive sign and is significant, they report that including interest rates does not improve accuracy. The novelties of our paper are that unlike published work (1) we consider forecasts of LGD for retail credit cards, (2) we report results of model comparison, (3) we include macroeconomic conditions in our models and (4) we do this using a very large sample across several different credit card products. In section 2 we describe our modelling and performance assessment methods. In section 3 we discuss the application and macroeconomic data used. In section 4 we provide model comparisons and test results along with a description of an explanatory model with MVs. Finally in Section 5 we provide some conclusions and discussion. 4 of 28

5 2. Method We consider several models as combinations of different variables, modelling frameworks and data transformations. 2.1 Models In general, for retail credit, there are five categories of circumstances that will affect the amount an individual repays on a defaulted loan and can be used to build models of LGD: (1) individual details, some of which can be collected at time of application such as age, income, employment, housing status and address; (2) account information at default: date or age of account at default and outstanding balance; (3) changes in personal circumstances of an obligor over time; (4) macroeconomic or business conditions on date of default, or possibly with a lag or lead on date of default; (5) operational decisions made by the bank, such as the level of risk they were willing to accept on the credit product and the process they use to follow up bad debt. Of these, the richest source of explanatory variables we have is the information provided at time of application for credit along with the credit bureau score collected by the bank at time of application. This is data falling into category (1). We also include category (2) data, account information at default. Including this data implies the model is conditional on default. It is possible to build models unconditionally but this is outside the scope of this paper. It is difficult for a lender to extract data in category (3). It cannot easily keep track of an individual s employment status or, even less so, his or her personal difficulties, such as divorce or illness, that may lead him or her to be unable to fully repay debt. It is possible to use account behaviour data or a behavioural score but we do not do this in this study since such information is not homogeneous within the data we have. It is understood that LGD is likely to be time dependent, varying over the business cycle [Schuermann 2005]. Therefore we include macroeconomic conditions (4). Including a bank s operational decisions (5) for each credit card product over time could also be fruitful. However, this information was not available for our study. 5 of 28

6 To answer Q4, we build and compare models with and without MVs. The question can be further explored by contrasting with models that also include interaction terms between AVs and MVs. To help answer Q3 we should contrast our models with a simple model built with no variables. Within the context of OLS this simple model effectively forecasts LGD to be the mean value taken from the training data set. If the more complex models give improved forecasts over this simple model then the variables we include provide useful information for LGD estimation. Therefore we have four model structures based on including different explanatory variables:- Simple: no covariates in the model. AV: account variables only. AV&MV: account and macroeconomic variables. AV&MV with interactions: also includes interaction terms between AVs and MVs. It is not feasible to include all interaction terms so variable selection is used as described later in Section 3.3. We do not restrict our models to OLS but also consider three alternatives. Tobit and a decision tree model are considered since they have a structure better suited to the bimodal nature of LGD. Least absolute value regression is also considered since it may be that the absolute error may be a more sensible criterion for estimating LGD than least square error. Since LGD follows a truncated distribution with a large number of cases at the extreme values 0 and 1, the Tobit model may be better suited since it takes account of bounds on a dependent variable y i through truncation. The two-tailed Tobit model uses a latent variable y* to model boundary cases such that y = min i ( 1, max( 0, y *)) i yi * = β x + ε where. Assuming the distribution of the residuals conditional on x is normal, the following log-likelihood function is constructed for maximum 2 likelihood estimation of β and variance of residuals σ : log T x β σ i i ( β, σ ) = log φ σ + log 1 Φ + y i x σ T β x log Φ < < = = 0 yi 1 yi 0 yi 1 i i T i β 1 σ 6 of 28

7 where φ and Φ are the probability and cumulative density functions for the standard normal distribution respectively. This likelihood function is constructed by considering the probabilities of the dependent variable being between the boundaries and also on each boundary separately [Greene 1997, pp ]. The decision tree model uses two logistic regression sub-models to model the special cases for total loss and no loss, ie LGD = 1 and 0 respectively, as binary classification problems. Then, if 0<LGD<1, an OLS regression model is used. This decision tree model is illustrated in Figure 1. We may expect this decision tree to be good at modelling LGD due to the large number of boundary cases at 0 and 1 which allow us to naturally approach the problem as a hybrid of two classification problems and a regression problem. This approach is meaningful since we may suppose there are special conditions which would make a customer pay back the full amount of debt or to pay back nothing, rather than just a portion. LGD is forecast for an account i as the expected value given the three sub-models, ie ( 1 )( p + ( p ) L ) p0i 1i 1 1i i where i p 0 is the probability of LGD=0 for account i computed from the second estimated logistic regression model, p 1 i is the probability of LGD=1 from the first estimated logistic regression model and L i is the estimate of LGD assuming the loss is fractional and computed from the regression model. Is LGD=1? True LGD=1 False Is LGD=0? False Regression of LGD True LGD=0 0<LGD<1 Figure 1. Decision tree model. The first two binary decision boxes (white) are modelled by logistic regression, whilst the final decision box models LGD using OLS regression. 7 of 28

8 As is conventional in the literature we model LGD in terms of recovery rate (RR) rather than LGD directly, where RR = 1-LGD. Our working definition of RR is sum of repayments made over a period t following default RR =. outstanding balance at date of default The choice of recovery period is a business decision. We consider a recovery period of t=12 months following default. Often banks are interested in longer periods or want to estimate recovery at close of account, but as we discover, a 12 month model can be used to estimate for longer recovery periods. It is possible to calculate LGD in alternative ways. For example, time after default could be an event, such as account charge off, rather than a fixed period; or the market value of the bad debt based on sale of the exposure in the market could be taken into account along with repayments; or administration costs of following up default may be included in the LGD calculation. However, these alternatives are beyond the scope of this paper. Loss of interest payments could also be included in the definition of LGD but this is not required by Basel II. Since the distribution of RR is bimodal and U-shaped, we also consider modelling fractional logit, beta distribution and probit transformations of RR. Fractional logit transformation: = ( RR) log( 1 RR) T RR log. The fractional logit model is particularly attractive since it deals specifically with response variables in the range 0 to 1, like RR, by transforming them into a larger range of values. It has been applied in several other econometric analyses [Papke & Wooldridge 1996] and was applied in particular by Dermine and Neto de Carvalho [2005] for modelling LGD for corporate loans. 1 Beta distribution transform: Φ ( beta( RR, α, β,0,1) ) T RR = where Φ is the cumulative density function of the standard normal distribution and α, β are parameters estimated from training data using maximum likelihood estimation. The Beta distribution is appealing since it is able to model bimodal variables with a U- shaped distribution over the interval 0 to 1. It is therefore particularly useful for RR 8 of 28

9 and tends to transform RR into an approximately normal distribution. Figure 2 shows a distribution of the Beta transformation of RR for the credit card default data we use. It illustrates how the transformation creates an approximately normal distribution between two extreme values. The Beta distribution has been used successfully in Moody s KMV Losscalc software package for modelling RR [Gupton & Stein 2005]. Beta(RR) Figure 2. Distribution of recovery rates following application of a beta distribution transformation 1. Probit transformation : T RR = Φ 1 { i : Ri RR} n density function of the standard normal distribution and where Φ is the cumulative R,, R are observed RRs taken from training data. This transformation uses a nonparametric approach to transform RR into a normal distribution based on the empirical distribution in the training data. 2.2 Model assessment For OLS we report adjusted R 2 for model fit. This is for two reasons. Firstly, the various models we consider are nested so inevitably those with additional covariates will give improved R 2 model fit. The adjusted R 2 compensates for the additional variables. Secondly, reporting R 2 is misleading since it does not give a fair comparison between different studies with different sample sizes. 1 n 1 Axes values withheld for reasons of commercial confidentiality of data provider. 9 of 28

10 We report coefficient estimates for the OLS model with RR as dependent variable. However, since RR is not normally distributed, the error terms may not be normally distributed. Therefore conventional estimators of standard errors may not be unbiased. Instead we use a bootstrap to construct distributions for the coefficient estimates [Kennedy 2003, section 4.6]. As Lam and Veall [2002] show when OLS is used with non-normal, and in particular bimodal, distributed error terms, the bootstrap gives accurate estimates of confidence intervals where the usual analytic method fails. We report the relative effect of each MV within the model. The coefficient estimates for MVs are multiplied by their standard deviation in the training data to derive standardized coefficient estimates. They show change in RR for a one standard deviation change in the covariate value and are therefore comparable and give an indication of the relative importance of each MV within the model. This approach was taken to study the effects of MVs on corporate default by Figlewski et al [2007]. Since our credit card data spans the period from 1999 to 2005, standard deviations for each MV are computed based on values within this period Hold-out test procedure for forecasts To test the effectiveness of the LGD model for forecasts, we use a hold-out sample, testing on credit card default data that is independent and follows chronologically after the period of the training data used to build the models. This approach allows us to simulate the expected operational use of LGD models in retail credit when a financial institution may want to assess the LGD risk on a new batch of defaults based on the performance of past defaults. In detail, we select cohorts of test data sets consisting only of accounts that default in a particular quarter. For each of these cohorts we train only on default data available prior to that quarter. Since we need to measure LGD after a period of t months following default, we need to ensure that date of defaults in the training data are at least t months prior to the beginning of the test quarter. This test procedure is illustrated in Figure 3. So, for example, if our test set was 2003Q3 and we are considering LGD after 12 months, our training data would consist of all cases that default within the period from 1999Q1 to 2002Q2. To estimate robust models with MVs we should train over the whole business cycle which is usually considered between 3 to 5 years. Therefore we consider taking training data sets with a minimum of 3 years of defaults. This then gives us of 28

11 quarters of test set data from 2003Q1 to 2005Q2. Results across these independent test sets then form a time series of forecast results. Training data set (eg 1999Q1 to 2002Q2) Recovery period (12 months) Test data set: one quarter (eg 2003Q3) Recovery period (12 months) Time (1999Q1 to 2006Q2) Figure 3. Illustration of hold out test procedure for one quarter of test data. The test data set holds all cases that default in the same financial quarter and the training data set comprises all cases that default and have full recovery period prior to the test quarter. The accuracy of forecasts relative to the observed true values is measured at the account level by mean square error (MSE). However, we are also interested in how well the model is able to estimate the observed, or true, mean LGD or RR over a portfolio of accounts. Therefore, for each test set quarter we measure the difference between forecast and observed mean RR across all test cases. If this difference is greater than zero, then the model is generally overestimating RR, whereas when it is less than zero, it is underestimating RR. The closer the difference between forecast and observed RR is to zero, the better the estimate. The mean value of both the MSE and absolute value of difference in forecast and observed mean RR (abs diff RR) across the several test quarters are reported in order to get an aggregate measure of performance. Since RR must be between 0 and 1, we truncate all forecasts of RR to fall within that range prior to measuring performance for all cases and all models. Also, to generate comparable results, performance is measured between predicted and observed RR, regardless of which transformation of RR is modelled. Therefore, if a transformation of RR is used, the inverse transform is applied to extract predicted RR and performance is measured on this rather than the transformed value. This is reasonable 11 of 28

12 since ultimately the value we want to model is RR and the transform should merely be a means to that end Forecasting 24 month LGD using a 12 month LGD model For these experiments, we will be assessing models of LGD after 12 months. However, financial institutions often follow a bad debt over several years, so they are also interested in forecasting LGD over a longer period of say 24 or 48 months. For this reason, we also assess the 12 month model to see how well it forecasts for longer periods in comparison with a 24 month model. If it does well, then this implies that a single LGD model may be used to forecast for any LGD period. A 12 month LGD model can be used to forecast 24 month LGD by calibrating the 12 month forecasts to 24 month forecasts. A simple way to do this is to use OLS regression on the 24 month LGD training data to build a linear model of 24 month RR with an intercept and 12 month RR as the explanatory variable. This model is then used to convert 12 month to 24 month forecasts. We use the same training and testing scheme as is set out above. For 24 month LGD we use 6 quarters of test data from 2003Q1 to 2004Q2. 3. Data 3.1 Application data For this study we have available a data set consisting of over 55,000 credit card accounts in default over the period 1999 to 2005 for customers across the whole of the UK. Account holders are expected to make a minimum payment of outstanding balance each month. We define default as a case where a credit card holder has failed to make minimum payments for three consecutive months or more. This is a typical definition of default for credit cards [Thomas et al 2002, p.123] and is in line with the default definition given by the Basel II Accord [BCBS 2006]. The data consists of four different credit card products which are a selection from those offered by a financial institution. As is typical for LGD, its distribution in our data set is between 0 and 1 and approximately U-shaped 2. The calculation of LGD should ideally include 2 For reasons of commercial confidentiality, to protect our data supplier, we cannot reveal exact distributions or values of LGD for our data, nor descriptive details of other variables. This is usual in this research area when large portfolios of live financial data are being studied. Nevertheless since our focus is on reporting forecast performance, this should not be a hindrance to scientific reporting. 12 of 28

13 administration costs for managing and implementing a collection procedure following account default. Unfortunately, this information was not available for the credit card data we used. The credit card data we use has many details extracted at time of application. These include the applicant s housing and employment status, age, income, total number of known credit cards and length of relationship with bank (time with bank). A credit bureau score is also provided and is from the same source and so is homogeneous across products. Demographic information is provided to classify the areas of residence of the credit card holder at time of application. The demographic information was from the same source for all credit card products and was coded into four broad categories: (1) council or poor housing, (2) rural, (3) suburban or wealthy area and (4) others. Additionally, information at time of default is also included in the data. This consists of balance outstanding at default and age of credit card account. We include balance at default since there is strong evidence from past studies that it is an important effect [Grippa et al 2005; Dermine and de Carvalho 2006] and it makes sense to include it operationally, especially if it improves forecasts of LGD. Table 1 shows the full list of variables used. Some variables, time with bank, income and age, have a small percentage of missing values (less than 6% of accounts). For each of these variables, we code missing values to 0 and create a dummy variable to indicate a missing value to capture the mean value amongst accounts with missing values. Several studies have found that PD and LGD are positively correlated (see Rösche and Scheule (2006)). We also found this to be the case in our data. Nevertheless, PD is not included in our models since it is effectively represented by including the application variables that are usually used to model it, along with macroeconomic variables that may explain the joint systematic risk to both PD and LGD (Altman et al 2005). Any single credit card portfolio is liable to have operational effects that will alter overall risk over time for that specific product, such as changes in cut-offs on credit score when accepting applications. This may lead to idiosyncratic links to economic conditions and therefore poorer models using MVs. By combining data across several products the impact of these idiosyncratic effects will be reduced and changes in risk over time are more likely to be linked to more objective effects such as the economy. Additionally, combining several products into one data set will increase the training 13 of 28

14 set size. These two factors should lead to stronger MV models and we test this hypothesis by running experiments for all products combined and also for each product separately. When all products are included in the data set, a dummy variable is used to indicate which product the account belongs to, in order to model different levels of RR between products. 3.2 Macroeconomic variables We consider these three series of macroeconomic data for the UK which we believe would have a strong direct effect on mean LGD for UK retail credit cards:- Selected UK retail banks base interest rates. UK unemployment level: measured as thousands of adults (16+) unemployed. (Earnings) UK earnings index (2000 = 100) for the whole economy including bonuses as a ratio of the retail price index. These are all available from the UK Office for National Statistics as monthly data. We use non-seasonally adjusted data for earnings since we expect that seasonal changes in the economy may have some effect on abilities to repay. We would also have preferred to use non-seasonally adjusted data for the unemployment level but unfortunately this was not available. GDP growth for the UK is a common indicator of economic conditions but we have not included it since it is not available as monthly data which is the granularity that lenders typically require and therefore the granularity we require for our models. The MVs are included for each case at the date of default but it may be that if there is a relationship between MVs and LGD then this is lagged or led. For example, changes in interest rates may, in general, affect ability to pay several months later. For this reason, we also consider LGD models with MVs lag or lead 6 months. Each MV has a time trend: interest rates and unemployment level are generally falling over the period whilst real earnings are steadily increasing. Indeed, earnings generally increase exponentially with time therefore we include it in our model as growth in log earnings over 12 months to remove this obvious time trend. The three MV time series we use are shown in Figure 4. We ensure that any model fit of MVs as explanatory variables is not simply because they follow a time trend that matches a trend in RR by including date of default explicitly in the models. If a MV 14 of 28

15 is a good explanatory variable simply because of a time trend, then the inclusion of date of default should weaken its effect within the model. Including date of default in the AV model also allows us to test whether improvements in forecasts are simply due to a time trend rather than specifically economic conditions Interest rate and log earnings growth (rescaled) Unemployment level Jan 1999 Jul 2000 Jan 2000 Jul 2001 Jan 2001 Jul 2002 Jan 2002 Jul 2003 Jan 2003 Jul 2004 Jan 2004 Jul 2005 Jan 1000 Interest rate Earnings growth (log) Unemployment level Figure 4. UK macroeconomic data Different effects on LGD over time could be captured by using dummy variables for cohorts either at the yearly or quarterly level. There are two reasons we do not do this however. Firstly, the freedom gained when using any time dummies could simply absorb the effect we expect the MVs to explain. Secondly, although fine for explanatory models, it is not clear how such time dummies could be used for forecasting on a hold-out sample since the dummy variables for the period of the hold-out cohort will necessarily have the value 0 for all cases in the training data and so will not have a coefficient estimate. High correlation between MVs is a potential problem since this could lead to multicollinearity within the LGD model and therefore distort parameter estimates. We can test for multicollinearity by measuring the variance inflation factor (VIF) 2 given by ( ) 1 1 R when each MV is regressed on all other model covariates (Kennedy 2003). A high VIF indicates multicollinearity and a VIF greater than 5 is an indication that there may be a problem. 15 of 28

16 3.3 Inclusion of interaction terms Since there are many possible combinations of variables to form interaction terms, the number included in the model is controlled using forward variable selection. All AVs are included in the model but an iterative process is used to include MVs and interaction terms between AVs and MVs. At each step, each of the outstanding MVs and interaction terms not already in the model are added separately. The term that maximally increases a fit criterion is added to the model. The process is repeated until no new interaction terms are found that improve fit. There are several possible fit criteria that could be used and it is common to use an F-test. However, since we are interested in forecasting, we use Akaike s information criterion (AIC) [Akaike 1973]. This has the advantage that it takes account of the parameter space of the model and discourages complex models with large numbers of variables. In turn, this discourages over-fitting to the training data set. We approximate AIC by ( MSE) p n ln + 2 where n and p are number of observations and number of parameters in the model respectively and MSE is the mean square error for observations in the training data. We find using the AIC criterion gives better forecasts than using the standard F-test. Further discussion of variable selection methods and use of AIC for predictive models is given by Miller [1990]. The variable selection procedure we use is further constrained so that for each interaction term included, its constitutive terms are also automatically included [Brambor 2005]. 4 Results Section 4.1 describes forecast performance for comparison of the different models. Then section 4.2 describes the best performing model for forecasts and its statistically significant explanatory variables. Models for longer recovery periods and stress tests are considered in further sections. 4.1 Model comparisons Table 1 shows forecast results for different models. Focussing on the first section showing results for a 12 month recovery period and for all products combined, it is clear that the standard OLS model with both AVs & MVs performs best for both 16 of 28

17 measures of forecast performance. The more complex models, Tobit, decision tree or least absolute value regression give worse performance and so does using any of the transformations of RR. We may expect OLS to do well for the MSE measure, but we may expect that one of the alternative models would be better when estimating mean RR over the portfolio. In particular we may expect least absolute value regression to be better since it is a linear loss function. However OLS forecasts best estimates of mean RR. This is a robust result which was obtained with many alternative experiments. The inclusion of interaction terms gives slightly worse forecasts than the AV&MV model so including interaction terms does not provide any benefit in estimating LGD. It is also notable that the simple model that effectively forecasts the mean RR from the training data set does well and outperforms many of the more complex models. Nevertheless using AVs along with MVs shows considerable gain in performance over the simple model. We also consider MVs with lags or leads of 6 months. We consider lags since it is possible that the effect of the economy on obligor behaviour may be delayed. We consider a lead since this would give the values of MVs midway through the recovery period. Table 1 shows that using lags or leads on MVs gives worse forecasts than using MV values at time of default and are almost as bad as not including MVs at all. The second section of Table 1 shows results for separate products. It is clear that the AV&MV model does not do consistently well for separate products and rarely as well as when all products are combined. In many cases the simple model with no explanatory variables is the best performing for forecasting mean RR. This is due partly to operational peculiarities within each portfolio that may have a spurious link to macroeconomic movements and also partly to reduced training sample size. These results suggest that better LGD models can be built when data from different products are combined. 17 of 28

18 Table 1. Aggregate forecast results for different models. Abs RR diff is the absolute difference between observed and forecast mean RR over each test quarter. Results are given as mean values across the series of test quarters from 2003Q1 to 2005Q2 or 2004Q2 for 12 or 24 month recovery periods respectively. Figures in bold show best results for each recovery period and product group. Recovery period 12 month 12 month Product All 1 Experiment RR Model transform None OLS regression Result Explanatory MSE Abs RR variables diff None (simple) AV AV & MV AV & MV & interaction terms AV & MV lag AV & MV lead Tobit Decision tree Least absolute value regression Beta AV & MV Logit OLS regression Probit Tobit Beta 24 month All None 2 None OLS regression 3 4 Decision tree None (simple) AV AV & MV None (simple) AV AV & MV None (simple) AV AV & MV None (simple) AV AV & MV OLS regression OLS model built AV & MV for 12 month recovery period Figures 5 and 6 show forecast time series results for the first three models listed in Table 1 in more detail for each test quarter. They show that the AV&MV model performs consistently better over time. Figure 5 shows that MSE is lower for AV&MV than either the simple or AV model and improves over time, possibly a result of having training data over a longer period of the business cycle to model the MV effects. Figure 6 shows that overall the AV&MV model forecasts mean RR much more closely than the other models as is evidenced by how close the difference between forecast and observed mean RR is to zero. In contrast the AV model is 18 of 28

19 consistently over-estimating RR over time. In 2004Q2, the simple model achieves the best result but this is serendipitous since its forecasts are simply moving from underestimating to overestimating RR at that time. These results show that including MVs is important to improve LGD forecast results. It should be noted, however, that in our study at least 3 years of training data is required. We found that using less than this led to unstable models with MVs that occasionally extremely over or under estimate LGD MSE Q1 2003Q2 2003Q3 2003Q4 2004Q1 2004Q2 2004Q3 2004Q4 2005Q1 2005Q2 simple (no variables) AV AV&MV Figure 5. MSE of forecasts for each test data set from 2003Q1 to 2005Q2 for three OLS models with different variables. Difference between forecast and observed mean RR Q1 2003Q2 2003Q3 2003Q4 2004Q1 2004Q2 2004Q3 2004Q4 2005Q1 2005Q2 simple (no variables) AV AV&MV zero line Figure 6. Difference between forecast and observed mean RR. Results relate to each test data set from 2003Q1 to 2005Q2 for three OLS model with different variables. 19 of 28

20 4.2 Explanatory model Table 1 shows that the AV&MV model using OLS regression was the best performing forecaster so we describe this model estimate in further detail. We report model fit results for LGD models built on all data from 1999 to Including MVs in the LGD model improves fit to training data and is statistically significant. The AV model has an adjusted R 2 model fit of When MVs are included this increases to When interaction terms between MVs and application variables are also included using forward selection, the adjusted R 2 is This small increase indicates that adding interactions does not give a noticeable improvement reinforcing the results observed for forecasting. Table 2 shows coefficient estimates using OLS regression with the AV&MV model. Since the error residuals are nonnormal we have used bootstrap to compute statistical significance. The reported p- values are from a normal-based distribution imposed on bootstrap coefficient estimates. This is reasonable since we found the 95% confidence intervals for the normal-based distributions matched closely those for the empirical percentile distribution sharing never less than 92% of the others range. Many of the model variables are statistically significant at a 0.01 level. In particular, housing status is important with council and private tenants having generally lower RR than home owners; the longer the time that the customer has been with the bank (time with bank) and the longer the individual has held the credit card prior to default (time on books at default) both tend to increase RR; individuals with higher income also tend to have higher RR. All these are indicators of customer stability which we would expect to give lower risk. Higher credit bureau scores tend to give higher RR, which again shows that individuals with expected low credit risk tend to pay back more of their bad debt. Also, the size of the initial balance at default has a negative effect on RR which is what we would expect since higher outstanding debt is more difficult to pay back. We note a positive correlation between default balance and income, which suggests that people with high income tend to build up larger balances on their credit card. Indeed if we remove default balance from the model the sign on income becomes negative since it becomes a surrogate for the missing default balance variable. However, when both income and default balance are in the model together then the sign on income becomes positive which is what we would expect since the 20 of 28

21 Table 2. Coefficent estimates for OLS recovery rate model. Training data set for this AV&MV model included all available cases from 1999 to Standard errors and p-values are computed with bootstrap estimation with 1,000 repetitions. Values in italics show standardized estimates for MVs. Dummy variable taking the value 0 or 1, for false and true respectively, are indicated by (D). Variable Coefficient Standard z P> z estimate error Intercept Home status (D): Council tenant Private tenant Term time accommodation Other Lives with parents Excluded category: Own home Time with bank No data on time with bank (D) Income (log) Income unknown (D) Number of cards Years at current address Employment status (D): Home maker Unemployed Retired Part time Self employed Other Student Excluded category: Employed Age at default Age unknown (D) Credit bureau score Demographic group (D): Council or poor housing area Rural area Suburban or wealthy area Excluded category: Other Credit card product (D): Excluded category: 4 Time on books at default (months) Balance at default (log) Date of default Macroeconomic variables: Bank interest rates standardized estimate Unemployment level standardized estimate Earnings growth (log) standardized estimate availability of higher income implies a greater capacity to repay the outstanding balance. Employment status has less impact, although we see that home makers tend to have lower RR. Demographic information was important with those living in areas 21 of 28

22 classified as council or poor housing tending to have lower RR than those in rural, suburban or wealthy areas. Table 2 shows that coefficient estimates for MVs have the expected signs. That is, the parameter estimate for interest rates is negative meaning that higher interest rates at the time of default tend to give lower RR. Similarly, higher unemployment levels are also linked to lower RR. However, higher earnings growth, year-on-year, leads to increased RR which suggests earnings growth leads to better recoveries. Bank interest rates and unemployment level are both statistically significant at a 0.01 level, although earnings growth is not statistically significant in the model. The coefficient estimate for date of default is small and not statistically significant which implies that the effect of the MVs is not due to a simple time trend. Date of default is a stronger explanatory variable in the AV model and is statistically significant at a 0.01 level. Nevertheless, the results in Table 1, comparing AV and AV&MV, show that the time trend is not sufficient to explain the improved performance given by the model with MVs. Table 2 also shows standardized parameter estimates for MVs. These give an indication of the relative size of effect of each MV. Bank interest rate clearly has the largest magnitude with unemployment level having less than half the interest rate effect. Additionally we find that the VIF for any of the MVs when regressed on all other covariates in model (1) was always less than 2 which is sufficiently small that we should not expect that the results are affected by multicollinearity. Since including interaction terms did not improve forecasts or model fit we do not report interaction terms in the explanatory model. 4.3 Forecasting 24 month LGD with a 12 month LGD model We test whether it is possible to use a model built for a 12 month recovery period to forecast for a 24 month period. Following the procedure given in section we get RR (24 months) = RR (12 months) with adjusted R 2 = This model always gives a higher estimated RR after 24 months than after 12 months. This is intuitively correct since we would normally expect RR for an individual not to decrease over time. This model is used to compare the AV&MV model built on the 12 month period with one built on the 24 month 22 of 28

23 period using the procedure described in Section Results are shown in the third section of Table 1. The 12 month MV model outperforms the 24 month model for forecasts of 24 month RR for both reported forecast measures. However, this is natural since the recovery period buffer between training and test data (see Figure 2) implies that the 12 month model is built from more recent data. These results indicate that working with a 12 month LGD model is sufficient since the same model can also be used to model longer periods. However, with further investigation, we expect that some hybrid model combining the 12 and 24 month trained LGD models would be the most effective. 4.4 Stress test using interest rates We can use LGD models with MVs for stress testing by replacing values of MVs by stressed values. Therefore they can be used to model particular scenarios or as part of a larger simulation based on a distribution of past macroeconomic conditions. We conduct a univariate stress test by considering the effect of interest rate changes on the test data set in 2005Q2 which is the last quarter of test data available from our data set. Figure 4 shows that for the training period, extreme values of interest rate are 6% and 3.5%. When these values are used in the AV&MV model, the forecast mean RR for the test quarter changes by -17% and +28% respectively. This shows the shift in overall RR for the worst and best cases. A downturn RR 17% less than during normal or good conditions is not implausible. 5. Conclusion In the introduction we posed four main research questions. We discuss conclusions relating to each of these questions in turn. Q1. Which credit card application and default variables are the key drivers of retail LGD? Our experiments have shown that application variables can be used to model LGD. In particular, Table 2 shows that home status, time with bank and credit bureau score are strong explanatory variables. Additionally, we found that income and balance at default form a joint effect. The negative correlation of default balance with RR matches the findings of Dermine and Neto de Carvalho [2005] that size of loan is a significant explanatory variable for RR. We found that default balance contributes to 23 of 28

24 forecasts of LGD since when it is removed from the AV&MV model, forecast performance is worse. Other variables at time of default such as age of obligor and age of account also influence LGD. Interestingly age has a positive effect on LGD (negative on RR) and we found the linear relation between age and LGD remained even when age was replaced by several age categories using dummy variables. This is surprising since we would normally expect risk to reduce with maturity. That is, for PD models, typically the positive effect of age peaks in the mid-30 s. Q2. What is the best modelling approach for retail LGD? Despite trying several different combinations of variables, models such as Tobit and a decision tree and various transformations of the dependent variable all of which should in theory be good models for bimodal LGD it turns out the best forecast model in our experiments was simple OLS. Why this may be so is unclear, although we conjecture that since LGD is difficult to model, with poor model fit, this implies that regression forecasts tend to fall in a narrow range away from the boundary cases of 0 and 1, therefore models dealing carefully with the boundary cases are superfluous in practice. Q3. How well do the models perform at forecasting LGD? Model fit is weak with R 2 = 0.11 but such low values are typical of modelling LGD. On a sample of 1,118 defaulted financial leases, De Laurentis and Riani [2005] report R 2 values from 0.20 to 0.45 after outliers were deliberately removed. On a data set of 374 defaulted loans to small and medium size firms, Dermine and Neto de Carvalho [2005] report a pseudo R 2 value of 0.13 when considering a 12-month recovery period. The lower R 2 value we report is partly a consequence of the large sample size of our data. In contrast, if we restrict our sample size to just 500 randomly selected cases we get R 2 = This is more typical of other studies but nevertheless it is misleading and we get adjusted R 2 = 0.13 which is closer to the value for the full sample. Therefore we report adjusted R 2 in our main results and suggest its use for comparison of LGD models across studies with different sample sizes. For forecasts, the simple model which effectively forecasts mean LGD from the training data set does very well and outperforms many of the complex models as can be seen in Table 1. Nevertheless we still see a modest improvement in MSE 24 of 28

Statistics for Retail Finance. Chapter 8: Regulation and Capital Requirements

Statistics for Retail Finance. Chapter 8: Regulation and Capital Requirements Statistics for Retail Finance 1 Overview > We now consider regulatory requirements for managing risk on a portfolio of consumer loans. Regulators have two key duties: 1. Protect consumers in the financial

More information

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance. Chapter 6: Behavioural models Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

More information

Credit Scoring Modelling for Retail Banking Sector.

Credit Scoring Modelling for Retail Banking Sector. Credit Scoring Modelling for Retail Banking Sector. Elena Bartolozzi, Matthew Cornford, Leticia García-Ergüín, Cristina Pascual Deocón, Oscar Iván Vasquez & Fransico Javier Plaza. II Modelling Week, Universidad

More information

Corporate Defaults and Large Macroeconomic Shocks

Corporate Defaults and Large Macroeconomic Shocks Corporate Defaults and Large Macroeconomic Shocks Mathias Drehmann Bank of England Andrew Patton London School of Economics and Bank of England Steffen Sorensen Bank of England The presentation expresses

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Macroeconomic drivers of private health insurance coverage. nib Health Insurance

Macroeconomic drivers of private health insurance coverage. nib Health Insurance Macroeconomic drivers of private health insurance coverage nib Health Insurance 1 September 2011 Contents Executive Summary...i 1 Methodology and modelling results... 2 2 Forecasts... 6 References... 8

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

IMPLEMENTATION NOTE. Validating Risk Rating Systems at IRB Institutions

IMPLEMENTATION NOTE. Validating Risk Rating Systems at IRB Institutions IMPLEMENTATION NOTE Subject: Category: Capital No: A-1 Date: January 2006 I. Introduction The term rating system comprises all of the methods, processes, controls, data collection and IT systems that support

More information

LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

More information

, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (

, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients ( Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we

More information

Non-Bank Deposit Taker (NBDT) Capital Policy Paper

Non-Bank Deposit Taker (NBDT) Capital Policy Paper Non-Bank Deposit Taker (NBDT) Capital Policy Paper Subject: The risk weighting structure of the NBDT capital adequacy regime Author: Ian Harrison Date: 3 November 2009 Introduction 1. This paper sets out,

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Models for Count Data With Overdispersion

Models for Count Data With Overdispersion Models for Count Data With Overdispersion Germán Rodríguez November 6, 2013 Abstract This addendum to the WWS 509 notes covers extra-poisson variation and the negative binomial model, with brief appearances

More information

1 Logit & Probit Models for Binary Response

1 Logit & Probit Models for Binary Response ECON 370: Limited Dependent Variable 1 Limited Dependent Variable Econometric Methods, ECON 370 We had previously discussed the possibility of running regressions even when the dependent variable is dichotomous

More information

Stress Test Modeling in Cards Portfolios

Stress Test Modeling in Cards Portfolios Stress Test Modeling in Cards Portfolios José J. Canals-Cerdá The views expressed here are my own and not necessarily those of the Federal Reserve or its staff. Credit Card Performance: Charge-off Rate

More information

Penalized regression: Introduction

Penalized regression: Introduction Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

More information

Dániel Holló: Risk developments on the retail mortgage loan market*

Dániel Holló: Risk developments on the retail mortgage loan market* Dániel Holló: Risk developments on the retail mortgage loan market* In this study, using three commercial banks retail mortgage loan portfolios (consisting of approximately 200,000 clients with housing

More information

Capital adequacy for credit risk: A practical exercise

Capital adequacy for credit risk: A practical exercise Capital adequacy for credit risk: A practical exercise Financial Institutions www.msnorthamerica.com Design and Layout Marketing and Communication Department Management Solutions - Spain Photographs: Management

More information

Optimal Debt-to-Equity Ratios and Stock Returns

Optimal Debt-to-Equity Ratios and Stock Returns Utah State University DigitalCommons@USU All Graduate Plan B and other Reports Graduate Studies 3-2014 Optimal Debt-to-Equity Ratios and Stock Returns Courtney D. Winn Utah State University Follow this

More information

The Financial Crisis and the Bankruptcy of Small and Medium Sized-Firms in the Emerging Market

The Financial Crisis and the Bankruptcy of Small and Medium Sized-Firms in the Emerging Market The Financial Crisis and the Bankruptcy of Small and Medium Sized-Firms in the Emerging Market Sung-Chang Jung, Chonnam National University, South Korea Timothy H. Lee, Equifax Decision Solutions, Georgia,

More information

Yiming Peng, Department of Statistics. February 12, 2013

Yiming Peng, Department of Statistics. February 12, 2013 Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop

More information

Credit Risk Models. August 24 26, 2010

Credit Risk Models. August 24 26, 2010 Credit Risk Models August 24 26, 2010 AGENDA 1 st Case Study : Credit Rating Model Borrowers and Factoring (Accounts Receivable Financing) pages 3 10 2 nd Case Study : Credit Scoring Model Automobile Leasing

More information

Modelling Bank Loan LGD of Corporate and SME Segments A Case Study 1 Radovan Chalupka 2 and Juraj Kopecsni 3

Modelling Bank Loan LGD of Corporate and SME Segments A Case Study 1 Radovan Chalupka 2 and Juraj Kopecsni 3 Modelling Bank Loan LGD of Corporate and SME Segments A Case Study 1 Radovan Chalupka 2 and Juraj Kopecsni 3 Abstract A methodology is proposed to estimate loss given default (LGD), which is applied to

More information

Statistics in Retail Finance. Chapter 2: Statistical models of default

Statistics in Retail Finance. Chapter 2: Statistical models of default Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Validation of Internal Rating and Scoring Models

Validation of Internal Rating and Scoring Models Validation of Internal Rating and Scoring Models Dr. Leif Boegelein Global Financial Services Risk Management Leif.Boegelein@ch.ey.com 07.09.2005 2005 EYGM Limited. All Rights Reserved. Agenda 1. Motivation

More information

Private Sector Employment Indicator, Quarter 1 2015 (February 2015 to April 2015)

Private Sector Employment Indicator, Quarter 1 2015 (February 2015 to April 2015) STATISTICAL RELEASE Date: 14 July 2015 Status: Experimental Official Statistics Coverage: England; Regions Private Sector Employment Indicator, Quarter 1 2015 (February 2015 to April 2015) 1. Introduction

More information

11/20/2014. Correlational research is used to describe the relationship between two or more naturally occurring variables.

11/20/2014. Correlational research is used to describe the relationship between two or more naturally occurring variables. Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

More information

Predicting Defaults of Loans using Lending Club s Loan Data

Predicting Defaults of Loans using Lending Club s Loan Data Predicting Defaults of Loans using Lending Club s Loan Data Oleh Dubno Fall 2014 General Assembly Data Science Link to my Developer Notebook (ipynb) - http://nbviewer.ipython.org/gist/odubno/0b767a47f75adb382246

More information

Get ready for IFRS 9. The impairment requirements

Get ready for IFRS 9. The impairment requirements Get ready for IFRS 9 The impairment requirements IFRS 9 (204) Financial Instruments fundamentally rewrites the accounting rules for financial instruments. It introduces a new approach for financial asset

More information

Testing for Granger causality between stock prices and economic growth

Testing for Granger causality between stock prices and economic growth MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at http://mpra.ub.uni-muenchen.de/2962/ MPRA Paper No. 2962, posted

More information

April 2016. Online Payday Loan Payments

April 2016. Online Payday Loan Payments April 2016 Online Payday Loan Payments Table of contents Table of contents... 1 1. Introduction... 2 2. Data... 5 3. Re-presentments... 8 3.1 Payment Request Patterns... 10 3.2 Time between Payment Requests...15

More information

ESTIMATING PROBABILITIES OF DEFAULT UNDER MACROECONOMIC SCENARIOS*

ESTIMATING PROBABILITIES OF DEFAULT UNDER MACROECONOMIC SCENARIOS* Articles Part II ESTIMATING PROBABILITIES OF DEFAULT UNDER MACROECONOMIC SCENARIOS* António Antunes** Nuno Ribeiro** Paula Antão** INTRODUCTION The assessment of the creditworthiness of current and prospective

More information

Executive Summary. Abstract. Heitman Analytics Conclusions:

Executive Summary. Abstract. Heitman Analytics Conclusions: Prepared By: Adam Petranovich, Economic Analyst apetranovich@heitmananlytics.com 541 868 2788 Executive Summary Abstract The purpose of this study is to provide the most accurate estimate of historical

More information

Approaches to Stress Testing Credit Risk: Experience gained on Spanish FSAP

Approaches to Stress Testing Credit Risk: Experience gained on Spanish FSAP Approaches to Stress Testing Credit Risk: Experience gained on Spanish FSAP Messrs. Jesús Saurina and Carlo Trucharte Bank of Spain Paper presented at the Expert Forum on Advanced Techniques on Stress

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

5. Multiple regression

5. Multiple regression 5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

More information

ECON Introductory Econometrics. Lecture 15: Binary dependent variables

ECON Introductory Econometrics. Lecture 15: Binary dependent variables ECON4150 - Introductory Econometrics Lecture 15: Binary dependent variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 11 Lecture Outline 2 The linear probability model Nonlinear probability

More information

Solutions to Problem Set #2 Spring, 2013. 1.a) Units of Price of Nominal GDP Real Year Stuff Produced Stuff GDP Deflator GDP

Solutions to Problem Set #2 Spring, 2013. 1.a) Units of Price of Nominal GDP Real Year Stuff Produced Stuff GDP Deflator GDP Economics 1021, Section 1 Prof. Steve Fazzari Solutions to Problem Set #2 Spring, 2013 1.a) Units of Price of Nominal GDP Real Year Stuff Produced Stuff GDP Deflator GDP 2003 500 $20 $10,000 95.2 $10,504

More information

CREDIT RISK MANAGEMENT

CREDIT RISK MANAGEMENT GLOBAL ASSOCIATION OF RISK PROFESSIONALS The GARP Risk Series CREDIT RISK MANAGEMENT Chapter 1 Credit Risk Assessment Chapter Focus Distinguishing credit risk from market risk Credit policy and credit

More information

ESTIMATING EXPECTED LOSS GIVEN DEFAULT

ESTIMATING EXPECTED LOSS GIVEN DEFAULT 102 ESTIMATING EXPECTED LOSS GIVEN DEFAULT ESTIMATING EXPECTED LOSS GIVEN DEFAULT Petr Jakubík and Jakub Seidler This article discusses the estimation of a key credit risk parameter loss given default

More information

Credit Risk in the Leasing Business

Credit Risk in the Leasing Business Credit Risk in the Leasing Business - A case study of low probability of default - by Mathias SCHMIT & DEGOUYS C. DELZELLE D. STUYCK J. WAUTELET F. 1 Université Libre de Bruxelles, Solvay Business School

More information

Regression analysis in practice with GRETL

Regression analysis in practice with GRETL Regression analysis in practice with GRETL Prerequisites You will need the GNU econometrics software GRETL installed on your computer (http://gretl.sourceforge.net/), together with the sample files that

More information

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios By: Michael Banasiak & By: Daniel Tantum, Ph.D. What Are Statistical Based Behavior Scoring Models And How Are

More information

Effect of Macroeconomic variables on Healthcare Loan/Lease Portfolio Delinquency Rate

Effect of Macroeconomic variables on Healthcare Loan/Lease Portfolio Delinquency Rate Effect of Macroeconomic variables on Healthcare Loan/Lease Portfolio Delinquency Rate Shylu John ** (shylu.john@yahoo.co.in) Principal Consultant, Genpact India Ltd with 10 years industry experience in

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Assessing Credit Risk of the Companies Sector

Assessing Credit Risk of the Companies Sector Assessing Credit Risk of the Companies Sector Gerhard Winkler Annual Regional Seminar On Financial Stability Issues INTERNATIONAL MONETARY FUND and NATIONAL BANK OF ROMANIA Sinaia, Romania September 18-19,

More information

Credit Risk and Capital Requirements under Basel II

Credit Risk and Capital Requirements under Basel II Credit Risk and Capital Requirements under Basel II Ana Lacerda Banco de Portugal Paula Antão Banco de Portugal 6th World Congress June 22-26, 2010 Outline of the talk 1. An overview of capital requirements

More information

SELF-TEST: SIMPLE REGRESSION

SELF-TEST: SIMPLE REGRESSION ECO 22000 McRAE SELF-TEST: SIMPLE REGRESSION Note: Those questions indicated with an (N) are unlikely to appear in this form on an in-class examination, but you should be able to describe the procedures

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

SAMPLE SELECTION BIAS IN CREDIT SCORING MODELS

SAMPLE SELECTION BIAS IN CREDIT SCORING MODELS SAMPLE SELECTION BIAS IN CREDIT SCORING MODELS John Banasik, Jonathan Crook Credit Research Centre, University of Edinburgh Lyn Thomas University of Southampton ssm0 The Problem We wish to estimate an

More information

UNDERSTANDING MULTIPLE REGRESSION

UNDERSTANDING MULTIPLE REGRESSION UNDERSTANDING Multiple regression analysis (MRA) is any of several related statistical methods for evaluating the effects of more than one independent (or predictor) variable on a dependent (or outcome)

More information

Nature of credit risk

Nature of credit risk Nature of credit risk Credit risk The risk that obligors (borrowers, counterparties and debtors) might default on their obligations, or having an impaired ability to repay them. How to measure the credit

More information

Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology

Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of

More information

SEMINAR ON CREDIT RISK MANAGEMENT AND SME BUSINESS RENATO MAINO. Turin, June 12, 2003. Agenda

SEMINAR ON CREDIT RISK MANAGEMENT AND SME BUSINESS RENATO MAINO. Turin, June 12, 2003. Agenda SEMINAR ON CREDIT RISK MANAGEMENT AND SME BUSINESS RENATO MAINO Head of Risk assessment and management Turin, June 12, 2003 2 Agenda Italian market: the peculiarity of Italian SMEs in rating models estimation

More information

Financial-Institutions Management

Financial-Institutions Management Solutions 3 Chapter 11: Credit Risk Loan Pricing and Terms 9. County Bank offers one-year loans with a stated rate of 9 percent but requires a compensating balance of 10 percent. What is the true cost

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Module 6: Introduction to Time Series Forecasting

Module 6: Introduction to Time Series Forecasting Using Statistical Data to Make Decisions Module 6: Introduction to Time Series Forecasting Titus Awokuse and Tom Ilvento, University of Delaware, College of Agriculture and Natural Resources, Food and

More information

Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA

Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA Introduction A common question faced by many actuaries when selecting loss development factors is whether to base the selected

More information

Credit Scorecards for SME Finance The Process of Improving Risk Measurement and Management

Credit Scorecards for SME Finance The Process of Improving Risk Measurement and Management Credit Scorecards for SME Finance The Process of Improving Risk Measurement and Management April 2009 By Dean Caire, CFA Most of the literature on credit scoring discusses the various modelling techniques

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

WORKING PAPER NO. 12-18 CREDIT RISK ANALYSIS OF CREDIT CARD PORTFOLIOS UNDER ECONOMIC STRESS CONDITIONS. Piu Banerjee Federal Reserve Bank of New York

WORKING PAPER NO. 12-18 CREDIT RISK ANALYSIS OF CREDIT CARD PORTFOLIOS UNDER ECONOMIC STRESS CONDITIONS. Piu Banerjee Federal Reserve Bank of New York WORKING PAPER NO. 12-18 CREDIT RISK ANALYSIS OF CREDIT CARD PORTFOLIOS UNDER ECONOMIC STRESS CONDITIONS Piu Banerjee Federal Reserve Bank of New York José J. Canals-Cerdá Federal Reserve Bank of Philadelphia

More information

Modelling Bank Loan LGD of Corporate and SME Segments A Case Study

Modelling Bank Loan LGD of Corporate and SME Segments A Case Study Institute of Economic Studies, Faculty of Social Sciences Charles University in Prague Modelling Bank Loan LGD of Corporate and SME Segments A Case Study Radovan Chalupka Juraj Kopecsni IES Working Paper:

More information

MSCI US Equity Indices Methodology

MSCI US Equity Indices Methodology Index Construction Objectives and Methodology for the MSCI US Equity Indices Contents Section 1: US Equity Indices Methodology Overview... 5 1.1 Introduction... 5 1.2 Defining the US Equity Market Capitalization

More information

MORTGAGE LENDER PROTECTION UNDER INSURANCE ARRANGEMENTS Irina Genriha Latvian University, Tel. +371 26387099, e-mail: irina.genriha@inbox.

MORTGAGE LENDER PROTECTION UNDER INSURANCE ARRANGEMENTS Irina Genriha Latvian University, Tel. +371 26387099, e-mail: irina.genriha@inbox. MORTGAGE LENDER PROTECTION UNDER INSURANCE ARRANGEMENTS Irina Genriha Latvian University, Tel. +371 2638799, e-mail: irina.genriha@inbox.lv Since September 28, when the crisis deepened in the first place

More information

ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? 1. INTRODUCTION

ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? 1. INTRODUCTION ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? SAMUEL H. COX AND YIJIA LIN ABSTRACT. We devise an approach, using tobit models for modeling annuity lapse rates. The approach is based on data provided

More information

Test of Hypotheses. Since the Neyman-Pearson approach involves two statistical hypotheses, one has to decide which one

Test of Hypotheses. Since the Neyman-Pearson approach involves two statistical hypotheses, one has to decide which one Test of Hypotheses Hypothesis, Test Statistic, and Rejection Region Imagine that you play a repeated Bernoulli game: you win $1 if head and lose $1 if tail. After 10 plays, you lost $2 in net (4 heads

More information

Interpreting Market Responses to Economic Data

Interpreting Market Responses to Economic Data Interpreting Market Responses to Economic Data Patrick D Arcy and Emily Poole* This article discusses how bond, equity and foreign exchange markets have responded to the surprise component of Australian

More information

Investment Statistics: Definitions & Formulas

Investment Statistics: Definitions & Formulas Investment Statistics: Definitions & Formulas The following are brief descriptions and formulas for the various statistics and calculations available within the ease Analytics system. Unless stated otherwise,

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Basel Committee on Banking Supervision. An Explanatory Note on the Basel II IRB Risk Weight Functions

Basel Committee on Banking Supervision. An Explanatory Note on the Basel II IRB Risk Weight Functions Basel Committee on Banking Supervision An Explanatory Note on the Basel II IRB Risk Weight Functions July 2005 Requests for copies of publications, or for additions/changes to the mailing list, should

More information

Section 6: Model Selection, Logistic Regression and more...

Section 6: Model Selection, Logistic Regression and more... Section 6: Model Selection, Logistic Regression and more... Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Model Building

More information

An Introduction to Ensemble Learning in Credit Risk Modelling

An Introduction to Ensemble Learning in Credit Risk Modelling An Introduction to Ensemble Learning in Credit Risk Modelling October 15, 2014 Han Sheng Sun, BMO Zi Jin, Wells Fargo Disclaimer The opinions expressed in this presentation and on the following slides

More information

Counterparty Credit Risk for Insurance and Reinsurance Firms. Perry D. Mehta Enterprise Risk Management Symposium Chicago, March 2011

Counterparty Credit Risk for Insurance and Reinsurance Firms. Perry D. Mehta Enterprise Risk Management Symposium Chicago, March 2011 Counterparty Credit Risk for Insurance and Reinsurance Firms Perry D. Mehta Enterprise Risk Management Symposium Chicago, March 2011 Outline What is counterparty credit risk Relevance of counterparty credit

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Predicting Recovery Rates for Defaulting Credit Card Debt

Predicting Recovery Rates for Defaulting Credit Card Debt Predicting Recovery Rates for Defaulting Credit Card Debt Angela Moore Quantitative Financial Risk Management Centre School of Management University of Southampton Abstract Defaulting credit card debt

More information

Regression with a Binary Dependent Variable

Regression with a Binary Dependent Variable Regression with a Binary Dependent Variable Chapter 9 Michael Ash CPPA Lecture 22 Course Notes Endgame Take-home final Distributed Friday 19 May Due Tuesday 23 May (Paper or emailed PDF ok; no Word, Excel,

More information

A macroeconomic credit risk model for stress testing the Romanian corporate credit portfolio

A macroeconomic credit risk model for stress testing the Romanian corporate credit portfolio Academy of Economic Studies Bucharest Doctoral School of Finance and Banking A macroeconomic credit risk model for stress testing the Romanian corporate credit portfolio Supervisor Professor PhD. Moisă

More information

AMF Instruction DOC Calculation of global exposure for authorised UCITS and AIFs

AMF Instruction DOC Calculation of global exposure for authorised UCITS and AIFs AMF Instruction DOC-2011-15 Calculation of global exposure for authorised UCITS and AIFs Reference texts : Articles 411-72 to 411-81 and 422-51 to 422-60 1 of the AMF General Regulation 1. General provisions...

More information

Trade-Off Study Sample Size: How Low Can We go?

Trade-Off Study Sample Size: How Low Can We go? Trade-Off Study Sample Size: How Low Can We go? The effect of sample size on model error is examined through several commercial data sets, using five trade-off techniques: ACA, ACA/HB, CVA, HB-Reg and

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

More information

Memento on Eviews output

Memento on Eviews output Memento on Eviews output Jonathan Benchimol 10 th June 2013 Abstract This small paper, presented as a memento format, reviews almost all frequent results Eviews produces following standard econometric

More information

University of Wisconsin Oshkosh

University of Wisconsin Oshkosh University of Wisconsin Oshkosh Oshkosh Scholar Submission Volume IV Fall 2009 The Correlation Between Happiness and Inequality Kiley Williams, author Dr. Marianne Johnson, Economics, faculty adviser Kiley

More information

CONSULTATION PAPER P016-2006 October 2006. Proposed Regulatory Framework on Mortgage Insurance Business

CONSULTATION PAPER P016-2006 October 2006. Proposed Regulatory Framework on Mortgage Insurance Business CONSULTATION PAPER P016-2006 October 2006 Proposed Regulatory Framework on Mortgage Insurance Business PREFACE 1 Mortgage insurance protects residential mortgage lenders against losses on mortgage loans

More information

Financial Evolution and Stability The Case of Hedge Funds

Financial Evolution and Stability The Case of Hedge Funds Financial Evolution and Stability The Case of Hedge Funds KENT JANÉR MD of Nektar Asset Management, a market-neutral hedge fund that works with a large element of macroeconomic assessment. Hedge funds

More information

Equifax Risk Score 2.0

Equifax Risk Score 2.0 Equifax Risk Score 2.0 Customer Handbook Prepared by: Larry Macdonald, Sr. Product Manager 10-Jun-2014 Table of Contents Introducing the Equifax Risk Score 2.0... 3 1. Modeling Concepts... 3 1.1 Population

More information

Asymmetry and the Cost of Capital

Asymmetry and the Cost of Capital Asymmetry and the Cost of Capital Javier García Sánchez, IAE Business School Lorenzo Preve, IAE Business School Virginia Sarria Allende, IAE Business School Abstract The expected cost of capital is a crucial

More information

A Test Of The M&M Capital Structure Theories Richard H. Fosberg, William Paterson University, USA

A Test Of The M&M Capital Structure Theories Richard H. Fosberg, William Paterson University, USA A Test Of The M&M Capital Structure Theories Richard H. Fosberg, William Paterson University, USA ABSTRACT Modigliani and Miller (1958, 1963) predict two very specific relationships between firm value

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

AN ASSESSMENT OF CAPITAL REQUIREMENTS UNDER BASEL II: THE PORTUGUESE CASE*

AN ASSESSMENT OF CAPITAL REQUIREMENTS UNDER BASEL II: THE PORTUGUESE CASE* Articles Part II AN ASSESSMENT OF CAPITAL REQUIREMENTS UNDER BASEL II: THE PORTUGUESE CASE* Paula Antão** Ana Lacerda** 1. INTRODUCTION Capital requirements for banks are of foremost importance for financial

More information

THE IMPACT OF MACROECONOMIC FACTORS ON NON-PERFORMING LOANS IN THE REPUBLIC OF MOLDOVA

THE IMPACT OF MACROECONOMIC FACTORS ON NON-PERFORMING LOANS IN THE REPUBLIC OF MOLDOVA Abstract THE IMPACT OF MACROECONOMIC FACTORS ON NON-PERFORMING LOANS IN THE REPUBLIC OF MOLDOVA Dorina CLICHICI 44 Tatiana COLESNICOVA 45 The purpose of this research is to estimate the impact of several

More information

Chapter 8: Interval Estimates and Hypothesis Testing

Chapter 8: Interval Estimates and Hypothesis Testing Chapter 8: Interval Estimates and Hypothesis Testing Chapter 8 Outline Clint s Assignment: Taking Stock Estimate Reliability: Interval Estimate Question o Normal Distribution versus the Student t-distribution:

More information

Determinants of Non Performing Loans: Case of US Banking Sector

Determinants of Non Performing Loans: Case of US Banking Sector 141 Determinants of Non Performing Loans: Case of US Banking Sector Irum Saba 1 Rehana Kouser 2 Muhammad Azeem 3 Non Performing Loan Rate is the most important issue for banks to survive. There are lots

More information

Integrated Resource Plan

Integrated Resource Plan Integrated Resource Plan March 19, 2004 PREPARED FOR KAUA I ISLAND UTILITY COOPERATIVE LCG Consulting 4962 El Camino Real, Suite 112 Los Altos, CA 94022 650-962-9670 1 IRP 1 ELECTRIC LOAD FORECASTING 1.1

More information

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting

More information

Università della Svizzera italiana. Lugano, 20 gennaio 2005

Università della Svizzera italiana. Lugano, 20 gennaio 2005 Università della Svizzera italiana Lugano, 20 gennaio 2005 Default probabilities and business cycle regimes: a forward-looking approach Chiara Pederzoli Università di Modena e Reggio Emilia 1 Motivation:

More information

A Proven Approach to Stress Testing Consumer Loan Portfolios

A Proven Approach to Stress Testing Consumer Loan Portfolios A Proven Approach to Stress Testing Consumer Loan Portfolios Interthinx, Inc. 2013. All rights reserved. Interthinx is a registered trademark of Verisk Analytics. No part of this publication may be reproduced,

More information

Causal Forecasting Models

Causal Forecasting Models CTL.SC1x -Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics Causal Models Used when demand is correlated with some known and measurable environmental

More information

Consumer Confidence and UK House Price Inflation

Consumer Confidence and UK House Price Inflation Consumer Confidence and UK House Price Inflation Dean Garratt, Economist, Council of Mortgage Lenders Consumer confidence alone can explain up to two-thirds of the variation in the annual growth of house

More information