Modeling Loss Given Default in SAS/STAT



Similar documents
benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

Forecasting the Direction and Strength of Stock Market Movement

Vasicek s Model of Distribution of Losses in a Large, Homogeneous Portfolio

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

How To Calculate The Accountng Perod Of Nequalty

Reporting Forms ARF 113.0A, ARF 113.0B, ARF 113.0C and ARF 113.0D FIRB Corporate (including SME Corporate), Sovereign and Bank Instruction Guide

An Alternative Way to Measure Private Equity Performance

Portfolio Loss Distribution

CHAPTER 14 MORE ABOUT REGRESSION

How To Evaluate A Dia Fund Suffcency

Section 5.4 Annuities, Present Value, and Amortization

What is Candidate Sampling

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

Statistical Methods to Develop Rating Models

Credit Limit Optimization (CLO) for Credit Cards

Financial Mathemetics

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

1. Measuring association using correlation and regression

Transition Matrix Models of Consumer Credit Ratings

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

Joe Pimbley, unpublished, Yield Curve Calculations

Macro Factors and Volatility of Treasury Bond Returns

The Application of Fractional Brownian Motion in Option Pricing

1 De nitions and Censoring

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Efficient Project Portfolio as a tool for Enterprise Risk Management

The OC Curve of Attribute Acceptance Plans

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Lecture 3: Force of Interest, Real Interest Rate, Annuity

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

An Empirical Study of Search Engine Advertising Effectiveness

Lecture 3: Annuity. Study annuities whose payments form a geometric progression or a arithmetic progression.

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

7.5. Present Value of an Annuity. Investigate

Regression Models for a Binary Response Using EXCEL and JMP

Analysis of Premium Liabilities for Australian Lines of Business

! # %& ( ) +,../ # 5##&.6 7% 8 # #...

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

Forecasting and Stress Testing Credit Card Default using Dynamic Models

Single and multiple stage classifiers implementing logistic discrimination

Binomial Link Functions. Lori Murray, Phil Munz

Method for assessment of companies' credit rating (AJPES S.BON model) Short description of the methodology

Intra-year Cash Flow Patterns: A Simple Solution for an Unnecessary Appraisal Error

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

Using Series to Analyze Financial Situations: Present Value

BERNSTEIN POLYNOMIALS

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

STATISTICAL DATA ANALYSIS IN EXCEL

Variance estimation for the instrumental variables approach to measurement error in generalized linear models

Prediction of Disability Frequencies in Life Insurance

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

Approximating Cross-validatory Predictive Evaluation in Bayesian Latent Variables Models with Integrated IS and WAIC

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

Brigid Mullany, Ph.D University of North Carolina, Charlotte

L10: Linear discriminants analysis

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining

Measurement of Farm Credit Risk: SUR Model and Simulation Approach

Simple Interest Loans (Section 5.1) :

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Discount Rate for Workout Recoveries: An Empirical Study*

Section 5.3 Annuities, Future Value, and Sinking Funds

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Decision Tree Model for Count Data

Multiple-Period Attribution: Residuals and Compounding

When Talk is Free : The Effect of Tariff Structure on Usage under Two- and Three-Part Tariffs

Testing Adverse Selection Using Frank Copula Approach in Iran Insurance Markets

GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM

Time Value of Money. Types of Interest. Compounding and Discounting Single Sums. Page 1. Ch. 6 - The Time Value of Money. The Time Value of Money

Lecture 14: Implementing CAPM

ECONOMICS OF PLANT ENERGY SAVINGS PROJECTS IN A CHANGING MARKET Douglas C White Emerson Process Management

Marginal Returns to Education For Teachers

A Model of Private Equity Fund Compensation

Scale Dependence of Overconfidence in Stock Market Volatility Forecasts

Traffic-light a stress test for life insurance provisions

Forecasting Irregularly Spaced UHF Financial Data: Realized Volatility vs UHF-GARCH Models

A Multistage Model of Loans and the Role of Relationships

DO LOSS FIRMS MANAGE EARNINGS AROUND SEASONED EQUITY OFFERINGS?

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION

Two Faces of Intra-Industry Information Transfers: Evidence from Management Earnings and Revenue Forecasts

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

The Cross Section of Foreign Currency Risk Premia and Consumption Growth Risk

1 Example 1: Axis-aligned rectangles

A Lyapunov Optimization Approach to Repeated Stochastic Games

Trivial lump sum R5.0

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

SIMPLE LINEAR CORRELATION

Transcription:

Paper 1593-014 Modelng Loss Gven Default n SAS/SA Xao Yao, he Unversty of Ednburgh Busness School, UK Jonathan Crook, he Unversty of Ednburgh Busness School, UK Galna Andreeva, he Unversty of Ednburgh Busness School, UK ABSRAC Predctng loss gven default (LGD) s playng an ncreasngly crucal role n quanttatve credt rsk modelng. In ths paper, we propose to apply mxed effects models to predct corporate bonds LGD, as well as other wdely used LGD models. he emprcal results show that mxed effects models are able to explan the unobservable heterogenety and to make better predctons compared wth lnear regresson and fractonal response regresson. All the statstcal models are performed n SAS/SA, SAS 9., usng specfcally PROC REG and PROC NLMIXED, and the model evaluaton metrcs are calculated n PROC IML. hs paper gves a detaled descrpton on how to use PROC NLMIXED to buld and estmate generalzed lnear models and mxed effects models. INRODUCION Loss gven default (LGD) measures the percentage of all exposure at the tme of default that can not be recovered. Recovery rate (RR) s defned as one mnus LGD. LGD/RR modelng attracts much less attenton compared wth the large volume of lterature on PD modelng. Wth the portfolo loss estmaton beng a major concern n modern rsk management system, ncreasng attenton s beng dedcated to LGD modelng as well as PD and LGD jont estmaton. In terms of the methodologes there are two man streams: one approach s to apply fxed effect regresson models ncludng lnear regresson and fractonal response regresson to predct LGD (Gupton and Sten, 00, Dermne et al, 006 and Bastos, 010). In emprcal studes LGD dstrbutons often present b-modal characterstcs bounded between the nterval [0, 1] based on ts defnton. Calabrese (01) proposes an nflated beta regresson model whch consdered the dependent varable as a mxture of a contnuous beta dstrbuton on (0, 1) and a dscrete Bernoull dstrbuton to model the probablty mass at the boundares 0 and 1, whch can be regarded as a specal type of generalzed lnear regresson model. A second approach s the use of the mxed effects models based on the Vascek s sngle factor framework (Vascek 1987) where LGD s assumed to be dependent on a systematc rsk factor n the predctors (Hamerle et al, 006). In the settng of a sngle factor model the unobservable systematc rsk factor works as a random effect term and the dosyncratc rsk factor s the resdual term. Consequently a sngle factor model wth the observable covarates s equvalent as a mxed effects model. However, Hamerle et al (006) only consder the tme-varyng latent factor and they do not make any further benchmarkng studes. hs project ams to fll n ths gap by applyng the random effect terms at multple levels and to conduct a comparson study of the commonly used models n lterature. We manly nvestgate to apply the mxed effects models to predct the US corporate bonds recovery rates wth the random effect terms specfed at oblgor, senorty and tme levels. We fnd that the ncluson of an oblgor-varyng random effect term effectvely explans the unobservable heterogenety and the related predctve accuraces are also much better than the others. All the models are bult up and realzed n SAS/SA and the predcton results are generated n PROC IML. he remander of ths paper s structured as follows. he next secton brefly ntroduces the data used n the emprcal study and s followed by an overvew of the models wth SAS programs provded. he emprcal results and analyss are then presented and the last secton concludes ths study. DAA AND SEUP he emprcal study s based on the US corporate bonds recovery rates nformaton from Moody s Ultmate Recovery Database (MURD). he unt of observaton n ths study s nstrument. hs database covers the recovery nformaton of more than 3000 nstruments untl date. he nstruments nclude bank loans, revolvers and corporate bonds. In ths 1

study we are only nterested n the corporate bonds and use recovery rate as the dependent varable nstead of LGD, and the fnal sample has 1413 observatons of defaulted bonds observed from 1986 to 01. here are fve dfferent senortes ncludng Junor Subordnated, Subordnated, Senor Subordnated, Senor Secured and Senor Unsecured. Each oblgor may ssue more than one nstrument wth dfferent senortes. In MURD only nstrument related nformaton s provded ncludng the debt characterstcs and recovery process nformaton. We ntegrate the accountng ratos from Compustat nto our sample by usng the common frm dentfer. US macroeconomc varables are also ncluded and extracted from the open sources onlne to capture features of the economc cycle. Here both Issue Sze and otal Asset are subjected to a log transformaton for scalng. Both accountng and macroeconomc covarates are ncorporated one year pror to default and the regresson model can be presented as follows Recovery Rate,t = Intercept + Recovery Characterstcs + Frm Characterstcs,t-1 + Macroeconomc Varables t-1 Fgure 1 presents the dstrbuton of recovery rates n our sample. We can observe the clustered observatons wth recovery rates equal to 0 and 1 from the fgure. able 1 presents the recovery rates dstrbuton by senorty and able gves the descrptons of the varables used n ths study. Fgure 1. Dstrbuton of recovery rates No. Mean Std Mn Max Junor Subordnate Bonds 8 0.168 0.634 0 1 Senor Secured Bonds 33 0.69 0.3688 0 1.198 Senor Subordnate Bonds 198 0.3150 0.3617 0 1.6978 Senor Unsecured Bonds 681 0.5100 0.3813 0 1.0499 Subordnate Bonds 174 0.317 0.3743 0 1.3691 otal 1413 0.4806 0.3915 0 1.6978 able 1. Recovery rates by senorty

Debt Characterstcs Var1: Collateral Rank Var: Percent Above Var3: Issue Sze Instruments are ranked related to each other based on the structure pror to default, takng nto consderaton collateral and nstrument type. Percentage of debt whch s contractually senor to the current nstrument. Face value of the relevant nstrument. Frm Characterstcs Var4: otal Asset Var5: EBIDA Var6: Leverage Var7: Debt Rato Var8: Book Value per Share Var9: Asset angblty Var10: Quck Rato otal assets of the oblgor Earnngs before nterest, taxes, deprecaton and amortzaton Rato of total debt and total assets Rato of current labltes and long term debt Book value of assets scaled by the total outstandng shares Rato between ntangble assets and tangble assets Sum of cash and short-term nvestment and total recevables dvded by the current labltes. Macroeconomc Varables Var11: Growth Rate Var1: -Bll Rate Var13: Aggregated Default Rates Var14: Unemployment Rate US annual GDP growth rate US three months reasury bll rate US annual ssuer-weghted corporate default rates US annual unemployment rate able. Lst of covarates MODELS AND SAS PROGRAMS hs secton gves an overvew of the models n ths study. he followng mathematcal notatons are used through ths paper. he recovery rate of nstrument s defned as y and the vector of covarates s gven as x. b 0 and β denote the ntercept term and the vector of parameters respectvely. In the followng SAS programs we name the dependent varable as RR, and the ndependent covarates are ndexed as Var1-Var14. he parameters of ntercept term and all the covarates are named as b0-b14, and the cleaned dataset s named as MyData. LINEAR REGRESSION Prevous studes show that lnear regresson models appear to be of comparable predctve accuraces as other more complcated statstcal models (Q and hao, 011; Bellott and Crook, 01) even though there s the potental rsk to make predctons out of the range between 0 and 1. he lnear regresson model s defned y e = b0 + β x + e. (1) ~ N(0, s ) 3

Program 1. Lnear regresson proc reg data=mydata outest=tran_estmate; model RR=Var1 Var Var14; qut; proc score data=mydata score=tran_estmate out=reg_output (keep=rr model1) predct type=parms; var Var1 Var Var14; Ordnary least squared estmaton method s used n PROC REG and the estmates of parameters are exported nto a new table called tran_estmate. he predcted values are computed by PROC SCORE whle we only keep the actual and predcted values named as RR and model1 n the output dataset reg_output. FRACIONAL RESPONSE REGRESSION Fractonal response regresson was frst proposed by Papke and Wooldrdge (1996) and has been wdely appled n LGD modelng (Dermne and Carvalho, 006; Bastos, 010, Bellott & Crook, 01). In ths model, the dependent varable s bounded between 0 and 1 by mposng a lnk functon such as Ey ( x) = Gb ( 0 + β x ) where G() denotes a lnk functon such as a logt or a complementary log-log transformaton functon and the quas maxmum lkelhood functon s gven as follows log L = å( y log G( b0 + β x) + (1 -y)log(1 - G( b0 + β x ))). () Program. Fractonal response regresson proc nlmxed data=mydata tech=newrap maxter=3000 maxfunc=3000 qtol=0.0001; parms b0-b14=0.0001; cov_mu=b0+b1*var1+b*var+ +b14*var14; mu=logstc(cov_mu); loglkefun=rr*log(mu)+(1-rr)*log(1-mu); model RR~general(loglkefun); predct mu out=frac_resp_output (keep=nstrument_d RR pred); Here the fractonal response regresson model s realzed n PROC NLMIXED and the log-lkelhood functon s defned as equaton () and optmzed by a Newton-Raphson method wth lne search specfed by the opton tech=newrap. Note that cov_mu denotes the term b 0 + β x and a logt form transformaton functon s selected by applyng a bult-n functon logstc defned n PROC NLMIXED. he predcted recovery rate mu s gven by usng the opton predct and exported to an output dataset named as frac_resp_output, where the predcted recovery rate s named as pred automatcally. Notce that both actual and predcted recovery rates are kept n the output dataset. INFLAED BEA REGRESSION Inflated beta regresson s proposed by Ospna and Ferrar (010) where the dependent varable s regarded as a mxture dstrbuton of a beta dstrbuton on (0, 1) and a Bernoull dstrbuton on boundares 0 and 1. he probablty densty functon s gven as ìï ïp(1 - y) f y = 0 b01(; y pymf,,, ) = ï ípy f y = 1, (3) ï(1 - p) fy ( ; m, f) fyî (0, 1) ïî where (;, ) f y mf s the beta densty functon and m and f are the mean and precson parameters. Here m can be reparameterzed by mposng a logt transformaton. he expectaton of the dependent varable s derved mmedately 4

such that Ey () = py + (1- p) m. (4) Let d = 1 f y = 0 and d = 0 f y Î (0, 1], c = 1 f y = 1 and c = 0 f y Î [0, 1), and the maxmum lkelhood functon s gven as follows å log L = d log( p(1 - y)) + c log( py) + (1 -d)(1 -c)(log(1 - p) + log G( f) - log G( mf) - log G((1 - m) f). (5) + ( mf - 1) log y + ((1 - m) f - 1) log(1 -y )) he unknown parameters ( b0, β, p, y, f ) are solved by standard optmzaton algorthms as above. More detals can be found n Ospna and Ferrar (010). Program 3. Inflated beta regresson proc nlmxed data=mydata tech=quanew maxter=3000 maxfunc=3000 qtol=0.0001; parms b0-b14=0.0001 pe=0. kesa=0.3 ph=; cov_mu=b0+b1*var1+b*var+ +b14*var14; mu=logstc(cov_mu); f RR=0 then loglkefun=log(pe)+log(1-kesa); f RR>=1 then loglkefun=log(pe)+log(kesa); f 0<RR<1 then loglkefun=log(1-pe)+lgamma(ph)-lgamma(mu*ph)-lgamma((1-mu)*ph) +(mu*ph-1)*log(rr)+((1-mu)*ph-1)*log(1-rr); predct pe*kesa+(1-pe)*mu out=inf_beta_output (keep=nstrument_d RR pred); model RR~general(loglkefun); In Program 3 we use an f-condton to defne the log-lkelhood functon of the nflated beta regresson model based on (3). he logt transformaton s gven by a logstc functon as above. Note that the predcted recovery rate s defned as equaton (4) nstead of mu n Program. Here lgamma s a bult-n functon n PROC NLMIXED meanng the log-gamma functon and a quas-newton optmzaton method (quanew) s employed to obtan the estmated parameters. he output data set s then named as Inf_beta_output. MIXED EFFECS MODEL We start wth the sngle factor framework n Hamerle et al (006) and rewrte t as follows y = b0 + β x + g1t + ge t = 1,...,, (6) N(0,1), e N(0,1) t where the random effect t s termed as a tme-varyng systematc rsk factor and e denotes the dosyncratc rsk factor. he sngle latent factor t s related to the unobservable heterogenety on economc condtons. In model (6) the cluster ntra-class recovery rate correlaton of any two dfferent nstruments and j s gven as 5

Cov( y, y ) = g Corr( y, y ) = j j 1 g1 g1 + g. (7) Model (6) s bult on the assumpton that the nstruments defaulted n the same year are dependent on a common latent economc state varable. In ths study, we defne the oblgor and senorty-varyng factor models by substtutng the tme-varyng latent factor t wth k, the k-th oblgor of the total of K oblgors and s, the s-th type of the total of S senortes. he estmaton procedure of ths model starts by dervng the condtonal probablty densty functon and then the random effect terms are ntegrated out to obtan the uncondtonal dstrbuton. Condtonng on the realzaton of, the condtonal dstrbuton of y s gven such as where 1 ( y - m) =, (8) ps s fy ( ) exp( ) m = Ey ( ) = b0 + β x + g1. s = g he uncondtonal probablty densty functon of y s gven by ntegratng out of the condtonal probablty densty functon such as + + 1 ( y - m ) f ( y) = f( y ) df ( ) = exp( ) df( ) - - ps s ò ò, (9) where F ( ) denotes the cumulatve dstrbuton functon of the standard normal varable. We are now able to defne the log lkelhood functon as log L( b, β, g, g ) = log f( y ). (10) 0 1 he estmates of parameters are generated by solvng the log lkelhood functon (10). Notce that the log lkelhood functon (10) nvolves calculatng a mult-dmensonal ntegral that s dffcult to evaluate, whch can be approxmated by a Gaussan quadrature method. Program 4. Mxed effects model %let level=oblgor_d; proc sort data=mydata; by &level; proc nlmxed data=mydata noad qponts=80 tech=quanew maxter=3000 maxfunc=3000 qtol=0.0001; parms b0-b14=0.0001 gamma1-gamma=0.4; cov_mu=b0+b1*var1+b*var+ +b14*var14; con_mu=cov_mu+gamma1*z; con_sgma=gamma**; model RR~normal(con_mu,con_sgma); random z~normal(0,1) subject=&level; predct con_mu out= mx_output_&level (keep=nstrument_d RR pred); Opton subject determnes when random effect term z s realzed whch s defned as a macro level by usng a %let statement for convenence. In ths example the oblgor_d ndcates an oblgor-varyng random effect s specfed and 6

Program 4 s correspondng to an oblgor-varyng mxed effects model. Senorty and tme-varyng models can be realzed by specfyng the macro level. Note that the whole sample data should be sorted by level to make sure the nput data set s clustered accordng to ths varable pror to callng PROC NLMIXED. Opton noad means that a nonadaptve Gaussan quadrature method s called wth the number of quadrature ponts specfed by opton qponts to approxmate the log-lkelhood functon whch s then optmzed by a quas-newton algorthm defned by opton tech. EMPIRICAL RESULS AND ANALYSIS o compare the model predctve accuraces there are three performance metrcs used ncludng the R-square (R ), root mean squared errors (RMSE) and mean absolute errors (MAE). he model ft performances are reported n able 3. All these performance metrcs are calculated n PROC IML. able 3 shows clearly that the oblgor-varyng mxed effects model gves remarkably better performances than the other methods wth R as hgh as 0.8964. When usng a tme-varyng random effect, the mxed effects model stll gves better predctons than the other fx effect regresson models. But the ncluson of a senorty-varyng random effect does not demonstrate any advantages compared wth the other regresson models. Among the fxed effect regresson models, fractonal response regresson shows slghtly better performances than lnear regresson and nflated beta regresson models. Such evdence s also consstent wth Q and hao (011). It s unexpected to see that the nflated beta regresson model does not show better predctons. Calabrese (01) studes bank loan recovery rates from he Bank of Italy and fnds that nflated beta regresson yelds better predcton accuraces than fractonal response regresson models. One possble explanaton s that although the nflated beta regresson s sutable to model the clustered samples on the boundares 0 and 1, t s not able to dscrmnate these observatons accurately from that n the nterval. Another reason probably s the beta dstrbuton may not be well ftted n our sample. able 4 shows the ntra-class covarance and correlaton of the mxed effects model based on equaton (7). Frst, t s clear to see that the oblgor level ntra-class correlaton s sgnfcantly hgher than that at the senorty or tme levels, ndcatng that the nstruments ssued by the oblgor have a sgnfcant hgh correlaton. It s also straghtforward to explan the low correlaton of the nstruments of the same senorty because they are not necessarly nfluenced by a common factor or sharng common characterstcs. Instruments that defaulted at the same year are affected by the same macroeconomc condtons so that the recovery rates correlaton s hgher than the senorty level but stll much lower than the oblgor level. Our fndng shows that t s necessary to nclude the ntra-class correlaton to explan the oblgor specfc unobservable heterogenety whch makes mxed effects models gve better predctons than the fxed effect models. Lnear mxed effects model Level R RMSE MAE Oblgor 0.8964 0.159 0.0839 Senorty 0.3383 0.3183 0.68 me 0.409 0.304 0.461 Ordnary lnear regresson - 0.3409 0.3177 0.65 Fractonal response regresson - 0.368 0.314 0.545 Inflated beta regresson - 0.3558 0.3141 0.630 able 3 Comparsons of model ft Covarance Correlaton Oblgor 0.0961 0.808 Senorty 0.0001 0.0009 me 0.018 0.114 able 4 Intra-class covarance and correlaton 7

Program 5. Performance metrcs %let output_data=mx_output_&level; proc ml; use &output_data; read all var {RR,pred} nto data; close &output_data; m=nrow(data); RR=data[,1]; RR_estmate=data[,]; start var(x); mean=x[:,]; countn=j(1,ncol(x)); do =1 to ncol(x); countn[]=sum(x[,]^=.); end; var=(x-mean)[##,]/(countn-1); return (var); fnsh; resd=rr_estmate-rr; SS_error=resd`*resd; SS_total=var(RR)#(m-1); R_square=1-SS_error/SS_total; RMSE=sqrt(resd`*resd/m); MAE=sum(abs(resd))/m; prnt R_square RMSE MAE; create output_measure var {R_square RMSE MAE}; append; qut; proc prnt data=output_measure; Program 5 llustrates how to calculate the performance metrcs of mxed effects models n PROC IML. Note that n the data set output_data the actual and predcted recovery rates are denoted as RR and pred respectvely. We read them nto PROC IML named as data. Computng R nvolves a calculaton of the varance of the actual recovery rates. However, PROC IML n SAS 9. does not provde a bult-n functon to calculate the varance for a vector or matrx. We defne a module named as var to calculate the varance * and export the performance metrcs nto a SAS data set output_measure. CONCLUSION In ths project we demonstrate that how to estmate both fxed and mxed effects models n SAS/SA. We also llustrate calculatng performance metrcs n PROC IML wth common matrx operators. hs study proposes to apply a lnear mxed effects model to predct the US corporate bonds recovery rates. he purpose of usng a mxed effects model s to better explan the unobservable heterogenety n an emprcal data set. Performances of the mxed effects models wth the random effect term specfed at oblgor, senorty and tme levels are examned. We buld up lnear regresson model n PROC REG, and realze fractonal response regresson and nflated beta regresson models n PROC NLMIXED. We provde detals on how to estmate mxed effects models n PROC NLMIXED wth a smple macro varable ncorporated for convenence. Emprcal evdence ndcates that an oblgor-varyng mxed effects model sgnfcantly outperforms the others n terms of all the performance metrcs we consdered, whch emphaszes the mportance of ncludng frm specfc random effect. We provde a new angle to model corporate bonds recovery rates and show the convenences to buld up credt rsk models n SAS/SA especally PROC NLMIXED. Further study wll be conducted based on the exstng fndngs. * he module program var s gven from Rck Wckln s SAS blog: http://blogs.sas.com/content/ml/011/04/07/computngthe-varance-of-each-column-of-a-matrx/ 8

REFERENCES Bastos, J. A. (010). Forecastng bank loans loss-gven-default, Journal of Bankng and Fnance, 34, 510-517. Bellott,. & Crook, J. (01). Loss gven default models ncorporatng macroeconomc varables for credt cards. Internatonal Journal of Forecastng, 8, 171-18. Dermne, J. & Neto De Carvalho, C. (006). Bank loan losses-gven-default: A case study. Journal of Bankng and Fnance, 30, 119-143. Gupton, G. & Sten, R. (00). LossCalc M : Model for predctng loss gven default (LGD), Moody s KMV. Hamerle, A., Knapp, M. & Wldenauer, N. (006). Modelng loss gven default: A pont n tme -approach, n he Basel II Rsk Parameters, Sprnger, Berln. Lu, Wensu. (01). Modelng Rates and Proportons n SAS-8. Access on May 16, 01 from http://blog.sna.com.cn/s/blog_a8fc8a0101ceb.html Papke, L. & Wooldrdge, J. (1996). Econometrc method for fractonal response varables wth an applcaton to the 401(K) plan partcpaton rates. Journal of Appled Econometrcs, 11, 619-63. Q, M. & hao, X. (011). Comparson of modelng methods for loss gven default, Journal of Bankng and Fnance, 35, 84-855. Ospna, R. and Ferrar, S. (010). Inflated beta dstrbutons, Stat Papers, 51, 111-16. SAS Insttute Inc (009). SAS 9. User s Gude, Cary, NC. Vascek, O. (1987). Probablty of loss on loan portfolo, KMV Corporaton. Wckln, R. Computng the varance of each column of a matrx. he DO Loop: Statstcal programmng n SAS wth an emphass on SAS/IML programs. Access on Aprl 7, 011 from http://blogs.sas.com/content/ml/011/04/07/computng-the-varance-of-each-column-of-a-matrx/ ACKNOWLEDGEMEN he author would lke to thank hs mentor, Stephane hompson, SAS Global Forum Mentorng Program, for her tremendous assstance, and other staff n SAS Insttute Inc. for ther knd support. he author would also lke to thank Rck Wckln and other SAS bloggers for ther nvaluable deas and tps on SAS programmng shared onlne. CONAC INFORMAION Your comments and questons are valued and encouraged. Contact the author at: Xao Yao he Unversty of Ednburgh Busness School 9 Buccleuch Place Ednburgh, EH8 9JS, UK Emal: yaoxao18@gmal.com SAS and all other SAS Insttute Inc. product or servce names are regstered trademarks or trademarks of SAS Insttute Inc. n the USA and other countres. ndcates USA regstraton. Other brand and product names are trademarks of ther respectve companes. 9