Measuring the Quality of Credit Scoring Models



Similar documents
STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

A Study of Unrelated Parallel-Machine Scheduling with Deteriorating Maintenance Activities to Minimize the Total Completion Time

APPENDIX III THE ENVELOPE PROPERTY

ANOVA Notes Page 1. Analysis of Variance for a One-Way Classification of Data

Applications of Support Vector Machine Based on Boolean Kernel to Spam Filtering

1. The Time Value of Money

Numerical Comparisons of Quality Control Charts for Variables

T = 1/freq, T = 2/freq, T = i/freq, T = n (number of cash flows = freq n) are :

6.7 Network analysis Introduction. References - Network analysis. Topological analysis

The Gompertz-Makeham distribution. Fredrik Norström. Supervisor: Yuri Belyaev

Average Price Ratios

The simple linear Regression Model

CHAPTER 4: NET PRESENT VALUE

SHAPIRO-WILK TEST FOR NORMALITY WITH KNOWN MEAN

Simple Linear Regression

Optimal Packetization Interval for VoIP Applications Over IEEE Networks

How do bookmakers (or FdJ 1 ) ALWAYS manage to win?

AP Statistics 2006 Free-Response Questions Form B

n. We know that the sum of squares of p independent standard normal variables has a chi square distribution with p degrees of freedom.

Banking (Early Repayment of Housing Loans) Order,

IDENTIFICATION OF THE DYNAMICS OF THE GOOGLE S RANKING ALGORITHM. A. Khaki Sedigh, Mehdi Roudaki

Optimal multi-degree reduction of Bézier curves with constraints of endpoints continuity

Fuzzy Task Assignment Model of Web Services Supplier in Collaborative Development Environment

Maintenance Scheduling of Distribution System with Optimal Economy and Reliability

Chapter 14 Nonparametric Statistics

Aggregation Functions and Personal Utility Functions in General Insurance

Classic Problems at a Glance using the TVM Solver

Small-Signal Analysis of BJT Differential Pairs

CIS603 - Artificial Intelligence. Logistic regression. (some material adopted from notes by M. Hauskrecht) CIS603 - AI. Supervised learning

A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining

Fractal-Structured Karatsuba`s Algorithm for Binary Field Multiplication: FK

Abraham Zaks. Technion I.I.T. Haifa ISRAEL. and. University of Haifa, Haifa ISRAEL. Abstract

Developing a Fuzzy Search Engine Based on Fuzzy Ontology and Semantic Search

ANNEX 77 FINANCE MANAGEMENT. (Working material) Chief Actuary Prof. Gaida Pettere BTA INSURANCE COMPANY SE

A New Bayesian Network Method for Computing Bottom Event's Structural Importance Degree using Jointree

Polyphase Filters. Section 12.4 Porat 1/39

Chapter Eight. f : R R

Finite Difference Method

Numerical Methods with MS Excel

An IG-RS-SVM classifier for analyzing reviews of E-commerce product

Lecture 7. Norms and Condition Numbers

Performance Attribution. Methodology Overview

Preprocess a planar map S. Given a query point p, report the face of S containing p. Goal: O(n)-size data structure that enables O(log n) query time.

LECTURE 13: Cross-validation

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Conversion of Non-Linear Strength Envelopes into Generalized Hoek-Brown Envelopes

Basic statistics formulas

Constrained Cubic Spline Interpolation for Chemical Engineering Applications

ECONOMIC CHOICE OF OPTIMUM FEEDER CABLE CONSIDERING RISK ANALYSIS. University of Brasilia (UnB) and The Brazilian Regulatory Agency (ANEEL), Brazil

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

ADAPTATION OF SHAPIRO-WILK TEST TO THE CASE OF KNOWN MEAN

ON SLANT HELICES AND GENERAL HELICES IN EUCLIDEAN n -SPACE. Yusuf YAYLI 1, Evren ZIPLAR 2. yayli@science.ankara.edu.tr. evrenziplar@yahoo.

Speeding up k-means Clustering by Bootstrap Averaging

Report 52 Fixed Maturity EUR Industrial Bond Funds

Multi-Channel Pricing for Financial Services

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

Properties of MLE: consistency, asymptotic normality. Fisher information.

Near Neighbor Distribution in Sets of Fractal Nature

Load and Resistance Factor Design (LRFD)

Proceedings of the 2010 Winter Simulation Conference B. Johansson, S. Jain, J. Montoya-Torres, J. Hugan, and E. Yücesan, eds.

Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS R =

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Green Master based on MapReduce Cluster

Load Balancing via Random Local Search in Closed and Open systems

Credibility Premium Calculation in Motor Third-Party Liability Insurance

Approximation Algorithms for Scheduling with Rejection on Two Unrelated Parallel Machines

CHAPTER 2. Time Value of Money 6-1

Regression Analysis. 1. Introduction

Borehole breakout and drilling-induced fracture analysis from image logs

Security Analysis of RAPP: An RFID Authentication Protocol based on Permutation

Determining the sample size

The paper presents Constant Rebalanced Portfolio first introduced by Thomas

Generalized solutions for the joint replenishment problem with correction factor

MODELLING OF STOCK PRICES BY THE MARKOV CHAIN MONTE CARLO METHOD

10/19/2011. Financial Mathematics. Lecture 24 Annuities. Ana NoraEvans 403 Kerchof

Section 11.3: The Integral Test

Mixed Distributions for Loss Severity Modelling with zeros in the Operational Risk losses

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Sequences and Series

Bond Valuation I. What is a bond? Cash Flows of A Typical Bond. Bond Valuation. Coupon Rate and Current Yield. Cash Flows of A Typical Bond

How To Value An Annuity

Confidence Intervals for One Mean

ISyE 512 Chapter 7. Control Charts for Attributes. Instructor: Prof. Kaibo Liu. Department of Industrial and Systems Engineering UW-Madison

Report 05 Global Fixed Income

A DISTRIBUTED REPUTATION BROKER FRAMEWORK FOR WEB SERVICE APPLICATIONS

Reinsurance and the distribution of term insurance claims

1 Computing the Standard Deviation of Sample Means

Statistical Pattern Recognition (CE-725) Department of Computer Engineering Sharif University of Technology

Transcription:

Measur the Qualty of Credt cor Models Mart Řezáč Dept. of Matheatcs ad tatstcs, Faculty of cece, Masaryk Uversty CCC XI, Edurh Auust 009

Cotet. Itroducto 3. Good/ad clet defto 4 3. Measur the qualty 6 4. Idexes ased o dstruto fucto 7 5. Idexes ased o desty fucto 7 6. oe results for orally dstruted scores 4 7. Coclusos 30 /30

Itroducto It s possle to use scor odel effectvely wthout kow how ood t s. Usually oe has several scor odels ad eeds to select just oe. The est oe. Before easur the qualty of odels oe should kow (ao other ths): ood/ad defto expected reject rate 3/30

Good/ad clet defto Good defto s the asc codto of effectve scor odel. The defto usually depeds o: days past due (DPD) aout past due te horzo Geerally we cosder follow types of clet: Good Bad Ideterate Isuffcet Excluded Rejected. 4/30

Good/ad clet defto BAD Custoer Accepted Rejected Default (60 or 90 DPD) Fraud (frst delayed payet, 90 DPD) Early default (-4 delayed payet, 60 DPD) Late default (5 delayed payet, 60 DPD) GOOD Not default Isuffcet INDETERMINATE 5/30

Measur the qualty Oce the defto of ood / ad clet ad clet's score s avalale, t s possle to evaluate the qualty of ths score. If the score s a output of a predctve odel (scor fucto), the we evaluate the qualty of ths odel. We ca cosder two asc types of qualty dexes. Frst, dexes ased o cuulatve dstruto fucto lke Koloorov-rov statstcs (K) G dex C-statstcs Lft. The secod, dexes ased o lkelhood desty fucto lke Mea dfferece (Mahalaos dstace) Iforatoal statstcs/value (I Val ). 6/30

Idexes ased o dstruto fucto D K, 0, clet s ood otherwse. Nuer of ood clets: Nuer of ad clets: Proportos of ood/ad clets: p G, p B Eprcal dstruto fuctos: F ( a) I ( s a D ). GOOD K Koloorov-rov statstcs (K) K ax F, BAD ( a) F, a [ L, H ] GOOD ( a) F F ( a) I ( s a D 0). BAD K N. ALL ( a) N I N ( s a ) a, [ L H ] I ( A) 0 A s true otherwse 7/30

Idexes ased o dstruto Lorez curve (LC) fucto x y G dex F F. BAD. GOOD ( a) ( a), a [ L, H ]. A G A A B G k ( F F ) ( F F ). BAD k. BAD k. GOOD k. GOOD k F. BAD k F. GOOD where ( ) s k-th vector value of eprcal dstruto fucto of ad (ood) clets k 8/30

Idexes ased o dstruto fucto C-statstcs: c stat A C G c It represets the lkelhood that radoly selected ood clet has hher score tha radoly selected ad clet,.e. c stat P ( s s D 0) K D K 9/30

0/30 Aother possle dcator of the qualty of scor odel ca e cuulatve Lft, whch says, how ay tes, at a ve level of rejecto, s the scor odel etter tha rado selecto (rado odel). More precsely, the rato dcates the proporto of ad clets wth less tha a score a,, to the proporto of ad clets the eeral populato. Forally, t ca e expressed y: [ ] H L a, ( ) ( ) ( ) ( ) ( ) ( ) N a s I Y a s I Y Y I Y I a s I Y a s I BadRate a CuBadRate a Lft 0 0 0 0 ) ( ) ( Idexes ased o dstruto fucto BadRate a BadRate a aslft ) ( ) (

Idexes ased o dstruto fucto Usually t s coputed us tale wth uers of all ad ad clets soe ads (decles). asolutely cuulatvely decle # clets # ad clets Bad rate as. Lft # ad clets Bad rate cu. Lft 00 6 6,0% 3,0 6 6,0% 3,0 00,0%,40 8 4,0%,80 3 00 8 8,0%,60 36,0%,40 4 00 5 5,0%,00 4 0,3%,05 5 00 3 3,0% 0,60 44 8,8%,76 6 00,0% 0,40 46 7,7%,53 7 00,0% 0,0 47 6,7%,34 8 00,0% 0,0 48 6,0%,0 9 00,0% 0,0 49 5,4%,09 0 00,0% 0,0 50 5,0%,00 All 000 50 5,0% Lft value 0,8 0,6 0,4 3,50 as. Lft 3,00 cu. Lft,50,00,50,00 0,50-3 4 5 6 7 8 9 0 decle G0,55 0, Lorz curve Base le 0 0 0, 0,4 0,6 0,8 /30

Idexes ased o dstruto fucto Whe ad rates are ot ootoe: LC looks fe G s slhtly lowered Lft looks strae asolutely cuulatvely decle # clets # ad clets Bad rate as. Lft # ad clets Bad rate cu. Lft 00 8 8,0%,60 8 8,0%,60 00,0%,40 0 0,0%,00 3 00 6 6,0% 3,0 36,0%,40 4 00 5 5,0%,00 4 0,3%,05 5 00 3 3,0% 0,60 44 8,8%,76 6 00,0% 0,40 46 7,7%,53 7 00,0% 0,0 47 6,7%,34 8 00,0% 0,0 48 6,0%,0 9 00,0% 0,0 49 5,4%,09 0 00,0% 0,0 50 5,0%,00 All 000 50 5,0% 0,8 G0,48 3,50 3,00,50 as. Lft cu. Lft 0,6 0,4 Lft value,00,50,00 0, Lorz curve Base le 0 0 0, 0,4 0,6 0,8 0,50-3 4 5 6 7 8 9 0 decle /30

Idexes ased o dstruto Whe score s reversed, we ota reversed fures. fucto asolutely cuulatvely decle # clets # ad clets Bad rate as. Lft # ad clets Bad rate cu. Lft 00 6 6,0% 3,0 6 6,0% 3,0 00,0%,40 8 4,0%,80 3 00 8 8,0%,60 36,0%,40 4 00 5 5,0%,00 4 0,3%,05 5 00 3 3,0% 0,60 44 8,8%,76 6 00,0% 0,40 46 7,7%,53 7 00,0% 0,0 47 6,7%,34 8 00,0% 0,0 48 6,0%,0 9 00,0% 0,0 49 5,4%,09 0 00,0% 0,0 50 5,0%,00 All 000 50 5,0% asolutely cuulatvely decle # clets # ad clets Bad rate as. Lft # ad clets Bad rate cu. Lft 00,0% 0,0,0% 0,0 00,0% 0,0,0% 0,0 3 00,0% 0,0 3,0% 0,0 4 00,0% 0,0 4,0% 0,0 5 00,0% 0,40 6,% 0,4 6 00 3 3,0% 0,60 9,5% 0,30 7 00 5 5,0%,00 4,0% 0,40 8 00 8 8,0%,60,8% 0,55 9 00,0%,40 34 3,8% 0,76 0 00 6 6,0% 3,0 50 5,0%,00 All 000 50 5,0% Lft value 3,50 as. Lft 3,00 cu. Lft,50,00,50,00 0,50-3 4 5 6 7 8 9 0 decle G - 0,55 0,8 0,6 0,4 0, Lorz curve Base le 0 0 0, 0,4 0,6 0,8 3/30

Idexes ased o dstruto The G s ot eouh!!! C : C : decle # clets # ad clets Bad rate 00 35 35,0% 00 6 6,0% 3 00 8 8,0% 4 00 8 8,0% 5 00 7 7,0% 6 00 6 6,0% 7 00 6 6,0% 8 00 5 5,0% 9 00 5 5,0% 0 00 4 4,0% All 000 00 0,0% decle # clets # ad clets Bad rate 00 0 0,0% 00 8 8,0% 3 00 7 7,0% 4 00 5 5,0% 5 00,0% 6 00 6 6,0% 7 00 4 4,0% 8 00 3 3,0% 9 00 3 3,0% 0 00,0% All 000 00 0,0% 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0, 0, 0 fucto ood ad 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0, 0, 0 ood ad K- 0.36 0 0, 0, 0,3 0,4 0,5 0,6 0,7 0,8 0,9 K- 0.34 0 0, 0, 0,3 0,4 0,5 0,6 0,7 0,8 0,9 0,8 0,6 0,4 0, 0 0,8 0,6 0,4 0, 0 G 0.4 G 0,4 0 0, 0,4 0,6 0,8 Lorz curve Base le 0 0, 0,4 0,6 0,8 Lorz curve Base le 4/30

Idexes ased o dstruto fucto C : C : 4,00 3,50 3,00 as. Lft cu. Lft,50,00 as. Lft cu. Lft Lft value,50,00,50 Lft value,50,00,00 0,50 0,50-3 4 5 6 7 8 9 0-3 4 5 6 7 8 9 0 decle decle Lft 0%.55 > Lft 50%.48 < Lft 0%.90 Lft 50%.64 C s etter f reject rate s expected aroud 50%. C s uch ore etter f reject rate s expected y 0%. 5/30

Idexes ased o dstruto fucto Lft ca e expressed ad coputed y forulae: Lft( a) F N. ALL ( a). BAD a [ L, H ] F ( a) Lft q F F F N. ALL ( FN. ALL ( q)) F. BAD N. ( F ( q)) q ( F ( )). BAD q ALL N. ALL N. ALL { a [ L, H], F ( a) q} ( q) N. ALL Lft 0% ( 0 F F (0.)).. BAD N. ALL 6/30

Idexes ased o desty fucto Lkelhood desty fuctos: f GOOD (x) f BAD (x) Kerel estates: ~ f GOOD ( x, h), D K K h ( x s ) ~ f BAD ( x, h) D 0 k K h ( x s ) Optal adwdth (axal sooth): h O, k 3 k ( )! ( 5) k k k k ~ k (k 3)! where: k ~ s the order of kerel fucto (e.. for Epaechkov kerel) s uer of actual cases s a estate of stadard devato 7/30

Idexes ased o desty fucto Mea dfferece (Mahalaos dstace): D M M where s pooled stadard devato: M, M are eas of ood (ad) clets, are stadard devatos of ood (ad) clets 8/30

Idexes ased o desty fucto Iforato value (I val ) cotuous case (Dverece): f I val dff f ( x) GOOD BAD f BAD x ( ) GOOD ( f ( x) f ( x) ) l dx ( x) f ( x) f ( x) GOOD BAD f LR ( x) l f f GOOD BAD ( x) ( x) 9/30

Iforato value (I val ) dscretzed cotuous case: Replace desty fuctos y ther kerel estates ad copute teral uercally (e.. y coposte trapezodal rule). Us Epaechkov kerel, ve y ad optal adwdth we have ~ f ( x) For ve M pots Idexes ased o desty IV h O, k we ota ( x ) I [,] 3 K( x) x 4 ( ) ~ ~ ( ) fgood( x, ho,) fgood ( x, ho,) fbad( x, ho,) l ~ fbad( x, ho,) x 0, K, x M fucto ~ I val M xm x0 ~ ~ ~ fiv ( x0) fiv ( x ) fiv ( xm ) M x0 xm 0/30

Idexes ased o desty fucto Iforato statstcs/value (I val ) dscrete case: Create tervals of score typcally decles. Nuer of oods (ads) -th terval s arked y ( ). It ust holds > 0, > 0 The we have I val l score t. # ad clets #ood clets % ad [] % ood [] [3] [] - [] [4] [] / [] [5] l[4] [6] [3] * [5] 0,0%,% -0,0 0,53-0,64 0,0 5 4,0%,6% -0,0 0,39-0,93 0,0 3 8 5 6,0% 5,5% -0, 0,34 -,07 0, 4 4 93 8,0% 9,8% -0,8 0,35 -,05 0,9 5 0 46 0,0% 5,4% -0,05 0,77-0,6 0,0 6 6 47,0% 6,0% 0,4,7 0,77 0, 7 4 37 8,0% 4,4% 0,06,80 0,59 0,04 8 3 05 6,0%,% 0,05,84 0,6 0,03 9 97,0% 0,% 0,08 5,,63 0,3 0 48,0% 5,% 0,03,53 0,93 0,03 All 50 950 Ifo. Value 0,68 /30

Iforato value for our exaple of two scorecards: C : C : Idexes ased o desty fucto decle # clets # ad clets #ood % ad [] % ood [] [3] [] - [] [4] [] / [] [5] l[4] [6] [3] * [5] cu. [6] 00 35 65 35,0% 7,% -0,8 0, -,58 0,44 0,44 00 6 84 6,0% 9,3% -0,07 0,58-0,54 0,04 0,47 3 00 8 9 8,0% 0,% 0,0,8 0,5 0,0 0,48 4 00 8 9 8,0% 0,% 0,0,8 0,5 0,0 0,49 5 00 7 93 7,0% 0,3% 0,03,48 0,39 0,0 0,50 6 00 6 94 6,0% 0,4% 0,04,74 0,55 0,0 0,5 7 00 6 94 6,0% 0,4% 0,04,74 0,55 0,0 0,55 8 00 5 95 5,0% 0,6% 0,06, 0,75 0,04 0,59 9 00 5 95 5,0% 0,6% 0,06, 0,75 0,04 0,63 0 00 4 96 4,0% 0,7% 0,07,67 0,98 0,07 0,70 All 000 00 900 Ifo. Value 0,70 decle # clets # ad clets #ood % ad [] % ood [] [3] [] - [] [4] [] / [] [5] l[4] [6] [3] * [5] cu. [6] 00 0 80 0,0% 8,9% -0, 0,44-0,8 0,09 0,09 00 8 8 8,0% 9,% -0,09 0,5-0,68 0,06 0,5 3 00 7 83 7,0% 9,% -0,08 0,54-0,6 0,05 0,0 4 00 5 85 5,0% 9,4% -0,06 0,63-0,46 0,03 0, 5 00 88,0% 9,8% -0,0 0,8-0,0 0,00 0,3 6 00 6 94 6,0% 0,4% 0,04,74 0,55 0,0 0,5 7 00 4 96 4,0% 0,7% 0,07,67 0,98 0,07 0,3 8 00 3 97 3,0% 0,8% 0,08 3,59,8 0,0 0,4 9 00 3 97 3,0% 0,8% 0,08 3,59,8 0,0 0,5 0 00 98,0% 0,9% 0,09 5,44,69 0,5 0,67 All 000 00 900 Ifo. Value 0,67 /30

Idexes ased o desty Us arks C : C : 0,0 0,05 0,00-0,05-0,0-0,5-0,0-0,5-0,30 0,0 0,05 0,00-0,05-0,0-0,5 I_dff I_LR I dff 3 4 5 6 7 8 9 0 I_dff I_LR 3 4 5 6 7 8 9 0,50,00 0,50 0,00-0,50 -,00 -,50 -,00,00,50,00 0,50 0,00 I -0,50 -,00 LR fucto l we have: 0,50 I_df * I_LR 0,45 cu. I_dff * I_LR 0,40 0,35 0,30 0,5 0,0 0,5 0,0 0,05 0,00 3 4 5 6 7 8 9 0 0,6 I_df * I_LR 0,4 cu. I_dff * I_LR 0, 0,0 0,08 0,06 0,04 0,0 0,00 3 4 5 6 7 8 9 0 0,80 0,70 0,60 0,50 0,40 0,30 0,0 0,0 0,00 0,80 0,70 0,60 0,50 0,40 0,30 0,0 0,0 0,00 K- 0.34 G 0.4 Lft 0%.55 Lft 50%.48 I val 0.70 I val0% 0.47 I val50% 0.50 K- 0.36 G 0.4 Lft 0%.90 Lft 50%.64 I val 0.67 I val0% 0.5 I val50% 0.3 3/30

4/30 Assue that the scores of ood ad ad clets are orally dstruted,.e. we ca wrte ther destes as Estates of paraeters ad : Pooled stadard devato: Estates of ea ad stadard dev. of scores for all clets : oe results for orally dstruted scores ( ) ) ( x GOOD e x f µ π ( ) ) ( x BAD e x f µ π µ µ,,,. M M, are stadard devatos of ood (ad) clets, are eas of ood (ad) clets M M M M ALL ( ) ALL ALL ALL µ,

µ µ D D D K Φ Φ D G Φ Lft q I val D oe results for orally Φ q ALL dstruted scores Assue that stadard devatos are equal to a coo value : ( ) Φ D Φ ( q) p D G Where Φ s the stadardzed oral dstruto fucto, Φ the oral dstruto fucto wth paraeters, ad Φ µ ( ), ( ) s the stadard quatle fucto. µ Lft q D M Φ q ALL Φ M ( q) p D G 5/30

6/30 Geerally (.e. wthout assupto of equalty of stadard devatos): Φ Φ c D a D a c D a D a K * * * * oe results for orally dstruted scores, a * D µ µ * M M D where c l, ( ) ( ) ( ) ( ) Φ Φ D D D D K l l * * * *

7/30 Geerally (.e. wthout assupto of equalty of stadard devatos): ( ) * Φ D G oe results for orally dstruted scores ( ) ( ) ( ) Φ Φ Φ Φ ALL ALL ALL ALL q q q q q Lft µ µ µ µ, *, ) ( val A A D A I *, ) ( val A A D A I ( ) Φ Φ ALL q M M q q Lft

oe results for orally dstruted scores K: µ 0, G µ 0, K ad the G react uch ore to chae of µ ad are alost uchaed the drecto of. G > K 8/30

oe results for orally dstruted scores Lft 0% : µ 0, I val : µ 0, I case of Lft 0% t s evdet stro depedece o µ ad sfcatly hher depedece o tha case of K ad G. Aa stro depedece o µ. Furtherore value of I val rses very quckly to fty whe teds to zero. 9/30

Coclusos It s possle to use scor odel effectvely wthout kow how ood t s. It s ecessary to jude scor odels accord to ther streth score rae where cutoff s expected. The G s ot eouh! Results cocer Lft ad Iforato value ca e used to ota the est avalale scor odel. Results for orally dstruted scores ca help wth coputato of referred dexes. Furtherore they ca help to uderstad how those dexes ehave. 30/30