Validation of Internal Rating and Scoring Models

Similar documents

Validation of low-default portfolios in the Basel II Framework

IMPLEMENTATION NOTE. Validating Risk Rating Systems at IRB Institutions

Regulatory and Economic Capital

Discussion Paper On the validation and review of Credit Rating Agencies methodologies

ACCEPTANCE CRITERIA FOR THIRD-PARTY RATING TOOLS WITHIN THE EUROSYSTEM CREDIT ASSESSMENT FRAMEWORK

Draft Supervisory Guidance on. Internal Ratings-Based Systems. for Corporate Credit

Despite its emphasis on credit-scoring/rating model validation,

The validation of internal rating systems for capital adequacy purposes

The Role of Mortgage Insurance under the New Global Regulatory Frameworks

The Western Hemisphere Credit & Loan Reporting Initiative (WHCRI)

Statistics for Retail Finance. Chapter 8: Regulation and Capital Requirements

Bank Capital Adequacy under Basel III

Credit Scorecards for SME Finance The Process of Improving Risk Measurement and Management

Implementing an AMA for Operational Risk

INTERAGENCY GUIDANCE ON THE ADVANCED MEASUREMENT APPROACHES FOR OPERATIONAL RISK. Date: June 3, 2011

Basel Committee on Banking Supervision. An Explanatory Note on the Basel II IRB Risk Weight Functions

Portfolio Management for Banks

Non-Bank Deposit Taker (NBDT) Capital Policy Paper

Credit Research & Risk Measurement

Approaches to Stress Testing Credit Risk: Experience gained on Spanish FSAP

Credit Risk Model Monitoring

Summary of submissions and final policy position on capital requirements for reverse mortgage loans, QRRE and the FIRB approach in BS2A and BS2B

How To Write A Bank

SEMINAR ON CREDIT RISK MANAGEMENT AND SME BUSINESS RENATO MAINO. Turin, June 12, Agenda

Basel Committee on Banking Supervision

Data Quality for BASEL II

MORTGAGE LENDER PROTECTION UNDER INSURANCE ARRANGEMENTS Irina Genriha Latvian University, Tel ,

Credit & Risk Management. Lawrence Marsiello Vice Chairman and Chief Lending Officer

Guidance for the Development of a Models-Based Solvency Framework for Canadian Life Insurance Companies

Measurement of Banks Exposure to Interest Rate Risk and Principles for the Management of Interest Rate Risk respectively.

Counterparty Credit Risk for Insurance and Reinsurance Firms. Perry D. Mehta Enterprise Risk Management Symposium Chicago, March 2011

Low Default Portfolio (LDP) modelling

Credit Risk Models. August 24 26, 2010

The Internal Capital Adequacy Assessment Process (ICAAP) and the Supervisory Review and Evaluation Process (SREP)

Loss Given Default What is your potential loss position in the current volatile environment?

CREDIT RISK MANAGEMENT

Central Bank of Ireland Guidelines on Preparing for Solvency II Pre-application for Internal Models

OWN RISK AND SOLVENCY ASSESSMENT AND ENTERPRISE RISK MANAGEMENT

RISK MANAGEMENT. Risk governance. Risk management framework MANAGEMENT S DISCUSSION AND ANALYSIS RISK MANAGEMENT

Operational risk capital modelling. John Jarratt National Australia Bank

Effective Techniques for Stress Testing and Scenario Analysis

11. Analysis of Case-control Studies Logistic Regression

Session 190 PD, Model Risk Management and Controls Moderator: Chad R. Runchey, FSA, MAAA

IT Governance. What is it and how to audit it. 21 April 2009

DATA AUDIT: Scope and Content

RISK MANAGEMENT REPORT (for the Financial Year Ended 31 March 2012)

The Coming Volatility

Basel Committee on Banking Supervision. Working Paper No. 17

LDA at Work: Deutsche Bank s Approach to Quantifying Operational Risk

Treatment of technical provisions under Solvency II

CEIOPS Advice for Level 2 Implementing Measures on Solvency II: Articles 120 to 126. Tests and Standards for Internal Model Approval

Consultation Paper: Review of bank capital adequacy requirements for housing loans (stage one).

Solvency II Preparation and IMAP James Latto

1) What kind of risk on settlements is covered by 'Herstatt Risk' for which BCBS was formed?

Capital Adequacy: Advanced Measurement Approaches to Operational Risk

RISK MANAGEMENT. Risk management framework. Risk governance

Assessing Credit Risk

Guidelines on the implementation, validation and assessment of Advanced Measurement (AMA) and Internal Ratings Based (IRB) Approaches

Financial Risk Forecasting Chapter 8 Backtesting and stresstesting

BEYOND AMA PUTTING OPERATIONAL RISK MODELS TO GOOD USE POINT OF VIEW

CHALLENGES FOR BASEL II IMPLEMENTATION IN CHILE

BASEL III PILLAR 3 CAPITAL ADEQUACY AND RISKS DISCLOSURES AS AT 30 SEPTEMBER 2015

ICAAP Required Capital Assessment, Quantification & Allocation. Anand Borawake, VP, Risk Management, TD Bank anand.borawake@td.com

Internal Ratings-based Approach to Credit Risk: Purchased Receivables

A Risk Management Standard

Model Risk Management

SUPERVISORY GUIDANCE ON MODEL RISK MANAGEMENT

2016 Comprehensive Capital Analysis and Review

Basel Committee on Banking Supervision. Working Paper No. 14. Studies on the Validation of Internal Rating Systems

CORPORATE CREDIT RISK MODELING: QUANTITATIVE RATING SYSTEM AND PROBABILITY OF DEFAULT ESTIMATION

Algorithmic Trading Session 1 Introduction. Oliver Steinki, CFA, FRM

Basel Committee on Banking Supervision. Working Paper No. 14. Studies on the Validation of Internal Rating Systems

CITIGROUP INC. BASEL II.5 MARKET RISK DISCLOSURES AS OF AND FOR THE PERIOD ENDED MARCH 31, 2013

Counterparty credit risk

EIOPACP 13/011. Guidelines on PreApplication of Internal Models

Credit Risk Stress Testing

Capital Adequacy: Internal Ratings-based Approach to Credit Risk

Zsuzsanna Tajti: The methodological opportunities of quantifying the retail mortgage loan s LGD in Hungary

Auto Days 2011 Predictive Analytics in Auto Finance

Risk Based Financial Planning Beyond Basel 2

Close Brothers Group plc

Managing Capital Adequacy with the Internal Capital Adequacy Assessment Process (ICAAP) - Challenges and Best Practices

Risk Based Capital Guidelines; Market Risk. The Bank of New York Mellon Corporation Market Risk Disclosures. As of December 31, 2013

Validating Third Party Software Erica M. Torres, CRCM

Validating Market Risk Models: A Practical Approach

Solvency II Data audit report guidance. March 2012

IMPLEMENTATION NOTE. Data Maintenance at IRB Institutions. Category: Capital. I. Introduction. No: A-1 Date: January 2006

Statistics in Retail Finance. Chapter 2: Statistical models of default

COMMITTEE OF EUROPEAN SECURITIES REGULATORS. Date: December 2009 Ref.: CESR/

Comply and Compete: Model Management Best Practices

CONSULTATION PAPER CP 41 CORPORATE GOVERNANCE REQUIREMENTS FOR CREDIT INSTITUTIONS AND INSURANCE UNDERTAKINGS

Capital management. Philip Scott, Group Finance Director

Transcription:

Validation of Internal Rating and Scoring Models Dr. Leif Boegelein Global Financial Services Risk Management Leif.Boegelein@ch.ey.com 07.09.2005 2005 EYGM Limited. All Rights Reserved.

Agenda 1. Motivation & Objectives 2. Measuring Discriminatory Power 3. Measuring the Quality of PD Calibration 4. Model Validation and Rating Philosophies 5. Relevance for Retail Credit Scoring 6. Summary 2

Motivation: Regulatory Compliance F-IRB / A-IRB Regulatory Requirements on Validation of internal estimates (BII 500-505) Banks must have a robust system in place to validate the accuracy and consistency of rating systems, processes, and the estimation of all relevant risk components [Basel II, 500] a bank must demonstrate to its supervisor that the internal validation process enables it to assess the performance of internal rating and risk estimation systems consistently and meaningfully. [Basel II, 500] Banks must regularly compare realised default rates with estimated PDs for each grade and be able to demonstrate that the realized default rates are within the expected range for that grade. [Basel II, 501] Banks must also use other quantitative validation tools and comparisons withrelevant external data sources. The analysis must be based on data that are appropriate to the portfolio, are updated regularly, and cover arelevant observation period. [Basel II, 502] UK Waiver Application CP189, CP05/03 Discriminatory Power Calibration Action plan Continuous Improvement Independent Staff Benchmarking 3

Regulatory developments: Europe 2005 2006 2007 2008 Basel II Final Accord June 2004 CRD Released July 2004 FSA CP 05/03 Trading Book Review Consultative Paper April 2005 Final CRD 4th quarter 2005 CRD Legislated FSA Prudential Source Book FSA CP Parallel Running Credit Risk Standardised Operational Risk BI Standardised Credit Risk Operational Risk Retail IRB FIRB BI Standardised Basel I Credit Risk Advanced IRB Operational Risk AMA 1 2 CEBS Guidance on Pillar 2 May 2004 QIS 4/5 Recalibration Of Accord FSA Application for Accreditation Credit Risk Retail IRB FIRB Operational Risk BI Standardised Credit Risk Advanced IRB Operational Risk AMA 3 4

Model Validation in the context of the UK Waiver application. CP05/03 Section A: Overview of Structure & Governance Framework CHALLENGES Roll-out plan THE KEY DECISION Single point of contact the right person at the right level Effective governance framework Section B: Self Assessment CHALLENGES Basel II, CRD, CEBS, FSA Comprehensive assessment Sign-off Section C: Summary of Firm s Approach in Key Areas CHALLENGES Section subject to most scrutiny Validation of Credit Risk Models Evidencing governance & use test Section D: Capital Impact CHALLENGES Quantitative impact study Section E: Detail on Rating System CHALLENGES Gathering of comprehensive information per model Evidencing compliance with requirements Section F: CEO or Equivalent Sign-off CHALLENGES Senior management held accountable How to involve them to the level required? 5

Motivation: Competitive Risk Management Business Value Credit Risk Management processes rely heavily on the quality of credit risk models. Reporting Internal, Management Reporting External Stakeholder Reporting Model error influences all phases of the credit lifecycle from transaction origination to portfolio management. Systematically inaccurate estimates of risk will lead to inefficient pricing and portfolio management decisions. Impact: Portfolio Management Credit Portfolio Management Capital Allocation and Performance Management Risk Measurement Internal Rating System Scoring Models LGD, EAD Models Transaction Decision-Making Underwriting and Approval Process Credit Authorities Risk-Based Pricing Tools 1. Deterioration of portfolio quality due to adverse selection effects. 2. Erosion of capital base due to unexpected losses, inadequate pricing and provisioning Monitoring Credit Reporting Credit Monitoring and Administration Model performance 6

The Model Validation Process -Components Completeness: Coverage of all transactions and portfolios; missing data protocols Appropriateness: Historical data is representative of current portfolio Methodology & Quantification Developmental Evidence: Model development and refinement stages Benchmarking: Comparison of results with other models Backtesting: Comparison of realized experience with that predicted by model Refinement: Review of key risk drivers Reporting Inclusion in reporting target audience of key decision-makers and reviewers Content matched to audience in depth, granularity, length, sufficiency Quality and accuracy appropriate for designated use Frequency appropriate for audience and designated use Data Quality Accuracy: Input data validation; frequent information refresh up to date data Consistency: Common identification and data definitions across organization Governance Consistency of application by end users Use Test Transparency of model development, approval and validation processes Independence of model operation from transaction/position originator Board responsibilities and committee structure and senior management oversight Ownership of models, ownership of validation process Approval processes for new models Documentation standards for Policies & procedures Models and methodologies Validation process Data Security & Controls Audit Trails for inputs and outputs Business Continuity Procedures Access & change controls 7

Agenda 1. Motivation & Objectives 2. Measuring Discriminatory Power 3. Measuring the Quality of PD Calibration 4. Model Validation and Rating Philosophies 5. Relevance for Retail Credit Scoring 6. Summary 8

Quantitative Validation: Discriminatory Power 0 10 20 30 40 Frequency of Model Score for complete (validation) sample 0 10 20 30 Conditional Frequency of Model Score for Non-Defaulters 0 2 4 6 8 Conditional Frequency of Model Score for Defaulters 0 20 40 60 80 100 Score 0 20 40 60 80 100 Score 0 20 40 60 80 100 Score Frequency plots for validation sample of retail scorecard validation sample (N=1000) indicate that the system is moderately succeeding in separating defaulters from Non-Defaulters Conditional Frequencies give us the likelihood of observing a certain Score given information about Default / Non Default. This is not the sought after PD, i.e. the likelihood of observing Default for a given Score! Beyond eye-inspection, we need a metric for the degree of discrimination between Defaulters and Non- Defaulters 9

Quantitative Validation: Discriminatory Power CAP: ( F(Score);G(F -1 (p)) ) p 0.0 0.2 0.4 0.6 0.8 1.0 Model Score Distribution for Complete Sample F(Score) 0 20 40 60 80 100 Score G(Score) 0.0 0.2 0.4 0.6 0.8 1.0 perfect model Prior PD Model CAP Random model 0.0 0.2 0.4 0.6 0.8 1.0 p 0.0 0.2 0.4 0.6 0.8 1.0 Model Score Distribution for Defaulters G(Score) F(Score) QQ plot of the distributions is the Cumulative Accuracy Profile (CAP). The Accuracy Ratio (AR) is defined as the area between the model CAP and the random model, divided by the area between the perfect model and the random model. 0 20 40 60 80 100 Score AR is typically reported between 0 (random model) and 1 (perfect model). Possible values in [-1;1] with negatives indicating better discrimination than random but in reverse order. 10

Quantitative Validation: Discriminatory Power (Example) [%] Defaults 0.0 0.2 0.4 0.6 0.8 1.0 NN 19-19-1 (AR=0.63) Strong Logit model (AR=0.58) Weak Logit model (AR=0.26) CAP and Accuracy Ratio measure how well the model is able to separate defaulters from nondefaulters The exemplary models produce values for AR between 0.26 and 0.63 0.0 0.2 0.4 0.6 0.8 1.0 Score CAP curves for 3 models developed on a retail dataset. Validation sample 1000 cases. 2 out of the 3 models seem to have discriminatory power Validation of discriminatory power must be based on a representative sample that has not been used for model development! 11

Quantitative Validation: Discriminatory Power (Example) Default events are random. Even if we could predict all default probabilities without error, the measures could still indicate low discriminatory power for periods with odd defaults By simulating defaults for the portfolio and calculating AR in each scenario we gain an impression on the variability of AR itself. For small samples, low default portfolios and models developed for pools of homogeneous obligors this issue becomes critical AR is dependent on the sample and can therefore not provide an objective comparison between models of different samples! Fraction of Defaulters 0.0 0.2 0.4 0.6 0.8 1.0 0 1 2 3 4 5 Asymptotic distribution Bootstrapping 0.0 0.2 0.4 0.6 0.8 1.0 Fraction of Toatal Obligors Bootstrapping: LogitCAP and 95% confidence bounds 0.0 0.2 0.4 0.6 0.8 1.0 AR.history Bootstrapping: Distribution of AR 12

Quantitative Validation: Discriminatory Power (Example) 0 2 4 6 8 0 1 2 3 4 5 Asymptotic distribution Bootstrapping 0.0 0.2 0.4 0.6 0.8 1.0 AR 0.0 0.2 0.4 0.6 0.8 1.0 AR.history Simulation: Distribution of AR, Random Model Simulation: Distribution of AR, Retail Scoring Model 13 For small (validation and development) sample sizes and low-default portfolios the influence of statistical noise on Accuracy ratios becomes more pronounced. AR analysis should include a test versus the random model with AR=0. In the example, the distribution of model AR is comfortably far to the right of the distribution of the Random model AR. Based on the simulation results (Validation sample size 200) we can estimate the confidence level of wrongly rejecting the hypothesis that the model is a random model (<<10exp-3 in the example).

Quantitative Validation: Discriminatory Power The Accuracy Ratio is a random variable itself. It is dependent on the portfolio structure, the number of defaulters, the number of rating grades etc. The metric is influenced by what it is measuring and should not be interpreted without knowledge of the underlying portfolio Accuracy Ratios are not comparable across models developed from different samples 14

Agenda 1. Motivation & Objectives 2. Measuring Discriminatory Power 3. Measuring the Quality of PD Calibration 4. Model Validation and Rating Philosophies 5. Relevance for Retail Credit Scoring 6. Summary 15

Quantitative Validation: Calibration The Calibration of Rating system influences the complete credit lifecycle, from pricing to portfolio management decisions. In practice, validation of PD accuracy is often considered a difficult exercise due to data constraints and lack of a proper definition of rating philosophy. Validation of PD accuracy covers all rating classes / retail pools. Different from the traditional focus on accuracy at the cut-off point in retail scoring. Simple statistical methods can be utilized to illustrate the uncertainty associated with PD estimates. 16

Quantitative Validation: Calibration The ultimate goal of rating models is to predict the probability of default events PDs are either produced by the model or obtained via mapping of internal grades to external default experience The accuracy of these estimates needs to be validated Default probability (PD) Forecast Actual Problem with model or statistical noise? Rating classes Realized Default rates will deviate from estimated ones. Validation procedures need to examine whether the deviation is substantial and should lead to a review of the model or can be attributed to statistical noise. Focus : 1. Significance of Deviations 2. Monotony of PDs with regards to risk PD Accuracy: Observed Default Rates versus Predicted Binomial test / Normal test. Determine probability of observing the realized default rate under the hypothesis that the estimated value is correct for each rating class Produce overall measure of calibration accuracy for the rating system 17

Quantitative Validation: Binomial Test Simple model to analyze calibration quality assuming independent default events: Assume that our estimate of the PD for the reference period in aparticular rating class k is pˆ k. We count n obligors in this particular rating class. If default events happen independently, the number of total defaults bˆk k for this rating class during the reference period is binomially distributed with parameters: Under this assumption, we can calculate the probability of observing a given range of the default rate. Alternatively we can fix a certain level of confidence and calculate the upper and lower bounds on the observed default rate under the hypothesis that pˆ k equals the true PD p k. Limitations: Default rates are typically influenced by common factors. Unconditional therefore appear to be correlated. The closely related normal test incorporates correlation between default rates / default events. The chosen rating philosophy determines the volatility of PD estimates and should be considered when interpreting the results as we will discuss later on. 18

Quantitative Validation: Calibration For 5 of the 7 rating classes we reject the hypothesis that the PD equals the long term average based on a two sided 95% interval and the assumption of independent defaults. 30.00% 25.00% 20.00% + 15.00% 10.00% 5.00% 0.00% + + + + + + 0 1 2 3 4 5 6 7 8 19

Quantitative Validation: Calibration The results from the Binomial test look at each rating class in isolation. The HosmerLemeshow statistic is an example of how the Goodness-of-Fit information can be condensed into one figure. The Hosmer-Lemeshow statistic is distributed Chi square with K degrees of freedom (if we assume that the p k are estimated out of sample). Lower p-values for HL document decreasing Goodness-of-Fit, i.e. the hypothesis that the observed default rate equals the assumed value become increasingly unlikely. HL p-values allow us to compare rating models with different number of classes. For the example, the HL statistic is 44.7, which yields a p-value < 10exp-6 for 7 degrees of freedom. We would therefore reject H 0 that our estimated PDs are in line with realized defaults across rating classes. The test is based on the normal approximation and independence assumption of observed p k 20

Agenda 1. Motivation & Objectives 2. Measuring Discriminatory Power 3. Measuring the Quality of PD Calibration 4. Model Validation and Rating Philosophies 5. Relevance for Retail Credit Scoring 6. Summary 21

PD [%] 0.0 0.5 1.0 1.5 2.0 2.5 Rating Philosophy Point-in-time PD Typical Hybrid approach Through the cycle PD 0 2 4 6 8 t [a] The volatility of observed PDs within rating grades and the migration frequency of obligors across rating grades are determined by the adopted rating philosophy. When interpreting the previously discussed PD Accuracy measures, one has to take into account the philosophy underlying the estimations: Point-in-time: Rating and PD estimate is based on current condition of the obligors risk characteristics, typically including the economic environment. This should result in high rating migration frequency. Default rates within rating classes should remain stable. Through-the-cycle: Obligor ratings are not influenced by short term risk characteristics. Ratings do not change frequently, default rates vary. Default Rate Volatility Rating Migration Frequency Hybrid Approach: Mixture of both pure philosophies. The rating system incorporates current condition elements of obligor idiosyncraticand systematic risks. Data limitations force a medium term PD forecast. Point in Time Low High Through the Cycle High Low 22

Agenda 1. Motivation & Objectives 2. Measuring Discriminatory Power 3. Measuring the Quality of PD Calibration 4. Model Validation and Rating Philosophies 5. Relevance for Retail Credit Scoring 6. Summary 23

Quantitative Validation Retail Banking Book Goods: 40.000 Bads: 5.000 New Retail Product 1 Goods: 1.000 Bads: 20 New Retail Product 2 Goods: 5.000 Bads: 350 Old Retail Product 1 Goods: 20.000 Bads: 3.000 Old Retail Product 2 Goods: 14.000 Bads: 1.630 Sub-Portfolio 1 Goods: 18.000 Bads: 2.000 Sub-Portfolio 2 Goods: 1.350 Bads: 500 Sub-Portfolio 3 Goods: 650 Bads: 500 Sub-Portfolio 4 Goods: 6.000 Bads: 250 Sub-Portfolio 5 Goods: 8.000 Bads: 1.130 Example of medium sized retail bank model infrastructure As we have seen, uncertainty of measurements and constraints in application result mostly from data restrictions (low default, small sample, model assumptions etc.). While on the retail banking book level, low-default and small may become important when portfolio splits are conducted to increase specificity of the model. 24

Agenda 1. Motivation & Objectives 2. Measuring Discriminatory Power 3. Measuring the Quality of PD Calibration 4. Model Validation and Rating Philosophies 5. Relevance for Retail Credit Scoring 6. Summary 25

Model Validation Summary Validation is a rigorous process carried out by a bank to ensure that parameter estimates result in an accurate characterization of risk to support capital adequacy assessment Validation is the responsibility of banks, not supervisors It is not a purely statistical exercise. The appropriateness of metrics will depend on the particular institution s portfolios and data availability. However, quantitative measures are considered to play an integral part in validation process by institutions and regulators. In practice, data is often insufficient and validation will focus on developmental evidence in model design, data quality and benchmarking. In practice, model validation is often not implemented as an actionable and independent process. It often lacks a formal policy with statements of definitions of responsibilities, tests to be performed, metrics and benchmarks to use, thresholds for acceptable quality and actions to be taken if these are breached. 26

References Basel Committee on Banking Supervision, International Convergence of Capital Measurement and Capital Standards A Revised Framework, June 2004 Basel Committee on Banking Supervision. Studies on the Validation of Internal Rating Systems. Basel, Feb. 2005. Working paper No.14 B. Engelmann, E. Hayden, and D. Tasche. Measuring the discriminatory power of rating systems. Nov 2002 FSA, Report and first consultation on the implementation of the new Basel and EU Capital Adequacy Standards, CP189, July 2003. FSA, Strengthening capital standards, CP05/03, Jan. 2005 R. Rauhmeier, Validierung und Performancemessungvon bankinternen Ratingsystemen, Dissertation, July 2003 L.C. Thomas, D.B. Edelman, and J.N. Crook. Credit Scoring and Its Applications. Monographs on Mathematical ModelingAnd Computation, Siam, 2002. 27