Scaling Models for the Severity and Frequency of External Operational Loss Data



Similar documents
benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Can Auto Liability Insurance Purchases Signal Risk Attitude?

An Alternative Way to Measure Private Equity Performance

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Forecasting the Direction and Strength of Stock Market Movement

1. Measuring association using correlation and regression

Reporting Forms ARF 113.0A, ARF 113.0B, ARF 113.0C and ARF 113.0D FIRB Corporate (including SME Corporate), Sovereign and Bank Instruction Guide

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The OC Curve of Attribute Acceptance Plans

SIMPLE LINEAR CORRELATION

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

DEFINING %COMPLETE IN MICROSOFT PROJECT

Vasicek s Model of Distribution of Losses in a Large, Homogeneous Portfolio

Portfolio Loss Distribution

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

Analysis of Premium Liabilities for Australian Lines of Business

CHAPTER 14 MORE ABOUT REGRESSION

Calculation of Sampling Weights

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

The Choice of Direct Dealing or Electronic Brokerage in Foreign Exchange Trading

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

Statistical Methods to Develop Rating Models

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

The Use of Analytics for Claim Fraud Detection Roosevelt C. Mosley, Jr., FCAS, MAAA Nick Kucera Pinnacle Actuarial Resources Inc.

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

The Current Employment Statistics (CES) survey,

LIFETIME INCOME OPTIONS

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

Transition Matrix Models of Consumer Credit Ratings

Section 5.4 Annuities, Present Value, and Amortization

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

How To Evaluate A Dia Fund Suffcency

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

The impact of hard discount control mechanism on the discount volatility of UK closed-end funds

Management Quality, Financial and Investment Policies, and. Asymmetric Information

Efficient Project Portfolio as a tool for Enterprise Risk Management

Macro Factors and Volatility of Treasury Bond Returns

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

Traffic-light a stress test for life insurance provisions

How To Calculate The Accountng Perod Of Nequalty

Multiple-Period Attribution: Residuals and Compounding

Criminal Justice System on Crime *

Underwriting Risk. Glenn Meyers. Insurance Services Office, Inc.

Method for assessment of companies' credit rating (AJPES S.BON model) Short description of the methodology

1.1 The University may award Higher Doctorate degrees as specified from time-to-time in UPR AS11 1.

What is Candidate Sampling

Estimating Total Claim Size in the Auto Insurance Industry: a Comparison between Tweedie and Zero-Adjusted Inverse Gaussian Distribution

Intra-year Cash Flow Patterns: A Simple Solution for an Unnecessary Appraisal Error

Recurrence. 1 Definitions and main statements

The Application of Fractional Brownian Motion in Option Pricing

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Covariate-based pricing of automobile insurance

Estimating Total Claim Size in the Auto Insurance Industry: a Comparison between Tweedie and Zero-Adjusted Inverse Gaussian Distribution

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION

Overview of monitoring and evaluation

An Interest-Oriented Network Evolution Mechanism for Online Communities

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

Management Quality and Equity Issue Characteristics: A Comparison of SEOs and IPOs

To manage leave, meeting institutional requirements and treating individual staff members fairly and consistently.

HARVARD John M. Olin Center for Law, Economics, and Business

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Using Series to Analyze Financial Situations: Present Value

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

An Empirical Study of Search Engine Advertising Effectiveness

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

FINAL REPORT. City of Toronto. Contract Project No: B

L10: Linear discriminants analysis

Small pots lump sum payment instruction

Financial Mathemetics

The Choice of Direct Dealing or Electronic Brokerage in Foreign Exchange Trading

7.5. Present Value of an Annuity. Investigate

! # %& ( ) +,../ # 5##&.6 7% 8 # #...

Two Faces of Intra-Industry Information Transfers: Evidence from Management Earnings and Revenue Forecasts

Traffic State Estimation in the Traffic Management Center of Berlin

Simple Interest Loans (Section 5.1) :

A Simplified Framework for Return Accountability

The impact of bank capital requirements on bank risk: an econometric puzzle and a proposed solution

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35, , ,200,000 60, ,000

Demographic and Health Surveys Methodology

2008/8. An integrated model for warehouse and inventory planning. Géraldine Strack and Yves Pochet

Construction Rules for Morningstar Canada Target Dividend Index SM

ESTIMATING THE MARKET VALUE OF FRANKING CREDITS: EMPIRICAL EVIDENCE FROM AUSTRALIA

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

A Multistage Model of Loans and the Role of Relationships

Chapter 8 Group-based Lending and Adverse Selection: A Study on Risk Behavior and Group Formation 1

IMPACT ANALYSIS OF A CELLULAR PHONE

Stress test for measuring insurance risks in non-life insurance

Calculating the high frequency transmission line parameters of power cables

Start me up: The Effectiveness of a Self-Employment Programme for Needy Unemployed People in Germany*

1 De nitions and Censoring

THE DETERMINANTS OF THE TUNISIAN BANKING INDUSTRY PROFITABILITY: PANEL EVIDENCE

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

Transcription:

Scalng Models for the Severty and Frequency of External Operatonal Loss Data Hela Dahen * Department of Fnance and Canada Research Char n Rsk Management, HEC Montreal, Canada Georges Donne * Department of Fnance and Canada Research Char n Rsk Management, HEC Montreal, Canada CREF and CIRRELT January 2008 Submtted to the Conference Performance Measurement n the Fnancal Servces Sector: Fronter Effcency Methodologes and Other Innovatve Technques July 4-5, 2008 Tanaka Busness School Imperal College London London, UK We tank F. Bellavance, S. Chrstoffersen, and B. Rémllard for ther helpful comments and recommendatons. We gratefully acknowledge fnancal support from CREF and IFM2. * Please send all correspondence to: HEC Montréal, Canada Research Char n Rsk Management, 3000, Chemn de la Côte-Sante-Catherne, Montreal, Quebec, Canada, H3T 2A7 (Phone): + 514 340 6596 (Fax): +514 340-5019 (E-Mal): hela.dahen@hec.ca, georges.donne@hec.ca

Summary Accordng to Basel II crtera, the use of external data s absolutely ndspensable to the mplementaton of an advanced method for calculatng operatonal captal. Ths artcle nvestgates how the severty and frequences of external losses are scaled for ntegraton wth nternal data. We set up an ntal model desgned to explan the loss severty. Ths model takes nto account frm sze, locaton, and busness lnes as well as rsk types. It also shows how to calculate the nternal loss equvalent to an external loss, whch mght occur n a gven bank. OLS estmaton results show that the above varables have sgnfcant power n explanng the loss amount. They are used to develop a normalzaton formula. A second model based on external data s developed to scale the frequency of losses over a gven perod. Two regresson models are analyzed: the truncated Posson model and the truncated negatve bnomal model. Varables estmatng the sze and geographcal dstrbuton of the banks actvtes have been ntroduced as explanatory varables. The results show that the negatve bnomal dstrbuton outperforms the Posson dstrbuton. The scalng s done by calculatng the parameters of the selected dstrbuton based on the estmated coeffcents and the varables related to a gven bank. Frequency of losses of more than $1 mllon are generated on a specfc horzon. Key words: Operatonal rsk n banks, scalng, severty dstrbuton, frequency dstrbuton, truncated count data regresson models. JEL classfcaton: G21, G28, C30, C35. 2

1. Introducton Over the recent years, there s an ncreasng nterest from fnancal nsttutons to dentfy losses assocated wth operatonal rsk. Ths s due to regulatory consderatons accordng to Basel II accord and also due to the occurrence of huge operatonal losses recently. We can menton two examples of enormous operatonal losses sustaned by the fnancal sector: $2.4 bllon lawsut CIBC sustaned by the shareholders of Enron and a $690 mllon loss caused by a rogue tradng actvtes at Alled Irsh Banks. Add to these the case of Barngs, the UK s oldest bank; t went bankrupt followng a rogue tradng actvtes too occasonng a loss of $1.3 bllon. These examples show the scope of ths rsk. They also serve as an mperatve warnng sgnal to fnancal nsttutons, whch must defne, measure, and manage ths rsk. Besdes the huge losses t can cause, operatonal rsk also threatens all the actvtes and operatons of an nsttuton. Operatonal events can be lnked to people, processes, systems, and external events. However, operatonal rsk has a varyng degree of mpact on all unts wthn the nsttuton. Gven ts scope and complexty, the management of operatonal rsk has become a necessty. Aware of ts dsruptve potental, regulatory authortes, n June 1999, opened a debate on the development of a management framework attuned to operatonal rsk. Such a framework would, among other thngs, provde for an operatonal loss dentfcaton and measurement of operatonal regulatory captal. They seek to mprove on the exstng rules by algnng regulatory captal requrements more closely to the underlyng rsks that banks face. In other words, they ensure that there s suffcent captal to cover the unexpected losses. Research n ths feld s stll n ts embryonc stage. Hence, any and all developments wll be of great use n helpng fnancal nsttutons meet ther short-term demands and they wll also beneft other ndustres n ther pursut of medum-term goals. In the Basel II agreement, one of the approaches proposed for quantfyng operatonal rsk captal s the advanced method. The development of such a method requres a large database. The data can be drawn from dfferent sources. Internal data are very useful n reflectng the real level of exposure to operatonal rsk. Ideally, they should be the only source of statstcal nformaton. However, most bank s nternal 3

data collecton process are stll n ts nfancy stage, and there s not enough data especally those rare, 1 hgh mpact losses to estmate the unexpected loss. Recourse to external operatonal loss data s therefore essental n order to supplement nternal data, especally those tal events, whch are generally mssng from nternal data. It s thus justfable to combne these possble severe losses wth the losses of the nternal database of a bank, so as to reduce ther surprse effect (the unexpected) and to calculate adequate operatonal rsk captal. Obvously, we cannot predct the exact amount of extreme losses, whch have not yet occurred. However, based on the losses recorded n the bankng sector, we can make a projecton for a partcular bank, f our scalng takes certan factors nto account. Gven ths context and the fact that an external database s needed for the calculaton of an operatonal rsk captal wth an advanced approach, the objectve of ths paper s to develop a robust method that can use external data to predct the severty as well as the frequency of losses, whch a bank s exposed to. Several factors wll be consdered n explanng the number of losses and ther severtes for a gven perod. It s thus a matter of projectng the external losses havng occurred n the ndustry to the level of just one bank. The method developed n ths artcle has been tested on the data of an external base contanng operatonal losses of more than $1 mllon. However, ths method s applcable to any external database. A combnaton of external losses scaled wth the nternal data of a partcular bank makes t possble to measure that bank s exposure to operatonal rsk. Ths artcle s organzed as follows. The second secton descrbes the dfferent approaches used to measure captal; gves the sources and characterstcs of the data; and takes a bref look at the scalng methods reported n the lterature. A descrpton of the data n the external base s presented n the thrd secton. The fourth secton states the model s hypotheses. The model to be used n scalng the severty of losses s then developed n the ffth secton. The next to last secton develops the model for scalng frequences. Fnally, the study ends wth a concluson and a dscusson of possble avenues of further research. 1 A rare loss s defned as one whch results from a hghly unlkely event. 4

2. Context 2.1 Regulatory framework In 2001, the Basel Commttee defned operatonal rsk as beng the rsk of loss resultng from nadequate or faled nternal processes, people and systems or from external events. Legal rsk s also ncluded, but the defnton does not take nto account strategc and reputatonal rsk. Wth the Basel II accord, there s now a requrement for an amount of addtonal regulatory captal to cover operatonal rsk. Regulatory authortes have dentfed three dfferent methods of calculatng ths captal. The most advanced of these three methods shows greater senstvty n ts detecton of rsk. In ths artcle, we use the advanced approach n dealng wth our research problem. Ths more advanced and sophstcated approach reles on nternal captal-calculaton procedures adopted by banks. Regulatory authortes are very flexble concernng the method chosen, provded t combnes adequately qualtatve and quanttatve crtera (nternal data, relevant external data, scenaro analyss, and busness envronment and nternal control factors). The method selected must reflect the fnancal nsttuton s level of exposure to operatonal rsk and must measure the unexpected loss correctly. Three optons are proposed under the advanced approach: (1) the nternal measurement approach (IMA); (2) the loss dstrbuton approach (LDA); and (3) the Scorecard approach. In ths study we focus on the loss dstrbuton approach (LDA), whch estmates unexpected loss or the operatonal value at rsk by modelng the amounts and frequences of operatonal losses. The correct combnaton of nternal and external loss data s thus an mportant step to be consdered n an advanced approach. 2.2 Sources of external data and ther potental bases Once scaled, external data can be combned wth nternal data to generate a database representng a partcular bank s rsk profle. Ths plays a part n mplementng the advanced approach. The sources of external data are stll qute lmted. But we can cte at least three: 5

Publc data obtaned from reports n the meda and magaznes on losses of over $1 mllon. There are two bases of external data on the market (such as Ftch). The problem wth ths type of data s that the base only contans very severe losses havng occurred n large fnancal nsttutons. Recourse to ths type of base wll not make up for the scarcty of data on certan types of rsk (busness dsruptons and system falures), but t does supplement the base wth data on extreme losses, whch occur rarely. Such losses wll form dstrbuton tals, snce the nternal data of most fnancal nsttutons contan no hstorcal record of large losses, whch mght occur. When combnng nternal and external data, specal treatment s needed to correct data-lnked bases. Data provded by nsurance brokers (such as Wlls, Aon and Marsh) have to do wth losses clamed by fnancal nsttutons. The major advantage of ths source s ts relablty. Snce data of ths sort are collected drectly from fnancal nsttutons, there s mnmum selecton bas. However, ths source has the dsadvantage of contanng dfferent collecton thresholds, sometmes unobservable, whch hnge on varatons n nsurance polcy deductbles. The second lmtaton of ths source resdes n the specfc nature of the types of rsk collected. In fact, only nsurable losses wll be ncluded n ths base. Non-publc data obtaned by complng nternal data from banks, whch have agreed to share ther nformaton, thus consttutng a consortum, lke ORX (Operatonal Rskdata Exchange Assocaton). However, gven the confdentalty of the nformaton shared, only the statstcs and analyses pertanng to the losses are avalable to partcpants. The advantage of ths source of data s ts relablty. The collecton threshold s much lower than that of the precedng sources. Ths makes the loss amounts more comparable, especally when the member banks are almost of the same sze. However, the major dsadvantage of ths source of data s that t does not allow event-by-event access to the losses. Therefore, these data can not be used to construct a base combnng nternal and external data. External data contan many bases, such as: Selecton bas: Only very large losses are publshed. Ths bas s lnked to the nature of the databases avalable and s thus dffcult to correct. 6

Control bas: Losses come from banks wth dfferent control envronments. There are unfortunately no varables capable of estmatng qualty control for the banks found n external bases. So, t s not possble to correct ths bas wth the nformaton avalable. Collecton bas: When data are drawn from dfferent sources, varatons n thresholds may cause bases. Frachot and Roncall (2002) and Baud, Frachot, and Roncall (2002) descrbe how nternal data can be compared wth external data havng dfferent collecton thresholds. Scale bas: Losses come from banks of dfferent szes (assets, revenues, number of employees ) located n dfferent countres. Our research s concerned wth correctng ths bas. 2.3 Lterature revew Lttle research has been done to fnd a soluton to the scale problem. Shh, Samad-Khan, and Medapa (2000) have ntroduced the nsttuton s sze as the man scalng factor. They have shown that the relaton between operatonal losses and frm sze s non-lnear. In effect, the relaton between the logarthm of the scale factor and the loss amounts s stronger than the one between losses and the gross scale varable. Besdes, a bank twce as large as another wll not, on average, suffer losses two tmes hgher than those of the smaller bank. Shh, Samad-Kahn, and Medapa (2000) effectvely suppose the relaton to be as follows: α L = R F where: L: loss amounts; R: total revenue of the frm where the loss occurred; α : a scalng factor; θ : a vector representng all the rsk factors not explaned by R. F ( θ ) s thus a multplyng resdual term whch s not explaned by fluctuatons n revenue. Takng the logarthm of ths equaton, we obtan a lnear relaton. It s thus possble to estmate α and the logarthm of the functon of the other factors F θ whch consttute the scalng factor ( ) the regresson s constant term. ( θ ) 7

Total revenue s the only rsk factor ncluded n the model, whch estmates frm sze. Most of the varablty n losses s thus probably caused by other factors such as type of busness lne, qualty of management, and effectveness of the control envronment. In ths same study, t has been shown that sze explans only a small porton (about 5%) of the loss amounts. Along the same lnes, Hartung (2004) has developed a normalzaton formula, whch makes t possble to calculate the equvalent of an external loss for a gven bank. The formula used s where: Loss adj Loss org Loss adj : the loss amount adjusted for a gven bank; b ( ) ( ) adj Scal. Param Loss = loss org 1+ a 1 Scal. Param Lossorg : the orgnal loss amount for a reference bank; Scal. Param( Loss adj ): a scalng parameter for a gven bank; Scal. Param( Loss org ): a scalng parameter for a reference bank; a, b: adjustment factors such that [ 1;1 ], b [ 0;1] a. The scalng parameter was assgned based on the cause of the event. Examples of ths parameter are revenues, number of employees or qualty of rsk management. Hypotheses have been formulated concernng the value of the adjustment factors n relaton to the scalng parameter. Ths scalng model s lmtatons consst, on the one hand, n the lack of any theoretcal justfcaton of the formula used and, on the other hand, n the absence of a sutable method for estmatng the adjustment factors. Accordng to the study by Na (2004), the loss amount can be broken down nto a common component and an dosyncratc component. The component common to all the banks or busness lnes captures all the changes n the macroeconomc, geopoltcal, and cultural envronment, whereas the dosyncratc component covers all the factors specfcally lnked to the lne of busness or the loss event. A power relatonshp has been nserted between ths last component and a sze estmator. A normalzaton formula has been developed to fnd, for lnes of 8

busness B1 and B2 of a gven bank, the equvalent loss amount of a loss taken as a reference pont. The formula s as follows: L T, B L T, B1 T, B2 ( R ) ( R ) λ dosyncratc = L λ T, B1 dosyncratc T, B2 : a loss amount havng occurred at date T at the bank or n the busness lne B; ( R dosyncratc ) T, B : the revenue of the bank or the busness lne at date T, consttutng the only estmator of the dosyncratc component; λ : a scalng factor. Ths model can be mproved by ntroducng scalng factors other than frm sze. Our model takes nto consderaton sze, locaton, busness lne, and rsk type. Once severty has been scaled, t s also mportant to determne the frequency wth whch normalzed losses occur on a partcular tme horzon. Very few studes have attempted ths type of scalng. Some studes have ndeed developed normalzaton models for severty but wthout consderng any scalng for frequences (Shh, Samad-Khan, and Medapa, 2001; Hartung, 2004). Hartung (2004) groups the frequency of losses n four banks along a nne-year horzon. The bank used as a reference wll have a dstrbuton dentcal to that of the four banks grouped together. These banks are not necessarly comparable. Several factors enter nto play when determnng the frequency of losses. Na (2004) has developed a model for scalng frequences whch s equvalent to the one used to scale severty. Ths model stpulates that the frequency of losses can be broken down nto a common and dosyncratc component estmated by sze. He concludes that sze s a sgnfcant factor n explanng the varablty of the number of losses. However, the model s man lmtaton s that t does not account for the dscrete character of the frequency data. In ths study, we develop a count data regresson model. A model of ths type can take the dscrete and non-negatve character of the data nto account. Two models wll thus be tested: the Posson and negatve bnomal (Klugman, Panjer, and Wllmot, 1998; Cruz, 2001). The regresson component contaned n the model allows us to take nto account certan factors related to the scalng procedure. 9

In the models used to descrbe dscrete varables n the lterature (Cox and Lews, 1966; El Sayyad, 1973; Frome, Kutner, and Beauchamp, 1973; Hausman, Hall, and Grlches, 1984; Gouréroux, Monfort, and Trognon, 1984), the endogenous varables are supposed to have a Posson regresson dstrbuton. The parameter of ths dstrbuton s a functon of the values of the exogenous varables. The choce of ths model s justfed when the dependent varable counts the occurrence of a gven event over a specfc perod and when the usual hypotheses of the Posson dstrbuton are satsfed. Several applcatons of ths model appear n the lterature. It has been used to model such rsks as: the number of patents receved by a frm (Hausman, Hall, and Grlches, 1984), the number of vsts to a doctor (Cameron, Trved, Mlne, and Pggott, 1988) or the number of automoble or plane accdents (Donne and Vanasse, 1989 and 1992; Donne et al. 1997). The Donne-and- Vanasse applcaton (1989) s the frst to ntroduce a regresson component n the nsurance feld, a feld showng many smlartes wth operatonal rsk. The number of accdents per ndvdual s supposed to follow a Posson dstrbuton whose parameter wll vary from one exposure unt to the next. Ths parameter actually depends on the characterstcs of the unts exposed. As dscussed by Maddala (1983) and Cameron, and Trved (1986), the coeffcents of these varables are estmated usng the maxmum lkelhood method. The Posson regresson model supposes equdsperson (equalty between the condtonal average and varance). Ths restrcton may not be compatble wth operatonal loss data. Recourse to a negatve bnomal dstrbuton 2 compensates for ths problem, snce t allows overdsperson. The studes done by Donne and Vanasse (1989 and 1992) and by Boyer, Donne, and Vanasse (1991) have shown the superorty of the negatve bnomal regresson model over the Posson regresson model when treatng automoble accdents. The negatve bnomal regresson model s now frequently used n the nsurance lterature. Once certan condtons have been met, t s possble to use maxmum lkelhood n estmatng dstrbuton parameters. But f the densty s poorly specfed, the estmators found wth maxmum lkelhood wll not be good. Gouréroux, Monfort, and Trognon (1984a and 1984b) have proposed other methods to counteract ths problem, such as the pseudo maxmum 10

lkelhood (PML) and the quas-generalzed pseudo maxmum lkelhood (QGPML). They have specfed the condtons under whch these PML and QCPML estmators from the lnear exponental famly of models wll behave coherently when appled to non-truncated models. However, f the densty of the negatve bnomal s correctly specfed, the maxmum-lkelhood estmators wll be more effcent than the PML and QGPML (Donne and Vanasse, 1992). In modelng the number of operatonal losses, we shall apply these models, whch have ganed great popularty n the lterature. These models make t possble to ntroduce nformaton on the fnancal nsttuton where the loss occurred. Exogenous varables reflectng the frm s locaton and geographcal dstrbuton wll help account for scalng. To our knowledge, ths s the frst tme these models are beng appled to operatonal rsk and, more precsely, beng used to scale the frequency of operatonal losses. However, we observe that the frequences do exceed zero. We thus develop truncated Posson and negatve bnomal regresson models at pont zero. The truncated denstes of these models have been presented by Cameron and Trved (1998) and Gurmu (1991). Gurmu and Trved (1992) have developed overdsperson tests for the same models. 3. Descrpton of external data Ftch s OpVaR database s made up of operatonal losses of US $1 mllon and over. Ths database contans losses from all ndustres. Snce our only target s banks, the database was frst screened to select only operatonal losses connected wth fnancal nsttutons. The database contans the followng types of nformaton. 1. Type of event, level 1: Types of rsk defned by regulatory authortes. Under ths headng we fnd: External fraud Internal fraud Clents, products, and busness practces Employment, practces, and workplace safety Executon, delvery and process management 2 Ths s a Posson dstrbuton whose random parameter follows the gamma dstrbuton. 11

Damage to physcal assets Busness dsrupton and system falures Also avalable are types of events at levels 2 and 3, whch offer greater precson and granularty. For example: dscrmnaton and dversty as a sub event of the employment, practces, and workplace safety rsk type. As type-3 events under the dversty and dscrmnaton subtype we have, for example, dscrmnaton due to age, sex, race, sexual orentaton, and sexual harassment. 2. The name of the parent company and that of the subsdary; 3. A detaled descrpton of the loss event; 4. The loss amount n local currency, n Amercan dollars, and ts real value (countng nflaton); 5. Date of the event. Snce we are not always sure of the exact date (day and month); we use only the year of the event; 6. Industry: Ether fnancal servces or publc admnstraton; 7. Busness unt, levels 1, 2, and 3: The frst level makes the dstncton between fnancal and non-fnancal nsttutons. In our case, we are only nterested n the fnancal sector. Level 2 s concerned wth fnancal nsttutons and makes the dstncton between banks, nsurance frms, nvestment banks, and other nsttutons. In level-3 busness unts, we fnd the lnes of busness defned by the Basel Commttee. 8. The country where the loss occurred. 9. An dentfcaton code for each loss. 10. Informaton on the nsttuton where the loss occurred: total assets, total equty, total deposts, total revenues, and number of employees. It s worth mentonng that the frm-related nformaton needed for some observatons s mssng. Snce ths nformaton wll be of key use n the scalng model later on, we are oblged to select only the loss data for whch specfc nformaton on the frm s avalable. Moreover, 1.8% of losses occurred between 1981 and 1994 and averaged $130.31 M per event, whereas the losses, 12

whch occurred later, averaged $67.15 M per event. We thus remove the events havng occurred before 1994 from the external base because of a collecton bas. Hence, 1,056 observatons of losses of more than a mllon Amercan dollars reman n the database. 4. Hypotheses of the model We apply a theoretcal model desgned to scale severtes and frequences. The emprcal applcaton of scalng models to external data s of great help n showng the model s smplcty and n producng results. It s clear that the database used s open to crtcsm for the reasons mentoned above. Snce, for the moment, no better data sources exst, we wll be guded by the followng hypotheses n carryng out our analyss. The methodology used would admttedly be applcable to other bases, provded they contan the nformaton requred. We suppose that the loss amounts recorded n the base as reported n the meda are exact and factoral. The evaluaton of losses s thus based nether on rumours nor predctons. We suppose that all types of losses are as lkely to be recorded n the base; there s thus no meda effect related to certan types of rsk. We suppose that the external base provdes all the losses of more than a mllon dollars for the fnancal nsttutons contaned n t. We suppose that there s no correlaton between the amount of the loss and the probablty of ts beng reported. The severty and frequency dstrbutons are thus supposed to be ndependent. 13

5. Scalng model for external-loss amounts 5.1 Theoretcal scalng model The scalng mechansm depends on three fundamental hypotheses. The frst s that the monetary loss can be broken down nto two components: common and dosyncratc or specfc. The second stpulates a non-lnear relaton between the dosyncratc component and the dfferent factors composng t. The thrd and last hypothess states that, asde from the factors controlled for the purpose of scalng, all the other non-observable factors (qualty of control envronment, etc.) are supposed to reman the same for all banks. Concernng the frst hypothess, we can suppose that the operatonal loss can be broken down nto two components (Na, 2004, Na and al., 2006): a component common to all banks and an dosyncratc component specfc to each loss. The common component contans all the factors whch, beng ndependent of any specfc bank s actvtes, can have the same mpact on all banks thus makng t a constant component for all loss events. It refers to the macroeconomc, geopoltcal or cultural envronment or even to human nature n general. The dosyncratc component refers to the specfc rsk facng the fnancal nsttuton or lne of busness. Some elements of ths component are observable: bank sze, type of rsk, lne of busness or locaton of loss event. These could therefore be quantfed or measured. But there are non-observable elements related to the control envronment, whch are dffcult to quantfy. These elements are not studed n ths artcle. We can thus dentfy a loss amount as a functon of these two components: Loss = f ((Comp common ), (Comp dosyncratc ) ). (1) The second hypothess stpulates that the functon f s non-lnear. Na (2004) supposes that the f functon s the product of a functon of the common component and of a functon of the dosyncratc component. Now, snce the common component s constant, we can model t wth the parameter: Loss = Comp common g(comp dosyncratc ). (2) 14

As for functon g, we draw on the study by Shh, Samad-Khan, and Medapa (2000) whch supposes a power relatonshp between the loss amount and frm sze. However, sze (estmated by total assets) s not the only factor we use to determne the severty of losses. We add to t other factors expressed n the functon h whch follows: We can thus rewrte (2) as follows: To smplfy the analyss, we suppose that: g(comp dosyncratc ) = Assets a h(factors). Loss = Comp common (Assets a h(factors)). h(factors) = exp( j b factors j ). j Thus, Log(Loss ) = Log(Comp common ) + a Log(assets ) + ( j b factors j ). (3) j In order to explan the varablty of the losses and to construct the scalng model, the dfferent elements of the dosyncratc component must be dentfed, snce they play a role as factors explanng the severty of losses. 5.2 Descrpton of the varables The dependent varable s the logarthm of the operatonal losses. The statstcs n Table 1a show that the average by loss event s evaluated at $67 mllon, wth a standard devaton of $521 mllon. The maxmum of the losses s $16 bllon. The loss amounts thus vary wdely from qute substantal to catastrophc. The explanatory varables to be ncluded n the model desgned to explan the varaton n the logarthm of losses are descrbed below. Table 1 presents descrptve statstcs of the losses n terms of these varables. 15

Sze: The base contans varables characterzng frm sze. Accordng to the results of the study by Shh, Samad-Khan, and Medapa (2000), sze s weakly lned to the loss amount. Other varables must explan ths varablty. Many nformaton on sze are avalable, such as: total revenues, total assets, total deposts, number of employees, and total equty. However, snce all these varables are correlated, we have chosen total assets (the varable most correlated wth losses) as the estmator for sze. Fnancal nsttutons havng sustaned losses reported n the database used dffer greatly n sze, varyng from the smallest bank (wth total assets of $43 mllon) to the largest nsttuton (wth assets of $1,533,036 mllon). The average total n assets s evaluated at $270,681M. In Table 1a, we present the number of events, the average, and the standard devaton for losses accordng to sze. We have thus classfed the banks nto three sze categores: those wth assets under $400 bllon (smaller and medum sze); those wth assets between $400 and $800 bllon (large sze); and very large banks whose assets excess $800 bllon (very large sze). The results n the table show that average losses are much hgher n the very large banks than n the others. But the average loss for large fnancal nsttutons s lower than that for small and medum sze banks. We expect losses to ncrease wth the sze of the fnancal nsttuton. So a natural catastrophe could, for example, cause more serous damage (losses) to a bank whose total assets are hgher than those of another bank. Sze could thus have a postve mpact on the severty of losses. Locaton: As losses do not all occur n the same country, a varable capturng the effect of locaton must be ncorporated. Seeng dfferences n envronment, legslaton, etc., we expect ths varable to be sgnfcantly lnked to loss amounts. It s worth notng that 60% of the losses occurred n the Unted State, as compared to only 4% n Canada. Ths varaton can be explaned by the fact that the number of banks n the Unted States greatly exceeds that n Canada. The remanng proporton of losses has been dvded between the countres of Europe and the rest of the world. 16

Table 1b presents statstcs for losses accordng to locaton. We note that the average for operatonal losses dffers accordng to the locaton where they occur, thus reflectng ther dfferent envronments. It s worth mentonng that the average for losses n the Unted States s hgher than that n Canada ($38 M Vs $9 M). And the envronment desgnated Other (countres other than Canada, the Unted States and those n Europe) reports the hghest average for losses ($163 M), thus makng t the most rsky. Seekng to defne the lnk between sze of fnancal nsttuton and locaton, we present, n table 1c, statstcs on the sze of nsttutons accordng to the locaton where the losses occurred. We notce that the average for total assets s lower n Canada than n the Unted States and Europe. However, though the envronment n countres desgnated Other s rsker (hghest average losses), the nsttutons havng suffered those losses are on average smaller than those located n the three other envronments. Thus, there s no drect lnk between the fnancal nsttuton s sze and the locaton where the losses occurred. Dummy varables (Unted States, Canada, Europe, and Others) capture the effect of the locaton of losses n one of the countres or contnents mentoned above. Lne of busness: We expect the busness lne to have an mpact on the severty of losses. Certan unts regster hgher losses than others, on average. Hghlghtng the lne of busness where the loss occurred can explan the severty of extreme losses. Accordng to the statstcs presented for seven busness lnes n table 1d, we see that two of them commercal bankng (25%) and retal bankng (33%) account for 58% of the losses. And the average loss s much hgher for commercal bankng than for the other unts. Based on the nformaton collected by LDCE and QIS-4 3 from 27 fnancal nsttutons, the lne of busness retal bankng accounts for 44% of the operatonal losses n the 177 data on losses of over $1 M collected over the 2001-2004 perod, whereas the lne of busness commercal bankng accounts for only 9%. Ths gap can be explaned ether by the collecton perod (the LDCE study covers 4 years of losses, whereas the external base covers 11 years of reportng) or by the number of dfferent fnancal nsttutons where 3 Loss Data Collecton Exercse (LDCE) and Quanttatve Impact Study-4 (QIS-4): two studes conducted by the U.S. federal bank and thrft regulatory agences to evaluate the mpact of Basel II on the mnmum regulatory captal requred. 17

the losses occurred. The dchotomc varables for each lne of busness wll thus capture the effect of the nature of ts actvty, when determnng loss amounts. Types of rsk: Certan rsk types are nfrequent but extremely severe, whereas others are very frequent but of relatvely weak severty. Table 1e shows that 44% of losses are of the clent, products, and busness practces type and that more than 40% of losses are dvded between nternal and external fraud. However, less than 0.5% of losses are of the damage to physcal assets type and 0.5% are of the busness dsruptons and system falures type. The average loss s hghest for the damage to physcal assets rsk type ($115 M), whereas t s lowest for busness dsruptons and system falures ($5 M) The results of the LDCE and QIS-4 studes show a very great dfference n relaton to ths dstrbuton. In effect, 49% of losses are of the executon, delvery and process management type; 31% are of the clents, products, and busness practces type; 7% of the external fraud type; and 3% of the nternal fraud type. As for the types of rsk damage to physcal assets and busness dsrupton and system falures, the proporton of losses s just as low as n the external base. Introducng dchotomc varables to capture ths aspect of the rsk type can be relevant n explanng the varablty of the loss amounts. Thus, 7 varables wll capture the rsk type effect n our model. 5.3 Lnear regresson To explan the degree of varablty of external losses, we shall estmate the coeffcents of the regresson below. Ths wll allow us to evaluate the common and specfc components for each loss amount. The followng regresson follows from equaton (3): Y = a 0 + a 1 Sze + a 2 US + a 3 Canada + a 4 Europe + a j BLj + a j RTj +e (4) 11 j= 5 17 j= 12 wth: Y : Log(losses ); 18

a 0 : Common component; Sze : Log(assets ); US : Bnary varable assumng the value 1 f the loss occurred n the Unted States, otherwse 0; Canada : Bnary varable assumng the value 1 f the loss occurred n Canada, otherwse 0; Europe : Bnary varable assumng the value 1 f the loss occurred n Europe, otherwse 0; The category omtted s Others; BL j : Bnary varable assumng the value 1 f the loss occurred n the busness unt j, otherwse 0; The category omtted s payment and settlement; RT j : Bnary varable assumng the value 1 f the loss s of the rsk type j, otherwse 0; The category omtted s busness dsruptons and system falures type of rsk; e : Devaton varable representng the non-observable specfc component whch s supposed to 2 follow a normal dstrbuton wth parameters (, ) 0 σ. 5.4 Results of the regresson The Ordnary Least Squares (OLS) method s used to estmate the parameters. The results of ths estmaton are presented n Table 2. The adjusted R 2 adj s 10.63%. Though the value s low, t s better than the 5% found n the lterature to date (Shh et al., 2000). Remember that t s dffcult to capture certan non-observable factors, whch are not present n the external base. Estmated by the logarthm of total assets, the sze varable s sgnfcantly dfferent than 0. The coeffcent s postve, confrmng the fact that the larger the frm, the hgher ts level of losses. The bnary coeffcents US and Canada are sgnfcantly dfferent then 0. The negatve sgn must be nterpreted n relaton to the category Others (varable omtted). Comparng the coeffcents US and Canada, we note that the Unted States envronment s rsker than Canada s. It s also 19

worth notng the even hgher losses havng occurred n the rest of the world where fnancal envronments are less regulated than those n the Unted States, Canada, and Europe. The commercal bankng varable s the only one lnked to busness lne, whch shows a sgnfcantly non-null mpact at a 99% confdence level. The coeffcent s postve, showng that losses for ths type of rsk are hgher than for others. Fnally, the clents, products, and busness practces varable has sgnfcant explanatory power. Ths shows that the losses assocated wth ths type of rsk are hgher compared to the others. 5.5 Robustness tests for the sze varable We start wth a smple regresson ncludng only the sze varable a model smlar to the one used by Shh et al. (2001). The other categores of varables are then added to ths same model one by one n order to capture any possble addtonal effects and to test the stablty of the parameters. The results show that sze plays a very small part n explanng the level of losses. Model 1 n Table 3 actually shows an R 2 adj of 0.6%. Ths statstc s sharply mproved (to 4.32%) once the varables assocated wth locaton are ntroduced. It should be mentoned that the values of the coeffcents estmated reman stable and sgnfcantly dfferent from 0 when compared to the basc model. Model 3 adds varables estmatng the mpact of the lne of busness where the loss occurred. The adjusted determnaton coeffcent jumps from 4.32% (model 2) to 7.16% (model 3). The commercal bankng varable stll remans sgnfcantly dfferent from 0. Each category of varables thus has sgnfcant power n explanng the severty of operatonal losses. The coeffcents of the varables are relatvely stable. We next select only those varables, whch are statstcally non-null at the 90% confdence level. We then regress the log of losses on the 5 remanng varables. The model thus constructed allows us to test whether the varables shown to be sgnfcant n the basc model wll keep ther explanatory power when tested alone. The results presented n model 4 of table 3 show that all the varables reman sgnfcantly non-null at the 90% confdence level. The adjusted coeffcent of determnaton s on the order of 9.38%. The sgns and scope of the varables do not change. These 5 sgnfcant varables wll be used n developng the normalzaton formula. 20

5.6 Normalzaton Formula We are nvestgatng a certan bank A and we want to fnd the equvalent value of a loss occurrng n another fnancal nsttuton B. A normalzaton formula wll allow us to put a loss havng occurred n bank B on the same scale as one n bank A. Accordng to equaton (2), a loss s the product of a common component and of a functon of the specfc component. The regresson analyss performed above allowed us to dentfy these two components. Log ( loss ) = a { 0 + a1sze + a2us + a3canada + a4cb + a5cpbp + e 14444444444 2444444444 43 log( Comp.Comm) log g( Comp Idosyncratc) where: CB: refers to the commercal bankng busness lne; CPBP: refers to the clents, products, and busness practces rsk type. As the common component s constant for all loss amounts, t s possble to re-wrte equaton (2) as follows: Loss Loss Comp = g Loss A B N comm = = =.... (5) ( Compdo ) g( Compdo ) g( Comp do ) A B N Suppose that we have a loss whch occurred n bank B and that we want to know ts equvalent value f t occurred n bank A. Based on the analyss above, we can determne the dosyncratc components of loss B as well as that of A. We multply the coeffcents already estmated by the correspondng value of the dfferent varables to fnd the dosyncratc or specfc component. ( Compdo ) ( Comp ) g = Loss (6) A Loss A g do B B wth: ( Compdo) ( aˆ 1SzeA + aˆ 2USA + aˆ 3CanadaA + aˆ 4CBA + aˆ CPBPA g A 5 = exp ). 21

Equaton (6) supposes that, n addton to the varables selected to perform the scalng, the unexplaned part of the regresson model (attrbutable to unobservable qualtatve factors such as management qualty, control envronment, etc.) s supposed to be the same between Loss A and Loss B (thrd hypothess). So, to calculate a loss sustaned by a gven bank n the bankng ndustry, the dosyncratc components of the two losses must frst be calculated wth the precedng equaton. Next, we apply formula (6) to fnd the equvalent loss for bank A. By applyng ths same method to the whole external base, we obtan a base of extreme losses havng occurred n other bankng nsttutons but scaled to a gven bank. The severty of losses has thus been adjusted by takng nto account several factors such as sze, locaton, busness lne, and rsk type. 5.7 Valdaton of the scalng severty model In order to concretze the scalng model, we have chosen the Amercan bank Merrll Lynch 4 from the external base. Ths bank shows 52 loss events over the 1994-2004 perod. We shall frst scale the operatonal losses n the external base to ths bank. We shall next compare the statstcs on the losses actually observed to those found after the scalng procedure. Our frst step s to determne the equvalent of the 1,056 loss events n the external base for the Merrll Lynch bank. We shall thus calculate the loss amount, whch could occur at Merrll Lynch for the same type of rsk, n the same lne of busness, and n the same year as the one n the external base. On the other hand, we shall take Merrll Lynch s total assets n the year of the event and apply them to all the losses. And we shall take the Unted States as the place where all the external losses occur, snce all the losses observed for Merrll Lynch dd occur n the Unted States. Once the explanatory varables for regresson (4) have been dentfed, we shall calculate the equvalent dosyncratc component for each loss recorded n the external base (as shown n Appendx 1) and also nclude the coeffcents of the varables prevously estmated. The normalzaton formula presented n the precedng secton allows us to scale the loss for the bank n queston. 4 We have chosen ths bank from the external base because t s the one wth the maxmum number of losses of more than $1 mllon over the 1994-2004 perod. 22

We next compare the statstcs calculated on the sample of the 52 losses actually observed at Merrll Lynch to the statstcs calculated on the 1, 050 5 losses equvalent to those n the external base whch could occur at the same bank. These statstcs are presented n Table 4. They show that the averages of the two samples are qute close. A hypothess test confrms ths and shows that the two averages are statstcally dentcal at a 95% threshold. It should also be noted that the standard devatons of the two samples are close. (83.1 vs. 84.3). In the second step of the analyss, we look to see what mpact the scalng varables have on the loss amounts obtaned after normalzaton. Table 5 presents the example of a loss-event at the Bank of New York drawn from the external base, along wth loss amounts scaled to fctonal banks. Frst of all, we modfy one characterstc of the event at a tme n order to see ts monetary mpact on the loss. We next analyze the aggregate effect of several varables on the loss amount. We note that, f the event took place n one of the larger banks (all other factors beng equal), the loss amount would be slghtly hgher (rsng from $8.26 M to $9.27 M, whle the asset total has more than quadrupled). However, f the same event took place n Canada rather than the Unted States, the loss would be smaller n scope (t would go from $8.26 M to $4.97 M). As we have already shown, Canada s envronment s less rsky than that of the Unted States. We also note that the commercal bankng lne of busness where the loss-event occurred s more lkely to produce heavy losses than the other lnes of busness. And the loss wll move from $8.26 M to $16.06 M f t takes place n the commercal bankng busness lne rather than the retal bankng busness lne. Type of rsk has also a bg mpact on the scope of losses. If the loss s of the clents, products, and busness practces type, the amount wll move to $15.56 M, gven that ths type of rsk has a sgnfcant mpact on the severty of losses, as already concluded n secton 5.4. Fnally, n the three last lnes of table 5, we present the aggregated mpact of two or more varables. We scale by modfyng the sze of the bank and the locaton of the event. Note that even f the bank s a larger one, the mpact of locaton (Canada) wns out and the loss amount produced by scalng s lower than that of the orgnal event. When the lne of busness and type 5 We have excluded 6 losses from the analyss because they represent outlers. 23

of rsk are also modfed, we fnd that the loss more than doubles. It s worth notng that the resultng loss s very dfferent from the one found when sze alone s modfed. Ths analyss thus shows us that the sze effect s qute weak compared to the other scalng factors. Our model s thus an mprovement over models exstng n the lterature, whch are based solely on sze. 6. Scalng model for frequency of external losses Remember that our objectve s to correct the scalng bas so that a combnaton of nternal and external data can be used to measure operatonal rsk captal. In the precedng secton, we worked out the scalng for loss amounts. It s thus possble to fnd several extreme losses lkely to occur n our reference bank A. The queston stll to be asked s: How frequently wll a bank sustan these losses? The scalng of frequences s a noton, whch rarely surfaces n the lterature. Some researchers have developed models to scale severty, but the number of external losses, whch should be combned wth nternal data, has not yet been modeled. In what follows, we propose a model, whch allows us to adjust the number of external losses per bank and to scale t down to a gven bank A. 6.1 Descrpton of the model Wth the model developed n ths secton, t s possble to scale the number of external losses and to determne what theoretcal dstrbuton fts to the frequences. We expect that, on a gven horzon, the number of losses per fnancal nsttuton wll depend on certan factors descrbng the characterstcs of the fnancal nsttuton. The nsttuton s sze can ndeed play an mportant role n determnng the number of losses. It should be noted that the larger the bank, the more exposed t s to operatonal rsks. If a bank does more transactons and has more assets, employees, and revenues than another, t wll probably have more operatonal losses of varous types (fraud, damage to assets ). And the geographcal dstrbuton of the nsttuton s actvtes can gve us an dea of the effectveness of ts controls. The more wdely dspersed a bank s actvtes and, consequently, ts measures of control, the less effectve these measures wll be. 24

It s thus possble to explan the number of losses by a regresson over the dfferent varables mentoned above. However, snce we are dealng wth frequences and thus dscrete varables, these numbers can be more sutably modeled wth count data dstrbutons such as the Posson and negatve bnomal. So the count data regresson model can be approprately appled n ths context. The advantage of these models s that they can both fnd the theoretcal dstrbuton adjustable to the frequency data and also provde flexble parameters sutable to each observaton. In other terms, the dstrbuton s parameters depend on the varables dentfyng the characterstcs of the fnancal nsttuton where the loss occurred. Once the parameters have been estmated, t s possble to calculate those belongng to a gven bank. Snce the only nsttutons to whch we have access are those, whch have sustaned losses, the frequences are non-null. Ths bas must be corrected by usng dstrbutons truncated at zero. In what follows, we shall frst descrbe the varables, whch wll be ncluded n the model and then present and test each of the two models: the truncated Posson regresson model and the truncated negatve bnomal regresson model. 6.2 Descrpton of the varables We create a varable descrbng the number of over $1 M losses per fnancal nsttuton over the 1994-2004 perod. 6 Ths gves us a sample of 323 fnancal nsttutons havng sustaned losses of over $1 M whch have been reported n the external base. Frequency wll be explaned based on a bank s sze and on the geographcal dstrbuton of ts actvtes. We expect to fnd that the number of losses wll ncrease wth bank sze and that control costs wll grow and decrease n ther effectveness, as a fnancal nsttuton expands the geographcal dstrbuton of ts actvtes. Sze wll be estmated by the logarthm of the average total of the frm s assets over the 1994-2004 perod. The geographcal dstrbuton wll be estmated by bnary varables such as: US: Bnary varable assumng the value 1 f the nsttuton has had losses n the Unted States over the 1994-2004 perod, otherwse 0. 6 We select ths perod for whch the collecton of losses s most exhaustve. 25

Canada: Bnary varable assumng the value 1 f the nsttuton has had losses n Canada over the 1994-2004 perod, otherwse 0. Europe: Bnary varable assumng the value 1 f the nsttuton has had losses n Europe over the 1994-2004 perod, otherwse 0. Others: Bnary varable assumng the value 1 f the nsttuton has had losses n another country over the 1994-2004 perod, otherwse 0. Unlke those n the model for severty, these varables are not mutually exclusve. Table 6a presents the descrptve statstcs for the number of losses per bank as well as total assets per fnancal nsttuton over the 1994-2004 perod. The average number of losses of over $1 M s 3.3 events per nsttuton on an 11-year horzon, wth a maxmum of 52. The fnancal nsttutons vary greatly n sze, the average total n assets beng $123,174 M. Wth regard to the geographcal dstrbuton of the banks actvtes, we fnd that losses are more concentrated n the Unted States and n other countres (other than Canada, the Unted States, and Europe) Table 6b shows that banks of small and medum sze (average assets under $400,000 M) suffer fewer losses over the 1994-2004 perod than do large banks (average assets of between $400,000 and $800,000 M). And very large banks, wth assets of over $800,000 M, have a hgher number of losses (19) than the other banks. These statstcs show a lnk between the fnancal nsttuton s sze and the number of losses of over $1 M. Table 6c presents statstcs on the number of losses per bank accordng to the geographcal dstrbuton of the actvtes of the nsttuton n queston. The results show that f actvtes are concentrated n the same country, the average number of losses s 2 per bank. But when actvtes are spread over two or three countres, the average vares between 8 and 10. The average number of losses per bankng nsttuton jumps to 28 when actvtes are very wdely dspersed geographcally. In modelng frequency, t s thus nterestng to take the geographcal dstrbuton of actvtes nto account. 26

6.3 Truncated Posson regresson model If Y the number of losses sustaned by company over the 1994-2004 perod follows a Posson dstrbuton, then the probablty of havng y losses wll be: λ y e λ y! ( = y) = y = 0,1, 2... and λ > 0 P Y where λ s the Posson parameter. The man characterstc of ths dstrbuton s ( Y ) Var( Y ) = λ E. = However, as we are n the presence of zero truncated data, the probablty of the number of condtonal losses must be estmated on the fact that the frequences observed are strctly superor to zero. The condtonal probablty s: λ y e λ = y = 1, 2... and λ > 0 y! ( y Y > 0) P Y =. λ ( 1 e ) We can, moreover, allow parameter λ to vary from one observaton to the next. Let ( X β ) λ = exp where X s a vector of (1 m) exogenous varables (characterstcs of the frms where the loss occurred) and β a vector of (m 1) coeffcents. The exponental functon allows the non-negatvty of parameter λ. In our context, the parameter λ takes the form: λ = exp ( β + β ( ) + β US + β Canada + β Europe + β Others ) 0 1 ln Assets 2 3 4 5. (7) Therefore, the probablty that a fnancal nsttuton would have y losses over an 11-year horzon (when ts specfc characterstcs are known) s: P ( Y = y Y > 0 ; X ) ( X β ) exp ( X β ) exp e =. (8) y! exp ( X β ) ( 1 e ) y 27