OLA HÖSSJER, BENGT ERIKSSON, KAJSA JÄRNMALM AND ESBJÖRN OHLSSON ABSTRACT



Similar documents
benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

An Alternative Way to Measure Private Equity Performance

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Can Auto Liability Insurance Purchases Signal Risk Attitude?

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

CHAPTER 14 MORE ABOUT REGRESSION

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

Traffic-light a stress test for life insurance provisions

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Analysis of Premium Liabilities for Australian Lines of Business

Stress test for measuring insurance risks in non-life insurance

Estimation of Dispersion Parameters in GLMs with and without Random Effects

How To Calculate The Accountng Perod Of Nequalty

DEFINING %COMPLETE IN MICROSOFT PROJECT

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

1. Measuring association using correlation and regression

Traffic-light extended with stress test for insurance and expense risks in life insurance

Chapter 2 The Basics of Pricing with GLMs

Binomial Link Functions. Lori Murray, Phil Munz

Section 5.4 Annuities, Present Value, and Amortization

Recurrence. 1 Definitions and main statements

General Iteration Algorithm for Classification Ratemaking

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

What is Candidate Sampling

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Calculation of Sampling Weights

Estimating Total Claim Size in the Auto Insurance Industry: a Comparison between Tweedie and Zero-Adjusted Inverse Gaussian Distribution

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

Portfolio Loss Distribution

SIMPLE LINEAR CORRELATION

A Probabilistic Theory of Coherence

Statistical Methods to Develop Rating Models

BERNSTEIN POLYNOMIALS

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

Solution: Let i = 10% and d = 5%. By definition, the respective forces of interest on funds A and B are. i 1 + it. S A (t) = d (1 dt) 2 1. = d 1 dt.

The Application of Fractional Brownian Motion in Option Pricing

1 De nitions and Censoring

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

Prediction of Disability Frequencies in Life Insurance

The OC Curve of Attribute Acceptance Plans

Survival analysis methods in Insurance Applications in car insurance contracts

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

Statistical algorithms in Review Manager 5

HARVARD John M. Olin Center for Law, Economics, and Business

Transition Matrix Models of Consumer Credit Ratings

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Fixed income risk attribution

TESTING FOR EVIDENCE OF ADVERSE SELECTION IN DEVELOPING AUTOMOBILE INSURANCE MARKET. Oksana Lyashuk

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Using Series to Analyze Financial Situations: Present Value

Forecasting the Direction and Strength of Stock Market Movement

8 Algorithm for Binary Searching in Trees

Economic Interpretation of Regression. Theory and Applications

Efficient Project Portfolio as a tool for Enterprise Risk Management

STATISTICAL DATA ANALYSIS IN EXCEL

ADVERSE SELECTION IN INSURANCE MARKETS: POLICYHOLDER EVIDENCE FROM THE U.K. ANNUITY MARKET *

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

Forecasting and Stress Testing Credit Card Default using Dynamic Models

Traffic State Estimation in the Traffic Management Center of Berlin

Variance estimation for the instrumental variables approach to measurement error in generalized linear models

L10: Linear discriminants analysis

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

Vasicek s Model of Distribution of Losses in a Large, Homogeneous Portfolio

An Empirical Study of Search Engine Advertising Effectiveness

Lecture 3: Force of Interest, Real Interest Rate, Annuity

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

How To Evaluate A Dia Fund Suffcency

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining

Extending Probabilistic Dynamic Epistemic Logic

Joe Pimbley, unpublished, Yield Curve Calculations

This circuit than can be reduced to a planar circuit

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

A Practitioner's Guide to Generalized Linear Models

When Talk is Free : The Effect of Tariff Structure on Usage under Two- and Three-Part Tariffs

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

Testing Adverse Selection Using Frank Copula Approach in Iran Insurance Markets

Calculating the high frequency transmission line parameters of power cables

Brigid Mullany, Ph.D University of North Carolina, Charlotte

To manage leave, meeting institutional requirements and treating individual staff members fairly and consistently.

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

Quantization Effects in Digital Filters

Transcription:

ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE BY OLA HÖSSJER, BENGT ERIKSSON, KAJSA JÄRNMALM AND ESBJÖRN OHLSSON ABSTRACT We consder varaton of observed clam frequences n non-lfe nsurance, modeled by Posson regresson wth overdsperson. In order to quantfy how much varaton between nsurance polces that s captured by the ratng factors, one may use the coeffcent of determnaton, R, the estmated proporton of total varaton explaned by the model. We ntroduce a novel coeffcent of ndvdual determnaton (CID), whch excludes nose varance and s defned as the estmated fracton of total ndvdual varaton explaned by the model. We argue that CID s a more relevant measure of explaned varaton than R for data wth Posson varaton. We also generalze prevously used estmates and tests of overdsperson and ntroduce new coeffcents of ndvdual explaned and unexplaned varance. Applcaton to a Swedsh three year motor TPL data set reveals that only 0.5% of the total varaton and 11% of the total ndvdual varaton s explaned by a model wth seven ratng factors, ncludng nteracton between sex and age. Even though the amount of overdsperson s small (4.4% of the nose varance) t s stll hghly sgnfcant. The coeffcent of varaton of explaned and unexplaned ndvdual varaton s 9% and 81% respectvely. KEYWORDS Clam frequency varaton, coeffcent of determnaton, coeffcent of ndvdual determnaton, unexplaned ndvdual varaton, overdsperson, Posson regresson, ratng factors. 1. INTRODUCTION The dea behnd modern non-lfe nsurance ratng s that each customer should pay a premum as close as possble to the expected value of the cost that he or she causes the company. Consequently, the pure premum (the premum wthout loadng for expenses and cost of captal), should be close to the expected Astn Bulletn 39(1), 49-73. do: 10.143/AST.39.1.038064 009 by Astn Bulletn. All rghts reserved.

50 O. HOSSJER, B. ERIKSSON, K. JARNMALM AND E. OHLSSON value of the clam cost for each nsurance polcy. In practce, the actuary tres to fulfll ths goal by fndng ratng factors that descrbe the varaton n the expected cost between the polces. These factors are chosen so that the actuaral model wll capture as much as possble of the varaton n expectaton between customers. On the other hand, the rsk,.e. the devaton of the clam cost from ts expectaton, s of course transferred to the company n partcular, t s not the goal to reduce the varance of the clam cost to zero. A tarff analyss s most often carred out wth the ad of Generalzed Lnear Models (GLMs), the theory of whch s well summarzed n McCullagh and Nelder (1989). Applcaton of GLMs to non-lfe nsurance has been consdered, among others, by Brockman and Wrght (199). The tarff analyss s usually made separately for clam frequency and average clam severty, usng multplcatve models (Jung, 1968). For GLMs, ths corresponds to usng a log-lnk functon. In ths paper we focus on clam frequency under a multplcatve model. Let Y be the observed clam frequency for polcy and let g j denote the prce relatvty for ratng factor number j for ths polcy compared to a reference polcy, j 1,,,q. The clam frequency of the multplcatve model can then be wrtten l 0 E(Y )l 0 g 1 g g q, (1) where l 0 s l for the reference polcy. The prce relatvtes are connected to the GLM regresson parameters b (b 1,,b p ) through the log-lnk, l exp( b T x ), where x (x 1,,x p ) s a vector of 0-1 dummy varables (covarates) ndcatng whch partcular parameters that apply to polcy. In practce, there s always some varaton left above the multplcatve model: two polces n the same tarff cell,.e. wth the same values on the ratng factors, stll have some resdual dfference n ther expectaton, unexplaned by the multplcatve model. Our am here s to present measures of explaned and unexplaned varaton. Ths serves two purposes: () t s an ad n choosng ratng factors for the model, cf. the use of R n lnear regresson; () t gves an ndcaton of whether there s a need for experence ratng (bonus/malus systems) at the ndvdual level or not. Several authors have suggested the use of credblty models for so called optmal bonus/malus ratng, see Lemare (1995) for an overvew. As explaned n Ohlsson and Johansson (006) and Ohlsson (008), credblty models can be vewed as random effect models, n partcular ths s convenent n a GLM context. The multplcatve model above then becomes, f U denotes the random effect for contract, E(Y U )l 0 g 1 g g q U l U. ()

ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE 51 Here E(U ) 1 and Var(U ) can be used as the bass for a measure of the amount of unexplaned ndvdual varaton, as explaned below. Wthout reference to GLMs, Bühlmann and Gsler (005, Chapter 4.13) dscuss smlar models under the name credblty models wth a pror dfferences. In ther Chapter 9, Bühlman and Gsler (005) also dscuss evolutonary credblty models, whch allow the U for dfferent observatonal years to have less than the 100% correlaton mplctly assumed above. Whle the total number of clams s what drves the cost of the nsurance company, ts varaton s not an approprate startng pont for measurng the performance of the chosen ratng factors. The ratng factors determne how the total premum s dstrbuted among the polcy holders, but does not affect the number of clams, the cost or premum ncome of the company drectly n a gven portfolo of polces. The goal of a tarff analyss s not to reduce the cost of the company, other thngs equal, but to get the rght prce on a compettve market. The latter wll, of course, n the end ncrease the revenue of the company, whle the wrong prce wll result n adverse selecton of customers. To ths end we suggest takng an ndvdual perspectve when measurng the performance of the chosen tarff, defnng the total varaton of a portfolo of nsurance polces as the average mean square error of predcton, where average refers to choosng a polcy at random, see Secton.. We use a decomposton of ths total varaton n the portfolo nto three parts: explaned ndvdual varaton, unexplaned ndvdual varaton and nose. Ths s smlar to a decomposton defned by Johnson and Hey (1971) and Brockman and Wrght (199, Appendx D), who refer to explaned and unexplaned ndvdual varaton as between cell varance and wthn cell varance respectvely. The coeffcent of determnaton, R, s defned as the estmated fracton of total varance explaned by the model. However, the nose part of the total varance, whch s the Posson varance n a model where there s nothng more to explan (Var(U ) 0) can never be explaned. Ths suggests that a more relevant ndex s the coeffcent of ndvdual determnaton (CID), defned as the estmated proporton of the total ndvdual varance explaned by the model. It excludes nose varance and s (close to) one f we manage to explan (almost) all varaton between polcy means. In non-lfe actuaral applcatons, the lkelhood based devance s often employed for model selecton. In the same sprt, coeffcents of determnaton may be defned usng lkelhood methods and devance rather than varance decompostons, see for nstance Maddala (1983), Cox and Snell (1989, pp. 08-09), Maggee (1990) and Nagelkerke (1991). However, we beleve that a varance decomposton of the response varable (n our case clam frequency) s of partcular nterest to the expermenter, provdng an ntutve explanaton of the fracton of total varaton that can be explaned. It s also of nterest to test whether there s more varaton left to explan or not. We present tests that generalze those of Venezan (1981, 1990) who only consders the specal case wth no covarates and constant duraton. The tests mght be used as an ndcaton of the need for bonus/malus systems and/or a

5 O. HOSSJER, B. ERIKSSON, K. JARNMALM AND E. OHLSSON search for addtonal ratng factors. We also present an estmate of the relatve amount of overdsperson, f, that dffers slghtly from the tradtonal one, based on Pearson s x -statstc n that polces are weghted based on tme duraton, not estmated clam frequency. We also defne coeffcents of varaton for the exlaned and unexplaned ndvdual varaton, as well as for the nose. The paper s organzed as follows. In Secton we defne the model and varance decomposton n more detal. Parameter estmaton s consdered n Subsecton 3.1, ncludng defntons of R and CID. Tests of excess varance are dscussed n Subsecton 3. and our fndngs are appled to Motor TPL (Thrd Party Lablty) nsurance n Secton 4. We demonstrate, for a tarff wth three year duratons, that only 0.5% of the total varaton (R ) and 11% of the total ndvdual varaton ( CID) n clam frequences s explaned. The explaned and unexplaned ndvdual varaton have coeffcents of varaton 9% and 81% respectvely. Further dscusson of the results s provded n Secton 5 and more techncal detals are gathered n the appendx.. VARIANCE DECOMPOSITION AND UNEXPLAINED INDIVIDUAL VARIATION.1. A Mxed Posson Model Consder a portfolo of n nsurance polces. For 1,,n, let N be the observed number of clams durng a perod of tme, t, so that Y N /t. It s assumed that condtonal on U, N follows a Posson dstrbuton wth expectaton t l U, and so the uncondtonal dstrbuton s a mxed Posson dstrbuton,.e. N Po(t L ), (3) where L l U and l s gven by (). The Posson model s frequently used n non-lfe nsurance, see for nstance Chapter of Beard et al. (1984). The unexplaned ndvdual varaton s captured by the random varable L n Motor TPL nsurance ths varable can be sad to capture the accdent proneness of the drver wth mean We assume a varance functon E(L )l. (4) Var(L )zl a, (5) for the accdent proneness for some a > 0 and z $ 0. When z 0, ()-(4) defne a generalzed lnear model wth log lnk functon. For non-lfe nsurance, a s most well known, snce then Var(U )z s ndependent of the covarate x and z becomes the squared coeffcent of

ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE 53 varaton of L x, a parameter ndependent of the chosen unt of tme. The extenson to tme varyng random effects (see the appendx) s also most natural for a. When a 1, z Var(L ) /E(L ) s the relatve ncrease of varance caused by the overdsperson. We regard a as a constant and z as an unknown parameter. However, to keep the varance functon more flexble, we wll not restrct a n advance. See Pocock et al. (1981), Hnde (198), Breslow (1984) and Lawless (1987) for more detals on parameter estmaton and choce of varance functons for overdspersed Posson regresson... Varance Decomposton For our purpose of measurng explaned and unexplaned varaton, we frst need a measure of the total varablty n the portfolo. As explaned above, the relevant measure here s not the varance of the total cost for the company, but rather an average of the varance for the ndvduals. For relevance, the average should be weghted wth the tme duraton t, so that a one-year polcy has the same mpact as two half-year polces. Conceptually, ths may be vewed as f we drew a polcy at random from the portfolo, gvng each polcy a probablty proportonal to ts t. The mean clam frequency (of a randomly drawn polcy) s then l t l / t, (6) where s short for n 1. The benchmark for measurng the effect of choosng a tarff should be a tarff where all polces are assgned the average l. (Note that l dffers from l 0 n (1), whch s the clam frequency of a reference polcy, chosen to have a prce relatvty of one for all ratng factors.) The average mean squared error of predcton (AMSEP) for all {Y } s te ^Y l s ` - h j / t. (7) Now, snce E(Y L )L and E(L )l we have the smple decomposton E((Y l) )(l l) + E((L l ) ) + E((Y L ) ). Hence, snce E((Y L ) ) E(Var(Y L )) l /t, we can wrte s as a sum of three terms 1 3 a s s + s + s t ^l lh / t z t l / t l / t. - + + (8)

54 O. HOSSJER, B. ERIKSSON, K. JARNMALM AND E. OHLSSON The frst term of (8), s 1, quantfes explaned ndvdual varaton, the second term s unexplaned ndvdual varaton and the thrd term s 3 represents nose,.e. the varance n a Posson model wthout overdsperson. We wll refer to un exp + 3 te ^Y - / t s s s ` l h j (9) as the total unexplaned varance. Followng Venezan (1990), we also use the word excess varance for s, snce t quantfes the total excess of varance for all Y compared to what s expected under a pure Posson model. We have argued that the nsurance company does not am at predctng the customers Y, but rather L. Hence, t would be more relevant to consder the AMSEP for {L } rather than {Y },.e. te ^L l snd ` - h j / t, (10) whch could also be called the total ndvdual varaton; note that s nd s 1 + s. A varance decomposton smlar to (8) s defned by Johnson and Hey (1971) and Brockman and Wrght (199) when a. The dfference s manly that they sum over tarff cells rather than polces and use a dscrete approxmaton of accdent proneness wthn each cell. Wth our approach we can handle contnuous as well as dscrete covarates. Tradtonally, the total varance s decomposed nto explaned and unexplaned varance components, and the explaned varance s further dvded nto varous sources of varaton. The specal feature of (8) s that the unexplaned varance s splt nto two terms representng ndvdual varaton and nose. It s a specal case of a more general varance decomposton ntroduced by Hössjer (008) for a large class of mxed regresson models, ncludng Posson, logstc and lnear regresson..3. Coeffcents of Determnaton To quantfy the proporton of varance explaned by the covarates, a tradtonal R -type quantty would be the fracton of the total varaton, r s 1, s whle we have argued that t would be more relevant to use the fracton of the total ndvdual varance, r s 1 nd, snd

ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE 55 whch excludes the nose varance. Recall here that s nd s 1 + s. The upper bound of r nd s 1, correspondng to all relevant covarates beng used n the model. The upper bound 1 of r requres, n addton, that the nose varance has been elmnated, whch can only be acheved for very long tme duratons, t. Indeed, t s easy to see that r nd s unaffected f all tme duratons are, for nstance, doubled, whereas r s ncreased..4. Measures of Overdsperson and Coeffcents of Varaton To assess the amount of unexplaned varance several possble quanttes could be used, such as z or s. A more ntutve choce s perhaps 1 r nd s /s nd, whch gves the proporton of total ndvdual varance not explaned by the covarates. Alternatvely, f un exp s 1 s3 s3 s + (11) quantfes, n relatve terms, the amount of excess of the total unexplaned varance over the nose varance. A value larger than one ndcates unexplaned ndvdual varaton. However, f shares the drawback of r n not beng nvarant wth respect to magnfed tme duratons. Alternatvely we may use the coeffcent of unexplaned ndvdual varaton CUIV s ^f - h l l 1 s3 t / 1 f - 1 l (1) as a measure of overdsperson. It quantfes the ndvdual unexplaned standard devaton n relaton to the mean and s nvarant wth respect to magnfed tme duratons. We argue that CUIV t s a more ntutve and relevant measure of overdsperson than f for actuaral applcatons. The coeffcent of explaned ndvdual varaton and the coeffcent of nose varaton CEIV l s 1 (13) s CNV 3 t l / 1 1 l, (14) are two other quanttes of nterest. In fact, the varance decomposton (8) may be restated by decomposng the squared coeffcent of varaton

56 O. HOSSJER, B. ERIKSSON, K. JARNMALM AND E. OHLSSON nto three sources of varaton. CV s l CEIV + CUIV + CNV 3.1. Parameter Estmaton 3. STATISTICAL INFERENCE The unknown parameters, a, b and z, can be estmated usng full maxmum lkelhood. Ths requres specfcaton of the dstrbuton of all L and yelds qute complcated parameter estmates. We wll use a smpler approach, where frst b s estmated separately by maxmum lkelhood from a generalzed lnear model wthout overdsperson (z 0). Ths facltates use of standard software and moreover, t can be shown that b s a consstent and asymptotcally normal estmator of b even when z >0, see e.g. Whte (198). The next step s to estmate a, as explaned n the appendx for the car accdents data set. Gven b and a, we then estmate z by z bt`y - lj - l l a, t l (15) where l exp(x b T ) and a s regarded as known (for nstance the estmated a, but not necessarly so). It s shown n Hössjer (008) that asymptotcally, n the lmt of large samples n, z has a normal dstrbuton wth mean z when a s regarded as a known constant. An explct formula for the standard error s also provded there. The emprcal verson of the AMSEP for predctng clam frequency wth the constant l n (7) s, usng (8), 3 1 3 s s + s + s t l l t l l ` - j b `Y - j - l l + + t t t, (16) where l t l / t. It gves rse to the coeffcent of determnaton R s r 1, s (17) Note that R can be nterpreted as the relatve decrease n AMSEP we obtan for {Y } by gong from a constant clam frequency l to a tarff of l s, snce ts

ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE 57 denomnator s an estmate s of the AMSEP wth l and the nomnator s s mnus an estmate of the AMSEP wth the l,.e. of s unexp n (9). The emprcal verson of the more relevant AMSEP for predcton of L n (10) s s nd s 1 + s. (18) leadng to an alternatve to R whch we call the coeffcent of ndvdual determnaton respectvely. CID r nd s 1, nd s (19) Lke R, CID can be nterpreted as the relatve decrease n AMSEP for {L } resultng from ntroducng varyng l s. The denomnator s an estmate snd of the AMSEP wth l, see (10), whle the numerator the dfference of snd and an estmator of the AMSEP wth l,.e. of s E((L l ) ) / t t. (0) When all tme duratons are equal, the R here s an analogue of the classcal R used for (unvarate) lnear regresson models, whereas CID has no such analogue. The reason s that that r nd cannot be estmated for lnear models, snce the two components of the unexplaned varance, s and s 3, cannot be dentfed. On the contrary, CID s computable for both mxed Posson and logstc regresson models, as well as for multvarate lnear regresson models, see Hössjer (008), where also standard errors of both R and CID are provded. In order to estmate f from data, we use f5 t `Y - l j Y. (1) A slghtly dfferent verson of f5 has l n the denomnator nstead. When all tme duratons are equal, t follows from the GLM lkelhood estmatng equatons that the two versons are dentcal (see e.g. McCullagh and Nelder, 1989). A formula for the standard error of f5 s provded n the appendx. In the specal case of constant tme duraton and no covarates (l / l), f5 s essentally the ndex of dsperson,.e. the rato of the sample varance and sample mean. In Secton 5, we show that (1) s closely related to the Pearson estmate f5 used n GLM theory.

58 O. HOSSJER, B. ERIKSSON, K. JARNMALM AND E. OHLSSON The coeffcents of varatons (1)-(14) are estmated n the natural way, replacng s by s and l by l t l / t. 3.. Testng Excess Varance In order to test for excess varance, we formulate the null hypothess H 0 of no excess varance aganst the alternatve H 1 of a postve excess varance,.e. H 0 : z 0, H 1 : z >0, () whch s equvalent to testng s 0 aganst s >0 or f 1 aganst f >1. When the dstrbuton of all L s specfed, one may employ a lkelhood rato test to carry out (). We wll use a smpler approach based on excess varance, whch only nvolves the frst two moments of Y. Our startng pont s the excess varance statstc t (Y l ) Y, whch agrees wth the numerator of (15), except that l s replaced by Y. (Agan, the latter two sums are dentcal when all tme duratons are equal.) It s shown n the appendx that the standardzed excess varance statstc T ` j t Y - l - Y l, (3) has an approxmate standard normal dstrbuton for large samples. Hence, a test wth an approxmate sgnfcance level 1 a rejects H 0 when T $ l a, where l a s the (1 a)-quantle of a standard normal dstrbuton. Snce T c(f5 1), wth c Y / l, we may also regard T as standardzed verson of f5. For constant tme duraton and no covarates, (3) amounts to testng overdsperson of statonary count data. Then the denomnator of (3) smplfes to n l. Ths test has been used by Venezan (1981, 1990) for car accdent data. An asymptotcally equvalent approach, based on a x -approxmaton of (Y l), has been consdered by Fsher (1950) and Rao and Chakravart (1956). 4. CAR-ACCIDENT DATA We wll analyze Swedsh car accdent data from If P&C Insurance Company. A detaled descrpton of the data set can be found n Järnmalm (006). Car accdents are regstered for customers havng a unnterrupted 3 year perod n between January 1, 00 and December 31, 005. Shorter duratons, n total approxmately 30% of the total portfolo are thus excluded. The ratng factors

ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE 59 TABLE 1 THE RATING FACTORS USED FOR THE CAR ACCIDENTS DATA SET. Ratng factor j Class Varable k j Class Descrpton 1: Customer years 1 0-4 No. of years a customer has been nsured n 3-5 the company. 3 6-10 4 11- : Geographc zone 0-18 19 A dvson of Sweden nto 19 geographcal zones. 3: Age of car 1 0-6 3-5 3 6-8 4 9-1 5 13-16 6 17-4: Premum class 0-9 10 Premum class s determned by type of car. 5: Drvng dstance 1-5 5 Fve ntervals of reported drvng dstances. A larger class ndex corresponds to a longer dstance. 6: Sex The sex of the customer. 7: Age 1 0-4 13 The age of the customer. 5-6 The classes 4-1 have fve year ntervals, 30-3 7-9 34,..., 70-74. 13 75- are defned at the begnnng of the rsk perod. The age dependng factors, e.g. Age of car, are for ths reason not as accurate as possble, the advantage on the other hand s that each ndvdual s characterstcs are kept n one data record. Hence, although our methodology n prncple handles varyng duraton, the present data set has t / 1, measurng tme n three year ntervals. The sze of the data set s n 43983, and customers report a total of 9405 accdents durng the three year perod. The seven ratng factors of the model are presented n Table 1. Although our varance decomposton handles contnuous covarates, we have followed the current practce at If P&C and dscretzed the contnuous covarates. Ths mples that ratng factor j s dvded nto k j classes. For j 1,, 5, each class wthn the gven ratng factor has a dstnct regresson coeffcent b r, except for the class of the reference polcy, whch s chosen to have a fxed regresson coeffcent, 0, not ncluded n b. We model nteracton between sex and age

60 O. HOSSJER, B. ERIKSSON, K. JARNMALM AND E. OHLSSON TABLE ESTIMATED RELATIVE INCREASE OF THE ACCIDENT RATE, exp(b r ) FOR SELECTED RATING FACTOR CLASSES AND CORRESPONDING WALD CONFIDENCE INTERVALS (CIS) WITH (APPROXIMATE) COVERAGE PROBABILITY 95%. FOR EACH (COMBINED) RATING FACTOR, WE HAVE ONLY INCLUDED THE TWO CLASSES WITH MINIMAL AND MAXIMAL exp( b r ). THE CIS ARE CALCULATED WITH THE HELP OF THE STANDARD SOFTWARE (PROC GENMOD IN SAS) FOR GLM LOGLINK POISSON REGRESSION ML-ESTIMATION. HENCE THE OVERDISPERSION IS MODELED SLIGHTLY DIFFERENTLY THAN FOR THE MIXED POISSON DISTRIBUTION (3). THIS DOES NOT CHANGE THE PARAMETER ESTIMATES b r, BUT THE CIS ARE SLIGHTLY AFFECTED. THE DIFFERENCE IS HOWEVER NEGLIGIBLE, SINCE THE AMOUNT OF OVERDISPERSION z IS SMALL. Ratng factor Class exp( bˆr) exp(i b r ) Intercept 0.0590 (0.0470,0.074) Customer years 1 1.364 (1.19,1.8) 4 1.0000 (1.0000,1.0000) Geographc zone 0.5475 (0.5056,0.595) 16 1.001 (0.9468,1.0607) Age of car 1.0641 (1.04,1.1055) 6 0.779 (0.6811,0.7779) Premum class 1 0.3988 (0.484,0.6371) 6 1.5873 (1.815,1.966) Drvng dstance 1 0.803 (0.7898,0.850) 5 1.545 (1.1739,1.3407) Sex/age Female/13 1.4593 (1.3475,1.5471) Female/10 0.8740 (0.815,0.9400) (ratng factors 6 and 7), gvng k 6 k 7 combned classes, of whch one s chosen as reference. The covarates are chosen as x 1 1 (the ntercept) and, for r >1, x r 1 f ndvdual belongs to the gven (combned) class and 0 otherwse. The total number of regresson coeffcents s 5 p 1 + _ kj - 1 +(k 6 k 7 1)65. j 1 Table shows results of a standard GLM analyss, ncludng parameter estmates b r and the assocated confdence ntervals of some selected regresson coeffcents. The average estmated clam frequency s l - 1-1 n l n Y 0. 0669 n 1 per three year ntervals. The next step s to test for overdsperson. For the data set analyzed by Venezan (1990), the overdsperson s hghly sgnfcant. Our concluson s the same, snce the test statstc for excess varance s

ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE 61 T 19.89, (4) so that the null hypothess of no excess varance s rejected at level 0.001. As a comparson, T 3.30 for a model wth no covarates. Hence the ratng factors only decrease the sgnfcance of excess varance margnally. To assess more explctly the mpact of the ratng factors, we estmated the three components of the emprcal varance decomposton (16) as s 1 0.000373, s 0.0030, s 3 l 0.0669. Insertng these values nto (17) and (19), we get surprsngly low coeffcents of determnaton R 0.0053, CID 0.110. (5) Only about 0.5% of the total varaton and 11% of the total ndvdual varaton s thus explaned by the ratng factors. In Fgure 1 both R and CID are plotted as functons of tme, assumng all polces n the portfolo have the same tme duraton t years. The two ndvdual varances are constant, whereas the nose varance s 3 s nversely proportonal to t. Hence R ncreases wth t whereas CID s constant. We notce that R 0.18% f t 1 and that t 60. s requred n order for R to reach 0.5CID. Of course, n practce, the tme duraton t cannot be vared n ths way. An nsurance contract usually lasts for one year. On the other hand, t s stll of nterest to consder clams over several years, and then several polces may reman unchanged for at least, say, fve years. In any case, Fgure 1 llustrates that nose s by far the domnatng source of varaton for tme duratons used n practce and that unrealstcally long duratons would be requred n order to reduce nose varance sgnfcantly. The relatve excess varance s estmated as and the coeffcents of varaton as f5 1.044. (6) % CEIV 0.89 % CUIV 0.813 % CNV 3.866 % CV 3.96.

6 O. HOSSJER, B. ERIKSSON, K. JARNMALM AND E. OHLSSON FIGURE 1: Plot of R (sold lne) and CID (dotted lne) versus t for the car accdent data set, assumng all polces reman n the portfolo for t years, wth s 1 (t) s 1, s (t) s and s 3 (t) 3s 3 /t. Hence, the explaned standard devaton s about 9% and the unexplaned ndvdual standard devaton about 81% of the average clam frequency. The nose coeffcent of varaton s much larger, 387%. As explaned above, ths s due to the short tme duratons. In order to gve confdence ntervals for (selected) parameter estmates, we need to estmate the varance functon, whch requres estmaton of a and z. A regresson method, explaned n the appendx, yelds (a,z 0 ) (1.3051, 0.0946). (7) For the fnal estmate (15) of the dsperson parameter z we use two dfferent values of a, whch s regarded as known, and obtan z 0. 044, f a 1, * 0. 0979, f a 1. 3. Here a 1.3 s taken from the ntal regresson analyss (7) and a 1 s chosen to yeld a smple overdsperson model.

ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE 63 We report 95% Wald confdence ntervals n Table for selected regresson parameters and n Table 4 for r, r nd, f and z. We notce that I r, I rnd and I f are very nsenstve to the choce of a (1 or 1.3). Ths s due to the small amount of excess varance n the data, makng the exact model of overdsperson less crucal. Hence we recommend usng the smpler model wth a 1. Two versons of I f are reported based on standard errors defned n the appendx. The parametrc model assumes a gamma dstrbuted accdent proneness, wheres the nonparametrc model only ncludes the frst four moments of L x. They gve essentally the same confdence ntervals. 5. DISCUSSION In ths paper, we have defned a general framework for quantfyng explaned and unexplaned varaton of clam frequences n non-lfe nsurance, ncludng a new coeffcent of determnaton, CID, generalzatons of prevously used estmates of relatve overdsperson (f5) and test statstcs for overdsperson (T), as well as new coeffcents of explaned and unexplaned ndvdual varaton, CEIV and CUIV respectvely. Our purpose s not to solve new actuaral problems, but rather to provde new, ntutve quanttes that hopefully gve new nsghts as regards to the qualty of the chosen model. Our hope s that CID, CEIV and CUIV all become valuable and much used tools when the actuary selects ratng factors and classes wthn each ratng factor. An applcaton to a Swedsh car accdent data set reveals that the amount of overdsperson s hghly sgnfcant, but yet small n relaton to nose varance. Ths manfests tself by CID beng much larger than R and s explaned by the fact that tme duratons n Motor TPL are very short n relaton to clam frequences. Smlar analyses for other countres (see Järnmalm, 006) show that although the amount of overdsperson vares, t s persstently sgnfcant but yet small n relaton to nose varance. Surprsngly, the proporton of explaned varance s stll very small after removng nose varance. We obtaned CID 11.%, whereas hgher, but stll low values CID 35.8% and CID 31.% can be deduced from the varance decompostons of Johnson and Hey (1971) and Brockman and Wrght (199) respectvely. The low value of CID obtaned for our data set and model may have several reasons: 1) The multplcatve rsk assumpton s only approxmately correct. In partcular, our model only ncludes nteracton between two of the seven ratng factors n Table 1. ) The number of classes wthn some ratng factor could be ncreased or replaced by contnuous covarates. 3) The true clam frequences l may be tme varyng, not constant. See the appendx for more dscusson on ths topc.

64 O. HOSSJER, B. ERIKSSON, K. JARNMALM AND E. OHLSSON TABLE 3 MEAN AND EXCESS VARIANCE FOR PREMIUM GROUPS (SEE APPENDIX). I j lˆ j ŝ excess, j I j Accdents (0.000, 0.015) 0.0113 0.0014 1513 13 (0.015, 0.05) 0.000 0.0009 67 13 (0.05, 0.035) 0.0317 0.00091 746 6 (0.035, 0.045) 0.0410 0.00144 35494 1448 (0.045, 0.055) 0.0505 0.0013 78455 3933 (0.055, 0.065) 0.0600 0.0033 100717 609 (0.065, 0.075) 0.0697 0.00345 84785 5945 (0.075, 0.085) 0.0796 0.00350 5708 4518 (0.085, 0.095) 0.0896 0.00541 35490 339 (0.095, 0.105) 0.0993 0.00196 0130 196 (0.105, 0.115) 0.1093 0.0037 994 1057 (0.115, 0.15) 0.1193 0.00483 4345 517 (0.15, 0.135) 0.194 0.00430 1818 0 (0.135, 0.145) 0.1394 0.0091 836 1 (0.145, 0.155) 0.1493 0.03867 391 61 (0.155, 0.165) 0.1590 0.14955 16 3 (0.165, 0.175) 0.1696 0.14359 4 9 (0.175, 0.185) 0.1806 0.06495 15 5 (0.185, 0.195) 0.1888 0.0047 8 (0.195, 0.05) 0.1988 0.1597 3 0 (0.05, 0.15) 0.097 0.16571 0 TABLE 4 WALD CONFIDENCE INTERVALS I q (q l a / d q, q + l a / d q ) OF VARIOUS PARAMETERS q. THE ASYMPTOTIC COVERAGE PROBABILITY IS 95% (a 0.05) AND d q IS THE STANDARD ERROR OF q. THE ASSUMED a IS EITHER 1 OR 1.3 AND I f IS EITHER NONPARAMETRIC (NP) OR PARAMETRIC (P). q a I q r 1 (0.0049, 0.0058) r 1.3 (0.0049, 0.0058) r nd 1 (0.0967, 0.174) r nd 1.3 (0.0967, 0.174) f 1 (1.0383, 1.0500) NP f 1.3 (1.0383, 1.0500) NP f 1 (1.0384, 1.0500) P z 1 (0.0383, 0.0500) z 1.3 (0.0849, 0.1108)

ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE 65 4) A number of unknown ndvdual characterstcs are not ncluded n the model. For nstance, the annual drvng dstance s self-reported and may dffer from the true one. Car drvers use dfferent roads wth varyng rsks, and ths varaton s only to some extent captured by geographcal zone. The ndvdual ablty to drve safely s only to some extent explaned by sex/age. Other factors, such as psychologcal make-up and drnkng habts, cannot be ncluded n the model. 5) Incluson of customers wth tme duraton less than three years n the portfolo may ncrease CID. These drvers typcally have hgher clam frequences than average. Snce ndvdual varaton of clam frequences s very complex, we don t state that 1-5) are enough to guarantee a CID of 100%, smply that they to some extent explan the low CID found n our data set. For more dscusson on ths theme we refer to Haght (001), Lemare (1995) and Brockman and Wrght (199). Our work can be extended n several ways. A frst extenson s to consder tme varyng covarates (see tem 4 above) and random effects, as descrbed n more detal n the appendx. A second extenson s to use overdspersed Posson dstrbutons (ODP) rather than mxed Posson dstrbutons. For ODPs, the parameter f s defned drectly n terms of the varance functon; v Var(Y ) fl /t, (8) for all polces 1,, n, see McCullagh and Nelder (1989). In general, for mxed Posson dstrbutons, (8) does not hold, and the defnton of f n (11) cannot be reformulated n terms of the varance functon of ndvdual polces. An excepton s a 1 and t / t, n whch case (8) s satsfed for mxed Posson dstrbutons as well, wth f 1 + zt. Formally, the varance decompostons (8) and (16) can be defned for ODPs, provded we change the nterpretaton of v to that of (8). Ths n turn provdes us wth R and CID for ODPs. The nterpreton of unexplaned ndvdual varance and CID s less clear though, snce ODP s not a mxed model, havng no random effects. A thrd extenson, when p/n s non-neglgble, s to account for reduced degrees of freedoms when defnng R and CID (see Hössjer, 008), as well as f5 and T. For our data set, ths adjustment has a mnor effect, snce p/n 1.48 10 4. A fourth extenson s to replace t by other weghts w when defnng l, s, the varance decomposton (8), r, r nd and f. Varous weghtng schemes are dscussed n Hössjer (008). One possblty s nverse varance weghtng w t / l. Ths choce of weghts results n all polces havng approxmately the same contrbuton to the unexplaned part of s, snce w Var(Y ). 1,

66 O. HOSSJER, B. ERIKSSON, K. JARNMALM AND E. OHLSSON where the approxmaton s exact n absence of overdsperson. Snce these weghts nvolve unknown parameters, we use estmated weghts w t /l (9) to compute l, s, the emprcal varance decomposton (16), R, CID and f5. We may also generalze the verson of T wth l nstead of Y n the numerator to T w`y - lj - w`l/ t w `l / tj j ( 9) x - n n 19. 864, (30) where x t Y - l ` j l 457 901 s the unscaled Pearson statstc (Pearson, 1900) for Posson regresson. The verson of f5 wth l n the denomnator s generalzed to f5 w `Y w `l / t j - l j x n ( 9) 1. 044, (31) whch agrees wth the Pearson defnton of f5, except for usng n nstead of n p n the denomnator. We notce that (30) and (31) only dffer margnally from (4) and (6). Hence, for our data set, t seems that the choce of weghts s not crucal. Ths s probably due to the fact that all tme duratons are equal and the estmated clam frequences l vary qute lttle for the majorty of polces. For other tarffs, ths may not be the case and then t s of nterest to compare how varous weghtng schemes affect the coeffcents of determnaton, test of excess varance, estmated overdsperson and coeffcents of varaton n terms of effcency and power. A ffth extenson would be to nclude clam severty. Assumng X j s the j th clam severty of the th polcy we may varance decompose the observed cost rates Z N X j 1 j / t

ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE 67 wth weghts w t. An alternatve approach s to treat clam severty separately and condton on the observed N n. Ths leads to varance decomposton of the average clam costs Z n X / n, j 1 j for all polces wth n > 0, usng weghts w n. APPENDIX Estmatng the varance parameter a. We dvde the estmated ndvdual clam frequences l nto 1 ntervals (see Table 3), henceforth denoted as premum groups. Let I j be the j th premum group ( j 1,, 1) and l j s excess, j / I, l Ij ` j j Y - I l / Ij - j l j the estmated average premum and excess varance wthn I j. Assumng a power relaton E(s excess, j) zl j a, a weghted lnear regresson of log(s excess, j) aganst log(l j ) s employed, wth weghts proportonal to I j. Snce s excess, j s unrelable (and sometmes negatve) for small premum groups, we only nclude I 3,,I 1 n the regresson analyss, resultng n (7), where z 0 s dfferent from (15), whch assumes a to be known. The estmate (7) s qute stable. Further excluson of ) I 1 gves (a, z 0 )(1.334, 0.0997) and ) I 3 and I 1 gves (a, z 0 ) (1.971, 0.0930). In Fgure, the pars (l j, s excess, j), j 3,, 1 are plotted together wth ftted varance curves based on (7) and a second curve wth a 1 and only z beng estmated. Asymptotc normalty of z5 and the numerator of (3). Defne S (S 1, S ) ( l, t v ), S (S 1, S ) ( Y, t (Y l ) ), where v Var(Y ). We wll prove that asymptotcally, n the lmt of large samples, S has a bvarate normal dstrbuton wth mean S and covarance matrx s11 s1 e. s s o 1

68 O. HOSSJER, B. ERIKSSON, K. JARNMALM AND E. OHLSSON FIGURE : Plot of s excess, j aganst l j for premum groups j 3,, 1, together wth ftted varance curves based on estmates (7) (dotted lne), a 1 and estmated z 0 (sold lne). From ths asymptotc normalty of f5 follows. Indeed, f5 g(s), f g(s) where g(s 1, S )S /S 1. Let G g (S)( f,1)/s 1. Then, by Taylor expandng g around S, t follows that f5 s asymptotcally normal wth mean f and varance s f5 GSG T (s fs 1 + f s 11 )/S 1. (A.1) Smlarly, let C be the numerator of (3) and wrte C (t (Y l ) Y ) k(s), C (t v l ) k(s),

ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE 69 where k(s) S S 1. Puttng K k (S) ( 1,1) we fnd that C s an asymptotcally normal estmator of C wth asymptotc varance s C K SKT s s 1 + s 11. (A.) Followng the lnes of proof n Hössjer (008), one verfes that S S + (Y l,t (Y l ) t v )+o p (n 1/ ), (A.3) where the the last term s small n probablty compared to n 1/ and hence asymptotcally neglgble. A consequence of (A.3) s that the mpact of replacng l by l n the defnton of S has no effect on the asymptotc dstrbuton. It follows from (A.3) that s 11, s 1, (A.4) s k, where t t E((Y l ) 3 ) and k t E(((Y l ) v ) ). Insertng (A.4) nto (A.1) and (A.) we obtan s f5 S 1 (k ft + f v ), s C (k t + v ). (A.5) To compute standard errors, we replace S 1, f, v, t and k by estmates and obtan d f5 ( Y ) (k f5 t + f5 (Y l ) ), d C (k t +(Y l ) ), (A.6) One opton s to proceed nonparametrcally and put t t ((Y l ) 3 v (Y l )), k t ((Y l ) v ), v l /t + zl a. We added the second term v (Y l ) n the defnton of t n order to guarantee that S s postve (sem)defnte and thus d f5 and d are non-negatve. C

70 O. HOSSJER, B. ERIKSSON, K. JARNMALM AND E. OHLSSON Alternatvely, a parametrc approach s to assume a gamma dstrbuton for all L. For nstance, f a 1,L G(l /z, z), where G(a,b ) has densty f (x) 1 b a G ] a g xa 1 e x/b, x >0. Hence t L G(l /z,t z), and N t Y has a negatve bnomal dstrbuton Nbn(l /z,1/ (1+ t z)). From moments of negatve bnomal dstrbutons we obtan v t 1 l (1 + zt ), t t 1 l (1 + 3zt +z t ), (A.7) k l (1 + zt ) + t 1 l (1 + 7zt +1z t +6z 3 t 3 ), and ther estmated analogues by pluggng n l and z. Snce z s often very small for non-lfe nsurance data the hgher order powers of z make lttle contrbuton to the standard errors. When z 0, we obtan the denomnator of (3) from (A.5) and (A.7). Extendng the varance decomposton to tme-varyng covarates and random effects. For smplcty, assume that tme s counted n unts of years and that all t are nteger valued and extend () and (3) to l j exp( b T x j ), N t, (A.8) Y j 1 j Y j Po(l j U j ), where l j, x j (x j1,,x jp ) and U j s the tme varyng prce relatvty, covarate vector and random effect of polcy durng year j respectvely. We assume that Y j and Y lk are condtonally ndependent gven U j and U lk, that U j and U lk are ndependent when l and that E(U j )1, Cov(U j,u k )zl j a/ 1 l k a/ 1 r k j (A.9) for some autocorrelaton functon r. t Let l j 1 l j /t and L t j 1 l j U j /t be the average prce relatvty and accdent proneness of contract. Then (A.9) gves an extenson Var (L ) zl a, (A.10)

ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE 71 of the varance functon (5), where t l a t a/ lj l a/ k r k j. jk, 1 We notce that l # l when a #, wth equalty f and only f r k / 1 and ether a or l j / l. Hence, the varance of the averaged accdent proneness s reduced when the random effects or prce relatvtes are tme-varyng. There are at least two ways to extend the varance decomposton of Secton. The frst opton s to retan (7) wth Y N /t, as before. Ths yelds a varance decomposton for whch the explaned varance s 1 and the nose varance s 3 agree wth (8), whereas the unexplaned ndvdual varance s a z tl / t. s decreased as soon as l < l for at least one polcy. The concluson s that both R and CID are ncreased when random effects wthn each polcy are tme varyng. The second opton s to consder the varance of the annual clam frequences, s n t Eb_ Y - l l 1 j 1 j / t. By smlar calculatons as n Secton, ths yelds qute a dfferent varance decomposton wth s 1 t^ l l l l - + _ j, j - b h l / t. a j, s z lj / t, s 3 t l / t. The nose varance s 3 s ncreased for a varance decomposton based on annual clam frequences (by a factor t f t / t) n agreement wth the dscusson n Secton 4. The explaned varance, s 1, s enlarged as well, snce we are able to explan not only l but also the varaton of l j around l. The unexplaned ndvdual varance, s, also ncreases when a $ 1, snce then, j l a j $ t l a. The concluson s that n general, R decreases f annual clam frequences Y j are consdered rather than the averaged ones Y, due to the ncreased nose varance. On the other hand, CID may ncrease or decrease, dependng on how

7 O. HOSSJER, B. ERIKSSON, K. JARNMALM AND E. OHLSSON varable the prce relatvtes and random effects are wthn each polcy. If the random effects are constant (r k / 1), then CID typcally ncreases when annual clam frequences are consdered. ACKNOWLEDGEMENT Ola Hössjer s work was supported by the Swedsh Research Councl, contract number 61-005-810. The authors wsh to thank two anonymous referees, whose comments sgnfcantly ncreased the qualty of the manuscrpt. REFERENCES BEARD, R.E., PENTIKAINEN, T. and PESONEN, E. (1984) Rsk Theory, The Stochastc Bass of Insurance (3rd edton). Chapman and Hall. BRESLOW, N.E. (1984) Extra-Posson varaton n log-lnear models. Appl. Statst. 33(1), 38-44. BROCKMAN, M.J. and WRIGHT, T.S. (199) Statstcal motor ratng: Makng effectve use of your data. Journal of the Insttute of Actuares 119(111), 457-543. BÜHLMANN, H. and GISLER, A. (005) A Course n Credblty Theory and ts Applcatons. Sprnger Unverstext. COX, D.R. and SNELL, E.J. (1989) The analyss of bnary data, nd ed., Chapman and Hall, London. FISHER, R.A. (1950) The sgnfcance of devatons from expectaton n a Posson seres. Bometrcs 6, 17-4. HAIGHT, F.A. (001) Accdent proneness: The hstory of an dea. Insttute of Transportaton Studes, Unversty of Calforna, Irvne, USA. HINDE, J. (198) Compound Posson regresson models. In GLIM 8: Proceedngs of the Internatonal Conference n Generalzed Lnear Models (R. Glchrst, ed.), pp. 199-11, Sprnger, Berln. HÖSSJER, O. (008) On the coeffcent of determnaton for mxed regresson models. Journal of Statstcal Plannng and Inference 138, 30-3038. JOHNSON, P.D. and HEY, G.B. (1971) Statstcal studes n motor nsurance. Journal of the Insttute of Actuares 97, 199. JUNG, J. (1968) On automoble nsurance ratemakng. ASTIN Bulletn 5, 41. Järnmalm, K. (006) Measures of the remanng systematc varance between ndvduals when dvded nto ndvdual premum groups n non-lfe nsurance. Master Thess, Mathematcal Statstcs, Stockholm Unversty, Report 006:15. (In Swedsh.) LAWLESS, J.F. (1987) Negatve bnomal and mxed Posson regresson. Canadan J. Statst. 15(3), 09-5. LEMAIRE, J. (1995) Bonus-Malus Systems n Automoble Insurance. Sprnger. MADDALA, G.S. (1983) Lmted-Dependent and Qualtatve Varables n Econometrcs. Cambrdge Unversty Press. MAGEE, L. (1990) R measures based on Wald and lkelhood rato jont sgnfcance tests. Am. Statstcan 44, 50-53. MCCULLAGH, P. and NELDER, J.A. (1989) Generalzed Lnear Models, second edton, Chapman and Hall. NAGELKERKE, N.J.D. (1991) A note on a general defnton of the coeffcent of determnaton. Bometrka 78(3), 691-69. OHLSSON, E. (008) Combnng generalzed lnear models and credblty models n practce. Scandnavan Actuaral Journal 008(4), 301-314. OHLSSON, E. and JOHANSSON, B. (006) Exact credblty and Tweede models. ASTIN Bulletn 36(1), 11-133. PEARSON, K. (1900) On a crteron that a gven system of devatons from the probable n case of a correlated system of varables n such that t can be reasonably supposed to have arsen from a random samplng. Phl. Mag. 50(5), 157-75.

ASSESSING INDIVIDUAL UNEXPLAINED VARIATION IN NON-LIFE INSURANCE 73 POCOCK, S.J., COOK, D.G. and BERESFORD, S.A.A. (1981) Regresson of area mortalty rates on explanatory varables: what weghtng s approprate? Appl. Statst. 30, 86-95. RAO, C.R. and CHAKRAVARTI, I.M. (1956) Some small sample tests for sgnfcance for a Posson dstrbuton. Bometrcs 1, 64-8. VENEZIAN, E.C. (1981) Good drvers and bad drvers a Markov model of accdent proneness. Proceedngs of the Casualty Actuaral Socety, LXVII, 65-85. VENEZIAN, E.D. (1990) The dstrbuton of automoble accdents are relatvtes stable over tme? Proceedngs of the Casualty Actuaral Socety, LXXVII, 309-336. WHITE, H. (198) Maxmum lkelhood under msspecfed models. Econometrca 50, 1-5. OLA HÖSSJER (correspondng author) Department of Mathematcs Stockholm Unversty, S-106 91 Stockholm, Sweden E-Mal: ola@math.su.se Fax: +46-8-61 67 17, URL: www.math.su.se/~ola/