Past and present trends in aggregate claims analysis Gordon E. Willmot Munich Re Professor of Insurance Department of Statistics and Actuarial Science University of Waterloo 1st Quebec-Ontario Workshop on Insurance Mathematics January 28, 2011 Montreal, Quebec
goal of talk is a discussion of modelling and analysis of aggregate claims on a portfolio of business historical perspective; techniques and complexity of models has changed over time models interdisciplinary credit risk operational risk profit analysis queueing theory
modelling incorporates two basic components random number of events of interest (frequency) each event generates a random quantity of interest (severity) main goal is to aggregate these quantities fixed period of time; aggregate claims analysis tracking of behaviour over time; surplus analysis complexity of models increased greatly recently attempt to realistically model quantities enhanced ability due to computational and analytic advances
historical description of aggregate model interested in the evaluation of aggregate claims distribution function (df) G(y) = 1 G(y) where S = Y 1 +Y 2 +...+Y N N = number of claims, Y i = amount of i-th claim traditionally, {Y 1,Y 2,...} assumed to be an iid sequence, independent of N also of interest is stop-loss random variable (S y) + = max(s y,0) stop-loss premium R 1 (y) = E{(S y) + } = y G(t)dt stop-loss moments R k (y) = E{(S y) k + } = y (t y) k dg(t)
original approaches to evaluation involved parametric approximations applied to G(y) directly easy to use, typically requires simple quantities such as moments questionable accuracy, particularly in right tail difficult to incorporate changes in individual policy characteristics such as deductibles and maximums less commonly used since 1980
commonly used approximations normal-based normal approximations give light right tail normal power, Haldane s, and Wilson-Hilferty all assume h(s) is normally distributed for some h( ) gamma-based Beekman-Bowers, translated gamma uses gamma rather than normal for skewness
exponential approximations motivated by ruin theory (compound geometric) includes Cramer-Lundberg asymptotic formula, Tijms, De Vylder s method light right tail subexponential approximations heavy right tail often based on extreme value arguments most common are regularly varying types (e.g. Pareto)
Esscher s method surprisingly good accuracy gave rise to Esscher transform (applied probability and mathematical finance; change of measure) adopted by statistical community for approximating distribution of sample statistics which involve sums of independent random variables) often referred to as saddlepoint approximations, exponential tilting
numerical procedures simulation used in 1970 s advantage in that complicated models may be used disadvantage in that right tail may be inaccurate unless many values used disadvantage in that difficult to modify assumptions at individual claim level
transform inversion techniques FFT in discrete case complex inversion based on aggregate pgf black box may require discretization of claim size distribution continuous inversion approaches Heckman-Meyers (characteristic function, piecewise constant density) Laplace transform inversion (much recent progress in queueing community)
recursions computation of (discretized) probability mass function of S recursively, beginning with Pr(S = 0) let p n = ( a+ b n) pn 1 for n = 2,3,... Panjer-type recursion Pr(S = y) = {p 1 (a+b)p 0 }Pr(Y = y) + y x=0 ( a+b x y ) Pr(Y = x)pr(s = y x)
includes most of basic compound models, e.g., Poisson, Binomial, negative binomial, logarithmic series extensions to other models as well simple to understand and use compound Poisson due to Euler, Adelson in queueing context, Panjer (1981) in actuarial context no restrictions on functional form of distribution of Y
individual policy modifications deductibles, maximums, and coinsurance on each claim easy to incorporate with numerical procedures such as recursions statistically, deductibles involve left truncation on loss sizes and thinning of loss numbers maximums involve right censoring ; coinsurance results in scale changes, both on loss sizes
trends in aggregate loss modelling in last quarter century removal of independent and/or identically distributed assumptions possible due to mathematical and computational advances claim count dependencies time series models dependence through latent variables as in mixed Poisson and MAP models claim size dependencies MAP models, mixtures as in credibility incorporated via copula models
strong dependency concepts such as comonotonicity dependence between claim sizes and numbers in dependent Sparre Andersen (renewal) risk models removal of identically distributed assumption discounted aggregate claims incorporating inflation claim sizes independent but depend on time of occurrence (mixed) Poisson process allows reduction to iid case renewal risk process difficult, but good progress made
stronger inter-disciplinary influences phase-type assumptions borrowed from queueing theory greater flexibility even for simple models very useful for complex models (fluid flow techniques) Gerber-Shiu techniques for option pricing and Esscher transform analysis in mathematical finance wide variety of probabilistic, statistical, and applied mathematical tools used in risk analysis
use of semiparametric distributional assumptions phase-type distributions, combinations of exponentials, mixture of Erlangs all are dense in class of distributions in R +, flexible use of these models involves a hybrid of analytic and numerical approaches semi-parametric nature makes estimation nontrivial can be numerical root-finding difficulties with phase-type distributions (location of eigenvalues for calculation of matrixexponentials) and combinations of exponentials (partial fraction expansions on Laplace transforms)
phase-type distributions absorption time in time-homogeneous Markov chain particularly useful for fairly complex stochastic models (advantage over other two classes) as well as for simple models calculations of most quantities of interest straightforward disadvantages 1) knowledge of matrix calculus needed 2) often necessary to assume that all components of model are of phase-type
mixed Erlang distributions huge class of distributions includes class of phase-type distributions includes many distributions whose membership in class is not obvious from definition
extremely useful for simple risk models model for claim sizes all quantities of interest computed easily using infinite series (even finite time ruin probabilities) no root finding needed only requires use of simple algebra present discussion mainly from Willmot and Woo (2007) and Willmot and Lin (2011)
mathematical introduction Erlang-j pdf for j = 1,2,..., β > 0 mixed Erlang pdf e j (y) = β(βy)j 1 e βy, y > 0 (j 1)! f(y) = j=1 q j e j (y), y > 0 where {q 1,q 2,...} is a discrete counting measure includes Erlang-j as special case q j = 1, and exponential as special case q 1 = 1; for many class members, {q j ; j = 1,2,3,...} is most easily expressed through its probability generating function (pgf)
let Q(z) = j=1 q j z j be the pgf, then the mixed Erlang Laplace transform (LT) is f(s) = 0 e sy f(y)dy = Q ( β β +s implying that f(y) is itself a compound pdf with an exponential secondary pdf, or exponential phases in queueing terminology if the LT may be put in this form then the distribution is a mixed Erlang ),
loss model properties tail F(y) = y f(x)dx is given by where Q k = F(y) = e βy k=0 q j j=k+1 Q k (βy) k k! = k=1 Q k 1 β e k(y) for value at risk (VaR) or quantiles; at level p, VaR = v p where F(v p ) = 1 p, and v p is easily obtained numerically asymptotic Lundberg type formula available for F(x) via compound distribution representation
moments 0 y k f(y)dy = β k j=1 q j (k +j 1)! (j 1)! excess loss (residual lifetime) pdf (payment per payment basis with a deductible for x) still of mixed Erlang form with f x (y) = f(x+y) F(x) q j,x = i=j j=1 = j=1 q i (βx) i j (i j)! Q m (βx) m m! q j,x e j (y)
force of mortality (failure or hazard rate) µ(y) = f(y)/ F(y) satisfies µ(0) = βq 1, µ( ) = β ( 1 z0 1 ) where z0 is the radius of convergence of Q(z) (µ( ) = β for finite mixtures), and µ(y) β (dominates exponential in failure rate and hence stochastic order) equilibrium distribution (useful in tail classification and ruin theory) still of mixed Erlang form with f e (y) = F(y) 0 F(x)dx = q j = Q j 1 k=1 kq k j=1 q j e j(y)
mean excess loss (mean residual lifetime) is r(y) = 0 yf x (y)dy (also reciprocal of equilibrium failure rate) with Q j = qk k=j+1 r(y) = 1 µ e (y) = j=0 Q j (βy)j j! β q (βy) j j+1 j! j=0 also, r(0) = 0 yf(y)dy = j=1 jq j /β is the mean, r( ) = z 0 /{β(z 0 1)} and r( ) = 1/β if z 0 =, and r(y) 1/β TVaR calculations easy using v p +r(v p )
aggregate claims with mixed Erlang claim sizes let {c 0,c 1,c 2,...} have the compound pgf C(z) = n=0 c n z n = P{Q(z)} where P(z) = E(z N ) = n=0 p n z n, and {c 0,c 1,c 2,...} is itself a compound distribution which may often be computed recursively aggregate claims LT 0 e sy dg(y) = P{ f(s)} = P { ( )} β Q β +s = C ( β β +s ) * still of mixed Erlang form with q n replaced by c n
for stop-loss moments (k = 1 stop-loss premium) where R k (y) = e βy n=0 r n,k (βy) n n! r n,k = β k j=1 = n=1 c n+j Γ(k +j) Γ(j) (valid for all k 0, and R 0 (y) = Ḡ(y)) for TVaR, r n 1,k e n (y) β E(S S > x) = x+ j=0 C j (βx)j j! (βx) j j! β j=0 c j+1 where c j = C j 1/ k=1 kc k and C j = k=j+1 c k also simpler asymptotic formulas (as x ) for VaR and TVaR using Lundberg light-tailed approach
nontrivial examples of Erlang mixtures many distributions of mixed Erlang form, after changing the scale parameter identity for Laplace transforms β 1 β 1 +s = β β +s β 1 β 1 ( 1 β 1 β ) β β+s for β 1 < β this expresses the well known result that a zerotruncated geometric sum of exponential random variables is again exponential
Example 1 (mixture of two exponentials) suppose that (without loss generality) β 1 < β 2, 0 < p < 1, and f(y) = pβ 1 e β 1y +(1 p)β 2 e β 2y, y > 0 then f(s) = p β 1 β 1 +s + (1 p) β 2 β 2 +s, and using the identity with β replaced by β 2, it follows that f(s) = β 2 β 2 +s that is, f(s) = Q ( β 2 β 2 +s (1 p)+p ) where β1 β 2 β 2 +s 1 ( 1 β 1 β 2 ) β2 β1 β 2 Q(z) = z (1 p)+p 1 ( 1 β 1 i.e., q 1 =(1 p)+p ( ) β 1 β, and qj =p ( )( β 1 2 β 1 β 1 2 for j=2,3,... ) β z 2 β 2 ) j 1
Example 2 (countable mixture of Erlangs) suppose that f(y) = n i=1k=1 p ik β i (β i y) k 1 e β iy (k 1)! assuming that β i < β n for i < n, the identity may be used with β 1 replaced by β i and β by β n for each i = 1,2,...,n, to express the Laplace transform f(s) = n i=1k=1 in the form f(s) = Q ( ) β n β n +s and q j = n j i=1k=1 ( ) k βi p ik β i +s p ( j 1) ( ) k ( β i ik 1 β ) j k i, j = 1,2,... k 1 β n β n
in the following example, the distribution is not necessarily of phase-type or a combination of exponentials, and there is no simple representation for the q j s in general, but they may be obtained numerically in a straightforward manner Example 3 (a sum of gammas) consider the Laplace transform of f(y) given by f(s) = n i=1 ( βi β i +s ) αi, corresponding to the distribution of the sum X 1 +X 2 + +X n, with the X i s being independent random variables, and X i has the gamma pdf β i (β i y) α i 1 e β iy /Γ(α i ) we assume that the α i s are positive (not necessarily integers), but the sum m = n i=1 α i is assumed to be a positive integer
assuming that β i < β n for i < n, it follows that f(s) = Q ( ) β n β n +s where Q(z) = z mn 1 i=1 β i β n 1 ( 1 β i ) β n z the probabilities {q 1,q 2,...} correspond to convolutions of negative binomial probabilities, shifted to the right by m simple analytic formulas for {q 1,q 2,...} may be derived in some cases, such as when α i = 1 for all i or when n = 2 in general, however, it follows that q j =0 for j<m, q m = n 1 i=1 (β i/β n ) α i, and {q m+1,q m+2,...} may be computed using the Panjer-type recursion q j = 1 j m j m n 1 ( α i k=1 i=1 1 β i β n ) k q j k, α i j=m+1,m+2,...
applications in ruin and surplus analysis Sparre Andersen (renewal) risk model mixed Erlang claim sizes let h δ (x) be the discounted (with parameter δ 0) density of the surplus immediately prior to ruin with zero initial surplus geometric parameter φ δ = 0 h δ (x)dx, 0 < φ δ < 1 ladder height pdf b δ (y) = 0 f x (y) { hδ (x) is a mixture over the deductible x of the mean excess loss pdf φ δ } dx
Laplace transform of the time of ruin (ruin probability is special case δ = 0) is the compound geometric tail Ḡ δ (x) = n=1 (1 φ δ )φ n δ B n δ (x), x 0 in mixed Erlang claim size case with f(y) = j=1 q j e j (y),f x (y) is also mixed Erlang, in turn implying that the ladder height pdf b δ (y) = j=1 q j (δ)e j (y) ( ) is still mixed Erlang, with LT b δ (s) = Q β δ β+s where Q δ (z) = j=1 q j (δ)z j
hence, define the discrete compound geometric pgf C δ (z) = n=0 c n (δ)z n = and the previous results imply that 1 φ δ 1 φ δ Q δ (z), Ḡ δ (x) = e βx j=0 C j (δ) (βx)j j! = j=1 C j 1 (δ) e j (x) β where C j (δ) = n=j+1 c n (δ) explicit expression for mixed Erlang mixing weights are available for some interclaim time distributions (e.g. Coxian), in which case recursive numerical evaluation is straightforward
deficit at ruin given initial surplus x (relevant quantity for risk management decisions), denoted by U T, has mixed Erlang pdf (given that ruin occurs) h x (y) = m=1 p m,x e m (y) where the distribution { p 1,x,p 2,x,... } is given by and p m,x = τ n (x) = j=m j=1 q j (0)τ j m (βx) q j (0) j 1 i=0 i=0 τ i (βx) c i (0) xi+n (i+n)!,
more generally, it can be shown that E { e δt w( U T )I (T < ) U 0 = x } = m=1 R m,δ e m (x) where U t is the surplus at time t, with R m,δ s constants