A distribution-based stochastic model of cohort life expectancy, with applications David McCarthy Demography and Longevity Workshop CEPAR, Sydney, Australia 26 th July 2011 1
Literature review Traditional stochastic mortality models work by specifying the distribution of the central rate of death at each age and time Starting point is Lee-Carter (1992) model: mx, t exp( ax bxkt ) Assumes that all uncertainty affecting mortality occurs in the year of death k m, q xt, If t is normally distributed, xtis log-normal and is nonstandard m xt All, are perfectly correlated at each t, and survival probabilities have awkward auto-correlation properties Haberman and Renshaw (2004) incorporate a cohort effect Cairns, Blake and Dowd (2005) have a 2-factor model
Shortcomings of traditional models Cannot estimate statistical distribution of life insurance, pension and annuity liabilities without intensive numerical work, owing to the nonstandard distribution and awkward auto-correlation properties of survival probabilities and probabilities of death Can be difficult to allow for cohort effects (e.g. difficult to estimate Haberman and Renshaw model)
A distribution-based model Instead, we ignore annual probabilities of death Regard the lifespan of an individual, conditional on reaching a certain age, as a draw from a statistical distribution In its basic form, we can regard the parameters as fixed when the individual reaches the conditioning age (implying that all uncertainty in the parameters is resolved by that time admittedly unrealistic) Since lifespan is the product of a whole number of random variables, each of which are (broadly) independent, we expect an approximately normal distribution (empirical work confirms this) So rather than modelling a series of annual mortality probabilities, we just choose the parameters of the normal distribution appropriately, and we are done!
Pr(Age at death<x X>=60), empirical and fitted Fit is quite close... To persuade you, I have fitted truncated normal distributions to the age at death of different UK male population cohorts, and plotted both the empirical and fitted CDF s below 1.0 0.8 Cohort of 1862 Cohort of 1899 Cohort of 1915 0.6 0.4 0.2 Cohort of 1929 0.0 60 65 70 75 80 85 90 95 100 105 110 Age (X)
Financial consequences of deviations are quite small So although this approach may get mortality probabilities in individual years quite wrong (especially at older ages), the financial significance of these deviations is small, especially for UK males 3.50% 3.00% 2.50% KS (males) Delta (Ann) (Males) KS (females) Delta (Ann) (females) 2.00% 1.50% 1.00% 0.50% 0.00% 1860 1870 1880 1890 1900 1910 1920 1930
Fitting cohorts individually (using K-S procedure) Parameters look like this, fitted to UK population male and female cohorts from 1862-1929 (when half the cohort was still alive in 2009) 90 88 86 84 82 80 78 76 74 72 Mu (LH axis) Sigma (RH axis) 16 14 12 10 8 6 4 2 70 1860 1870 1880 1890 1900 1910 1920 1930 0
Incorporating stochastic mortality into the model We just impose a stochastic process/distribution onto the parameters of the underlying normal distribution Two sensible possibilities: Bayesian theory suggests the normal-scaled inverse gamma 2 distribution for and (the conjugate prior of the normal with unknown mean and variance) (4 parameter dstbn, cov = 0) In this case, lifespan conditional on meta-parameters will have a Student s t distribution (approximately normal for high d.o.f)...... but specifying a process generating this distribution is hard. We also consider a bivariate normal distribution for and. (5 parameter distribution) Allows a natural process for the time series of the parameters for different cohorts...... but the unconditional distribution of lifespan is not easy to calculate.
BVN vs NSIG: fitted UK male population born 1950 0.003 0.0025 0.002 0.0015 0.001 0.0005 0 0.003 0.0025 0.002 0.0015 0.001 0.0005 0 Bivariate normal Normal-scaled inverse-gamma
Modelling the lifespan of a cohort of individuals Conditional on a realisation of and, each individual in the cohort s remaining lifespan is assumed to be independently (and truncatednormally) distributed So the distribution of the present value of a life insurance policy (or annuity) on their life will have a truncated log-normal (or scaled, shifted truncated lognormal) distribution And the sum of the present value of many (>50, say) of these liabilities will have a normal distribution, by the central limit theorem All lives in a given cohort, though, share one realisation of and, which induces correlation between them, and makes the distribution of the sum of liabilities non-normal (or, as we shall see, actually a heaviertailed normal distribution)
Distribution of liabilities If L is the total present value of all liabilities on a cohort, then it has this 2 distribution if we use the NSIG distribution for and : 2 2 2 f ( L,,, ) f ( L, ) f (,,,, ) d d (Approximate) normal conditional distribution of PV of liabilities, 2 (, ) and this distribution if has a bivariate normal distribution: f ( L,,,, ) f ( L, ) f (,,,,, ) d d (Approximate) normal conditional distribution of PV of liabilities, (These both need to be calculated numerically, painful but straightforward) NSIG distribution pdf Bivariate normal pdf
Empirical work: UK mortality 1922-2009 Obtained population mortality data for UK males and females over the age of 60 between 1922 and 2009 from www.mortality.org Also obtained the Proportion of 60-year old males and females who smoked each year from 1928-2009 from Forey et al (2002) Occupational-based social class measures for each cohort from birth years 1895 onwards, separate for males and females from Heath and Payne (1999) Per-capita GDP growth in the year before birth for each cohort from Maddison (2006) We need to estimate the trend in the parameters, and the range of uncertainty in the parameters
Percentage Demographic data Variables are highly trended (except GDP growth) So we have to estimate males and females jointly for trended variables, using the difference between male and female mortality to provide accurate estimates of effects of data 100 90 80 Smokers, M, nearest to age 60 Smokers, F, nearest to age 60 Social classes I+II, M, 35+ Social classes I+II, F, 35+ PC GDP growth (RH axis) 0.15 0.1 70 0.05 60 50 0 40 30-0.05 20-0.1 10 0 1850 1860 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960-0.15
Estimating the trend: maximum likelihood procedure We calculated the annual probability of death for each cohort at each age from the normal distribution assumption as follows: q xt, 0 a( t YEAR) ' Z b( t YEAR) ' Z t YEAR 0...1949 ( x 1, t, t ) ( x, t, t ) 1 (60,, ) And then used the multinomial distribution to calculate the loglikelihood as: ({ D, N } ) 1 ( N D )log(1 q ) D log( q ) (This makes the assumption that there is no immigration or emigration, the Poisson alternative produced similar estimates) t x, t x, t x t 2009 x, t x, t x, t x, t x, t x x, t t
Parameter estimates Not stable for sub-periods (regression on trended variables problem?) Fitted these variables jointly for men and women
Fitted parameters vs. individual KS cohort estimates ML procedure provides quite accurate fits for females in-sample (less accurate for males) 88 20 94 20 86 84 82 80 78 76 74 MALES Mu (LH axis) Sigma (RH axis) 18 16 14 12 10 8 6 4 92 90 88 86 84 82 80 FEMALES Mu (LH axis) Sigma (RH axis) 18 16 14 12 10 8 6 4 72 2 78 2 70 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 0 76 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 0 We then assumed that all factors not allowed for in the model followed a random walk, so fitted a bivariate random walk to the residuals Used the mean value implied by model projections on 1900-2009 data, and distribution implied by the fitted RW, starting in 1929 (half this cohort was dead by 2009)
Resulting meta-parameters for future cohorts of annuitants Procedure gives the following values for meta-parameters (assuming that (, ) follows a bivariate normal distribution) (Parameter values reflect the better in-sample fit of the model to the data for females) Standard deviation of age at death Variance of expected age at death Expected value of age at death Variance of standard deviation of age at death Covariance of two parameters
99% VaR for annuity portfolios Used projected meta-parameters to estimate the 99% VaR for annuity portfolios of different sizes on different cohorts (VaR above the mean expressed as a percentage of the mean value of the portfolio) (VaR s for females are lower reflecting better in-sample fit of model) VaR falls much more slowly due to the presence of cohort mortality risk VaR falls with sqrt(n) as portfolio size increases as it only contains idiosyncratic risk
Approximation The NSIG distribution allows us to express the mean and variance of the unconditional distribution of lifespan as follows: E( X,,, ) E( ) Var( X,,, ) var( ) var( ) E( ) Applying these same results to one annuity allows us to approximate the distribution of the present value of a portfolio of n annuities as a normal distribution with the following parameters: M ne * ( ) n var( ) nvar( ) ne( ) 2 * * * 2 (Starred quantities refer to the mean and standard deviation of the present value of one annuity if underlying mortality is allowed to vary) 2
The approximation is remarkably accurate... Males: Approximate VaR is never more than 0.4% out Females: Approximate VaR is never more than 0.2% out The implication is that even in the presence of stochastic mortality, portfolio values are very closely approximated by a normal distribution, but with a higher variance to incorporate the effect of stochastic mortality
Conclusion Developed a stochastic mortality model based on fitting the distribution of the age at death of individuals in a cohort, rather than modelling the distribution of annual probabilities of death This allows accurate calculation of the distribution of mortalitycontingent liabilities, incorporating the effects of stochastic mortality (as well as idiosyncratic risk) Fitted the model to UK cohort mortality, and used it to estimate the 99% VaR of annuity portfolios on various cohorts and with various sizes Demonstrated that the PV of annuity liabilities are approximately normal but with a higher variance even in the presence of stochastic mortality, and derived an accurate approximation of the parameters of this distribution