Survival analysis methods in Insurance Applications in car insurance contracts Abder OULIDI 1-2 Jean-Marie MARION 1 Hérvé GANACHAUD 3 1 Institut de Mathématiques Appliquées (IMA) Angers France 2 Institut de Statistiques et d Economie Appliquées (INSEA) Rabat Maroc 3 Mutuelles du Mans Assurances (MMA) Le Mans - France
Context Solvency II derectives to map, to identify their own risks to analyse and modelise their own risks Car insurance mature market Competition expending ( banks-insurers ) Quasi stability insurable motor vehicle populationp Insurers are led to develop optimal models of surveilance and mangement of their portfolio.
Plan 1. Introduction : definitions and notations 2. Survival models 2.1 Non parametic models 2.2 Parametric models 2.3 Semi parametric models 3. Application 3.1 Data set 32 3.2 Results 4. Conclusion and perspectives
1- INTRODUCTION: Applied Fields Statutory Mortality Tables. Experience Mortality Tables. Insurance Contracts.
1- INTRODUCTION: Definitions and notations T survival time from the starting point until cancellation of a contract. f probability density function and F cumulative distribution function of the distribution of T. S(t)=P(T>t) survival function. t hazard function defined d by: f () t 1 t lim P t T tt/ T t St () t 0 t
1- INTRODUCTION: Definitions and notations A(t) cumulative hazard function defined by : A () t t s ds 0 St ( ) exp At ( ) since S(0) 1
2- SURVIVAL MODELS: Non parametric ti models Kaplan-Meier estimator Peterson estimator Nelson estimator
2- SURVIVAL MODELS: Parametric models ( t1,..., t n ) a possibly right and left censored set of observations from: t z ln i i i the distribution of the error term i can be specified as exponential, Weibull, log normal, log logistic distributions
2- SURVIVAL MODELS: Semi-parametric ti models Cox model with time-fixed covariates: t / z t exp z β a vect or of r egr essi on par a met er s z a vect or of covari at es val ues 0 0 an unspecifi ed baseli ne hazar d f uncti on
2- SURVIVAL MODELS: Semi-parametric ti models The Cox regression model is a proportional hazard model t / z1 the «hazard ratio» exp z is independant of t 11z2 2 t/ z 2
2- SURVIVAL MODELS: Semi-parametric ti models t,..., t 1 n a sample of orderly observations. In order to estimate we use the «partial likelihood function»: n expz i k krt i,..., t ; 1 n i1 exp z Lt i
2- SURVIVAL MODELS: Semi-parametric ti models How to test proportional hazard assumption? Plots of Log cumulative hazard rate. Scaled Schoenfeld residuals an alternative to proportional hazards is time varying coefficients t g t If 0 the «hazard ratio» is not constant with respect to time t.
2- SURVIVAL MODELS: Semi-parametric ti models Alternatives models: A- Cox model with time-dependant covariates: t/ z texp z t The «partial likelihood function» is defined by: exp n zi t i Lt,..., t ; 1 n i1 exp zk tk krt i 0 i
2- SURVIVAL MODELS B- Non parametric Aalen s additive regression model: 0 t/ Z t t t Z( t) Our data, based on a sample of size n, consist of the triple Ti, i, Zi t i the event indicator for the ith contract t
2- SURVIVAL MODELS Aalen s additive regression model We define: i i T t; 1 Nt ( ) N t avec N t 1 1 i n i 1 i n i T t Y () t Y t avec Y t 1 (observation at risk at t - ) i i i
2- SURVIVAL MODELS Aalen s additive regression model The additive hazard model can be written in matrix form: dn () t Y () t db () t dm () t Y( t) is the matrix multiplicative intensity model M ( t) is a mean zero martingale k k k B ( t) B t with B t s ds 1kp t 0
2- SURVIVAL MODELS Aalen s additive regression model The least square estimator for B(t) is given by: Bt YTYT YT T 1 ˆ i i i 1( i) it ; t i where 1 T is a vector with ith element equal to 1 if contract i is cancelled i An estimate of t is given by the slope of the estimate or by using smoothing techniques k Bˆk t
2- SURVIVAL MODELS Aalen s additive regression model The estimator of the covariance matrix of ˆB t is: it ; t Var Bˆ ( t) Y Ti Y T i Y Ti 1 Ti 1 Ti Y T i Y Ti Y T i i 1 1 The hypothesis of no regression effect for one or more covariates is testing by: ( H ) B t 0 0 k
Dataset t Dataset from French insurance company 1461 car s insurance contracts t created during the period of June 13th, 1974 to December 28th, 1995. - Cancellation of a contract could only be observed after January 1st, 1996. - If the cancelling contract is before February 7th, 2006 we have considered the duration between cancellation and conclusion of contract (otherwise right censoring).
Dataset Lifetime variable : lifespan of cars insurance (Durvie) If cancellation is before February, 7th, 2006 Durvie = contract cancellation s date - contract conclusion s date If cancellation is after February, 7th, 2006 Durvie = February,7th,2006 - contract conclusion s s date fixed right censoring date
Dataset Covariates: Age of vehicle (AgeVehic) If AgeVehic 1 AgeVehic1 If 1<AgeVehic 4 AgeVehic2 If 4<AgeVehic 8 AgeVehic3 If 8<AgeVehic AgeVehic4 Type of insurance (Formule) Tierce Intégrale (formule tous risques) Formule1 Tierce Maxi (formule RC + dommages) Formule2 Tierce Simple (formule RC seule) Formule3
Dataset Bonus-Malus variable (BM) If Bonus-Malus = 0.5 05(b bonus 50%) BM1 If 0.5 < Bonus-Malus 0.7 (30 % bonus <50 %) BM2 If Bonus-Malus > 0.7 (bonus or malus < 30 % ) BM3
Results All Censoring Cancellation Effectifs 1461 537 924 Number of contracts All Cens. cancel BM1 569 266 303 BM2 387 142 245 All Cens. cancel Formule1 343 140 203 Formule2 589 222 367 AgeVehic 1 AgeVehic 2 AgeVehic 3 All Cens. cancel 49 12 37 260 111 149 449 161 288 BM3 505 129 376 Formule3 529 175 354 AgeVehic 4 703 253 450
Results All Censoring Cancellation DurVie 10.24 14.79 7.59 Mean of DurVie (in years) on January 1st, 1996 All Cens. cancel BM1 12.63 15.94 9.73 BM2 9.77 14.11 7.26 All Cens. cancel Formule1 9.19 12.80 6.70 Formule2 10.58 14.92 7.96 AgeVehic 1 AgeVehic 2 AgeVehic 3 All Cens. cancel 6.20 10.62 4.77 820 8.20 11.78 553 5.53 8.84 13.17 6.43 BM3 7.89 13.18 6.09 Formule3 10.52 16.22 7.71 AgeVehic 4 12.1616 17.34 925 9.25
Results coef exp(coef) se(coef) z p BM 0.419 1.520 0.04000400 10.47 0.0e+0000e+000 Formule 0.181 1.199 0.0577 3.14 1.7e-003 Agevehic -0.326 0.722 0.0522-6.25 4.2e-010 Rsquare= 0.112 Likelihood ratio test= 174 on 3 df, p=0 Wald test = 174 on 3 df, p=0 Score (logrank) test = 179 on 3 df, p=0 Cox model
Results survival function 0.6 0.8 1.0 BM1 BM2 BM3 0.0 0.2 0.4 0 5 10 15 20 25 time in years
Results survival functio on 1.0.6 0.8 0.4 0. Agevehic1 Agevehic2 Agevehic3 Agevehic4 0.0 0.2 0 5 10 15 20 25 time in years
Results.0 survival function n 0.8 1 0.4 0.6 Formule1 Formule2 Formule3 0.0 0.2 0 5 10 15 20 25 time in years
Results 0 BM1 BM2 BM3-6 log-log survival fu -4-2 nction 0.5 1.0 5.0 10.0 time in years
Results log-log survival func ction -2 0-4 - Agevehic1 Agevehic2 Agevehic3 Agevehic4-6 05 0.5 10 1.0 50 5.0 10.00 time in years
Results nction log-log survival fu -4-2 0 Formule1 Formule2 Formule3-6 0.5 1.0 5.0 10.0 time in years
Results Proportional hazard test: t g t 0 Test de: H = 0 avec gt ( ) t rho chisq p BM -0.0849 6.33 1.19e-02 Formule -0.1174 13.46 2.43e-04 Agevehic 0.0987 9.19 2.43e-03 GLOBAL NA 24.10 2.38e-05
Results Schoenfeld residuals:
Results
Results
Results Additive Aalen Model Test for non-significant effects Supremum-test t of significance ifi p-value H_0: B(t)=0 (Intercept) 4.75 0 Agevehic 6.17 0 Formule 4.79 0 BM 9.30 0 Test for time invariant effects Kolmogorov-Smirnov test p-value H_0: B(t)=b t (Intercept) 0.808 0.009 Agevehic 0.252 0.001 Formule 0.119 0.022 BM 0.195 0.000 Cramer von Mises test p-value H_0: B(t)=b t (Intercept) 1.240 0.081 Agevehic 0.263 0.003 Formule 0.112 0.002 BM 0.206 0.000
Results
Results
Results
Results
4- Conclusion and percpectives - Cox models with time-change covariates are not easy to understand or visualize. - Aelen model for failure time analysis allows the inclusion of time-dependent covariates as well as the variation of covariate effects over time. - Comparison with other models («duplication» models ) - Tests on another large dataset with new time dependant covariates..
Some References 1. Aalen, O.O. (1989). A linear regression model for the analysis of life times, Statistics in Medicine 8, 907-925. 1. Cox D.R. (1972). Regression models and life tables, J.R.Statist.Soc. B34, 187-220. 2. Grambsch P. and Therneau T.M. (1994). Proportional hazards tests and diagnostics based on weighted residuals. 3. Therneau T.M. and Grambsch P. (1990). Martingale-based residuals for survival models. Biometrika. 77, 1, pp. 147-160.