PREDICTIVE MODELS IN LIFE INSURANCE Date: 17 June 2010 Philip L. Adams, ASA, MAAA Agenda 1. Predictive Models Defined 2. Predictive models past and present 3. Actuarial perspective 4. Application to Life Insurance: ILEC 02-04 and 05-07 data 1
PREDICTIVE MODELS WHAT THEY ARE (AND AREN T) Title of presentation and name 05/06/2010 of speaker3 What is predictive modeling? Predictive modeling is the use of modern statistical methods to identify and measure past associations for the sake of inferring associations about the past, the present and the future. Usually, this is expressed through probabilities, and it has a very strong Bayesian flavor. -Your humble speaker This sounds a lot like what actuaries have been doing for a century (except for the modern statistical methods part). 2
What it s not Panacea Best thing since sliced bread The tool to end all tools 05/06/2010 5 Nuts and Bolts of Predictive Modeling Data mining: Use statistical methods to find patterns Correlation does not imply causation but we often have to assert causation Some associations cannot be measured using correlation Correlation is measure of statistical collinearity Correlation implies dependency, but not conversely Example: If X is uniform on (-1,1), then X 2 is uncorrelated with X Take action with those patterns Build a better product or service Incorporate into risk management 05/06/2010 6 3
PREDICTIVE MODELS PAST AND PRESENT Title of presentation and name 05/06/20107 of speaker PREDICTIVE MODELS: The Past Horoscopes 4
PREDICTIVE MODELS: In Fiction The Witch in Monty Python s Holy Grail Experience taught knight that witches float, so compare candidate to other things that float Peasants suggested comparisons: wood, very small rocks, churches, water fowl Analogous to a decision tree 05/06/2010 9 PREDICTIVE MODELS: Real Data, Bad Analysis Pirates and Global Warming Number of pirates has decreased since 1860 Globe has consistently warmed Ergo, Somalia is at the forefront of the fight against global warming 05/06/2010 10 5
PREDICTIVE MODELS: Dubious conclusions More pirates! 05/06/2010 11 PREDICTIVE MODELS: The Present Mortality tables Lapse tables Trends Credit scoring Underwriting manuals UCS Fraud detection 05/06/2010 12 6
Comparisons Life Actuarial Modeling Predictive Modeling Manual, ad hoc Inefficient use of data Inefficient models Modeler biases strong (often hidden) but gets the job done Formal, statistically sound systems Extracts a lot of information Models efficiently capture relationships Modeler biases must be explicit remains mysterious 05/06/2010 13 Analogues Life Actuarial Modeling Predictive Modeling Whitaker-Henderson Graduation Analysis of A/E ratios Underwriting rules Credibility (basic) Additive models with splines Regression models with offsets Decision trees Credibility (advanced) 05/06/2010 14 7
APPLICATIONS TO LIFE INSURANCE Title of presentation and name 05/06/2010 of speaker 15 Modeling the ILEC 02-04 Data Death claims and exposures on fully underwritten lives from 35 companies collected between 2002 and 2004 Nearly 700,000 claims overall Almost 500,000 in durations 21+ For this presentation, we confine our attention to a particular subset Attained ages 20-84, durations 1-10 Face amounts at least $10,000 and summarized into two bands Left with 37,087 death claims We anchor to PPR ultimate model from Edwalds, Craighead, Adams paper 8
Modeling the ILEC 02-04 Data Modeling the ILEC 02-04 Data 9
Modeling the ILEC 02-04 Data Modeling the ILEC 02-04 Data 10
Modeling the ILEC 02-04 Data Modeling with GAMs Linear Model Standard regression model using least squares minimization of some response against some collection of predictor variables x y 0 1 Generalized Linear Model Expand universe of response variables to probabilities, counts, etc. y x g 0 1 Generalized Additive Model Expand universe of predictors to include smooth functions y f x g 0 05/06/2010 22 11
Modeling with GAMs Splines underly smooth functions Cubic splines Penalized splines Thin-plate regression splines Tensor product splines 05/06/2010 23 Fitting GAMs to ILEC 02-04: Non-smoker Surface 12
Fitting GAMs to ILEC 02-04: Smoker Surface Fitting GAMs to ILEC 02-04: Model Parameters Parameters (offset from ultimate) Estimate Std. Error p-value e.d.f. Intercept -0.317 0.062 0.00% 1 Males -0.059 0.012 0.00% 1 Face Amounts $50,000+ -0.399 0.012 0.00% 1 Non-smokers -0.298 0.062 0.00% 1 Fixed Smokers 0.488 0.065 0.00% 1 Non-smokers in Duration 1-0.823 0.078 0.00% 1 Non-smokers in Duration 2-0.199 0.039 0.00% 1 Smokers in Duration 1 0.000 1 Smokers in Duration 2-0.173 0.055 0.16% 1 Smooth Non-smokers 63 parameters 0.00% 18.06 Smokers 63 parameters 0.00% 9.81 13
Fitting GAMs to ILEC 02-04: Smoker Penalty But can it predict? Attach to ILEC 05-07 Dataset Confined to data similar to fitting set Data summarized, so judgment used for choice of age and duration Table projected forward to 2008 using 1% per annum flat Compare to 2008 VBT Represents actuarial approach 05/06/2010 28 14
2008 VBT and GAM Compared Gender and Smoker Status A/E by Count A/E by Amount Claims 08VBT Primary Predictive Model 08VBT Primary Predictive Model Female 11,402 110.1% 112.8% 95.4% 105.9% Non-Smoker 8,812 108.8% 110.1% 95.9% 105.3% Smoker 2,590 115.1% 123.1% 92.4% 109.9% Male 20,654 107.8% 106.6% 87.1% 88.2% Non-Smoker 15,663 106.3% 105.0% 86.4% 86.9% Smoker 4,991 113.1% 112.0% 92.1% 96.7% Overall 32,056 108.7% 108.7% 89.2% 92.3% Bottom line results are comparable. 05/06/2010 29 2008 VBT and GAM Compared Face Amount Band A/E by Count A/E by Amount Claims 08VBT Primary Predictive Model 08VBT Primary Predictive Model 10,000-24,999 4,062 150.7% 121.8% 149.7% 120.5% 25,000-49,999 4,212 135.5% 106.6% 134.9% 106.2% 50,000-99,999 5,531 121.4% 136.6% 120.7% 135.9% 100,000+ 18,251 95.3% 100.6% 87.0% 90.3% Overall 32,056 108.7% 108.7% 89.2% 92.3% GAM has better control for face amount variation. 05/06/2010 30 15
2008 VBT and GAM Compared Attained Age Slope 05/06/2010 31 2008 VBT and GAM Compared Duration Slope 05/06/2010 32 16
References Edwalds, T., Craighead, S., Adams, P. (2008) Estimating Mortality Risk Using Predictive Modeling. Enterprise Risk Management Symposium. Lowrie, Walter B. (1982) An Extension of the Whittaker-Henderson Method of Graduation, Transactions of the Society of Actuaries, Vol. 34, pp. 329-372 R Development Core Team (2009). R: A Language and environment for statistical computing. Vienna, Austria. R Foundation for Statistical Computing Wood, S. (2006) Generalized Additive Models: An Introduction with R. Chapman & Hall, London. Image Credits Wikimedia Commons Monty Python Veganza.com Wenger CDC 05/06/2010 34 17