SOA 2013 Life & Annuity Symposium May 6-7, 2013 Session 30 PD, Predictive Modeling Applications for Life and Annuity Pricing and Underwriting Moderator: Barry D. Senensky, FSA, FCIA, MAAA Presenters: Jonathan P. Polon, FSA Qichun (Richard) Xu, FSA, Ph.D. Primary Competency Technical Skills and Analytical Problem Solving
Predictive Modeling for Life and Annuity Pricing and Underwriting Life and Annuity Symposium Session 30 May 6, 2013 Jonathan Polon FSA Overview One Key Takeaway Traditional Actuarial Techniques vs. Predictive Modeling Credibility vs. Validation Benefits of Predictive Modeling Data Considerations Performing the Analysis 2 1
One Key Takeaway 3 One Key Takeaway Mortality experience takes several years to develop Your ability to analyze internal mortality experience in the future will be affected by the way you capture data today Invest the time and resources today to develop and implement a data collection strategy What data to collect Where to collect the data from How to structure the data storage to facilitate analysis 4 2
Data Quality vs. Data Quantity The Law of Diminishing Marginal Returns applies Select the data elements that you believe are most important Focus on accuracy, completeness and structure for the key data elements rather than simply maximizing the number of data elements; this will: Minimize the cost of data storage and capture Greatly decrease time to create models Increase the interpretability of the models Reduce the risk of overfit Probably improve model accuracy 5 Traditional Actuarial Techniques vs. Predictive Modeling 6 3
Traditional Actuarial Techniques Its purpose is not to classify individual risks Rather, it is used to determine average cost for each class of risk Objective is to be accurate at the class level in aggregate, not at the level of the individual case Typically applied in low number of dimensions (e.g., age, gender, smoking, underwriting class, duration) Techniques include tables and classical statistics 7 Predictive Modeling Its purpose is to make predictions at the individual case level Output for each case could be a risk class, number of debits or qx vector Each case is unique and has its own combination of characteristics Typically applied in high number of dimensions Techniques include machine learning and numerical analysis Iteratively improving fit of model to the historic data 8 4
Credibility vs. Validation 9 Credibility Actuaries apply credibility theory to ensure their mortality analysis is based upon a sufficient number of observations From CIA Educational Note on Expected Mortality, July 2002 Goal of credibility theory is to provide a framework for combining data from different sources Typically company data, which may not be fully credible, and industry data, which is assumed to be fully credible The Normalized Method is the preferred credibility method and 3,007 is the suggested number of deaths needed for full credibility Credibility is as much an art is it is a science Barry Senensky FSA FCIA MAAA, April 2013 10 5
Validation Credibility is not independent of dimensionality If # observations is small, can still model primary predictors As # observations increases, models can be expanded to include predictors of secondary and tertiary importance Danger of using predictive modeling is overfit modeling noise rather than signal Validation i is applied to protect against overfit Validate using out-of-sample data to ensure models are robust Withhold 10-20% of data until models are deemed complete 11 Benefits of Predictive Modeling 12 6
Benefits of Predictive Modeling The potential benefits of predictive modeling vs. traditional techniques: 1. Improved accuracy 2. Reduced time to decision 3. Lower expense 13 Improved Accuracy Improved accuracy is most important for larger risks E.g., large amounts or impaired lives These cases are typically fully underwritten and a lot of information is generated E.g., application, lab results, APS, financial underwriting Predictive models can provide more accurate estimates of the risk of each applicant This can be used as a decision support tool for the underwriting decision 14 7
Reduced Time to Decision and Lower Expense Quicker turnaround and lower cost of underwriting decisions can be especially important in the middle markets Can increase close rates and reduce issue expenses Why can t an insurance policy be sold in real-time? Whether at the agent/broker office or direct online No medical exam or fluid samples required Base underwriting decision on other sources of information such as prescription drug history 15 Data Considerations 16 8
Big Data Big data is the catch phrase of 2013 Not only are we creating data in new ways, such as: social media, cell phone GPS, webclicks The information created by our interactions with the world is being stored electronically to a greater and greater extent Sources of data for life pricing and underwriting: An insurer s internal data Data for sale from external data aggregators Webscraping 17 Internal Data Sources Application for insurance Lab results Attending physician statements Underwriter s notes Data generated from other product lines 18 9
External Data Sources MIB MVR Prescription drug history Credit score Public records Consumer data There may be regulatory, legal and reputational risk involved with the use of some external data sources. Be sure to research before using. 19 Webscraping Crawling the internet to uncover information about an entity More difficult to perform for an individual than a business Names are unlikely to be unique Personal Facebook, Twitter and other social media accounts are often set to private May be greater reputational risk to an insurer that is webscraping for information about an individual as opposed to searching for information about a business 20 10
Vital Status Analysis should probably include all applicants for insurance not just written cases May need to determine vital status of non-written cases In the US, the SS DMF is a good start, but not complete Companies that aggregate public records may be of help Can validate these data sources against insured lives and develop assumptions to account for the missing deaths 21 Performing the Analysis 22 11
Steps Define objective Identify data sources Acquire and clean data Analyze data and train models Validate models 23 Define objective Sounds trivial but is of critical importance Will drive all other steps of the modeling process What should the model output be? For example: Replicate underwriting decisions (doesn t require vital status) Risk classification (e.g., preferred, standard, substandard) Number of debits to apply to the base table Applicant-specific mortality rates (qx) for the first several years 24 12
Identify Data Sources Must have the target (independent) variable available in the historic data What predictor data is available? Internal sources External sources Webscraping 25 Acquire and Clean Data May be 80% of the total effort required for the project Data collected from different sources must be linked Raw data is seldom in a form appropriate for modeling Text mine documents, such as APS or underwriter notes Perform some basic calculations like age or BMI Some data elements will need to be transformed to optimize modeling, depending on modeling techniques to be applied 26 13
Analyze Data and Train Models Can begin with analysis of current basis Identify types of applications where actual outcomes are similar to or different from expected outcomes Train the new models Iterative process: will require testing of various modeling techniques and data transformations Evaluate new models Typically on a hold-out sample of testing data Consider: goodness-of-fit metrics, univariate analysis, model complexity vs. interpretability, consistency with expectations 27 Model Validation Final test on out-of-sample ( validation ) data Can really only be performed once then data is no longer unseen Goodness-of-fit should, at a minimum, be improved relative to current basis Requires a goodness-of-fit metric such as mean squared error 28 14
Predictive Modeling Applications Case Study Richard Xu Global R&D RGA LAS May 2013 UW Model GLM model Contents Experience Study GLM model Pricing Model CART model Client Segmentation Clustering 2 1
Underwriting Identify best risks Be fast & consistent Prioritize cases Reduce not-taken rates Claims Predict claim frequency Identify claim severity Prioritize resources Identify claims most likely fraudulent/rescinded PM Applications Pricing/ Reserves Improve pricing accuracy Identify deviation of pricing variables Reserve more accurate Compute reserve variance Experience Analysis Identify drivers in experience Handle low credibility data Create own mortality/lapse tables Sales & Marketing Make effective campaigns Recommend products Select new agents Monitor existing agents In Force Business Client segmentation Predict lapses Design retention strategies Offer other products 3 3 Generalized Linear Model OLS(LM) GLM Random Systematic Link OLS Normal only GLM Various distributions Generalized Linear Model (GLM) Inclusion of most distributions related to insurance data Normal, binomial, Poisson, Gamma, inverse-gaussian, etc. Ordinary Least Square (OLS) is a special case of GLM Great flexibility in variance structure Weights & offset to be more flexible Multiplicative model intuitive & consistent with insurance practice Easy to understand & communicate 4 2
Case Study 1: UW Model Goal: to predict UW decisions on its existing customers Bancassurance in Asia with large customer pool, but low penetration in life product Identify certain pre-qualified existing customers, & offer guaranteed issue (GI) or simplified issue (SI) without medical UW Acquisition costs will be significantly reduced Market penetration will be deeper, and sales will increase Bancassurance is unique for PM Financial/demographic information about customers Major challenges - very limited data A total of about 8k-9k full UW cases Target variable UW decision, with very low declined/rated cases, ~3.0% Many missing values due to old time, especially for sub-std Not all information collected at the time of UW 5 Key Variables GLM with binomial and logistic link function About a dozen of predictor variables that are statistically significant for prediction & readily available in client database Key predictor variables Positive means the probability to be STD increases if the value goes up; otherwise, it is Negative Name Type Note Age_At_Entry Numeric Negative; less likely to qualify for STD as age goes up Branch Categorical Proxy of geographic locations AUM Numeric Positive; more likely to qualify for STD with large AUM Customer_Segment Categorical Positive for Premier, negative for non-premier Nationality Categorical Positive for domestic; negative for certain others 6 3
STD Rate non 18.0% 16.0% 14.0% 12.0% 10.0% 8.0% 6.0% 4.0% 2.0% 0.0% Lift Plot for In Sample Results Declined Rated Average nonstd Rate 3.0% 0.6% 0.5% 0.2% 1 2 3 4 5 6 7 8 9 10 Sorted Model Output In-sample results show model performance under optimal condition May over-fit data 0.5% of sub-std in top 30% of model output non STD Rate Model Results Validation results are a better test of model performance in real business 0.6% sub-std in the top 30% of model outputs, about 80% reduction Declined vs. Rated 16.0% 14.0% 12.0% 10.0% 8.0% 6.0% 4.0% 2.0% 0.0% Lift Plot for Validation Results Declined Rated Average nonstd Rate 3.0% 0.5% 0.4% 0.8% 0.5% 1 2 3 4 5 6 7 8 9 10 Sorted Model Output 7 Model results Gain curve, another way to understand model capability to differentiate STD from sub-std Best 30% of model outputs t contains about 5% of total non-std Lowest 30% captures about 75% of bad risks Model implementation Results delivered to the client Final implementation stage Final control on offers by insurer non STD % 1 0.9 0.8 0.7 0.6 0.5 04 0.4 0.3 0.2 0.1 Model Gain Curve In sample results Validation results Random 0 0 1 2 3 4 5 6 7 8 9 10 Sorted Model Output 8 4
Case Study 2: Experience Study PM vs. traditional actuarial approach True multivariate approach vs. univariate, under/over-estimation Impact of interaction term on target More efficient use of data, and handle low credibility data Establish own assumption based on experience data Type of studies Mortality, lapse, claim severity, incidence rate, continue table Major challenge - Data! Data! Data! Understanding business; clean data; mapping of data; data legacy; missing values; timing, etc. Not enough credible data Too much data, in big data territory 9 Post level Lapse Rates vs. Duration Term Tail Lapse Rates Tail lapse rates for 10-year term product Duration, premium jump, face amount, UW class, issue age, gender, etc. Post level Lapse Rate vs. Premium Jump Formula-based results with uncertainty estimated Business insights 10 5
CART Model Classification And Regression Tree (CART) Both classification and regression Non-parametric approach (no insight in data structure) CART tree is generated by repeated partitioning i of data set Data is split into two partitions (binary partition) Partitions can also be split into sub-partitions (recursive) Until data in end node(leaf) is homogeneous (more or less) Results are very intuitive Identify specific groups that deviate in target variable Yet, algorithm is very sophisticated 11 Case Study 3: LTD Pricing Business: US group Long-Term Disability(LTD) About 13k policies, with lives per policies from 10 to 30k Current pricing variables: about 30-40 Experience data of past 5 years with >80 variables Major pricing variables: age, gender, industry, location, benefit structure Objective To determine additional pricing variables and possible interaction terms (for pricing) To identify groups with experience deviating from pricing assumptions (for UW) Client has experience with PM Minimum efforts on business & data understanding Profit margin as target variable 12 6
CART Model results Results Easy to develop, interpret and understand; business insights Not efficient for linear function; sensitive to noise; over-fitting 13 CART Model results Results improve profit margin and pricing accuracy Useful tool for both pricing and UW of group LTD business Model implementation Client is very interested in model results; approved by management team Implemented in Q1 13 Quartile # of cases Actual EPM Model Predicted EPM 1 3230 (0.28) (0.32) 2 3230 (0.088) 088) (0.060) 060) 3 3230 0.063 0.020 4 3230 0.017 0.14 14 7
Clustering algorithm Data Clustering Find similarities in data according to features found in data and group similar objects into clusters Unsurprised (no pre-defined), classification, non-parametric How to measure similarities/dissimilarities, e.g. distance Numeric, categorical, and ordinal variables Partitioning (k-means), Hierarchical, Density-based, etc. 15 Case Study 4: Client Segmentation Existing client segmentation is based on geographic location, a more self-serving approach for own benefit rather than market and needs Objective To better understand client base, identifying knowledge gaps To capture tacit knowledge; create structured data on our clients & a tool for client analysis and strategic decision-making purposes on an ongoing basis To identify opportunities to better serve our clients needs and grow business To help better optimizing resourcing requirements 16 8
Client Segmentation Business team survey data Three main data categories Description of clients Behavior when facing of risks Needs to deal with risks Clustering algorithm & principal component analysis Algorithms find clusters that clients in same cluster are more similar to each other than to those in other clusters Un-supervised algorithm without target variable Data is dominated by categorical variables 17 Results on 5 clusters Number of clusters is a free parameters Example: opportunity Clustering Model Results 18 9
Clustering Model Example: Two High Level Clusters Direct distribution and Living Benefits related products Data quality is very important Prefer objective variables than subjective variables 5 (12% by NB volume) Want Direct 32 (25% by NB volume) Want Direct 8 (7% by NB volume) Want Direct e-sales 16 (5% by NB volume) Want Direct Traditional 17 (16% by NB volume) Want living benefits - related 13 (11% by NB volume) Want Living Benefits 4 (5% by NB volume) Want Combination 3 (1% by NB volume) Want Direct to in-force 19 PM is skills for actuaries in future Conclusion PM is to find knowledge in data so that we can understand and gain advantage Everything eyt gshould be made as simple as possible, but not simpler. Albert Einstein 20 10
Predictive Modeling Applications Case Study Richard Xu Global R&D RGA LAS May 2013 11