Session 42 PD, Predictive Analytics for Actuaries: Building an Effective Predictive Analytics Team Moderator: Courtney Nashan Presenters: Ian G. Duncan, FSA, FCIA, FIA, MAAA Andy Ferris, FSA, MAAA Christine Irene Hofbeck, FSA, MAAA Courtney Nashan
Building an Effective Predictive Modeling Team Christine Hofbeck, FSA, MAAA Ian Duncan, FSA, FIA, FCIA, FCA, MAAA Andy Ferris, FSA, FCA, CFA, MAAA Courtney Nashan October 12, 2015
Overall Approach To comprehensively incorporate predictive analytics into a core operational business process, we follow four phases: 1. Phase 1 Planning 2. Phase 2 - Data Assembly and Model Build 3. Phase 3 - Technical Implementation 4. Phase 4 - Business Implementation 2
Phase 1 3
Phase 1 - Planning 1. Assembling a Team 2. Laying the Foundation 3. Selecting a Project 4
Phase 1 - Assembling a Team Consider skillsets both individually and collectively 1. Ability to manipulate large datasets (SAS, R, SQL) 2. Modeling expertise 3. Business acumen 4. Ability to explain highly technical information to a nontechnical audience 5. Ability to represent results graphically for ease of communication 6. Consider mix of prior experience 7. Charisma Those who prepare the data are as important as those who build the model, who are as important as your business partners who provide subject matter expertise. 5
Phase 1 - Laying the Foundation Building a predictive modeling capability is not only about hiring a team. Consider: 1. Technology 2. Legal commitments to customers 3. Data privacy and compliance 4. Objective 5. Change management 6. Cross functional support 7. Budget Consider the cultural and political impacts of this change, not only the strategic. 6
Phase 1 - Selecting a Project Your first project will get a lot of attention select it wisely 1. Large enough that it can make a true business impact 2. Not so large that it takes over a year or more to build (your colleagues will be anxious to see results!) 3. Available data 4. Projects which may have been unsolvable in the past with current methods 5. The business wants to implement (use) it to improve decision making 6. What are my competitors doing? Where should I invest the effort? Remember that predictive modeling makes an impact when the model is implemented and better informed decisions are made 7
Phase 2 8
Phase 2 Data Assembly & Model Build There are two important challenges to keep in mind with modeling: 1. How to organize the data for efficient interrogation; and 2. How to organize the data for replicability (remember that at some point, your model is going to go into production). 9
Phase 2 Data Assembly & Model Build How to organize the data for efficient interrogation Here is an example of a data management and warehousing problem from healthcare: We know that diagnoses are an important contributing factor to illness, health risk and cost. There are about 17,000 diagnosis codes currently in use (ICD-9). With ICD-10 this number grows to 140,000 (from October 2015!) There are 100,000 CPT (procedure) codes, and the National Drug Code directory contains hundreds of thousands of drug codes (updated daily!) Obviously this creates an unmanageable set of codes for analysis purposes. In healthcare we have solved this problem with the use of grouper models. Grouper models group like diagnosis codes into diagnostic categories. Drug codes are similarly grouped into therapeutic classes. For a lot of analytical work, grouper models are all that is required. The SOA has studied the predictive accuracy of these models in three studies (1994-2007); a fourth study is in preparation. 10
Phase 2 Data Assembly & Model Build How to organize the data for replicability The use of grouper-type models or models that assign a categorical value to a continuous variable is very valuable in modeling because these models can be built into a warehousing process. They will then be used in the practical application of the model in production. Another example from Healthcare: Body Mass Index is defined as Weight (in kg)/height 2 (in cm). Obviously, a continuous variable. But clinicians have provided categories, as follows, which provide a useful guide to the status of a particular patient: Category BMI Underweight < 18.0 Normal weight 18.0 25.0 Heavy weight 25.0 30.0 Obese 30.0 40.0 Morbidly Obese 40.0+ 11
Phase 2 Data Assembly & Model Build A few quotes to keep us grounded: The year 1930, as a whole, should prove at least a fairly good year. -- Harvard Economic Service, December 1929 All models are wrong but some are useful. George E.P. Box, Professor Statistics, University of Wisconsin-Madison. 12
Phase 2 Data Assembly & Model Build Frequently-used software: SAS R Internally developed software Other commercially available models Not as popular: Python, SPSS, Salford Systems From SOA Sections Survey: Predictive Analytics 2015 13
Phase 2 Data Assembly & Model Build Frequently-used models: OLS Regression GLM Time series Decision Trees Clustering Not as popular: Neural network, Bayes. From SOA Sections Survey: Predictive Analytics 2015 14
Phase 3 15
Phase 3 Technical Implementation At this point in the overall approach What we have accomplished: We have a mathematical equation: What we have not accomplished: No real time scoring engine to enable use of the equation Objective of this phase: A real-time flow of data inputs from multiple internal and external sources to the scoring engine A real-time flow of model output ( score, reason codes, etc.) to business unit operations 16
Phase 3 Technical Implementation Common Challenges of this phase Lack of early engagement of IT staff in planning Lack of sufficient dedicated IT resources Format of data received (scanned images, etc.) in current environment Collecting data fields in real time business production from multiple internal systems (administrative system, agent licensing system, illustration, etc.) Sensitive data fields that prior phase found to be predictive Fixed system release dates conflict with desired program rollout 17
Phase 3 Technical Implementation Hints in overcoming common challenges Engage IT resources early in the project Plan in advance to discover more data challenges than you initially expect Avoid reputational risk by carefully considering how each data field will be used in new business process Consider temporarily outsourcing the scoring engine if needed 18
Phase 4 19
Phase 4 Business Implementation At this point in the overall approach What we have accomplished: We can deliver model output in real time to a business unit What we have not accomplished: Not changed any core business operations to take advantage of the model output Objective of this phase: Classic business process change exercise Change an existing business process to save time, save money, be more efficient, etc. 20
Phase 4 Business Implementation Common Challenges of this phase Lack of Early Engagement - by business unit in how algorithm will be used; how/why business process will change Lack of Sufficient Communication - with business stakeholders (other departments, customers, producers) on changes in operational procedures Unrealistic Expectations - by business stakeholders in impact of predictive modeling and associated changes to business processes Reputation Risk Are you comfortable explaining on 60 Minutes data sources used by your business process in making decisions on individual customers? Implementing tools and metrics to monitor the ongoing impact of the new business process Of all four phases, the business implementation phase is consistently the most challenging for most organizations. 21
Phase 4 Business Implementation Hints in overcoming common challenges Engage business unit early to ensure large model development effort will be deployed in tangible business process change Design change management plan, including any impacts to operating model, org design, as well as communications plan for program rollout Manage expectations to communicate what the new process will NOT do Carefully consider how any new data sources may be perceived as sensitive in future state business process Implement tools and metrics to monitor the ongoing impact of the algorithm on the business process As previously mentioned, predictive modeling makes a business impact only when the model is implemented and more informed decisions are made. 22
SOA Support of Members, Candidate and Students in Predictive Analytics 23
Expanding Opportunities for Actuaries Cultivate opportunities for SOA members in relevant fields for actuaries through: Identifying the opportunities Building relationships with decision makers Marketing and publicizing the skills of actuaries in new roles with traditional employers and new industries Informing the membership and share pioneer stories
Predictive Analytics Focus Growth and timing With proliferation of big data, use of analytics is growing Opportunity to expand roles for actuaries in predictive analytics Need to mobilize quickly or actuaries will not be considered for these roles
Strategic Direction Strategy Generate supply of trained actuaries Initiate multi-phase marketing communications campaign to generate demand, interest in members, candidates, and employers Tactics ASA Education FSA Education Professional Development Research Sections Marketing
Q&A