Predictive Analytics for Life Insurance: How Data and Advanced Analytics are Changing the Business of Life Insurance Seminar May 23, 2012
|
|
- Francis Hodges
- 8 years ago
- Views:
Transcription
1 Predictive Analytics for Life Insurance: How Data and Advanced Analytics are Changing the Business of Life Insurance Seminar May 23, 2012 Session 2 How to Build a Risk Based Analytical Model for Life Insurance Presenter James C. Guszcza, FCAS, MAAA
2 Predictive Modeling From A to B Society of Actuaries Los Angeles James Guszcza, PhD, FCAS, MAAA Deloitte Consulting LLP May 23, 2011 Analyzing Analytics Three thoughts capture a major reason why predictive analytics is ubiquitous... The whole of science is a refinement of everyday thinking. Albert Einstein Decision-making is at the heart of administration. Herbert Simon Human judges are not merely worse than optimal regression equations; they are worse than almost any regression equation. Richard Nisbett & Lee Ross Human Inference: Strategies and Shortcomings of Social Judgment 2 Deloitte Analytics Institute 2011 Deloitte LLP 1
3 Agenda Introduction Why Business Analytics is Now Ubiquitous Predictive Modeling: Levels of Discussion Strategy Methodology Technique Why Business Analytics is Increasingly Ubiquitous 2
4 Is Business Analytics New? Not really Easy example: credit scoring is an early example of business analytics and has been around for many decades. 5 Deloitte Analytics Institute 2010 Deloitte LLP Is Business Analytics New? Predictive Modeling in Insurance isn t new but it s rate of adoption continues to increase. Property-Casualty insurers have been intensively using predictive models for ~ 15 years now. 1990s: Personal credit used to price, underwrite personal insurance. 2000s: scoring models used to price, underwriter commercial and professional liability insurance. Today: Life insurers are adopting or preparing to adopt predictive modeling techniques. 6 Deloitte Analytics Institute 2010 Deloitte LLP 3
5 Why Now? (Answer #1) Technology (Moore s Law) Cost of storage and computing power has decreased exponentially Data Third-party data is becoming increasingly available Companies are learning to do more with their internal data Software and algorithms Great analytic ideas keep coming from statistics, economics, machine learning, marketing, Free tools like R 7 Deloitte Analytics Institute 2011 Deloitte LLP Exponential Growth of Data Availability More and more data is gathered, stored, and available to be analyzed Time Challenges Challenging Times for Future Innovation Search 8 Deloitte Analytics Institute 2011 Deloitte LLP 4
6 Exponential growth of Statistical Computing Availability The open-source R statistical computing environment has experienced exponential growth The number of contributed packages has grown exponentially 9 Deloitte Analytics Institute 2011 Deloitte LLP Why Now? (Answer #2) OR: Clinical vs Actuarial Judgment the Motion Picture 10 Deloitte Analytics Institute 2011 Deloitte LLP 5
7 There s Something About Linda To illustrate, think about this person: Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations 11 Deloitte Analytics Institute 2011 Deloitte LLP There s Something About Linda Think about this person: Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations Now rank these possible scenarios in order of probability: Linda is active in the feminist movement Linda is a bank teller Linda is a bank teller and is active in the feminist movement 12 Deloitte Analytics Institute 2011 Deloitte LLP 6
8 There s Something About Linda Daniel Kahneman and Amos Tversky posed precisely this question to a group of people They found that 87% of the people thought that feminist bank teller was more probable than bank teller But this is logically impossible. Mathematical fact: So in particular: Prob(A&B) < Prob(B) Prob(feminist & bank teller) < Prob(bank teller) 13 Deloitte Analytics Institute 2011 Deloitte LLP Analytics Everywhere Neural net models are used to predict movie box-office returns based on features of their scripts Decision tree models are used to help ER doctors better triage patients complaining of chest pain. Predictive models are used to predict the price of different wine vintages based on variables about the growing season. Predictive models to help commercial insurance underwriters better select and price risks. Predict which non-custodial parents are at highest risk of falling into arrears on their child support. Predicting which job candidates will successfully make it through the interviewing / recruiting process and which candidates will subsequently retain and perform well on the job. Predicting which doctors are at highest risk of being sued for malpractice. Predicting the ultimate severity of injury claims. (Deloitte applications in green) 14 Deloitte Analytics Institute 2011 Deloitte LLP 7
9 A Practical Conclusion There is no controversy in social science which shows such a large body of quantitatively diverse studies coming out so uniformly in the same direction as this one. When you are pushing over 100 investigations, predicting everything from the outcome of football games to the diagnosis of liver disease, and when you can hardly come up with half a dozen studies showing even a weak tendency in favor of the clinician, it is time to draw a practical conclusion. -- Paul Meehl, Causes and Effects of my Disturbing Little Book 15 Deloitte Analytics Institute 2011 Deloitte LLP Predictive Analytics: Fundamental Concepts 8
10 What Business Analytics Isn t One view of predictive modeling: We assemble a database, explore it, and fit models to it. DATA MODEL Is this a full conception of analytics? 17 Deloitte Analytics Institute 2011 Deloitte LLP What Business Analytics Is No: analytics projects do not begin with data they begin with a strategic issue or problem STRATEGY DATA MODEL 18 Deloitte Analytics Institute 2011 Deloitte LLP 9
11 What Business Analytics Is No: analytics projects do not begin with data they begin with a strategic issue or problem STRATEGY DATA MODEL DECISIONS And they do not end with a model they end with a model being used to make better decisions 19 Deloitte Analytics Institute 2011 Deloitte LLP What Business Analytics Is No: analytics projects do not begin with data they begin with a strategic issue or problem STRATEGY DATA MODEL DECISIONS And they do not end with a model they end with a model being used to make better decisions Therefore, the best analytics projects projects begin with business strategy and end with IT implementation, training, creating analytics-aided decision rules, and change management. 20 Deloitte Analytics Institute 2011 Deloitte LLP 10
12 Predictive Analytics: Three Levels of Discussion Strategy Profitable growth More consistent, accurate and economical underwriting Retaining, i cross-selling to preferred policyholders ld Marketing / prospecting Promoting wellness Methodology Model design Modeling process ( actuarial science ) ( data science ) Technique Supervised and unsupervised GLM vs. decision trees vs. support vector machines 21 Deloitte Analytics Institute 2011 Deloitte LLP Methodology and Technique How does data science need actuarial science? Target variable specification, preparation Model design Domain knowledge Variable creation, variable selection Model evaluation Knowing which ambiguities to focus on and which to ignore How does actuarial science need data science? Advances in computing, modeling techniques Programming with data Exploratory data analysis, data visualization ideas Ideas from other fields can be applied to insurance problems 22 Deloitte Analytics Institute 2011 Deloitte LLP 11
13 Semantics: Data Mining vs Predictive Modeling Data Mining : exploration and discovery Find interesting patterns in databases Open-ended, cast the net wide Data visualization often an important t part Can involve brute-force machine learning algorithms Example: discovery that personal credit correlates with risky insurance behavior. Predictive Modeling : apply statistical techniques in the light of the EDA, knowledge discovery phase Quantify & synthesize relationships found during knowledge discovery Example: use credit information in a traditional predictive model 23 Deloitte Analytics Institute 2011 Deloitte LLP At the Center of It All: Data Science Or: The Collision between Statistics and Computation Today, it is arguably appropriate to call certain aspects of propertycasualty ypractice data science. This often involves more than just knowing how to run regression models. Data science goes beyond: Traditional statistics Business intelligence [BI] Information technology Image borrowed from Drew Conway s blog 24 Deloitte Analytics Institute 2011 Deloitte LLP 12
14 Predictive Analytics: Understanding the Strategy A Great Book About Predictive Analytics 26 Deloitte Analytics Institute 2011 Deloitte LLP 13
15 The Strategic Use of Analytics in Life Insurance Like baseball, insurance is a numbers game. Both fields are awash in data and a lot of money is at stake! As with baseball, so in insurance: All of this data until fairly recently has not been used in strategic ways. Percent Declined 80% 70% 60% 50% 40% 30% Underwriting Model Lift Curve (Hypothetical) Life insurance example: Models can be built to assign probabilities of (e.g.) being declined or (e.g.) being assigned to the best underwriting class based on readily available data. 20% 10% 0% Model Decile (Predicted Value) 27 Deloitte Analytics Institute 2011 Deloitte LLP The Strategic Use of Analytics in Life Insurance Strategic application: data-driven underwriting 80% Underwriting Model Lift Curve (Hypothetical) 70% Models estimate the probabilities that a risk will fall 60% into the best / worst 50% underwriting classes. In this hypothetical example Worst 10% of risks have a >70% decline rate Percent Declined 40% 30% 20% 10% 0% Model Decile (Predicted Value) 28 Deloitte Analytics Institute 2011 Deloitte LLP 14
16 The Strategic Use of Analytics in Life Insurance Strategic application: data-driven underwriting 80% Underwriting Model Lift Curve (Hypothetical) 70% Models estimate the probabilities that a risk will fall 60% into the best / worst 50% underwriting classes. In this hypothetical example Worst 10% of risks have a >70% decline rate Best 10% of risks have <5% decline rate Percent Declined 40% 30% 20% 10% 0% Model Decile (Predicted Value) 29 Deloitte Analytics Institute 2011 Deloitte LLP Predictive Analytics: Understanding the Methodology 15
17 Three Central Concepts of Predictive Analytics Scoring engines aka algorithmic solutions aka predictive models Lift / Gain analysis Lift: how much better/worse than average are the observations with the best/worst scores? Gain: what percent of the outcomes are captures in the 10% of observations with the highest scores Out-of-sample tests Estimate how the model is likely to perform once implemented Unbiased estimate of model s predictive power 31 Deloitte Analytics Institute 2011 Deloitte LLP Scoring Engines Scoring engine: a formula that classifies or separates risks (or accounts or households ) into: Expected longevity Churn likelihoodlih Profitable vs unprofitable Utilization... (non)-linear equation f() of several predictive variables Produces continuous range of scores: Y = f(x 1, X 2,,X N ) Common model form: Y = g -1 ( + 1 X 1, + 2 X 2,+ + N X N ) 32 Deloitte Analytics Institute 2011 Deloitte LLP 16
18 What Powers a Scoring Engine? Scoring engine: Y = f(x 1, X 2,,X N ) The {X 1, X 2,,X N } are at least as 1 2 N important as the f(). Domain expertise, creativity, insight, all important A major part of the modeling process consists of variable creation and selection It is possible to generate many 100s of variables Knowing what is important t to create, how to create them, and what to do with them once they are created is the steepest part of the learning curve Why predictive modeling is an art as well as a science Tacit knowledge vs textbook knowledge 33 Deloitte Analytics Institute 2011 Deloitte LLP Model Evaluation: Lift Curve Analysis Sort the data by the model score Underwriting Model Lift Curve (Hypothetical) Break into 10 equal pieces Best decile : 10% of risks with the best scores Worst decile : 10% of risks with the worst scores Lift: average measured difference in the quantity we are trying to predict or estimate Lift = segmentation power Lift translates into the ROI of the modeling project Unlike such concepts as R 2 or AIC, lift has operational meaning Percent Declined 80% 70% 60% 50% 40% 30% 20% 10% 0% Model Decile (Predicted Value) 34 Deloitte Analytics Institute 2011 Deloitte LLP 17
19 Data Mining and The Curse of Dimensionality Data Mining has both positive and negative connotations. Done well, it can lead us to new insights that result in more powerful models Done naively, it can cause our models to be fooled by randomness The curse of dimensionality the more variables we consider the harder it becomes to search through the space of possible models. As the dimension increases, so does the side-length of the cube needed to cover r% of the data space. In 10 dimensions we need to cover 80% of the range of each coordinate to capture 10% of the data. (From The Elements of Statistical Learning by Hastie, Tibshirani, Friedman) 35 Deloitte Analytics Institute 2011 Deloitte LLP The Bias-Variance Tradeoff and the Problem of Overfitting Predictive modeling techniques cannot distinguish true patterns from random variation in the data. Error measured on the dataset used to fit the model is overly optimistic The Bias-Variance Tradeoff high bias low variance <----- low bias high variance -----> Which model would you choose? error prediction e train data test data Deloitte Analytics Institute complexity (# neural net cycles) 2011 Deloitte LLP 18
20 The Bias-Variance Tradeoff and the Problem of Overfitting Low bias : The model fit is good on the training data. The model s predicted value is close to the data s expected value High variance : The model might not make accurate predictions upon implementation Bias alone is not the name of the game The Bias-Variance Tradeoff high bias low variance <----- low bias high variance -----> game Out of sample validation is a way of guarding against overfitting. prediction error train data test data complexity (# neural net cycles) 37 Deloitte Analytics Institute 2011 Deloitte LLP Out-of-Sample Model Testing / Validation A Classic methodology: Divide the data into 3 pieces Randomly and/or using time Test data: use to measure gain/lift of candidate model Train Test Val Training data: used to fit models E.g. estimate parameters of a Generalized Linear Model or Cox Hazard Model Test data: used to measure gain/lift of candidate models Perform train/test steps iteratively until you have a model you are happy with During this iterative phase, validation is held in cold storage Validation Data: used measure lift on final selected model The test data was implicitly used during model selection process, therefore it is preferable to measure lift on a truly blind test sample Validation data is often an out-of-time sample 38 Deloitte Analytics Institute 2011 Deloitte LLP 19
21 So far Therefore we want to: Build a scoring model that results in a continuous range of scores that Produces valuable segmentation power on a blind-test data sample How do we get there? 39 Deloitte Analytics Institute 2011 Deloitte LLP Predictive Analytics: Understanding the Process 20
22 The High-Level Process Data scrubbing and Variable Creation Data Exploration and Modeling Model Analysis and Implementation The modeling phases we will discuss fit into the Cross-Industry Standard Process for Data Mining [CRISP-DM] 41 Deloitte Analytics Institute 2011 Deloitte LLP Data Scrubbing and Variable Creation Research possible data sources Extract/purchase data Check data for quality (QA) Messy! (still deep in the mines) Create Predictive and Target Variables Opportunity to quantify tribal wisdom and come up with new ideas Can be a very big task! Steepest part of the learning curve 42 Deloitte Analytics Institute 2011 Deloitte LLP 21
23 Types of Predictive Variables Insurance data sources can be viewed in an outward inward hierarchy. Geodemographic Population Density, traffic, weather, crime, education, Policy Characteristics Term, whole-life, Policyholder, Household Characteristics Age, gender, martial status, #kids, income, education Prior History / Behavior Medical history, life events, credit history, education/career choices, 43 Deloitte Analytics Institute 2011 Deloitte LLP Knowing Me, Knowing You Consumer data is rapidly increasing in: Volume Scope Specificity it REALITY The nature of available data is evolving rapidly. A 3-stage evolutionary process 44 Deloitte Analytics Institute 2011 Deloitte LLP 22
24 Data Exploration and Variable Triage The variable creation phase was hypothesis-driven expansion. This phase is contraction : begin the process of analyzing and winnowing the space of candidate variables. Data visualization / Exploratory Data Analysis is a key part of this. Use EDA to: Discard weak or poorly distributed variables Consider how to handle missing variables Transform variables categoricity, linearity, Investigate correlations amongst the candidate predictors Explore dimension reduction techniques (remember curse of dimensionality) Take note of variables relative predictive power 45 Deloitte Analytics Institute 2011 Deloitte LLP Model Design Model design is where actuarial science meets data science. Translate business problem into algorithmic solution. Common considerations: Which target variable to use? Which data points / types of policies to include/exclude? Which data sources / variable types to consider? What is the level of the model (policy? HH for cross-sell?) How should model be evaluated? Lift curves / gains charts Broken down by certain segments of business Compare with existing practice How to split data into train / test / validation samples Is data sufficiently credible? 46 Deloitte Analytics Institute 2011 Deloitte LLP 23
25 Multivariate Modeling Key initial step: appropriate model design Translate ultimate use of the model into appropriate model design With the model design in place, the iterative ti modeling process begins Picks up where EDA process leaves off: process of triaging, transforming, interacting candidate predictors continues. Multiple candidate models are built/tested Sample techniques: Regression / GLM / proportional hazard models CART / MARS / Random Forests, Neural nets, Support Vector Machines 47 Deloitte Analytics Institute 2011 Deloitte LLP Multivariate Modeling: In More Detail 1) Pair down full collection of candidate predictive variables to a more manageable set 2) Iterative modeling process Build candidate models on training data Evaluate on test data Many things to tweak Alternate modeling techniques Hybrid models, ensemble models Different target t variables Different predictive variables Data filtering / sampling criteria Technique-specific modeling options: distributional assumptions, weights and offsets, cost-complexity tuning parameters, neural net topology, decision tree splitting method, multiple random samples, 48 Deloitte Analytics Institute 2011 Deloitte LLP 24
26 Considerations Do signs/magnitudes of parameters make sense? Statistically significant? Sensible from a business point of view? Readily explainable Consistent with corporate philosophy? Is the population reasonably homogenous? If not does the model behave consistently / reasonably on both groups? Does the predictive power hold up within sub-segments Continuity Are there small changes in input values that might produce large swings in scores Make sure that end-users can t game the system 49 Deloitte Analytics Institute 2011 Deloitte LLP Model Analysis and Implementation Perform model analytics Necessary for end-users to gain comfort with the model Calibrate Models Create user-friendly scale client dictates Implement models Technology implementation deep IT skills crucial Business implementation plan use of business rules, reason messages, education Monitor performance Distribution of scores over time, predictiveness, usage of model... Collect reactions, suggestions from end-users Collect list of desired improvements for v2.0 Plan model maintenance schedule 50 Deloitte Analytics Institute 2011 Deloitte LLP 25
27 Predictive Analytics: Understanding the Techniques Supervised vs Unsupervised Learning Supervised Learning Estimate expected value of Y given values of X. Generalized Linear Models (GLM) Generalized Additive Models (GAM) Survival Models (Cox Proportional Hazard) Classification and Regression Trees (CART) Multivariate Adaptive Regression Splines (MARS) Random Forests Support Vector Machines Neural Nets Unsupervised Learning Attempt to find interesting patterns amongst the X. No outcome ( target ) variable Clustering, Self-organizing maps Correlation / Principal Components / Factor Analysis 52 Deloitte Analytics Institute 2011 Deloitte LLP 26
28 Classification vs Regression Classification Problems Goal is to segment the observations of interest into 2 or more distinct categories. Average, above average, below average profitability customers Likely fraudsters vs likely non-fraudsters Likely defectors vs likely non-defectors Regression Problems Goal is to predict a continuous amount. Dollars of loss for a policy Ultimate size of claim Years of retention Days out of work 53 Deloitte Analytics Institute 2011 Deloitte LLP Parametric vs Non-Parametric Parametric Statistics Involves the specification of a probabilistic model governing the process that generated the data. e.g. Poisson Regression: assume that the process generating g the number of claims for a given policy is a Poisson process. The mean ( λ ) of each policy s Poisson distribution is a linear combination of various attributes. Non-Parametric Statistics No probability model specified Purely algebraic e.g. classification trees use brute-force computing to create relatively pure segments of a dataset. No assumptions made about the target variable. 54 Deloitte Analytics Institute 2011 Deloitte LLP 27
29 Regression and its Relations Generalized Linear Models [GLM] Relax assumptions of Ordinary Least Squares [OLS] regression Normality exponential family Linearity linearity on scale link function scale (e.g. log scale) Logistic regression: binary outcomes Poisson regression: count outcomes Multivariate Adaptive Regression Splines [MARS] and Neural Nets Clever automatic ways of transforming and interacting candidate input variables Automates difficult job of selecting and transforming dozens or hundred of variables Account for fact that true relationships aren t necessarily linear Universal approximators: can model any functional form Downside: are black-box models (though MARS is much more interpretable) Classification and Regression Trees [CART] Tree-structured regression Simplification of MARS; easier to interpret 55 Deloitte Analytics Institute 2011 Deloitte LLP This presentation contains general information only and is based on the experiences and research of Deloitte practitioners. Deloitte is not, by means of this presentation, rendering business, financial, investment, or other professional advice or services. This presentation is not a substitute for such professional advice or services, nor should it be used as a basis for any decision or action that may affect your business. Before making any decision or taking any action that may affect your business, you should consult a qualified professional advisor. Deloitte, its affiliates, and related entities shall not be responsible for any loss sustained by any person who relies on this presentation. Copyright 2011 Deloitte Development LLC. All rights reserved. Member of Deloitte Touche Tohmatsu Limited 28
PREDICTIVE MODELLING FOR COMMERCIAL INSURANCE
PREDICTIVE MODELLING FOR COMMERCIAL INSURANCE General Insurance Pricing Seminar 13 June 2008 London James Guszcza, FCAS, MAAA jguszcza@deloitte.com General Themes Predictive modelling: 3 Levels of Discussion
More informationPredictive Modeling Techniques in Insurance
Predictive Modeling Techniques in Insurance Tuesday May 5, 2015 JF. Breton Application Engineer 2014 The MathWorks, Inc. 1 Opening Presenter: JF. Breton: 13 years of experience in predictive analytics
More informationModel Validation Techniques
Model Validation Techniques Kevin Mahoney, FCAS kmahoney@ travelers.com CAS RPM Seminar March 17, 2010 Uses of Statistical Models in P/C Insurance Examples of Applications Determine expected loss cost
More informationRisk pricing for Australian Motor Insurance
Risk pricing for Australian Motor Insurance Dr Richard Brookes November 2012 Contents 1. Background Scope How many models? 2. Approach Data Variable filtering GLM Interactions Credibility overlay 3. Model
More informationSOA 2013 Life & Annuity Symposium May 6-7, 2013. Session 30 PD, Predictive Modeling Applications for Life and Annuity Pricing and Underwriting
SOA 2013 Life & Annuity Symposium May 6-7, 2013 Session 30 PD, Predictive Modeling Applications for Life and Annuity Pricing and Underwriting Moderator: Barry D. Senensky, FSA, FCIA, MAAA Presenters: Jonathan
More informationPredictive Modeling and Big Data
Predictive Modeling and Presented by Eileen Burns, FSA, MAAA Milliman Agenda Current uses of predictive modeling in the life insurance industry Potential applications of 2 1 June 16, 2014 [Enter presentation
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationSession 62 TS, Predictive Modeling for Actuaries: Predictive Modeling Techniques in Insurance Moderator: Yonasan Schwartz, FSA, MAAA
Session 62 TS, Predictive Modeling for Actuaries: Predictive Modeling Techniques in Insurance Moderator: Yonasan Schwartz, FSA, MAAA Presenters: Jean-Frederic Breton David A. Moore, FSA, MAAA Session 62:
More informationPredictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD
Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,
More informationAnti-Trust Notice. Agenda. Three-Level Pricing Architect. Personal Lines Pricing. Commercial Lines Pricing. Conclusions Q&A
Achieving Optimal Insurance Pricing through Class Plan Rating and Underwriting Driven Pricing 2011 CAS Spring Annual Meeting Palm Beach, Florida by Beth Sweeney, FCAS, MAAA American Family Insurance Group
More informationSupervised Learning (Big Data Analytics)
Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used
More informationWebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat
Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise
More informationHow To Build A Predictive Model In Insurance
The Do s & Don ts of Building A Predictive Model in Insurance University of Minnesota November 9 th, 2012 Nathan Hubbell, FCAS Katy Micek, Ph.D. Agenda Travelers Broad Overview Actuarial & Analytics Career
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationPredictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar
Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar Prepared by Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc. www.data-mines.com Louise.francis@data-mines.cm
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationHow Will Predictive Modeling Change the P&C Industry in the Next 5-10 Years?
How Will Predictive Modeling Change the P&C Industry in the Next 5-10 Years? CAS Annual Meeting Seattle November, 2008 James Guszcza, FCAS, MAAA Deloitte Consulting The Problem with Prediction A grand
More informationElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis
ElegantJ BI White Paper The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis Integrated Business Intelligence and Reporting for Performance Management, Operational
More informationApplied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets
Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification
More informationData Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
More informationBOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING
BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING Xavier Conort xavier.conort@gear-analytics.com Session Number: TBR14 Insurance has always been a data business The industry has successfully
More informationChapter 12 Discovering New Knowledge Data Mining
Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to
More informationMS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
More informationHow To Understand The Theory Of Probability
Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL
More informationDatabase Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More informationData Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
More informationPrinciples of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
More informationWhy do statisticians "hate" us?
Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data
More informationPractical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
More informationSimple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
More informationLocation matters. 3 techniques to incorporate geo-spatial effects in one's predictive model
Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort xavier.conort@gear-analytics.com Motivation Location matters! Observed value at one location is
More informationChapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
More informationGerry Hobbs, Department of Statistics, West Virginia University
Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit
More informationTDWI Best Practice BI & DW Predictive Analytics & Data Mining
TDWI Best Practice BI & DW Predictive Analytics & Data Mining Course Length : 9am to 5pm, 2 consecutive days 2012 Dates : Sydney: July 30 & 31 Melbourne: August 2 & 3 Canberra: August 6 & 7 Venue & Cost
More informationA Basic Guide to Modeling Techniques for All Direct Marketing Challenges
A Basic Guide to Modeling Techniques for All Direct Marketing Challenges Allison Cornia Database Marketing Manager Microsoft Corporation C. Olivia Rud Executive Vice President Data Square, LLC Overview
More informationEasily Identify Your Best Customers
IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do
More informationAddressing Analytics Challenges in the Insurance Industry. Noe Tuason California State Automobile Association
Addressing Analytics Challenges in the Insurance Industry Noe Tuason California State Automobile Association Overview Two Challenges: 1. Identifying High/Medium Profit who are High/Low Risk of Flight Prospects
More informationIn this presentation, you will be introduced to data mining and the relationship with meaningful use.
In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine
More informationMSCA 31000 Introduction to Statistical Concepts
MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced
More informationBetter decision making under uncertain conditions using Monte Carlo Simulation
IBM Software Business Analytics IBM SPSS Statistics Better decision making under uncertain conditions using Monte Carlo Simulation Monte Carlo simulation and risk analysis techniques in IBM SPSS Statistics
More informationA Property & Casualty Insurance Predictive Modeling Process in SAS
Paper AA-02-2015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing
More informationData Mining Methods: Applications for Institutional Research
Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014
More informationData Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd Edition
Brochure More information from http://www.researchandmarkets.com/reports/2170926/ Data Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd
More informationEnsemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
More information69 PD Underwriting Issues for Group Life and Disability Insurance. Moderator: Peter A. Heinrichs, FSA, MAAA
69 PD Underwriting Issues for Group Life and Disability Insurance Moderator: Peter A. Heinrichs, FSA, MAAA Presenters: Susan L. Ebertz Michael F. Vassar Mark R. Yoest, FSA, MAAA Underwriting Trends in
More informationREPORT DOCUMENTATION PAGE
REPORT DOCUMENTATION PAGE Form Approved OMB NO. 0704-0188 Public Reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,
More informationSilvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com
SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING
More informationThe Last Mile Problem
The Last Mile Problem How Data Science and Behavioural Science can Work Together IPAC Toronto March 4, 2015 James Guszcza, PhD, FCAS, MAAA Deloitte Consulting jguszcza@deloitte.com Two overdue sciences
More informationInsurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.
Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics
More informationBIDM Project. Predicting the contract type for IT/ITES outsourcing contracts
BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an
More informationIs a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
More informationBusiness Analytics and Data Mining for CRM Business Analytics and Data Mining for CRM: Jumpstart workshop
: Jumpstart workshop Date and Place: Bangalore, Sep 1 st (Sat) and 2 nd (Sun) 2012 Registration Link: http://compegence.com/open-programs.php http://compegence.com/workshop-analytics-for-crm.php Audience:
More informationData Analytical Framework for Customer Centric Solutions
Data Analytical Framework for Customer Centric Solutions Customer Savviness Index Low Medium High Data Management Descriptive Analytics Diagnostic Analytics Predictive Analytics Prescriptive Analytics
More informationData Mining: Overview. What is Data Mining?
Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,
More informationAnalytics in Action. What do Jeopardy, Pampers, and Major League Baseball all have in common? October 24, 2012
Analytics in Action What do Jeopardy, Pampers, and Major League Baseball all have in common? October 24, 2012 University of Cincinnati Tangeman University Center Theater Sponsored by LUCRUM, Inc. ABOUT
More informationData Mining. Nonlinear Classification
Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15
More informationCOPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments
Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for
More informationData Mining: An Introduction
Data Mining: An Introduction Michael J. A. Berry and Gordon A. Linoff. Data Mining Techniques for Marketing, Sales and Customer Support, 2nd Edition, 2004 Data mining What promotions should be targeted
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
More informationCombining Linear and Non-Linear Modeling Techniques: EMB America. Getting the Best of Two Worlds
Combining Linear and Non-Linear Modeling Techniques: Getting the Best of Two Worlds Outline Who is EMB? Insurance industry predictive modeling applications EMBLEM our GLM tool How we have used CART with
More informationData Mining with SAS. Mathias Lanner mathias.lanner@swe.sas.com. Copyright 2010 SAS Institute Inc. All rights reserved.
Data Mining with SAS Mathias Lanner mathias.lanner@swe.sas.com Copyright 2010 SAS Institute Inc. All rights reserved. Agenda Data mining Introduction Data mining applications Data mining techniques SEMMA
More informationSTATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and
Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table
More informationAn Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
More informationSURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationTHE HYBRID CART-LOGIT MODEL IN CLASSIFICATION AND DATA MINING. Dan Steinberg and N. Scott Cardell
THE HYBID CAT-LOGIT MODEL IN CLASSIFICATION AND DATA MINING Introduction Dan Steinberg and N. Scott Cardell Most data-mining projects involve classification problems assigning objects to classes whether
More informationData Mining: An Overview. David Madigan http://www.stat.columbia.edu/~madigan
Data Mining: An Overview David Madigan http://www.stat.columbia.edu/~madigan Overview Brief Introduction to Data Mining Data Mining Algorithms Specific Eamples Algorithms: Disease Clusters Algorithms:
More informationIntroduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu
Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics
More informationActuarial Science as Data Science A vision for actuarial science in the 21 st century
Actuarial Science as Data Science A vision for actuarial science in the 21 st century California Actuarial Student Congress UC Santa Barbara April 4, 2015 James Guszcza, PhD, FCAS, FSA Deloitte Consulting
More informationData Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
More informationHealth Spring Meeting May 2008 Session # 42: Dental Insurance What's New, What's Important
Health Spring Meeting May 2008 Session # 42: Dental Insurance What's New, What's Important Floyd Ray Martin, FSA, MAAA Thomas A. McInteer, FSA, MAAA Jonathan P. Polon, FSA Dental Insurance Fraud Detection
More informationBusiness Analytics and Credit Scoring
Study Unit 5 Business Analytics and Credit Scoring ANL 309 Business Analytics Applications Introduction Process of credit scoring The role of business analytics in credit scoring Methods of logistic regression
More informationData Mining Approaches to Modeling Insurance Risk. Dan Steinberg, Mikhail Golovnya, Scott Cardell. Salford Systems 2009
Data Mining Approaches to Modeling Insurance Risk Dan Steinberg, Mikhail Golovnya, Scott Cardell Salford Systems 2009 Overview of Topics Covered Examples in the Insurance Industry Predicting at the outset
More informationCross Validation. Dr. Thomas Jensen Expedia.com
Cross Validation Dr. Thomas Jensen Expedia.com About Me PhD from ETH Used to be a statistician at Link, now Senior Business Analyst at Expedia Manage a database with 720,000 Hotels that are not on contract
More informationLearning Objectives for Selected Programs Offering Degrees at Two Academic Levels
Learning Objectives for Selected Programs Offering Degrees at Two Academic Levels Discipline Degree Learning Objectives Accounting 1. Students graduating with a in Accounting should be able to understand
More informationStatistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
More informationMachine Learning with MATLAB David Willingham Application Engineer
Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the
More informationCustomer and Business Analytic
Customer and Business Analytic Applied Data Mining for Business Decision Making Using R Daniel S. Putler Robert E. Krider CRC Press Taylor &. Francis Group Boca Raton London New York CRC Press is an imprint
More informationNine Common Types of Data Mining Techniques Used in Predictive Analytics
1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better
More informationMHI3000 Big Data Analytics for Health Care Final Project Report
MHI3000 Big Data Analytics for Health Care Final Project Report Zhongtian Fred Qiu (1002274530) http://gallery.azureml.net/details/81ddb2ab137046d4925584b5095ec7aa 1. Data pre-processing The data given
More informationUsing reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management
Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators
More informationSTATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
More informationA Marketer s Guide to Analytics
A Marketer s Guide to Analytics Using Analytics to Make Smarter Marketing Decisions and Maximize Results WHITE PAPER SAS White Paper Table of Contents Executive Summary.... 1 The World of Marketing Is
More informationData Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
More informationCI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.
CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes
More informationMA2823: Foundations of Machine Learning
MA2823: Foundations of Machine Learning École Centrale Paris Fall 2015 Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr TAs: Jiaqian Yu jiaqian.yu@centralesupelec.fr
More informationSAS Fraud Framework for Banking
SAS Fraud Framework for Banking Including Social Network Analysis John C. Brocklebank, Ph.D. Vice President, SAS Solutions OnDemand Advanced Analytics Lab SAS Fraud Framework for Banking Agenda Introduction
More informationCHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES
CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES Claus Gwiggner, Ecole Polytechnique, LIX, Palaiseau, France Gert Lanckriet, University of Berkeley, EECS,
More informationHow To Perform An Ensemble Analysis
Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY 10598 Outlier Ensembles Keynote, Outlier Detection and Description Workshop, 2013 Based on the ACM SIGKDD Explorations Position Paper: Outlier
More informationBetter credit models benefit us all
Better credit models benefit us all Agenda Credit Scoring - Overview Random Forest - Overview Random Forest outperform logistic regression for credit scoring out of the box Interaction term hypothesis
More informationThe Process of Predictive Modeling Dorothy L. Andrews Merlinos & Associates
Dorothy L. Andrews Merlinos & Associates Conceive Develop Test Implement Analyze My Background Actuary, ASA, MAAA MA Mathematical Statistics MA Mathematics & Education BA Mathematics Everett Curtis Huntington
More informationComparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
More informationData Mining + Business Intelligence. Integration, Design and Implementation
Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationEXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.
EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER ANALYTICS LIFECYCLE Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models
More informationGordon S. Linoff Founder Data Miners, Inc. gordon@data-miners.com
Survival Data Mining Gordon S. Linoff Founder Data Miners, Inc. gordon@data-miners.com What to Expect from this Talk Background on survival analysis from a data miner s perspective Introduction to key
More informationAn Overview of Data Mining: Predictive Modeling for IR in the 21 st Century
An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century Nora Galambos, PhD Senior Data Scientist Office of Institutional Research, Planning & Effectiveness Stony Brook University AIRPO
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification
More informationService courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
More informationBusiness Intelligence and Decision Support Systems
Chapter 12 Business Intelligence and Decision Support Systems Information Technology For Management 7 th Edition Turban & Volonino Based on lecture slides by L. Beaubien, Providence College John Wiley
More information