Insurance Fraud Detection: MARS versus Neural Networks?

Similar documents
Martian Chronicles: Is MARS better than Neural Networks? by Louise Francis, FCAS, MAAA

Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar

Session 94 L, Detecting Fraudulent Claims in General Insurance Moderator: Anthony E. Cappelletti, FSA, FCAS, FCIA

Predictive Modeling in Automobile Insurance: A Preliminary Analysis. Stephen P. D Arcy. Paper to be Presented at the

Using Predictive Analytics to Detect Fraudulent Claims

A Property & Casualty Insurance Predictive Modeling Process in SAS

Predictive Modeling and Big Data

Predictive Modeling Techniques in Insurance

Model Validation Techniques

Azure Machine Learning, SQL Data Mining and R

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Predictive modelling around the world

BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING

Customer and Business Analytic

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Data Preparation Part 1: Exploratory Data Analysis & Data Cleaning, Missing Data

Predictive Modeling for Workers Compensation Claims

Data Mining. Nonlinear Classification

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP


COMPARING NEURAL NETWORK ALGORITHM PERFORMANCE USING SPSS AND NEUROSOLUTIONS

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

How To Build A Predictive Model In Insurance

Benchmarking of different classes of models used for credit scoring

Predictive Analytics 101

Corporate Defaults and Large Macroeconomic Shocks

TNS EX A MINE BehaviourForecast Predictive Analytics for CRM. TNS Infratest Applied Marketing Science

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

CART 6.0 Feature Matrix

THE HYBRID CART-LOGIT MODEL IN CLASSIFICATION AND DATA MINING. Dan Steinberg and N. Scott Cardell

Combining GLM and datamining techniques for modelling accident compensation data. Peter Mulquiney

Data Mining: STATISTICA

Chapter 6. The stacking ensemble approach

Data Mining - Evaluation of Classifiers

A Property and Casualty Insurance Predictive Modeling Process in SAS

Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL

A Deeper Look Inside Generalized Linear Models

How To Understand The Theory Of Probability

Prediction of Stock Performance Using Analytical Techniques

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

A Short Tour of the Predictive Modeling Process

Statistics in Retail Finance. Chapter 7: Fraud Detection in Retail Credit

The Data Mining Process

Maschinelles Lernen mit MATLAB

FRAUD CLASSIFICATION USING PRINCIPAL COMPONENT ANALYSIS OF RIDITs

Data Mining. Dr. Saed Sayad. University of Toronto

Using Data Mining to Predict Automobile Insurance Fraud

answers to some of the tough questions that insurers get asked in Ontario. We hope it helps you own the road this summer.

Cleaned Data. Recommendations

IBM's Fraud and Abuse, Analytics and Management Solution

Application of Predictive Analytics to Higher Degree Research Course Completion Times

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

ROI CASE STUDY SPSS INFINITY PROPERTY & CASUALTY

An effective approach to preventing application fraud. Experian Fraud Analytics

Principles of Data Mining by Hand&Mannila&Smyth

PREDICTIVE MODELLING FOR COMMERCIAL INSURANCE

The New NCCI Hazard Groups

Gerry Hobbs, Department of Statistics, West Virginia University

How To Price Insurance In Canada

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

Chapter 12 Discovering New Knowledge Data Mining

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Application of SAS! Enterprise Miner in Credit Risk Analytics. Presented by Minakshi Srivastava, VP, Bank of America

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios

Predictive Modeling from a Risk Management Perspective Recording of this session via any media type is strictly prohibited.

The Relationship of Credit-Based Insurance Scores to Private Passenger Automobile Insurance Loss Propensity

Knowledge Discovery and Data Mining

Linear Models and Conjoint Analysis with Nonlinear Spline Transformations

Risk pricing for Australian Motor Insurance

Data Mining in CRM & Direct Marketing. Jun Du The University of Western Ontario jdu43@uwo.ca

Knowledge Discovery from patents using KMX Text Analytics

Driving Down Claim Costs With PREDICTIVE MODELING. December Sponsored by:

Detecting Money Laundering Actions Using Data Mining and Expert Systems

Enhancing Compliance with Predictive Analytics

Data Mining Applications in Higher Education

Data Mining + Business Intelligence. Integration, Design and Implementation

Local classification and local likelihoods

Anomaly detection. Problem motivation. Machine Learning

Didacticiel Études de cas

Performance Measures in Data Mining

Using Predictive Analytics to Detect Contract Fraud, Waste, and Abuse Case Study from U.S. Postal Service OIG

Data Mining Part 5. Prediction

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA

Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA

Predicting Bankruptcy with Robust Logistic Regression

Claims Auditing in Automobile Insurance

Knowledge Discovery and Data Mining

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis

Transcription:

Insurance Fraud Detection: MARS versus Neural Networks? Louise A Francis FCAS, MAAA Louise_francis@msn.com 1

Objectives Introduce a relatively new data mining method which can be used as an alternative to neural networks Compare the method to neural networks Apply the methods to fraud data 2

MARS Acronym for Multivariate Adaptive Regression Splines In many ways it is similar to regression, but: It can deal with data complexities that ordinary linear regression had difficulties 3

Data Complexities Nonlinear functions Interactions Missing Data 4

The Fraud Study Data 1993 Automobile Insurers Bureau closed Personal Injury Protection claims Dependent Variables Suspicion Score Expert assessment of liklihood of fraud or abuse Predictor Variables Red flag indicators Claim file variables 5

Example: Nonlinear Function Neural Network Fit of SUSPICION vs Provider Bill 4.00 3.00 netfraud1 2.00 1.00 0.00 1000 3000 5000 7000 Provider Bill 6

MARS Fit to Nonlinear Function MARS Fit of SUSPICION vs Provider Bill 4 Fitted Suspicion Score 3 2 1 0-1 1000 3000 5000 7000 Provider Bill 7

How MARS Fits Nonlinear Function MARS fits a piecewise regression BF1 = max(0, 2185 X ) Y = 4.29-0.002 * BF1 BF1 is basis function MARS uses statistical optimization to find best basis function Basis function similar to dummy variable in regression 8

Interactions Effect of a predictor variable on dependent variable depends on the values of another variable(s) Neural Network Predicted for Provider Bill and Injury Type inj.type: 05 6.00 4.00 2.00 Neural Net Predicted 0.00 6.00 4.00 2.00 0.00 inj.type: 03 in j.type: 04 inj.type: 01 in j.type: 02 3000 8000 13000 18000 Provider Bill 6.00 4.00 2.00 0.00 9

Interactions: MARS Fit MARS Predicted for Provider Bill and Injury Type 1000 3000 5000 7000 6 inj.type: 05 inj.type: 06 3 Fitted Suspicion Score 0 6 inj.type: 03 inj.type: 04 inj.type: 01 inj.type: 02 6 3 0 3 0 1000 3000 5000 7000 Provider 1 Bill 10

Interactions: The Basis Functions Injury type 4 (neck sprain), and type 5 (back sprain) increase faster and have higher scores than the other injury types BF1 = max(0, 2185 - X ) BF2 = ( INJTYPE = 4 OR INJTYPE = 5) BF3 = max(0, X - 159) * BF2 Y = 2.815-0.001 * BF1 + 0.685 * BF2 +.360E-03 * BF3 where X is the provider bill INJTYPE is the injury type 11

Missing Data Occurs frequently in insurance data There are some sophisticated methods for addressing this (i.e., EM algorithm) MARS uses basis functions to find surrogates for variables with missing values 12

Missing Data Example: Health Insurance (Claimant has Health Insurance) Value Frequency Percent Cumulative Percent No 457 32.6 32.6 Missing 208 14.9 47.5 13

Missing Data Example BF1 = max(0, MP_BILL - 2885) BF2 = max(0, 2885 - MP_BILL ) BF3 = (HEALTHIN MISSING) BF4 = (HEALTHIN = MISSING) BF5 = (HEALTHIN = N) BF7 = max(0, MP_BILL - 2262) * BF5 BF8 = max(0, 2262 - MP_BILL ) * BF5 BF9 = max(0, MP_BILL - 98) * BF4 BF10 = max(0, 98 - MP_BILL ) * BF4 BF11 = max(0, MP_BILL - 710) * BF3 BF13 = max(0, MP_BILL - 35483) BF15 = BF3 * BF2 Y = -0.754-0.002 * BF1 + 0.967 * BF3 + 1.389 * BF5 -.808E-04 * BF7 -.624E-03 * BF8 + 0.001 * BF9 + 0.016 * BF10 + 0.001 * BF11 +.114E-03 * BF13 +.376E-03 * BF15 14

More Complex Example Dependent variable: Expert s assessment of liklihood claim is legitimate A classification application Predictor variables: Combination of claim file variables (age of claimant, legal representation) red flag variables (injury is strain/sprain only, claimant has history of previous claim) 15

More Complex Example BF1 = (LEGALREP = 1) BF2 = (LEGALREP = 2) BF3 = ( TRTLAG = missing) BF4 = ( TRTLAG NE missing) BF5 = ( INJ01 = 1) * BF2 BF7 = ( ACC04 = 1) * BF4 BF9 = ( ACC14 = 1) BF11 = ( PARTDIS = 1) * BF4 BF15 = max(0, AGE - 36) * BF4 BF16 = max(0, 36 - AGE ) * BF4 BF18 = max(0, 55 - AMBUL ) * BF15 BF20 = max(0, 10 - RPTLAG ) * BF4 BF21 = ( CLT02 = 1) BF23 = POLLAG * BF21 BF24 = ( ACC15 = 1) * BF16 Y = 0.580-0.174 * BF1-0.414 * BF3 + 0.196 * BF5-0.234 * BF7 + 0.455 * BF9 + 0.131 * BF11-0.011 * BF15-0.006 * BF16 +.135E-03 * BF18-0.013 * BF20 +.286E-03 * BF23 + 0.010 * BF24 16

Evaluating Predictor Variables: Generalized Cross-validation GCV = 1 N y ˆ ( i f x [ N 1 k / N i= 1 i ) ] 2 where N is the number of observations y is the dependent variable x is the independent variable(s) k is the effective number of parameters or degrees of freedom in the model. 17

Variable Importance Ranking Rank Variable MARS Ranking of Variables Description 1 LEGALREP Legal Representation 2 TRTMIS Treatment lag missing 3 ACC04 Single vehicle accident 4 INJ01 Injury consisted of strain or sprain only 5 AGE Claimant age 6 PARTDIS Claimant partially disabled Property damage was inconsistent with 7 ACC14 accident 8 CLT02 Had a history of previous claims 9 POLLAG Policy lag 10 RPTLAG Report lag 11 AMBUL Ambulance charges 12 ACC15 Francis Very Analytics minor impact and Actuarial collision 18

Methods of Assessing Fit Cross Validation Confusion Matrix Sensitivity Specificity ROC Curve Area Under the ROC Curve 19

Cross-validation Four Fold Cross-validation Percent Technique R^2 Correct MARS 0.35 0.77 Neural Network 0.39 0.79 20

Confusion Matrix MARS Predicted * Actual Predicted Actual No Yes Total No 738 160 898 Yes 157 344 501 Total 895 505 21

Sensitivity/Specificity Sensitivity: Percent of targets correctly predicted Specificity: Percent of non-targets correctly predicted Model Sensitivity Specificity MARS 68.3 82.5 Neural Network 74.8 83.4 22

ROC Curve ROC Curve 1.0 Sensitivity 0.8 0.6 0.4 0.2 Neural Net MARS BASE 0.0 0.1 0.3 0.5 0.7 0.9 1.1 1 - Specificity 23

Area Under the ROC Curve Statistics for Area Under the ROC Curve Test Result Variables Area Std Error Asymptotic Sig Lower 95% Bound Upper 95% Bound MARS Probability Neural Probability 0.85 0.01 0.000 0.834 0.873 0.88 0.01 0.000 0.857 0.893 24

Which One is Better? Depends on application MARS handles missing values better MARS clusters categories on nominal variables with many categories MARS can be explained more easily On applications where analyst believes neural networks will outperform MARS, use them Also use hybrid models to improve performance 25

Using the Model Results Both claim file variables and red flag variables appear to be significant in predicting fraud Other research supports value of using statistical and data mining models to predict fraud Derrig (Journal of Risk and Insurance, 2002) advocates using analytic models to sort claims Pay claims with low score Devote resources to claims with high scores 26