Insurance Rate Making Using Data Mining Bernd Drewes SAS EMEA
or: Dollars from your Data
Is Data Mining Important? Postbank N.V. 50% response on first mailing paid for DM investment US West Reducing customer churn by any amount is 10 times cheaper than gaining a new customer ABN AMRO Interest earned on 40% reduction in cash in ATMs Neckermann Versand AG Increased number of good customers getting credit by 80 a day Gloucestershire Constabulary For the public, increased crime pattern identification and prevention is priceless
Drivers for Data Mining! Companies feel data rich, information poor! Customer related data offer a competitive advantage
Business Drivers Marketing - CRM Relation! Target Marketing! Customer Acquisition! Customer Retention! Cross Selling/Up-selling! Customer Segmentation! Customer Profiling! Customer Profitability Analysis Wi n b a c k Ret ent i on, Chur n Pr ospect s Acquisition Loyal t y Lifetime Profitability
Business Drivers Finance! Credit Risk Management! Fraud Detection! Funds Management! Optimal Pricing! Optimal Rates! Individualized Risk
Data Mining! Data Mining is the process of selecting, exploring and modeling! large amounts of data! to uncover previously unknown patterns for business advantage
The Integrated SAS Solution Business Question Identify Problem Data Warehouse DBMS Measure Results Transform Data into Information EIS, Business Reporting, OLAP Act on Information Data Mining Processing
The SAS Institute Data Mining Solution Enterprise Miner SAS Institute Data Mining Project Methodology Business Solutions SAS Institute and Partners
Enterprise Miner TM Characteristics:! Integration of data warehousing and data mining! Co-operation of business, IT and data miners! Covers all stages of the DM process! Process flow orientation! Enables companies to gain competitive advantage
Process Flow Diagrams
Data Mining for Rate Making! The Insurance Industry in Change! The Business Problem! Examples of Data Mining Approaches to the Business Problem! Rate Making Using Pure Premium! Rate Making Using Loss Ratio! Refining an Existing Rate Structure
The Insurance Industry in Change! Globalization! no safe home markets any more! Deregulation! many more players, new rules! New channels, new modes of business! retailers, E-insurance! Result: Much higher competition and! increased focus: customers, attrition/loyalty, prices/rates, direct marketing/profiling
The Business Problem! Rate Making:! competitive rates, increased profitability! Need to use correct pricing methodology based on correct levels of risk exposure, and! need to move away from rigid rates and provide a flexible price structure! Implications on customer relations! attract good clients, discourage bad ones! client retention, cross selling other prod s
Insurance: Basic Terminology! Mean loss severity! (sum of all losses)/ (number of claims)! pure premium! amount that covers losses from a policy! estimate as (loss frequency) * (mean severity)! loss ratio! percentage of revenue paid out to claimants! key quantities needed for modeling! expected number of claims per year! expected costs per claim
Classical Approach: Regression and its variants! Step 1:Define risk variables & categories! Typically done by actuaries! Typically only few variables & values! Step 2: Use regression to determine! if variables & values are risk related! estimates for parameters needed for modeling! Step 3: drop,combine variables & values! may involve marketing considerations! then iterate analysis until no more changes! Step 4: use model to compute rate
Example: Automobile Insurance! start with age, car_age, car_power, value, region! define intervals for each variable! based on experience and practice, not just data! cross product of intervals defines risks! now each customer falls into a risk class! Use statistics to build a model for! accident likelihood! mean cost per claim
Claim frequency: Model results Source Deviance NDF DDF F PrF ChiSquare PrChi 1 INTERCEPT 357.2690 0 76.... 2 TARIF_ID 341.5579 1 76 11.9241 0.0009 11.9241 0.0006 3 AGE_ID 131.1380 1 76 159.7009 0.0001 159.7009 0.0001 4 VALUE_ID 128.5264 2 76 0.9911 0.3759 1.9821 0.3712 5 POWER_ID 124.4280 2 76 1.5553 0.2178 3.1106 0.2111 6 CAR_AGES 100.1366 4 76 4.6091 0.0022 18.4362 0.0010
Rate Making Using Data Mining: Overview! Step 1: Collect ALL relevant data! store in single table! check & clean data, e.g. false & missing values! Step 2: Explore data! determine relationships & trends! transform values and define derived variables! Step 3: Compute several models! decision tree, regression, neural nets! Step 4: Assess results! from technical and business perspective
Rate Making: Data Mining vs. Traditional Approach! Data Mining! uses many more initial variables! the data determine which variables to use! computes intervals & category groups! quickly builds & assesses alternative models! tailors model to data! finds niches! does not use cross product as risk classes! can mimick much of traditional approach:-->
Example: Claim Frequency Prediction with Data Mining! When starting with same variables: region, age, value, power, car_age! Decision Tree analysis finds these rules! IF Age < 32.5 THEN ACCID s: 70.9% No_ACCID s: 29.1%! IF Car category based on power < 7.5 AND 32.5 <= Age < 35.5 THEN ACCID s: 40.4% No_ ACCID s: 59.6%! IF Age of the car < 11.5 AND 35.5 <= Age < 50.5 THEN ACCID s: 16.9% No_ACCID s: 83.1%
Example: Claim Frequency Prediction with Data Mining! Results: Similarities to traditional analysis! Rule set for frequency involve region,car age! but not value & power alone! finds interaction of value and power! Differences from traditional analysis! Rules define very varied risk groups! Rate structure is very different! includes small niche groups! advantage: finds high/low risk, profitable cases
Getting more detailed: Specific Data Mining Approaches! Approaches! 1. identify winning & loosing segments 2. generate new rating structure:! by focussing on loss ratio! by focussing on pure premiums 3. refine existing rating structure! Additionally! Find most highly predicting variables! Find best splits and optimal groupings! Use Data Mining Results for statistics input
Method 1: Identify Winning & Losing Segments! Method! Build predictive model for loss ratio, and/or losses, and/or profitability! Identify segments of interest, interactively or automatic! Actions! Expand good segments (ads, lower rates)! Reduce bad segments (rate increases, change underwriting policies)
Method 2: Reduce Variance of Loss Ratio! Build predictive model for loss ratio! percentage of premiums paid out to claimants! Consider resulting segments with a high variance in loss ratio! mixed behavior, some subgroups much better! good customers paying for bad ones! Interactively partition segment! Discard or increase rate for poor subsegments! Results in entirely new rating structure! Can only be done occasionally
Method 3: Refine Rating Structure! Goal:subdivide existing rating classes for improved profitability! Method:! label clients by unique rating class identifier! do supervised classification on risk identifier! identify segments to be refined! do loss ratio reduction method on segments! inspect resulting segments: raise rates on high loss ratio, then if feasible, lower rates on others! Assess, iterate if necessary! Result: modification of existing rate structure
Traditional Multiplicative Rate Structure OLD CAR NEW CAR SLOW CAR FAST CAR
Rate Refinement Using Data Mining Low Car Value Medium High Director Profession Artist
Summary: Rate Making with Data Mining! Data Mining allows! refinement of existing rate structures! interactive exploration of rate groups! almost individualized rate setting:! let the data define the rating categories! building alternative models quickly! complexity reduction:! optimal groupings for variables! Finding profitability and loss niches
Conclusion: Beyond Rate Making! Data Mining focuses on customers! Identifies customer groups & niches! focus is on understanding rate structure, business implications, underwriting rules! Results may extend beyond rate making! Cross selling (other insurance, investments)! improve customer retention! may consider life time value of customer
References: Zurich Insurance! Application Description: Neural Network (EM) Application to detect significant Attributes of Policy-Holders and determine Risk-Adequacy as well as Lifetime Profitability & future Cash flows.! Need to be able to do Customer-based Pricing to target profitable Segments of the Insurance Market.! Some key directions: -->
Zurich Insurance (see Lechner, SEUGI 98)! from a product orientation to a provider of solutions for clearly defined customer groups! uses Neural Networks to detect policyholder s attributes, which have a significant effect on the probability of higher losses or more claims! determine each customers profitability. The goal is to bundle the marketing efforts and to improve customer oriented pricing by taking into account the probability of future developments.
References: Old Mutual!!"#$%&'("#)*+&',-$%&).)"$$#)',)/,,0 (')#('()(*%,&&)&$1$%(/)#2-$"&2,"&3)'4$ *+&',-$%&)'4$-&$/1$&5)'4$)6%,#+*'& '4$7)4,/#),%)#,)",')4,/#5)("#)'4$)&(/$& *4(""$/!,8'$")9("')"$:('21$)(&)9$//)(&)6,&2'21$ *%2'$%2();)&+*4)(&)<942*4)*+&',-$%& 94,)-$$')'4$&$)*4(%(*'$%2&'2*&)#,)",' 4(1$)$2'4$%)6%,#+*')=),%)6%,#+*')>?! -,%$)&6$*282*&)..@
Old Mutual (SEUGI 98)! "$$#)42&',%2*)#('(),")'4$)*+&',-$% ABCD! *+&',-$%)2"8,%-('2,")E7)4,+&$4,/#! consistent adresses, coherent product view! customer attrition! which customers to make offers, which to let go! patterns for repeat purchases! life time value of customers
References on Rates & Prices Using Enterprise Miner! GIE AXA (F)! Pricing, targeting, segmentation! Allianz Subalpina (I)! motor rate analysis! Societa Reale Mutua di Assicurazioni (I)! correct pricing
References on Enterprise Miner in Insurance! Allianz (AU):! Rate Making, Customer Segmentation, Churn! mainly doing Rate Making, but also Customer Segmentation, Database Marketing,Churn Prediction! GIE MMA SI (F)! Application Description: using EM for strategic marketing decisions : loyalty analysis, retention analysis, profiling, targeting.
References on Enterprise Miner in Insurance! AXA Colonia (D)! Winterthur (CH)! Marketing, Behavioral Analysis! AGIS (CH)! Database Marketing! EM-based CRM & DWH! Victoria AG (D)! churn & cancellation analysis! Finax (DK)! Credit scoring
References on Enterprise Miner in Insurance! GIE MMA SI (F)! loyalty & retention analysis, profiling! SACCEF (F)! credit scoring! ARCA VITA (I)! database marketing! ACHMEA HOLDING NV (NL)! Direct Mail, cross selling, profiling, LTV! London&Edinburgh Insurance Group (GB)! Credit risk analysis! Eagle Star (GB)
Competition! Data Mining Vendors! IBM: consulting package, generic tool, virtual product! Software/Consulting Companies! TRICAST! LARSEN & Partners
Opportunity! SAS has tools for traditional solutions! for data mining solutions!. Has good references!. Not very much competition! customer is under market pressure! --> good sales opportunity! customer needs to develop own approach
Applying Data Mining for Rate Making:Some Examples! Identify niches of profits & losses! Refining rate structure! Interactively modify risk structure, e.g. minimize loss ratio (Demo)
Identifying Niches: (1) Find Rate Structure
Identifying Niches: (2) Analyze Bottom Line GROUP N Premium-Pure Premium 8 931-558,472 9 1065 6505,428 10 1237 768,6775 11 546 3445,336 12 104 14644,7 13 47 6257,001 14 73 18199,08 15 16 27253,4
Demo: Refining an Existing Rate Structure
Refining an existing rate structure
Refining an Existing Rate Structure: Segment 3 OLD: Node 3 IF 17.5 <= Car value THEN NODE: 3, AVE : 23202.4, SD : 9753.32 NEW 1. IF 17.5 <= Car value AND IF Tariff zone IS ONE OF: 8 9 10 THEN NODE: 3, AVE: 30293.9, SD: 10478 2.IF 17.5 <= Car value AND IF Age of the car < 8.5 AND Tariff zone IS ONE OF: 2 3 4 5 6 7 THEN NODE: 4, AVE: 22460.2, SD: 6779.28 3.IF 17.5 <= Car value AND IF 8.5 <= Age of the car AND Tariff zone IS ONE OF: 2 3 4 5 6 7 THEN NODE: 5, AVE: 15637, SD: 6464
Refining an Existing Rate Structure: Segment 6 OLD: Node 6 IF Age of the car < 8.5 AND Car category based on power < 6.5 AND Car value < 17.5 THEN NODE : 6, N: 213, AVE : 6352.95, SD : 3473.55 NEW 1. IF Age of the car < 8.5 AND Car category based on power < 6.5 AND Car value < 17.5 AND IF Car value < 4.5 AND 6.5 <= Age of the car THEN NODE: 6, N: 50, AVE: 5807.4, SD: 3243.92
Refining an Existing Rate Structure: Segment 6 2. IF Age of the car < 8.5 AND Car category based on power < 6.5 AND Car value < 17.5 AND IF Age < 36 AND Tariff zone IS ONE OF: 1 2 3 4 AND Age of the car < 6.5 THEN NODE : 8, N: 14, AVE: 8372.93, SD: 3081.97 3. IF Age of the car < 8.5 AND Car category based on power < 6.5 AND Car value < 17.5 AND IF 36 <= Age AND Tariff zone IS ONE OF: 1 2 3 4 AND Age of the car < 6.5 THEN NODE : 9, N: 51, AVE: 4440.06, SD: 1458.21 4. IF Age of the car < 8.5 AND Car category based on power < 6.5 AND Car value < 17.5 AND IF Age < 40.5 AND Tariff zone IS ONE OF: 5 6 7 8 9 10 AND Age of the car < 6.5 THEN NODE: 10, N: 15, AVE: 8539.47, SD: 3664.78
Refining an Existing Rate Structure: Segment 6 5. IF Age of the car < 8.5 AND Car category based on power < 6.5 AND Car value < 17.5 AND IF 40.5 <= Age AND Tariff zone IS ONE OF: 5 6 7 8 9 10 AND Age of the car < 6.5 THEN NODE: 11, N: 35, AVE: 6562.66, SD: 3375.42 6. IF Age of the car < 8.5 AND Car category based on power < 6.5 AND Car value < 17.5 AND IF Tariff zone IS ONE OF: 1 2 3 4 AND 4.5 <= Car value AND 6.5 <= Age of the car THEN NODE: 12, N: 28, AVE: 6715, SD: 4768.79 7. IF Age of the car < 8.5 AND Car category based on power < 6.5 AND Car value < 17.5 AND IF Tariff zone IS ONE OF: 5 6 7 8 9 10 AND 4.5 <= Car value AND 6.5 <= Age of the car THEN NODE: 13, N: 20, AVE: 8667, SD: 2477.46
Epilogue! Many more specialized rules with data mining! rules almost impossible to find with classical analysis! creative tool in hands of creative analyst! validate hypotheses statistically! more approaches than shown: interactive modifications of premiums, etc.