A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

Similar documents

Modeling Lifetime Value in the Insurance Industry

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank

Data Mining Algorithms Part 1. Dejan Sarka

Numerical Algorithms Group

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

Chapter 12 Discovering New Knowledge Data Mining

A Property & Casualty Insurance Predictive Modeling Process in SAS

Predictive Modeling and Big Data

Using Adaptive Random Trees (ART) for optimal scorecard segmentation

Lecture 8 February 4

Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén Table Of Contents

Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry

Predictive Modeling Techniques in Insurance

Business Analytics and Credit Scoring

Data Mining Applications in Higher Education

TNS EX A MINE BehaviourForecast Predictive Analytics for CRM. TNS Infratest Applied Marketing Science

Application of SAS! Enterprise Miner in Credit Risk Analytics. Presented by Minakshi Srivastava, VP, Bank of America

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

IBM SPSS Direct Marketing 23

USING LOGIT MODEL TO PREDICT CREDIT SCORE

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Data Mining Applications in Fund Raising

IBM SPSS Direct Marketing 22

Data Mining. Nonlinear Classification

Predicting Successful Completion of the Nursing Program: An Analysis of Prerequisites and Demographic Variables

Data Mining Part 5. Prediction

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

Prediction of Stock Performance Using Analytical Techniques

Predicting Customer Default Times using Survival Analysis Methods in SAS

Customer and Business Analytic

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.

Decision Trees from large Databases: SLIQ

THE HYBRID CART-LOGIT MODEL IN CLASSIFICATION AND DATA MINING. Dan Steinberg and N. Scott Cardell

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

Data Mining Techniques Chapter 6: Decision Trees

An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century

Free Trial - BIRT Analytics - IAAs

Predictive Dynamix Inc

Easily Identify Your Best Customers

Marketing Strategies for Retail Customers Based on Predictive Behavior Models

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis

A Robust Method for Solving Transcendental Equations

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios

Finding Supporters. Political Predictive Analytics Using Logistic Regression. Multivariate Solutions

11. Analysis of Case-control Studies Logistic Regression

Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar

Predicting Customer Churn in the Telecommunications Industry An Application of Survival Analysis Modeling Using SAS

Data Mining Lab 5: Introduction to Neural Networks

Data Mining is the process of knowledge discovery involving finding

Understanding Characteristics of Caravan Insurance Policy Buyer

Behavior Model to Capture Bank Charge-off Risk for Next Periods Working Paper

Data Mining. Dr. Saed Sayad. University of Toronto

ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL

Deployment of Predictive Models. Sumit Kumar Bardhan

Lecture 10: Regression Trees

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

Predictive modelling around the world

IBM SPSS Direct Marketing 19

Data Mining Practical Machine Learning Tools and Techniques

Data Mining - Evaluation of Classifiers

Data Mining Techniques

Business Analytics and Data Mining for CRM Business Analytics and Data Mining for CRM: Jumpstart workshop

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone:

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Experian TAPS SM Total Annual Plastic Spend

Knowledge Discovery and Data Mining

Data Mining Part 5. Prediction

not possible or was possible at a high cost for collecting the data.

The Predictive Data Mining Revolution in Scorecards:

Statistics in Retail Finance. Chapter 6: Behavioural models

D A T A M I N I N G C L A S S I F I C A T I O N

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

8. Machine Learning Applied Artificial Intelligence

Advanced analytics at your hands

Microsoft Azure Machine learning Algorithms

Azure Machine Learning, SQL Data Mining and R

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

Data Mining + Business Intelligence. Integration, Design and Implementation

CoolaData Predictive Analytics

Some Essential Statistics The Lure of Statistics

Course Syllabus. Purposes of Course:

Business Analytics Using SAS Enterprise Guide and SAS Enterprise Miner A Beginner s Guide

PharmaSUG2011 Paper HS03

Transcription:

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges Allison Cornia Database Marketing Manager Microsoft Corporation C. Olivia Rud Executive Vice President Data Square, LLC

Overview Types and Uses of Models Descriptive Segmentation Profiling Predictive Regression Trees Neural Networks Genetic Algorithms Association Rules Latent Class Variable Implementing Models Why Good Models Fail Scoring errors Backend failures

Segmentation Analysis Segmentation analysis groups variables with like characteristics. Can be market driven: analyst determines the segments. Can be data driven: data determines the segments (clustering) MALE FEMALE M S D W M S D W < 40 55 58 54 45 35 52 46 31 40-49 57 60 58 55 40 54 49 37 50-59 61 63 60 58 46 55 51 42 60+ 58 60 61 55 44 46 36 27

Profiling: Credit Card Customers High R I S K Low Low Low Potential Average Balance = $1,089 Average APR = 12.3% Average Tenure = 2.4 Years Average Charge-off = $111 Average Profits = $8 Good Potential Average Balance = $549 Average APR = 8.4% Average Tenure = 1.2 Years Average Charge-off = $29 Average Profits = $33 Revenue Cautious Potential Average Balance = $5,315 Average APR = 15.8% Average Tenure = 2.8 Years Average Charge-off = $584 Average Profits = $239 Best Customer Average Balance = $3,288 Average APR = 13.7% Average Tenure = 3.7 Years Average Charge-off = $102 Average Profits = $440 High

Profiling: Credit Card Customers High R I S K Low Potential Charge Annual Fee Increase APR Low Priority Service No Solicitations Good Potential Good Customer Service Decrease APR Offer Balance Transfers Offer Cardholder Benefits Cautious Potential Charge Annual Fee Increase APR Monitor Payment Behavior Offer secured loan Best Customer Best Customer Service No Annual Fee Automatic Line Increase Offer Cardholder Benefits Low Low Revenue High

Association Rules Rules derived from past behavior such as movement on Website or purchase groupings. Used to enhance Website structure and modify Web traffic. Used to make real time targeted offers.

Linear Regression Uses continuous values to predict continuous value. Explains variation in data using ordinary least squares (OLS). Useful in predicting: amount of sale ~ advertising, cost, demographics charge-off dollars ~ balance, financial risk profile, demographics amount of claim ~ age, health risk profile, geography dollar balance ~ financial risk profile, action to account, market pressure average profitability ~ financial risk profile, price sensitivity, demographics

Simple Linear Regression Advertising $120 $160 $205 $210 $225 $230 $290 $315 $375 $390 $440 $475 $490 $550 Sales $1,503 $1,755 $2,971 $1,682 $3,497 $1,998 $4,528 $2,937 $3,622 $4,402 $3,844 $4,470 $5,492 $4,398 S A L E S $6K $5K $4K $3K $2K $1K 0 0 $100 $200 $300 $400 $500 $600 ADVERTISING

Simple Linear Regression Goal: characterize relationship between advertising and sales Result: equation that predicts sales dollars based on advertising dollars spent S A L E S $6K $5K $4K $3K $2K $1K Minimize Squared Error Sales = B 0 + B 1 Advertising 0 0 $100 $200 $300 $400 $500 $600 ADVERTISING

Multiple Linear Regression Minimizes squared error in N-dimensional space Credit card balances payment amount years gender (0/1) Balances = 2.1774 +.0966Payment + 1.2494 Months +.4412Gender

Logistic Regression Uses continuous values to predict probability of discrete outcome Iterative method of minimizing error using method of maximum likelihood Useful in predicting probability of: response to loan offer ~ financial risk profile, demographics response to insurance offer ~ health risk profile, demographics activation ~ financial risk profile, demographics, market pressure charge-off ~ balance, financial risk profile, demographics claim ~ health risk profile, demographics fraud ~ financial risk profile, account activity account closure ~ account activity, market pressure

Logistic Regression Predicts probability of event occurring using function of linear predictors p = probability of event occurring p/(1-p) is the odds of an event occurring. Log of the odds: log(p/(1-p)) is linear function of predictors. 1 0 Uses s-shaped curve instead of linear function to fit the data. log(p/(1-p)) = B 0 + B 1 X 1 + b 2 X 2 + B n X n P = 1/(1+e -(B 0 + B 1 X 1 + b 2 X 2 + B n X n ) )

Classification Trees Mailed 10,000 Resp Rate 2.6% Male 4,677 Resp Rate 3.2% Female 5,323 Resp Rate 2.1% <$30K 1,290 Resp Rate 1.7% >$45K 1,281 Resp Rate 4.1% Age => 40 2,211 Resp Rate 4.3% $30K-$45K 2,106 Resp Rate 3.6% Age < 40 3,112 Resp Rate 0.7%

Decision Trees Profit Issue Loan Yes 97% 3% x $728 (Interest) x $4872 (Loss) $706 ($146) No $0 Decision Node Chance Node Allows you to quantify the best action.

Neural Networks amount of sale ~ advertising, cost, demographics charge-off dollars ~ balance, financial risk profile, demographics amount of claim ~ age, health risk profile, geography dollar balance ~ financial risk profile, action to account, market pressure average profitability ~ financial risk profile, price sensitivity, demographics

Artificial Neural Networks Multiple hidden nodes Each node is linear transformation of output from previous node Structure is too complex to interpret weights. Output layer Hidden layer Stopping rules Error threshold Time limit Change in error Input layer

Artificial Neural Networks Advantages Handles non-linearity Handles interactions Considered very accurate Useful for complex optimization Disadvantages Not interpretable CPU intensive Poor handling of missing data Sensitive to input variable selection Explodes categorical data Risk of over-fitting -> not robust

Genetic Algorithms Based on Darwin s Principle of Survival of the Fittest. Genetic Operators Reproduction (Copying) Mating (Crossover) Mutation (Altering) Process starts with initial population of random models. Models with poor performance (fitness) die out - are deleted.

Genetic Algorithms Methodology Fitness of the new population improves by: 1. Copying good models. 2. Mating good models to create better offspring models with improved fitness. 3. Altering good models to create mutants with improved fitness. 4. Repeat steps 1-3 until stopping rules are met. The Best Evolved model is the solution.

Genetic Algorithms Models are composed of Functions arithmetic (+, -,, ) mathematical (log, exp, max,... ) trigonometric (sin, cos, tan, arcsin,...) logics (and, or, not, gt, lt, eq,...) conditional (if-then-else) Variables independent variables numeric values (constants, random numbers)

GA s - Initialize Random Model Models Objective Predict response Let the function set consist of +, -,,, exp Let the variable set consist of 20% X1, X2, b 20% _ + 20% exp 20% 20%

GA s - Initialize Random Model Models are displayed in trees. Response + 12.5% 12.5% b X1 12.5% X1 b _ + 12.5% Repeat M times 12.5% X2 exp 12.5% 12.5% 12.5%

GA s Generate M Models Response Response _ Y = b exp(x1) X1 X2 b exp Y = X1X2 X1

GA s Compare Fitness 26% Model 1 Y = x1(b + X2) Model 2 Y = b - exp(x1) Model 3 Y = x1 - X2 Model 4 Y = x1x2 Model 5 Y = b + x1 M2 23% M3 M1 M4 17% M5 6% Fitness Value (r-square) PTF Model 1 0.61 0.29 Model 2 0.55 0.26 Model 3 0.48 0.23 Model 4 0.36 0.17 Model 5 0.12 0.06 Total 2.12 1.00 29%

Genetic Methodology Fitness Improves by: Copying models based on PTF Mating models based on PTF Altering models based on PTF Continue above until stopping rules are met The best-evolved model is the solution

Latent Class Models Used more in academic circles Software only allowed small sets and a small number of variables LatentGOLD developed by Statistical Innovations (Jay Magidson, inventor of CHAID) Scalable sofware Disparate sources of data

3 Kinds of Latent Class Models Traditional Applications in scaling and classification Factor Applications in exploratory and confirmatory factor analysis Regression Uses are in the prediction and explanation when the population is not homogenous

Traditional LCM vs. LC Factor Traditional Latent Class Models identify classes which group together persons who share similar interest/values/characteristics/behavior Latent Class Factor Models identify factors which group together variables sharing a common source of variation

Implementing Models How do we select based on model results? What is the impact to the bottom line?

Gains Table Number Accounts Predicted Actual Cum Actual Lift Cum Lift 1 48,342 4,891 10.35% 10.12% 10.12% 3.57 3.57 2 48,343 3,945 8.44% 8.16% 9.14% 2.88 3.22 3 48,342 2,783 5.32% 5.76% 8.01% 2.03 2.83 4 48,342 1,151 2.16% 2.38% 6.60% 0.84 2.33 5 48,343 519 1.03% 1.07% 5.50% 0.38 1.94 6 48,342 269 0.48% 0.56% 4.67% 0.20 1.65 7 48,342 112 0.31% 0.23% 4.04% 0.08 1.43 8 48,343 25 0.06% 0.05% 3.54% 0.02 1.25 9 48,342 5 0.01% 0.01% 3.15% 0.00 1.11 10 48,342 1 0.00% 0.00% 2.83% 0.00 1.00

Gains Chart 100% 90% P e r c e n t A c t i v e 80% 70% 60% 50% 40% 30% 20% 10% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Percent Mailed

Modeling Lifetime Value Predict probability of activation for a life insurance offer using logistic regression, neural networks, genetic algorithms. Use probability to calculate Lifetime Value (LTV) for life insurance prospect for a five year period LTV = Pr(Activation) Risk (Product Profitability+ Cross Sales)Lapse Indicator - Marketing Expense Activation - probability given by a model Risk - indices in matrix of gender marital status age group Product Profitability - present value of product specific 5 year profit measure Cross Sales additional net revenues for five years following activation Lapse Indicator adjustment based on payment method Marketing Expense - cost of package, postage & processing

Lifetime Value LTV = Pr(Active) Risk (Cross Sell Profit + Product Profitability) Lapse Indicator Index - Marketing Expense Active Cross Risk Lapse Product Average Average Sum Number Rate Sell Index Indicator Profitability LTV CUM LTV Cum LTV 1 96,685 10.36% $120 0.94 0.99 $553 $64.76 $64.76 $6,261,266 2 96,685 8.63% $104 0.99 1.00 $553 $55.35 $60.06 $11,612,984 3 96,685 5.03% $105 0.96 0.99 $553 $30.99 $50.37 $14,609,591 4 96,685 1.94% $107 0.93 0.97 $553 $11.13 $40.56 $15,685,475 5 96,685 0.96% $98 1.01 0.99 $553 $5.53 $33.55 $16,220,346 6 96,685 0.28% $101 1.02 1.00 $553 $1.09 $28.14 $16,325,522 7 96,685 0.11% $97 1.03 1.01 $553 ($0.04) $24.12 $16,321,311 8 96,685 0.08% $98 0.99 1.01 $553 ($0.26) $21.07 $16,295,747 9 96,685 0.01% $94 1.04 1.02 $553 ($0.75) $18.64 $16,223,586 10 96,685 0.00% $95 1.09 1.02 $553 ($0.78) $16.70 $16,148,199 How many deciles do you mail?

Why Good Models Fail (Allison s Top Ten for Troubleshooting) 1. Check the phones; make sure the site is functioning properly 2. Track the mail 3. Listen in on the call center 4. Implementation Issues Programming errors Inverted scoring 5. Did they pull the right group?

More of Why Good Models Fail 6. Practice crop rotation 7. External validity 8. Internal validity 9. Bad ingredients make for bad models 10.Old models, like old horses, have to be put out to pasture

All models are wrong, but some are useful. George Box

C. Olivia Rud Executive Vice President DataSquare, LLC 733 Summer St. Stamford, CT 06901 203 964-9733 x103 Olivia@datasquare.com Specializing in Data Mining, Statistical Modeling and Marketing Strategy for Marketing, Risk and Customer Relationship Management

Allison Cornia Database Marketing Manager CRM/Home & Retail Division Microsoft Corporation One Microsoft Way Redmond, WA 98052 425-882-8080 Allisonc@microsoft.com

A Basic Guide to Modeling Techniques for all Direct Marketing Challenges Allison Cornia Database Marketing Manager Microsoft Corporation C. Olivia Rud Executive Vice President Data Square, LLC