Targeted Marketing, KDD Cup and Customer Modeling



Similar documents
Introduction to Data Mining

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product

Data Mining in CRM & Direct Marketing. Jun Du The University of Western Ontario jdu43@uwo.ca

Data Mining Applications in Higher Education

Predictive Data modeling for health care: Comparative performance study of different prediction models

Mining Customer Value: From Association Rules to Direct Marketing

Modeling Lifetime Value in the Insurance Industry

Data Mining for Direct Marketing: Problems and

How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK

Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank

IBM SPSS Direct Marketing 22

Predictive Modeling and Big Data

Getting Even More Out of Ensemble Selection

Data Mining for Fun and Profit

Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI

How To Solve The Kd Cup 2010 Challenge

Mining Life Insurance Data for Customer Attrition Analysis

Data Mining Applications in Fund Raising

IBM SPSS Direct Marketing 23

not possible or was possible at a high cost for collecting the data.

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

Data Mining Techniques Chapter 4: Data Mining Applications in Marketing and Customer Relationship Management

Data Mining - Evaluation of Classifiers

Knowledge Discovery and Data Mining

Data Mining: Overview. What is Data Mining?

Data Mining Solutions for the Business Environment

Direct Marketing Profit Model. Bruce Lund, Marketing Associates, Detroit, Michigan and Wilmington, Delaware

Knowledge Discovery and Data Mining

Data Mining Algorithms Part 1. Dejan Sarka

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

Maximizing Return and Minimizing Cost with the Decision Management Systems

Better credit models benefit us all

DATA MINING TECHNIQUES AND APPLICATIONS

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

Using News Articles to Predict Stock Price Movements

Introduction Predictive Analytics Tools: Weka

Why do statisticians "hate" us?

Data Mining. Nonlinear Classification

Paper AA Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

Predictive Analytics Applied: Marketing and Web

Using Control Groups to Target on Predicted Lift:

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

Benchmarking of different classes of models used for credit scoring

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

Successfully Implementing Predictive Analytics in Direct Marketing

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

An Introduction to Data Mining

How To Identify A Churner

The Predictive Data Mining Revolution in Scorecards:

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Survey Analysis: Data Mining versus Standard Statistical Analysis for Better Analysis of Survey Responses

Social Media Mining. Data Mining Essentials

Improve Marketing Campaign ROI using Uplift Modeling. Ryan Zhao

Why Ensembles Win Data Mining Competitions

Knowledge Discovery and Data Mining

Question 2 Naïve Bayes (16 points)

Data Mining: STATISTICA

IBM SPSS Direct Marketing 19

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

Target Analytics Nonprofit Cooperative Database

Alex Vidras, David Tysinger. Merkle Inc.

Easily Identify Your Best Customers

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

Crowdfunding Support Tools: Predicting Success & Failure

APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Predicting Customer Churn in the Telecommunications Industry An Application of Survival Analysis Modeling Using SAS

Information Management course

Issues in Information Systems Volume 16, Issue IV, pp , 2015

Java Modules for Time Series Analysis

Predicting Student Performance by Using Data Mining Methods for Classification

Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry

Role of Social Networking in Marketing using Data Mining

Data Mining. Dr. Saed Sayad. University of Toronto

A Game Theoretical Framework for Adversarial Learning

Machine Learning, Data Mining, and Knowledge Discovery: An Introduction

Determining optimum insurance product portfolio through predictive analytics BADM Final Project Report

Selecting Data Mining Model for Web Advertising in Virtual Communities

The Data Mining Process

Performance Metrics for Graph Mining Tasks

Transcription:

Targeted Marketing, KDD Cup and Customer Modeling

Outline Direct Marketing Review: Evaluation: Lift, Gains KDD Cup 1997 Lift and Benefit estimation Privacy and Data Mining 2

Direct Marketing Paradigm Find most likely prospects to contact Not everybody needs to be contacted Number of targets is usually much smaller than number of prospects Typical Applications retailers, catalogues, direct mail (and e-mail) customer acquisition, cross-sell, attrition... 3

Direct Marketing Evaluation Accuracy on the entire dataset is not the right measure Approach develop a target model score all prospects and rank them by decreasing score select top P% of prospects for action Evaluate Performance on top P% using Gains and Lift 4

5 15 25 35 45 55 65 75 85 95 CPH (Gains): Random List vs Model-ranked list Cumulative % Hits 100 90 80 70 60 50 40 30 20 10 0 5% of random list have 5% of targets, but 5% of model ranked list have 21% of targets CPH(5%,model)=21%. Random Model Pct list

5 15 25 35 45 55 65 75 85 95 Lift Curve Lift(P) = CPH(P) / P Lift (at 5%) = 21% / 5% = 4.2 better than random 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 Lift P -- percent of the list

KDD-CUP 1997 Task: given data on past responders to fund-raising, predict most likely responders for new campaign Population of 750K prospects 10K responded to a broad campaign mailing (1.4% response rate) Analysis file included a stratified (non-random) sample of 10K responders and 26K non-responders (28.7% response rate) 75% used for learning; 25% used for validation target variable removed from the validation data set

KDD-CUP 1997 Data Set 321 fields/variables with sanitized names and labels Demographic information Credit history Promotion history Significant effort on data preprocessing leaker detection and removal

KDD-CUP Participant Statistics 45 companies/institutions participated 23 research prototypes 22 commercial tools 16 contestants turned in their results 9 research prototypes 7 commercial tools

KDD-CUP Algorithm Statistics Of the 16 software/tools (Score as % of best) Algorithm # of Entries Ave. Score Rules 2 87 k-nn 1 85 Bayesian 3 83 Multiple/Hybrid 4 79 Other 2 68 Decision Tree 4 44

KDD Cup 97 Evaluation Best Gains at 40% Urban Science BNB Mineset Best Gains at 10% BNB Urban Science Mineset 11

KDD-CUP 1997 Awards The GOLD MINER award is jointly shared by two contestants this year 1) Charles Elkan, Ph.D. from University of California, San Diego with his software BNB, Boosted Naive Bayesian Classifier 1) Urban Science Applications, Inc. with their software gain, Direct Marketing Selection System The BRONZE MINER award went to the runner-up 3) Silicon Graphics, Inc with their software MineSet

KDD-CUP Results Discussion Top finishers very close Naïve Bayes algorithm was used by 2 of the top 3 contestants (BNB and MineSet) BNB and MineSet did little data preprocessing MineSet used a total of 6 variables in their final model Urban Science implemented a tremendous amount of automated data preprocessing and exploratory data analysis and developed more than 50 models in an automated fashion to get to their results

KDD Cup 1997: Top 3 results Top 3 finishers are very close 14

KDD Cup 1997 worst results Note that the worst result (C6) was actually worse than random. 15 Competitor names were kept anonymous, apart from top 3 winners

Better Model Evaluation? Comparing Gains at 10% and 40% is ad-hoc Are there more principled methods? Area Under the Curve (AUC) of Gains Chart Lift Quality Ultimately, financial measures: Campaign Benefits 16

Model Evaluation: AUC Area Under the Curve (AUC) is defined as the Difference between Gains and Random Curves Cum % Hits 17 Selection

Model Evaluation: Lift Quality AUC(Model) AUC(Random) LQ = ----------------------------- AUC(Perfect) AUC(Random) See Measuring Lift Quality in Database Marketing, Piatetsky- Shapiro and Steingold, SIGKDD Explorations, December 2000. 18

Lift Quality (Lquality) For a perfect model, Lquality = 100% For a random model, Lquality = 0 For KDD Cup 97, Lquality(Urban Science) = 43.3% Lquality(Elkan) = 42.7% However, small differences in Lquality are not significant 19

Estimating Profit: Campaign Parameters Direct Mail example N -- number of prospects, e.g. 750,000 T -- fraction of targets, e.g. 0.014 B -- benefit of hitting a target, e.g. $20 Note: this is simplification actual benefit will vary C -- cost of contacting a prospect, e.g. $0.68 P -- percentage selected for contact, e.g. 10% Lift(P ) -- model lift at P, e.g. 3 20

Contacting Top P of Model-Sorted List Using previous example, let selection be P = 10% and Lift(P) = 3 Selection size = N P, e.g. 75,000 Random has N P T targets in first P list, e.g. 1,050 Q: How many targets are in model P-selection? Model has more by a factor Lift(P) or N P T Lift(P) targets in the selection, e.g. 3,150 Benefit of contacting the selection is N P T Lift(P) B, e.g. $63,000 Cost of contacting N P is N P C, e.g. $51,000 21

Profit of Contacting Top P Profit(P) = Benefit(P) Cost(P) = N P T Lift(P) B - N P C = NP (T Lift(P) B - C ) e.g. $12,000 Q: When is Profit Positive? C When T Lift(P) B > C, or Lift(P) > ------, e.g. 2.4 T B 22

Finding Optimal Cutoff 60 Use the formula to estimate benefit for each P Find optimal P 40 20 0-20 -40-60 10 20 30 40 50 60 70 80 90 100 Est Payoff

*Feasibility Assessment Expected Profit(P) depends on known Cost C, Benefit B, Target Rate T and unknown Lift(P) To compute Lift(P) we need to get all the data, load it, clean it, ask for correct data, build models,... 24

*Can Expected Lift be estimated? only from N and T? In theory -- no, but in many practical applications,?!?! surprisingly yes?!?! 25

*Empirical Observations about Lift For good models, usually Lift(P) is monotically decreasing with P Lift at fixed P (e.g. 0.05) is usually higher for lower T Special point P = T for a perfect predictor, all targets are in the first T of the list, for a maximum lift of 1/T What can we expect compared to 1/T? 26

*Meta Analysis of Lift 26 attrition & cross-sell problems from finance and telecom domains N ranges from 1,000 to 150,000 T ranges from 1% to 22% No clear relation to N, but there is dependence on T 27

*Results: Lift(T) vs 1/T Tried several linear and log-linear fits Best Model (R 2 = 0.86) log10(lift(t)) = -0.05 + 0.52 log10(1/t) Approximately Lift(T) ~ T -0.5 = sqrt (1/T) 28

Lift *Actual Lift(T) vs sqrt(1/t) for All Problems 14 12 10 8 6 4 2 0 Actual lift(t) Est. lift(t) Error = Actual Lift - sqrt(1/t) 0 5 10 15 20 25 100*T% Avg(Error) = -0.08 St. Dev(Error) = 1.0 29

*GPS Lift(T) Rule of Thumb For targeted marketing campaigns, where 0.01 < T < 0.25, Lift(T) = sqrt (1/T) 1 Exceptions for truly predictable or random behaviors poor models information leakers 30

*Estimating Entire Curve Cumulative Percent Hits CPH(P) = Lift(P) * P CPH is easier to model than Lift Several regressions for all CPH curves Best results with regression log10(cph(p)) = a + b log10(p) Average R 2 = 0.97 31

*CPH Curve Estimate Approximately CPH(P) ~ sqrt(p) bounds: P 0.6 < CPH(P) < P 0.4 32

*Lift Curve Estimate Since Lift(P) = CPH(P)/P Lift(P) ~ 1/sqrt(P) bounds: (1/P ) 0.4 < Lift(P) < (1/P ) 0.6 33

*More on Estimating Lift and Profitability G. Piatetsky-Shapiro, B. Masand, Estimating Campaign Benefits and Modeling Lift, Proc. KDD-99, ACM. www.kdnuggets.com/gpspubs/ 34

KDD Cup 1998 Data from Paralyzed Veterans of America (charity) Goal: select mailing with the highest profit Winners: Urban Science, SAS, Quadstone see full results and winner s presentations at www.kdnuggets.com/meetings/kdd98 35

KDD-CUP-98 Analysis Universe Paralyzed Veterans of America (PVA), a not-forprofit organization that provides programs and services for US veterans with spinal cord injuries or disease, generously provided the data set PVA s June 97 fund raising mailing, sent to 3.5 million donors, was selected as the competition data Within this universe, a group of 200K Lapsed donors was of particular interest to PVA. Lapsed donors are individuals who made their last donation to PVA 13 to 24 months prior to the mailing 36

KDD Cup-98 Example Evaluation: Expected profit maximization with a mailing cost of $0.68 Sum of (actual donation-$0.68) for all records with predicted/ expected donation > $0.68 Participant with the highest actual sum wins 37

KDD Cup Cost Matrix Predicted Donation Yes No Actual Donation Yes DonationAmt -0.68 0 No -0.68 0 38

KDD Cup 1998 Results Model Selecte d Result GainSmarts 56,330 $14,712 1 SAS 55,838 $14,662 2 Quadstone 57,836 $13,954 3 *ALL* 96,367 $10,560 13 #20 42,270 $1,706 20 #21 1,551 $ -54 21 Rank Selected: how many were selected by the model Result: the total profit (donations-cost) of the model *ALL* - selecting all 39

Summary KDD Cup 1997 case study Model Evaluation: AUC and Lift Quality Estimating Campaign Profit *Feasibility Assessment GPS Rule of Thumb for Typical Lift Curve KDD Cup 1998 40