Targeted Marketing, KDD Cup and Customer Modeling

Size: px
Start display at page:

Download "Targeted Marketing, KDD Cup and Customer Modeling"

Transcription

1 Targeted Marketing, KDD Cup and Customer Modeling

2 Outline Direct Marketing Review: Evaluation: Lift, Gains KDD Cup 1997 Lift and Benefit estimation Privacy and Data Mining 2

3 Direct Marketing Paradigm Find most likely prospects to contact Not everybody needs to be contacted Number of targets is usually much smaller than number of prospects Typical Applications retailers, catalogues, direct mail (and ) customer acquisition, cross-sell, attrition... 3

4 Direct Marketing Evaluation Accuracy on the entire dataset is not the right measure Approach develop a target model score all prospects and rank them by decreasing score select top P% of prospects for action Evaluate Performance on top P% using Gains and Lift 4

5 CPH (Gains): Random List vs Model-ranked list Cumulative % Hits % of random list have 5% of targets, but 5% of model ranked list have 21% of targets CPH(5%,model)=21%. Random Model Pct list

6 Lift Curve Lift(P) = CPH(P) / P Lift (at 5%) = 21% / 5% = 4.2 better than random Lift P -- percent of the list

7 KDD-CUP 1997 Task: given data on past responders to fund-raising, predict most likely responders for new campaign Population of 750K prospects 10K responded to a broad campaign mailing (1.4% response rate) Analysis file included a stratified (non-random) sample of 10K responders and 26K non-responders (28.7% response rate) 75% used for learning; 25% used for validation target variable removed from the validation data set

8 KDD-CUP 1997 Data Set 321 fields/variables with sanitized names and labels Demographic information Credit history Promotion history Significant effort on data preprocessing leaker detection and removal

9 KDD-CUP Participant Statistics 45 companies/institutions participated 23 research prototypes 22 commercial tools 16 contestants turned in their results 9 research prototypes 7 commercial tools

10 KDD-CUP Algorithm Statistics Of the 16 software/tools (Score as % of best) Algorithm # of Entries Ave. Score Rules 2 87 k-nn 1 85 Bayesian 3 83 Multiple/Hybrid 4 79 Other 2 68 Decision Tree 4 44

11 KDD Cup 97 Evaluation Best Gains at 40% Urban Science BNB Mineset Best Gains at 10% BNB Urban Science Mineset 11

12 KDD-CUP 1997 Awards The GOLD MINER award is jointly shared by two contestants this year 1) Charles Elkan, Ph.D. from University of California, San Diego with his software BNB, Boosted Naive Bayesian Classifier 1) Urban Science Applications, Inc. with their software gain, Direct Marketing Selection System The BRONZE MINER award went to the runner-up 3) Silicon Graphics, Inc with their software MineSet

13 KDD-CUP Results Discussion Top finishers very close Naïve Bayes algorithm was used by 2 of the top 3 contestants (BNB and MineSet) BNB and MineSet did little data preprocessing MineSet used a total of 6 variables in their final model Urban Science implemented a tremendous amount of automated data preprocessing and exploratory data analysis and developed more than 50 models in an automated fashion to get to their results

14 KDD Cup 1997: Top 3 results Top 3 finishers are very close 14

15 KDD Cup 1997 worst results Note that the worst result (C6) was actually worse than random. 15 Competitor names were kept anonymous, apart from top 3 winners

16 Better Model Evaluation? Comparing Gains at 10% and 40% is ad-hoc Are there more principled methods? Area Under the Curve (AUC) of Gains Chart Lift Quality Ultimately, financial measures: Campaign Benefits 16

17 Model Evaluation: AUC Area Under the Curve (AUC) is defined as the Difference between Gains and Random Curves Cum % Hits 17 Selection

18 Model Evaluation: Lift Quality AUC(Model) AUC(Random) LQ = AUC(Perfect) AUC(Random) See Measuring Lift Quality in Database Marketing, Piatetsky- Shapiro and Steingold, SIGKDD Explorations, December

19 Lift Quality (Lquality) For a perfect model, Lquality = 100% For a random model, Lquality = 0 For KDD Cup 97, Lquality(Urban Science) = 43.3% Lquality(Elkan) = 42.7% However, small differences in Lquality are not significant 19

20 Estimating Profit: Campaign Parameters Direct Mail example N -- number of prospects, e.g. 750,000 T -- fraction of targets, e.g B -- benefit of hitting a target, e.g. $20 Note: this is simplification actual benefit will vary C -- cost of contacting a prospect, e.g. $0.68 P -- percentage selected for contact, e.g. 10% Lift(P ) -- model lift at P, e.g. 3 20

21 Contacting Top P of Model-Sorted List Using previous example, let selection be P = 10% and Lift(P) = 3 Selection size = N P, e.g. 75,000 Random has N P T targets in first P list, e.g. 1,050 Q: How many targets are in model P-selection? Model has more by a factor Lift(P) or N P T Lift(P) targets in the selection, e.g. 3,150 Benefit of contacting the selection is N P T Lift(P) B, e.g. $63,000 Cost of contacting N P is N P C, e.g. $51,000 21

22 Profit of Contacting Top P Profit(P) = Benefit(P) Cost(P) = N P T Lift(P) B - N P C = NP (T Lift(P) B - C ) e.g. $12,000 Q: When is Profit Positive? C When T Lift(P) B > C, or Lift(P) > , e.g. 2.4 T B 22

23 Finding Optimal Cutoff 60 Use the formula to estimate benefit for each P Find optimal P Est Payoff

24 *Feasibility Assessment Expected Profit(P) depends on known Cost C, Benefit B, Target Rate T and unknown Lift(P) To compute Lift(P) we need to get all the data, load it, clean it, ask for correct data, build models,... 24

25 *Can Expected Lift be estimated? only from N and T? In theory -- no, but in many practical applications,?!?! surprisingly yes?!?! 25

26 *Empirical Observations about Lift For good models, usually Lift(P) is monotically decreasing with P Lift at fixed P (e.g. 0.05) is usually higher for lower T Special point P = T for a perfect predictor, all targets are in the first T of the list, for a maximum lift of 1/T What can we expect compared to 1/T? 26

27 *Meta Analysis of Lift 26 attrition & cross-sell problems from finance and telecom domains N ranges from 1,000 to 150,000 T ranges from 1% to 22% No clear relation to N, but there is dependence on T 27

28 *Results: Lift(T) vs 1/T Tried several linear and log-linear fits Best Model (R 2 = 0.86) log10(lift(t)) = log10(1/t) Approximately Lift(T) ~ T -0.5 = sqrt (1/T) 28

29 Lift *Actual Lift(T) vs sqrt(1/t) for All Problems Actual lift(t) Est. lift(t) Error = Actual Lift - sqrt(1/t) *T% Avg(Error) = St. Dev(Error) =

30 *GPS Lift(T) Rule of Thumb For targeted marketing campaigns, where 0.01 < T < 0.25, Lift(T) = sqrt (1/T) 1 Exceptions for truly predictable or random behaviors poor models information leakers 30

31 *Estimating Entire Curve Cumulative Percent Hits CPH(P) = Lift(P) * P CPH is easier to model than Lift Several regressions for all CPH curves Best results with regression log10(cph(p)) = a + b log10(p) Average R 2 =

32 *CPH Curve Estimate Approximately CPH(P) ~ sqrt(p) bounds: P 0.6 < CPH(P) < P

33 *Lift Curve Estimate Since Lift(P) = CPH(P)/P Lift(P) ~ 1/sqrt(P) bounds: (1/P ) 0.4 < Lift(P) < (1/P )

34 *More on Estimating Lift and Profitability G. Piatetsky-Shapiro, B. Masand, Estimating Campaign Benefits and Modeling Lift, Proc. KDD-99, ACM. 34

35 KDD Cup 1998 Data from Paralyzed Veterans of America (charity) Goal: select mailing with the highest profit Winners: Urban Science, SAS, Quadstone see full results and winner s presentations at 35

36 KDD-CUP-98 Analysis Universe Paralyzed Veterans of America (PVA), a not-forprofit organization that provides programs and services for US veterans with spinal cord injuries or disease, generously provided the data set PVA s June 97 fund raising mailing, sent to 3.5 million donors, was selected as the competition data Within this universe, a group of 200K Lapsed donors was of particular interest to PVA. Lapsed donors are individuals who made their last donation to PVA 13 to 24 months prior to the mailing 36

37 KDD Cup-98 Example Evaluation: Expected profit maximization with a mailing cost of $0.68 Sum of (actual donation-$0.68) for all records with predicted/ expected donation > $0.68 Participant with the highest actual sum wins 37

38 KDD Cup Cost Matrix Predicted Donation Yes No Actual Donation Yes DonationAmt No

39 KDD Cup 1998 Results Model Selecte d Result GainSmarts 56,330 $14,712 1 SAS 55,838 $14,662 2 Quadstone 57,836 $13,954 3 *ALL* 96,367 $10, #20 42,270 $1, #21 1,551 $ Rank Selected: how many were selected by the model Result: the total profit (donations-cost) of the model *ALL* - selecting all 39

40 Summary KDD Cup 1997 case study Model Evaluation: AUC and Lift Quality Estimating Campaign Profit *Feasibility Assessment GPS Rule of Thumb for Typical Lift Curve KDD Cup

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Sagarika Prusty Web Data Mining (ECT 584),Spring 2013 DePaul University,Chicago [email protected] Keywords:

More information

Data Mining in CRM & Direct Marketing. Jun Du The University of Western Ontario [email protected]

Data Mining in CRM & Direct Marketing. Jun Du The University of Western Ontario jdu43@uwo.ca Data Mining in CRM & Direct Marketing Jun Du The University of Western Ontario [email protected] Outline Why CRM & Marketing Goals in CRM & Marketing Models and Methodologies Case Study: Response Model Case

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

Predictive Data modeling for health care: Comparative performance study of different prediction models

Predictive Data modeling for health care: Comparative performance study of different prediction models Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath [email protected] National Institute of Industrial Engineering (NITIE) Vihar

More information

Mining Customer Value: From Association Rules to Direct Marketing

Mining Customer Value: From Association Rules to Direct Marketing Mining Customer Value: From Association Rules to Direct Marketing Ke Wang Simon Fraser University [email protected] Senqiang Zhou Simon Fraser University [email protected] Qiang Yang Hong Kong University

More information

Modeling Lifetime Value in the Insurance Industry

Modeling Lifetime Value in the Insurance Industry Modeling Lifetime Value in the Insurance Industry C. Olivia Parr Rud, Executive Vice President, Data Square, LLC ABSTRACT Acquisition modeling for direct mail insurance has the unique challenge of targeting

More information

Data Mining for Direct Marketing: Problems and

Data Mining for Direct Marketing: Problems and Data Mining for Direct Marketing: Problems and Solutions Charles X. Ling and Chenghui Li Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 Tel: 519-661-3341;

More information

How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK

How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK Agenda Analytics why now? The process around data and text mining Case Studies The Value of Information

More information

Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry

Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry Paper 12028 Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry Junxiang Lu, Ph.D. Overland Park, Kansas ABSTRACT Increasingly, companies are viewing

More information

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through

More information

IBM SPSS Direct Marketing 22

IBM SPSS Direct Marketing 22 IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release

More information

Predictive Modeling and Big Data

Predictive Modeling and Big Data Predictive Modeling and Presented by Eileen Burns, FSA, MAAA Milliman Agenda Current uses of predictive modeling in the life insurance industry Potential applications of 2 1 June 16, 2014 [Enter presentation

More information

Getting Even More Out of Ensemble Selection

Getting Even More Out of Ensemble Selection Getting Even More Out of Ensemble Selection Quan Sun Department of Computer Science The University of Waikato Hamilton, New Zealand [email protected] ABSTRACT Ensemble Selection uses forward stepwise

More information

Data Mining for Fun and Profit

Data Mining for Fun and Profit Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools

More information

Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI

Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI Data Mining Knowledge Discovery, Data Warehousing and Machine Learning Final remarks Lecturer: JERZY STEFANOWSKI Email: [email protected] Data Mining a step in A KDD Process Data mining:

More information

How To Solve The Kd Cup 2010 Challenge

How To Solve The Kd Cup 2010 Challenge A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China [email protected] [email protected]

More information

Mining Life Insurance Data for Customer Attrition Analysis

Mining Life Insurance Data for Customer Attrition Analysis Mining Life Insurance Data for Customer Attrition Analysis T. L. Oshini Goonetilleke Informatics Institute of Technology/Department of Computing, Colombo, Sri Lanka Email: [email protected] H. A. Caldera

More information

Data Mining Applications in Fund Raising

Data Mining Applications in Fund Raising Data Mining Applications in Fund Raising Nafisseh Heiat Data mining tools make it possible to apply mathematical models to the historical data to manipulate and discover new information. In this study,

More information

IBM SPSS Direct Marketing 23

IBM SPSS Direct Marketing 23 IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release

More information

not possible or was possible at a high cost for collecting the data.

not possible or was possible at a high cost for collecting the data. Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

More information

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT

More information

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign Arun K Mandapaka, Amit Singh Kushwah, Dr.Goutam Chakraborty Oklahoma State University, OK, USA ABSTRACT Direct

More information

Data Mining Techniques Chapter 4: Data Mining Applications in Marketing and Customer Relationship Management

Data Mining Techniques Chapter 4: Data Mining Applications in Marketing and Customer Relationship Management Data Mining Techniques Chapter 4: Data Mining Applications in Marketing and Customer Relationship Management Prospecting........................................................... 2 DM to choose the right

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

More information

Data Mining: Overview. What is Data Mining?

Data Mining: Overview. What is Data Mining? Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over

More information

Direct Marketing Profit Model. Bruce Lund, Marketing Associates, Detroit, Michigan and Wilmington, Delaware

Direct Marketing Profit Model. Bruce Lund, Marketing Associates, Detroit, Michigan and Wilmington, Delaware Paper CI-04 Direct Marketing Profit Model Bruce Lund, Marketing Associates, Detroit, Michigan and Wilmington, Delaware ABSTRACT A net lift model gives the expected incrementality (the incremental rate)

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk [email protected] Tom Kelsey ID5059-17-AUC

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES Translating data into business value requires the right data mining and modeling techniques which uncover important patterns within

More information

Maximizing Return and Minimizing Cost with the Decision Management Systems

Maximizing Return and Minimizing Cost with the Decision Management Systems KDD 2012: Beijing 18 th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Rich Holada, Vice President, IBM SPSS Predictive Analytics Maximizing Return and Minimizing Cost with the Decision Management

More information

Better credit models benefit us all

Better credit models benefit us all Better credit models benefit us all Agenda Credit Scoring - Overview Random Forest - Overview Random Forest outperform logistic regression for credit scoring out of the box Interaction term hypothesis

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19 PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations

More information

Using News Articles to Predict Stock Price Movements

Using News Articles to Predict Stock Price Movements Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 [email protected] 21, June 15,

More information

Introduction Predictive Analytics Tools: Weka

Introduction Predictive Analytics Tools: Weka Introduction Predictive Analytics Tools: Weka Predictive Analytics Center of Excellence San Diego Supercomputer Center University of California, San Diego Tools Landscape Considerations Scale User Interface

More information

Why do statisticians "hate" us?

Why do statisticians hate us? Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data

More information

Data Mining. Nonlinear Classification

Data Mining. Nonlinear Classification Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

More information

Paper AA-08-2015. Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM

Paper AA-08-2015. Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM Paper AA-08-2015 Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM Delali Agbenyegah, Alliance Data Systems, Columbus, Ohio 0.0 ABSTRACT Traditional

More information

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an

More information

Predictive Analytics Applied: Marketing and Web

Predictive Analytics Applied: Marketing and Web Predictive Analytics Applied: Marketing and Web Brought to you by Prediction Impact and World Organization of Webmasters (WOW) Eric Siegel, Ph.D. [email protected] (415) 683-1146 Training Program

More information

Using Control Groups to Target on Predicted Lift:

Using Control Groups to Target on Predicted Lift: Using Control Groups to Target on Predicted Lift: Building and Assessing Uplift Models Nicholas J. Radcliffe Portrait Software The Smith Centre The Fairmile Henley-on-Thames Oxfordshire RG9 6AB UK Department

More information

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation

More information

Benchmarking of different classes of models used for credit scoring

Benchmarking of different classes of models used for credit scoring Benchmarking of different classes of models used for credit scoring We use this competition as an opportunity to compare the performance of different classes of predictive models. In particular we want

More information

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges A Basic Guide to Modeling Techniques for All Direct Marketing Challenges Allison Cornia Database Marketing Manager Microsoft Corporation C. Olivia Rud Executive Vice President Data Square, LLC Overview

More information

Successfully Implementing Predictive Analytics in Direct Marketing

Successfully Implementing Predictive Analytics in Direct Marketing Successfully Implementing Predictive Analytics in Direct Marketing John Blackwell and Tracy DeCanio, The Nature Conservancy, Arlington, VA ABSTRACT Successfully Implementing Predictive Analytics in Direct

More information

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis ElegantJ BI White Paper The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis Integrated Business Intelligence and Reporting for Performance Management, Operational

More information

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

How To Identify A Churner

How To Identify A Churner 2012 45th Hawaii International Conference on System Sciences A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication Namhyoung Kim, Jaewook Lee Department of Industrial and Management

More information

The Predictive Data Mining Revolution in Scorecards:

The Predictive Data Mining Revolution in Scorecards: January 13, 2013 StatSoft White Paper The Predictive Data Mining Revolution in Scorecards: Accurate Risk Scoring via Ensemble Models Summary Predictive modeling methods, based on machine learning algorithms

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Survey Analysis: Data Mining versus Standard Statistical Analysis for Better Analysis of Survey Responses

Survey Analysis: Data Mining versus Standard Statistical Analysis for Better Analysis of Survey Responses Survey Analysis: Data Mining versus Standard Statistical Analysis for Better Analysis of Survey Responses Salford Systems Data Mining 2006 March 27-31 2006 San Diego, CA By Dean Abbott Abbott Analytics

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Improve Marketing Campaign ROI using Uplift Modeling. Ryan Zhao http://www.analyticsresourcing.com

Improve Marketing Campaign ROI using Uplift Modeling. Ryan Zhao http://www.analyticsresourcing.com Improve Marketing Campaign ROI using Uplift Modeling Ryan Zhao http://www.analyticsresourcing.com Objective To introduce how uplift model improve ROI To explore advanced modeling techniques for uplift

More information

Why Ensembles Win Data Mining Competitions

Why Ensembles Win Data Mining Competitions Why Ensembles Win Data Mining Competitions A Predictive Analytics Center of Excellence (PACE) Tech Talk November 14, 2012 Dean Abbott Abbott Analytics, Inc. Blog: http://abbottanalytics.blogspot.com URL:

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 10 Sajjad Haider Fall 2012 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

More information

Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

More information

Data Mining: STATISTICA

Data Mining: STATISTICA Data Mining: STATISTICA Outline Prepare the data Classification and regression 1 Prepare the Data Statistica can read from Excel,.txt and many other types of files Compared with WEKA, Statistica is much

More information

IBM SPSS Direct Marketing 19

IBM SPSS Direct Marketing 19 IBM SPSS Direct Marketing 19 Note: Before using this information and the product it supports, read the general information under Notices on p. 105. This document contains proprietary information of SPSS

More information

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4. Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics

More information

Target Analytics Nonprofit Cooperative Database

Target Analytics Nonprofit Cooperative Database Target Analytics Nonprofit Cooperative Database Solution Overview Target Analytics Nonprofit Cooperative Database Lists and Predictive Models for Effective Fundraising You know how challenging it is to

More information

Alex Vidras, David Tysinger. Merkle Inc.

Alex Vidras, David Tysinger. Merkle Inc. Using PROC LOGISTIC, SAS MACROS and ODS Output to evaluate the consistency of independent variables during the development of logistic regression models. An example from the retail banking industry ABSTRACT

More information

Easily Identify Your Best Customers

Easily Identify Your Best Customers IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do

More information

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: What do the data look like? Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses

More information

Crowdfunding Support Tools: Predicting Success & Failure

Crowdfunding Support Tools: Predicting Success & Failure Crowdfunding Support Tools: Predicting Success & Failure Michael D. Greenberg Bryan Pardo [email protected] [email protected] Karthic Hariharan [email protected] tern.edu Elizabeth

More information

APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING

APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING Wrocław University of Technology Internet Engineering Henryk Maciejewski APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING PRACTICAL GUIDE Wrocław (2011) 1 Copyright by Wrocław University of Technology

More information

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification

More information

Predicting Customer Churn in the Telecommunications Industry An Application of Survival Analysis Modeling Using SAS

Predicting Customer Churn in the Telecommunications Industry An Application of Survival Analysis Modeling Using SAS Paper 114-27 Predicting Customer in the Telecommunications Industry An Application of Survival Analysis Modeling Using SAS Junxiang Lu, Ph.D. Sprint Communications Company Overland Park, Kansas ABSTRACT

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])

More information

Issues in Information Systems Volume 16, Issue IV, pp. 30-36, 2015

Issues in Information Systems Volume 16, Issue IV, pp. 30-36, 2015 DATA MINING ANALYSIS AND PREDICTIONS OF REAL ESTATE PRICES Victor Gan, Seattle University, [email protected] Vaishali Agarwal, Seattle University, [email protected] Ben Kim, Seattle University, [email protected]

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

Predicting Student Performance by Using Data Mining Methods for Classification

Predicting Student Performance by Using Data Mining Methods for Classification BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance

More information

Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry

Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry Advances in Natural and Applied Sciences, 3(1): 73-78, 2009 ISSN 1995-0772 2009, American Eurasian Network for Scientific Information This is a refereed journal and all articles are professionally screened

More information

Role of Social Networking in Marketing using Data Mining

Role of Social Networking in Marketing using Data Mining Role of Social Networking in Marketing using Data Mining Mrs. Saroj Junghare Astt. Professor, Department of Computer Science and Application St. Aloysius College, Jabalpur, Madhya Pradesh, India Abstract:

More information

Data Mining. Dr. Saed Sayad. University of Toronto 2010 [email protected]. http://chem-eng.utoronto.ca/~datamining/

Data Mining. Dr. Saed Sayad. University of Toronto 2010 saed.sayad@utoronto.ca. http://chem-eng.utoronto.ca/~datamining/ Data Mining Dr. Saed Sayad University of Toronto 2010 [email protected] http://chem-eng.utoronto.ca/~datamining/ 1 Data Mining Data mining is about explaining the past and predicting the future by

More information

A Game Theoretical Framework for Adversarial Learning

A Game Theoretical Framework for Adversarial Learning A Game Theoretical Framework for Adversarial Learning Murat Kantarcioglu University of Texas at Dallas Richardson, TX 75083, USA muratk@utdallas Chris Clifton Purdue University West Lafayette, IN 47907,

More information

Machine Learning, Data Mining, and Knowledge Discovery: An Introduction

Machine Learning, Data Mining, and Knowledge Discovery: An Introduction Machine Learning, Data Mining, and Knowledge Discovery: An Introduction AHPCRC Workshop - 8/17/10 - Dr. Martin Based on slides by Gregory Piatetsky-Shapiro from Kdnuggets http://www.kdnuggets.com/data_mining_course/

More information

Determining optimum insurance product portfolio through predictive analytics BADM Final Project Report

Determining optimum insurance product portfolio through predictive analytics BADM Final Project Report 2012 Determining optimum insurance product portfolio through predictive analytics BADM Final Project Report Dinesh Ganti(61310071), Gauri Singh(61310560), Ravi Shankar(61310210), Shouri Kamtala(61310215),

More information

Selecting Data Mining Model for Web Advertising in Virtual Communities

Selecting Data Mining Model for Web Advertising in Virtual Communities Selecting Data Mining for Web Advertising in Virtual Communities Jerzy Surma Faculty of Business Administration Warsaw School of Economics Warsaw, Poland e-mail: [email protected] Mariusz Łapczyński

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

Performance Metrics for Graph Mining Tasks

Performance Metrics for Graph Mining Tasks Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical

More information