Targeted Marketing, KDD Cup and Customer Modeling
|
|
|
- Sandra Burns
- 10 years ago
- Views:
Transcription
1 Targeted Marketing, KDD Cup and Customer Modeling
2 Outline Direct Marketing Review: Evaluation: Lift, Gains KDD Cup 1997 Lift and Benefit estimation Privacy and Data Mining 2
3 Direct Marketing Paradigm Find most likely prospects to contact Not everybody needs to be contacted Number of targets is usually much smaller than number of prospects Typical Applications retailers, catalogues, direct mail (and ) customer acquisition, cross-sell, attrition... 3
4 Direct Marketing Evaluation Accuracy on the entire dataset is not the right measure Approach develop a target model score all prospects and rank them by decreasing score select top P% of prospects for action Evaluate Performance on top P% using Gains and Lift 4
5 CPH (Gains): Random List vs Model-ranked list Cumulative % Hits % of random list have 5% of targets, but 5% of model ranked list have 21% of targets CPH(5%,model)=21%. Random Model Pct list
6 Lift Curve Lift(P) = CPH(P) / P Lift (at 5%) = 21% / 5% = 4.2 better than random Lift P -- percent of the list
7 KDD-CUP 1997 Task: given data on past responders to fund-raising, predict most likely responders for new campaign Population of 750K prospects 10K responded to a broad campaign mailing (1.4% response rate) Analysis file included a stratified (non-random) sample of 10K responders and 26K non-responders (28.7% response rate) 75% used for learning; 25% used for validation target variable removed from the validation data set
8 KDD-CUP 1997 Data Set 321 fields/variables with sanitized names and labels Demographic information Credit history Promotion history Significant effort on data preprocessing leaker detection and removal
9 KDD-CUP Participant Statistics 45 companies/institutions participated 23 research prototypes 22 commercial tools 16 contestants turned in their results 9 research prototypes 7 commercial tools
10 KDD-CUP Algorithm Statistics Of the 16 software/tools (Score as % of best) Algorithm # of Entries Ave. Score Rules 2 87 k-nn 1 85 Bayesian 3 83 Multiple/Hybrid 4 79 Other 2 68 Decision Tree 4 44
11 KDD Cup 97 Evaluation Best Gains at 40% Urban Science BNB Mineset Best Gains at 10% BNB Urban Science Mineset 11
12 KDD-CUP 1997 Awards The GOLD MINER award is jointly shared by two contestants this year 1) Charles Elkan, Ph.D. from University of California, San Diego with his software BNB, Boosted Naive Bayesian Classifier 1) Urban Science Applications, Inc. with their software gain, Direct Marketing Selection System The BRONZE MINER award went to the runner-up 3) Silicon Graphics, Inc with their software MineSet
13 KDD-CUP Results Discussion Top finishers very close Naïve Bayes algorithm was used by 2 of the top 3 contestants (BNB and MineSet) BNB and MineSet did little data preprocessing MineSet used a total of 6 variables in their final model Urban Science implemented a tremendous amount of automated data preprocessing and exploratory data analysis and developed more than 50 models in an automated fashion to get to their results
14 KDD Cup 1997: Top 3 results Top 3 finishers are very close 14
15 KDD Cup 1997 worst results Note that the worst result (C6) was actually worse than random. 15 Competitor names were kept anonymous, apart from top 3 winners
16 Better Model Evaluation? Comparing Gains at 10% and 40% is ad-hoc Are there more principled methods? Area Under the Curve (AUC) of Gains Chart Lift Quality Ultimately, financial measures: Campaign Benefits 16
17 Model Evaluation: AUC Area Under the Curve (AUC) is defined as the Difference between Gains and Random Curves Cum % Hits 17 Selection
18 Model Evaluation: Lift Quality AUC(Model) AUC(Random) LQ = AUC(Perfect) AUC(Random) See Measuring Lift Quality in Database Marketing, Piatetsky- Shapiro and Steingold, SIGKDD Explorations, December
19 Lift Quality (Lquality) For a perfect model, Lquality = 100% For a random model, Lquality = 0 For KDD Cup 97, Lquality(Urban Science) = 43.3% Lquality(Elkan) = 42.7% However, small differences in Lquality are not significant 19
20 Estimating Profit: Campaign Parameters Direct Mail example N -- number of prospects, e.g. 750,000 T -- fraction of targets, e.g B -- benefit of hitting a target, e.g. $20 Note: this is simplification actual benefit will vary C -- cost of contacting a prospect, e.g. $0.68 P -- percentage selected for contact, e.g. 10% Lift(P ) -- model lift at P, e.g. 3 20
21 Contacting Top P of Model-Sorted List Using previous example, let selection be P = 10% and Lift(P) = 3 Selection size = N P, e.g. 75,000 Random has N P T targets in first P list, e.g. 1,050 Q: How many targets are in model P-selection? Model has more by a factor Lift(P) or N P T Lift(P) targets in the selection, e.g. 3,150 Benefit of contacting the selection is N P T Lift(P) B, e.g. $63,000 Cost of contacting N P is N P C, e.g. $51,000 21
22 Profit of Contacting Top P Profit(P) = Benefit(P) Cost(P) = N P T Lift(P) B - N P C = NP (T Lift(P) B - C ) e.g. $12,000 Q: When is Profit Positive? C When T Lift(P) B > C, or Lift(P) > , e.g. 2.4 T B 22
23 Finding Optimal Cutoff 60 Use the formula to estimate benefit for each P Find optimal P Est Payoff
24 *Feasibility Assessment Expected Profit(P) depends on known Cost C, Benefit B, Target Rate T and unknown Lift(P) To compute Lift(P) we need to get all the data, load it, clean it, ask for correct data, build models,... 24
25 *Can Expected Lift be estimated? only from N and T? In theory -- no, but in many practical applications,?!?! surprisingly yes?!?! 25
26 *Empirical Observations about Lift For good models, usually Lift(P) is monotically decreasing with P Lift at fixed P (e.g. 0.05) is usually higher for lower T Special point P = T for a perfect predictor, all targets are in the first T of the list, for a maximum lift of 1/T What can we expect compared to 1/T? 26
27 *Meta Analysis of Lift 26 attrition & cross-sell problems from finance and telecom domains N ranges from 1,000 to 150,000 T ranges from 1% to 22% No clear relation to N, but there is dependence on T 27
28 *Results: Lift(T) vs 1/T Tried several linear and log-linear fits Best Model (R 2 = 0.86) log10(lift(t)) = log10(1/t) Approximately Lift(T) ~ T -0.5 = sqrt (1/T) 28
29 Lift *Actual Lift(T) vs sqrt(1/t) for All Problems Actual lift(t) Est. lift(t) Error = Actual Lift - sqrt(1/t) *T% Avg(Error) = St. Dev(Error) =
30 *GPS Lift(T) Rule of Thumb For targeted marketing campaigns, where 0.01 < T < 0.25, Lift(T) = sqrt (1/T) 1 Exceptions for truly predictable or random behaviors poor models information leakers 30
31 *Estimating Entire Curve Cumulative Percent Hits CPH(P) = Lift(P) * P CPH is easier to model than Lift Several regressions for all CPH curves Best results with regression log10(cph(p)) = a + b log10(p) Average R 2 =
32 *CPH Curve Estimate Approximately CPH(P) ~ sqrt(p) bounds: P 0.6 < CPH(P) < P
33 *Lift Curve Estimate Since Lift(P) = CPH(P)/P Lift(P) ~ 1/sqrt(P) bounds: (1/P ) 0.4 < Lift(P) < (1/P )
34 *More on Estimating Lift and Profitability G. Piatetsky-Shapiro, B. Masand, Estimating Campaign Benefits and Modeling Lift, Proc. KDD-99, ACM. 34
35 KDD Cup 1998 Data from Paralyzed Veterans of America (charity) Goal: select mailing with the highest profit Winners: Urban Science, SAS, Quadstone see full results and winner s presentations at 35
36 KDD-CUP-98 Analysis Universe Paralyzed Veterans of America (PVA), a not-forprofit organization that provides programs and services for US veterans with spinal cord injuries or disease, generously provided the data set PVA s June 97 fund raising mailing, sent to 3.5 million donors, was selected as the competition data Within this universe, a group of 200K Lapsed donors was of particular interest to PVA. Lapsed donors are individuals who made their last donation to PVA 13 to 24 months prior to the mailing 36
37 KDD Cup-98 Example Evaluation: Expected profit maximization with a mailing cost of $0.68 Sum of (actual donation-$0.68) for all records with predicted/ expected donation > $0.68 Participant with the highest actual sum wins 37
38 KDD Cup Cost Matrix Predicted Donation Yes No Actual Donation Yes DonationAmt No
39 KDD Cup 1998 Results Model Selecte d Result GainSmarts 56,330 $14,712 1 SAS 55,838 $14,662 2 Quadstone 57,836 $13,954 3 *ALL* 96,367 $10, #20 42,270 $1, #21 1,551 $ Rank Selected: how many were selected by the model Result: the total profit (donations-cost) of the model *ALL* - selecting all 39
40 Summary KDD Cup 1997 case study Model Evaluation: AUC and Lift Quality Estimating Campaign Profit *Feasibility Assessment GPS Rule of Thumb for Typical Lift Curve KDD Cup
Introduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product
Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Sagarika Prusty Web Data Mining (ECT 584),Spring 2013 DePaul University,Chicago [email protected] Keywords:
Data Mining in CRM & Direct Marketing. Jun Du The University of Western Ontario [email protected]
Data Mining in CRM & Direct Marketing Jun Du The University of Western Ontario [email protected] Outline Why CRM & Marketing Goals in CRM & Marketing Models and Methodologies Case Study: Response Model Case
Data Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
Predictive Data modeling for health care: Comparative performance study of different prediction models
Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath [email protected] National Institute of Industrial Engineering (NITIE) Vihar
Mining Customer Value: From Association Rules to Direct Marketing
Mining Customer Value: From Association Rules to Direct Marketing Ke Wang Simon Fraser University [email protected] Senqiang Zhou Simon Fraser University [email protected] Qiang Yang Hong Kong University
Modeling Lifetime Value in the Insurance Industry
Modeling Lifetime Value in the Insurance Industry C. Olivia Parr Rud, Executive Vice President, Data Square, LLC ABSTRACT Acquisition modeling for direct mail insurance has the unique challenge of targeting
Data Mining for Direct Marketing: Problems and
Data Mining for Direct Marketing: Problems and Solutions Charles X. Ling and Chenghui Li Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 Tel: 519-661-3341;
How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK
How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK Agenda Analytics why now? The process around data and text mining Case Studies The Value of Information
Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry
Paper 12028 Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry Junxiang Lu, Ph.D. Overland Park, Kansas ABSTRACT Increasingly, companies are viewing
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through
IBM SPSS Direct Marketing 22
IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release
Predictive Modeling and Big Data
Predictive Modeling and Presented by Eileen Burns, FSA, MAAA Milliman Agenda Current uses of predictive modeling in the life insurance industry Potential applications of 2 1 June 16, 2014 [Enter presentation
Getting Even More Out of Ensemble Selection
Getting Even More Out of Ensemble Selection Quan Sun Department of Computer Science The University of Waikato Hamilton, New Zealand [email protected] ABSTRACT Ensemble Selection uses forward stepwise
Data Mining for Fun and Profit
Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools
Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI
Data Mining Knowledge Discovery, Data Warehousing and Machine Learning Final remarks Lecturer: JERZY STEFANOWSKI Email: [email protected] Data Mining a step in A KDD Process Data mining:
How To Solve The Kd Cup 2010 Challenge
A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China [email protected] [email protected]
Mining Life Insurance Data for Customer Attrition Analysis
Mining Life Insurance Data for Customer Attrition Analysis T. L. Oshini Goonetilleke Informatics Institute of Technology/Department of Computing, Colombo, Sri Lanka Email: [email protected] H. A. Caldera
Data Mining Applications in Fund Raising
Data Mining Applications in Fund Raising Nafisseh Heiat Data mining tools make it possible to apply mathematical models to the historical data to manipulate and discover new information. In this study,
IBM SPSS Direct Marketing 23
IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release
not possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
STATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign
Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign Arun K Mandapaka, Amit Singh Kushwah, Dr.Goutam Chakraborty Oklahoma State University, OK, USA ABSTRACT Direct
Data Mining Techniques Chapter 4: Data Mining Applications in Marketing and Customer Relationship Management
Data Mining Techniques Chapter 4: Data Mining Applications in Marketing and Customer Relationship Management Prospecting........................................................... 2 DM to choose the right
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right
Data Mining: Overview. What is Data Mining?
Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
Direct Marketing Profit Model. Bruce Lund, Marketing Associates, Detroit, Michigan and Wilmington, Delaware
Paper CI-04 Direct Marketing Profit Model Bruce Lund, Marketing Associates, Detroit, Michigan and Wilmington, Delaware ABSTRACT A net lift model gives the expected incrementality (the incremental rate)
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk [email protected] Tom Kelsey ID5059-17-AUC
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES
HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES Translating data into business value requires the right data mining and modeling techniques which uncover important patterns within
Maximizing Return and Minimizing Cost with the Decision Management Systems
KDD 2012: Beijing 18 th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Rich Holada, Vice President, IBM SPSS Predictive Analytics Maximizing Return and Minimizing Cost with the Decision Management
Better credit models benefit us all
Better credit models benefit us all Agenda Credit Scoring - Overview Random Forest - Overview Random Forest outperform logistic regression for credit scoring out of the box Interaction term hypothesis
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19
PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations
Using News Articles to Predict Stock Price Movements
Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 [email protected] 21, June 15,
Introduction Predictive Analytics Tools: Weka
Introduction Predictive Analytics Tools: Weka Predictive Analytics Center of Excellence San Diego Supercomputer Center University of California, San Diego Tools Landscape Considerations Scale User Interface
Why do statisticians "hate" us?
Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data
Data Mining. Nonlinear Classification
Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15
Paper AA-08-2015. Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM
Paper AA-08-2015 Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM Delali Agbenyegah, Alliance Data Systems, Columbus, Ohio 0.0 ABSTRACT Traditional
BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts
BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an
Predictive Analytics Applied: Marketing and Web
Predictive Analytics Applied: Marketing and Web Brought to you by Prediction Impact and World Organization of Webmasters (WOW) Eric Siegel, Ph.D. [email protected] (415) 683-1146 Training Program
Using Control Groups to Target on Predicted Lift:
Using Control Groups to Target on Predicted Lift: Building and Assessing Uplift Models Nicholas J. Radcliffe Portrait Software The Smith Centre The Fairmile Henley-on-Thames Oxfordshire RG9 6AB UK Department
Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP
Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation
Benchmarking of different classes of models used for credit scoring
Benchmarking of different classes of models used for credit scoring We use this competition as an opportunity to compare the performance of different classes of predictive models. In particular we want
A Basic Guide to Modeling Techniques for All Direct Marketing Challenges
A Basic Guide to Modeling Techniques for All Direct Marketing Challenges Allison Cornia Database Marketing Manager Microsoft Corporation C. Olivia Rud Executive Vice President Data Square, LLC Overview
Successfully Implementing Predictive Analytics in Direct Marketing
Successfully Implementing Predictive Analytics in Direct Marketing John Blackwell and Tracy DeCanio, The Nature Conservancy, Arlington, VA ABSTRACT Successfully Implementing Predictive Analytics in Direct
ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis
ElegantJ BI White Paper The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis Integrated Business Intelligence and Reporting for Performance Management, Operational
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
How To Identify A Churner
2012 45th Hawaii International Conference on System Sciences A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication Namhyoung Kim, Jaewook Lee Department of Industrial and Management
The Predictive Data Mining Revolution in Scorecards:
January 13, 2013 StatSoft White Paper The Predictive Data Mining Revolution in Scorecards: Accurate Risk Scoring via Ensemble Models Summary Predictive modeling methods, based on machine learning algorithms
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
Survey Analysis: Data Mining versus Standard Statistical Analysis for Better Analysis of Survey Responses
Survey Analysis: Data Mining versus Standard Statistical Analysis for Better Analysis of Survey Responses Salford Systems Data Mining 2006 March 27-31 2006 San Diego, CA By Dean Abbott Abbott Analytics
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Improve Marketing Campaign ROI using Uplift Modeling. Ryan Zhao http://www.analyticsresourcing.com
Improve Marketing Campaign ROI using Uplift Modeling Ryan Zhao http://www.analyticsresourcing.com Objective To introduce how uplift model improve ROI To explore advanced modeling techniques for uplift
Why Ensembles Win Data Mining Competitions
Why Ensembles Win Data Mining Competitions A Predictive Analytics Center of Excellence (PACE) Tech Talk November 14, 2012 Dean Abbott Abbott Analytics, Inc. Blog: http://abbottanalytics.blogspot.com URL:
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 10 Sajjad Haider Fall 2012 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right
Question 2 Naïve Bayes (16 points)
Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the
Data Mining: STATISTICA
Data Mining: STATISTICA Outline Prepare the data Classification and regression 1 Prepare the Data Statistica can read from Excel,.txt and many other types of files Compared with WEKA, Statistica is much
IBM SPSS Direct Marketing 19
IBM SPSS Direct Marketing 19 Note: Before using this information and the product it supports, read the general information under Notices on p. 105. This document contains proprietary information of SPSS
Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.
Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics
Target Analytics Nonprofit Cooperative Database
Target Analytics Nonprofit Cooperative Database Solution Overview Target Analytics Nonprofit Cooperative Database Lists and Predictive Models for Effective Fundraising You know how challenging it is to
Alex Vidras, David Tysinger. Merkle Inc.
Using PROC LOGISTIC, SAS MACROS and ODS Output to evaluate the consistency of independent variables during the development of logistic regression models. An example from the retail banking industry ABSTRACT
Easily Identify Your Best Customers
IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do
4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"
Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses
Crowdfunding Support Tools: Predicting Success & Failure
Crowdfunding Support Tools: Predicting Success & Failure Michael D. Greenberg Bryan Pardo [email protected] [email protected] Karthic Hariharan [email protected] tern.edu Elizabeth
APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING
Wrocław University of Technology Internet Engineering Henryk Maciejewski APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING PRACTICAL GUIDE Wrocław (2011) 1 Copyright by Wrocław University of Technology
Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets
Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification
Predicting Customer Churn in the Telecommunications Industry An Application of Survival Analysis Modeling Using SAS
Paper 114-27 Predicting Customer in the Telecommunications Industry An Application of Survival Analysis Modeling Using SAS Junxiang Lu, Ph.D. Sprint Communications Company Overland Park, Kansas ABSTRACT
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
Issues in Information Systems Volume 16, Issue IV, pp. 30-36, 2015
DATA MINING ANALYSIS AND PREDICTIONS OF REAL ESTATE PRICES Victor Gan, Seattle University, [email protected] Vaishali Agarwal, Seattle University, [email protected] Ben Kim, Seattle University, [email protected]
Java Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
Predicting Student Performance by Using Data Mining Methods for Classification
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance
Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry
Advances in Natural and Applied Sciences, 3(1): 73-78, 2009 ISSN 1995-0772 2009, American Eurasian Network for Scientific Information This is a refereed journal and all articles are professionally screened
Role of Social Networking in Marketing using Data Mining
Role of Social Networking in Marketing using Data Mining Mrs. Saroj Junghare Astt. Professor, Department of Computer Science and Application St. Aloysius College, Jabalpur, Madhya Pradesh, India Abstract:
Data Mining. Dr. Saed Sayad. University of Toronto 2010 [email protected]. http://chem-eng.utoronto.ca/~datamining/
Data Mining Dr. Saed Sayad University of Toronto 2010 [email protected] http://chem-eng.utoronto.ca/~datamining/ 1 Data Mining Data mining is about explaining the past and predicting the future by
A Game Theoretical Framework for Adversarial Learning
A Game Theoretical Framework for Adversarial Learning Murat Kantarcioglu University of Texas at Dallas Richardson, TX 75083, USA muratk@utdallas Chris Clifton Purdue University West Lafayette, IN 47907,
Machine Learning, Data Mining, and Knowledge Discovery: An Introduction
Machine Learning, Data Mining, and Knowledge Discovery: An Introduction AHPCRC Workshop - 8/17/10 - Dr. Martin Based on slides by Gregory Piatetsky-Shapiro from Kdnuggets http://www.kdnuggets.com/data_mining_course/
Determining optimum insurance product portfolio through predictive analytics BADM Final Project Report
2012 Determining optimum insurance product portfolio through predictive analytics BADM Final Project Report Dinesh Ganti(61310071), Gauri Singh(61310560), Ravi Shankar(61310210), Shouri Kamtala(61310215),
Selecting Data Mining Model for Web Advertising in Virtual Communities
Selecting Data Mining for Web Advertising in Virtual Communities Jerzy Surma Faculty of Business Administration Warsaw School of Economics Warsaw, Poland e-mail: [email protected] Mariusz Łapczyński
The Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
Performance Metrics for Graph Mining Tasks
Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical
