Data Mining Approaches to Collections and Case Closure. Background

Size: px
Start display at page:

Download "Data Mining Approaches to Collections and Case Closure. Background"

Transcription

1 Data Mining Approaches to Collections and Case Closure Bill Haffey Technical Director, SPSS Public Sector Background Florida DOR has 500,000 sales accounts, of which ~35,000 are likely to be in the collections process in a given month Payment frequencies range from monthly to annually, based on expected tax amount Current collections process generally entails: Notice mailed after 30 days Phone call after another 15 days Visit after 54 days, or collection agency for low $ Garnishment/lien after 120 days All accounts treated identically, and no costs have been associated with any steps in the process 1

2 Background (cont) Idea: Identify paths /maps composed of minimal/optimal sequences of actions that tend to result in delinquent case closure (for monthly payment accounts), perhaps unique to particular account types Deploy these paths into an automated recommendation engine designed to improve timeliness and efficiency of collections process Account Type B Acct1 Acct2 Acct3. Sequence Detection notice notice notice notice close phone phone close close 2

3 Recommendation Engine If Then Account Type A Visit/Collections Close Account Type B Notice Phone call Close Account Type C Notice Close But, in reality... Best Contact Resolution not yet feasible: Actions made to the account are not separable: 1 st notice sent on establishment of liability Phone call after another 15 days Sent to service center 54 days after 1 st notice Garnishment/lien may be made What if notice rec d and payment sent day 40, but not rec d by Fla until day 45 after phone call placed A phone call placed to account == a phone call rec d from account w/promise to pay Not all actions made on the account are recorded: Virtual agent campaigns (eg, Mosaix recorded msg) not recorded 3

4 Instead... Model time to account closure (X days), broken into the following groups: X < < X < 60 X > 60 Assumptions: X < 30 Case will have entailed minimal contact 30 < X < 60 Notice and/or phone call or automated message X > 60 (and bill exceeds $250 threshold) handled by field service ctr Why? These time-to-closure groupings provide a reasonable proxy for the type of contact that resulted in closure The modeling and prediction of an account s time-to-closure could provide such business rules as: If account is predicted as X < 30, consider not adding case to call queue for an additional period If account is predicted as X > 60, refer case directly to field service center 4

5 Why Data Mining? Needed to model /predict the time-to-closure category As opposed to query/olap/report snapshots Lots of legacy data to train the model (account characteristics and outcomes) Ability to scale procedures against large volumes of data Needed flexibility in types of data that could be modeled As opposed to traditional statistical procedures Why Data Mining (cont)? In training the model, needed to minimize the probability of especially bad predictions Predicting 30 < X < 60 for a case that would actually close in X < 30 isn t as bad as predicting X > 60 for that same case Needed to understand the model why certain types of cases were predicted to close at X > 60 As opposed to an opaque black-box modeling methodology Chose the Rule-Induction data mining procedure 5

6 Project Approach Methodology The Data Mining Project followed the CRISP-DM Methodology. Predictive Evolution CRISP-DM Approach Business Value Predictive/Proactive: What should we offer this customer today? Predictive: Which ones are at risk of leaving? OLAP Real-Time Information Distribution Data Mining and Forecasting Historical: Which cities did they live in? Query and Report Historical: How many customers do we lose each month? Time Benefits Provided Standard, proven process to guide data mining efforts Maximizes return on investment in data mining tools and processes Iterative process that incorporates business expertise and understanding as a key guide to analyses Cross Industry Standard Process for Data Mining: CRISP Data Mining Methodology Developed by SPSS, NCR, Daimler-Chrysler, and OHRA in 1996 Time tested and used worldwide Flexible and adaptable methodology Six Cyclic Stages: Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment 6

7 CRISP DM: Project Approach Project Goal: Develop a data model that will predict the time required for for an an account to to close for for both bills and delinquencies. Step 2: Step 3: Steps: Step 1: Business Step 4: Data Data Understanding Modeling Understanding Preparation Step 5: Evaluation Step 6: Deployment Objectives: Goals definition Project objectives Gain buy-in Determine status of data Conduct data collection process Prepare data for Model data to yield detailed analysis cross-sell insights Determine missing data Validate process and results with business goals Implement models and processes. Activities: Define project goals Conduct interviews with key staff to define analytic and reporting processes Assess current processes Define success criteria Determine deployment method Collect data Data quality check Upload data into Clementine Select fields to be used in analyses Clean data Transform and derive calculated fields as required Conduct various modeling procedures on data Identify and implement highestvalue modeling method Model data Revisit original business objectives Validate process and results with business goals Review results with Client and make any necessary modifications prior to delivery Plan and structure processes for deployment of model. Deliverables: Interviews Definition of project goals Success criteria Data audit report Finished dataset to be used for analysis Documented analytical process as performed Analytical results tied to business objectives Additional input needed for Go/No Go decision Data quality improvement recommendations Demonstration of models. First Round of Models Data Preparation Steps Take time to group SIC (first 2 digits) into meaningful categories Create time history for AGE and CASE_AGE Do not yet include time histories for other fields, such as contacts, bankrupt, etc. Modeling Steps Create decision trees and neural networks using available fields Used balanced samples for training the neural networks Select models that do the best job Predicting outcomes Minimizing confusion between categories 1 and 3 7

8 Data Sources COUNTY ACCOUNT APP_PERIOD CREA_DATE1 STAT_DATE1 AGE CREA_DATE2 STAT_DATE2 CONTACTS RECNO CASE_AGE /10/00 10/27/ /11/00 10/27/ /29/00 12/14/ /30/00 12/14/ /21/00 1/17/ /22/00 1/17/ /24/01 2/14/ /25/01 2/14/ /25/01 5/18/ /26/01 5/18/ /25/01 5/18/ /26/01 5/18/ /29/01 7/23/ /2/01 7/23/ /4/00 3/21/ /7/00 4/5/ /25/00 4/5/ /7/00 4/5/ /7/00 3/31/ /7/00 4/5/ /21/00 5/25/ /19/00 6/8/ /1/00 5/30/ /19/00 6/8/ /11/00 6/8/ /19/00 6/8/ /18/00 8/21/ /19/00 8/21/ Types of Features Create Time-Based Features AGE features Last AGE value Maximum AGE Average AGE for all modules, last 3 modules, last 5 modules, etc. CASE_AGE features Same kinds of features as AGE: last, max, average AGE Contacts Reduce large numbers of categories down to a smaller (more manageable number) Ex: County, ORG_CODE, SIC, KIND_CODE, STAT_CODE Reason: reduce redundant information, speed up modeling 8

9 Data Preprocessing Stream SIC 2-Digit Features Group SIC 2-digit Values Functionally (SIC 1-digit) By SICs with similar distributions of AGE categories 9

10 Age Category Distributions Split sample data into training and testing subsets Training for creating model Testing for assessing model performance Balancing Proportions of AGE Categories 10

11 Template Modeling Stream Standard modeling stream Load data Create models Assess results for training subset and testing subset AGE Neural Network Model Parameters and Results Sometimes the direct path to a model doesn t work well. Create a model that predicts AGE, and use this model as input to the AGE_cat model (actually, created a model that predicted LOG10(AGE) Make sure no fields are allowed in the AGE model that cannot be included in AGE_cat model 11

12 Neural Network Accuracy Predicting Age AGE model predicts AGE values with 69% correlation. A scatter plot shows predictions vs. actual AGE values. This doesn t have to be perfect to provide good information for the AGE_cat models Rule Induction Key Features Model output is intuitive in the form of either decision trees or rulesets Flexibility in types of data Can ransack a dataset to identify key data features The resultant model will utilize relevant fields, and ignore others 12

13 Build the MODEL Cust Training Data Risk Income Job Good 50k 6 Bad 60k 3 Good 41k 7... Debt low high low. Decision trees: income < $40K job > 5 yrs then good risk job < 5 yrs then bad risk income > $40K high debt then bad risk low debt then good risk or Rule Sets: Rule #1 for good risk: if income > $40K if low debt Rule #2 for good risk: if income < $40K if job > 5 years Test the Model Testing Data Model Cust Risk Good Good Bad Income 50k 60k 41k. Job Debt low high low. Rule #1 for good risk: if income > $40K if low debt Rule #2 for good risk: if income < $40K if job > 5 years 13

14 Some Model misses more critical than others... Modeled Outcomes Good Amb Bad Actual Outcomes Good Amb Bad Changing Where the Errors Occur Change misclassification costs to change where errors occur. If want to ensure that one gets category 3 records correct, change how the decision tree views errors on records with category 3. In this example, classifier has 84.8% accuracy on testing data for category 3. However, we also get many category 1 and 2 records incorrectly called category 3 (false alarms) No misclassification costs 14

15 Decision Tree Accuracy on Testing Data Results for output field Age_cat Comparing $C-Age_cat with Age_cat Correct : ( 60.15%) Wrong : ( 39.85%) Total : Coincidence Matrix $C-Age_cat Actual Predicted Key Variables in AGE_cat Decision Tree Model Decision tree rules for best tree. This is actually the third boost from a series of decision trees AGE_pred is first split 15

16 Some Interesting Rules Rule #1 for 3: if WAR_FLAG == Y then -> 3 (1019.0, 0.777) Rule #6 for 1: if WAR_FLAG == N and field50 =< 1 and last_caseage_know =< 31 and TAX_STATUS == 1 then -> 1 ( , 0.605) Rule #7 for 1: if WAR_FLAG == N and field50 > 0 and field50 =< 1 and last_caseage_know =< 31 and TAX_STATUS == 1 and last_age_know =< 27 and ORG_CODE == [ ] then -> 1 (49.0, 0.694) Rule #8 for 1: if WAR_FLAG == N and field50 > 0 and field50 =< 1 and last_caseage_know =< 31 and TAX_STATUS == 1 and last_age_know =< 27 and ORG_CODE == 11 and SIC_2_groups == ['00_41_82_86' '01_15_42_53_84_91' '02_32_67' '07_25_30_48_56_75' '09_38_63_64_93' '10_29_31_34_45' '13' '14_23_78' '16_24_37_49_60' '17_52_54' '20_33_89' '22_43_61' '27_39' '28_50' '35_72_81_99' '36_47_51_55_58_79' '57_59' '65_70' '73_76_80' 100] then -> 1 (332.0, 0.651) Rule #1 for 3: if WAR_FLAG == Y then -> 3 (1019.0, 0.777) Rule #26 for 3: if WAR_FLAG == N and field50 > 3 and field50 =< 6 and last_caseage_know > 28 and last_contacts_know =< and module_count > 11 and COUNTY =< 54 then -> 3 (83.0, 0.687) Rule #27 for 3: if WAR_FLAG == N and field50 > 6 and last_caseage_know > 28 and last_contacts_know =< and max_known_age =< 44 and ORG_CODE == [ ] then -> 3 (228.0, 0.684) Next Steps Monitor performance of current models test model output on actual cases Address data issues Build sufficient cases with reengineered data Re-attempt Best Contact mapping 16

CRISP - DM. Data Mining Process. Process Standardization. Why Should There be a Standard Process? Cross-Industry Standard Process for Data Mining

CRISP - DM. Data Mining Process. Process Standardization. Why Should There be a Standard Process? Cross-Industry Standard Process for Data Mining Mining Process CRISP - DM Cross-Industry Standard Process for Mining (CRISP-DM) European Community funded effort to develop framework for data mining tasks Goals: Cross-Industry Standard Process for Mining

More information

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

CS590D: Data Mining Chris Clifton

CS590D: Data Mining Chris Clifton CS590D: Data Mining Chris Clifton March 10, 2004 Data Mining Process Reminder: Midterm tonight, 19:00-20:30, CS G066. Open book/notes. Thanks to Laura Squier, SPSS for some of the material used How to

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Ernst van Waning Senior Sales Engineer May 28, 2010 Agenda SPSS, an IBM Company SPSS Statistics User-driven product

More information

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III www.cognitro.com/training Predicitve DATA EMPOWERING DECISIONS Data Mining & Predicitve Training (DMPA) is a set of multi-level intensive courses and workshops developed by Cognitro team. it is designed

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Using Data Mining to Detect Insurance Fraud

Using Data Mining to Detect Insurance Fraud IBM SPSS Modeler Using Data Mining to Detect Insurance Fraud Improve accuracy and minimize loss Highlights: combines powerful analytical techniques with existing fraud detection and prevention efforts

More information

not possible or was possible at a high cost for collecting the data.

not possible or was possible at a high cost for collecting the data. Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

More information

Using Data Mining to Detect Insurance Fraud

Using Data Mining to Detect Insurance Fraud IBM SPSS Modeler Using Data Mining to Detect Insurance Fraud Improve accuracy and minimize loss Highlights: Combine powerful analytical techniques with existing fraud detection and prevention efforts Build

More information

Data Analysis. Management Information Systems 13

Data Analysis. Management Information Systems 13 Data Analysis Management Information Systems 13 166137-01+02 Management Information Systems Spring 2014 Sync Sangwon Lee, Ph. D D. of Information & Electronic Commerce WONKWANG University Prof. Dr. SSL

More information

Predicting earning potential on Adult Dataset

Predicting earning potential on Adult Dataset MSc in Computing, Business Intelligence and Data Mining stream. Business Intelligence and Data Mining Applications Project Report. Predicting earning potential on Adult Dataset Submitted by: xxxxxxx Supervisor:

More information

SPSS Data Mining Tips

SPSS Data Mining Tips SPSS Data Mining Tips A handy guide to help you save time and money as you plan and execute your data mining projects www.spss.com Table of contents Introduction...........................2 What is data

More information

CRISP-DM, which stands for Cross-Industry Standard Process for Data Mining, is an industry-proven way to guide your data mining efforts.

CRISP-DM, which stands for Cross-Industry Standard Process for Data Mining, is an industry-proven way to guide your data mining efforts. CRISP-DM, which stands for Cross-Industry Standard Process for Data Mining, is an industry-proven way to guide your data mining efforts. As a methodology, it includes descriptions of the typical phases

More information

Chapter 7: Data Mining

Chapter 7: Data Mining Chapter 7: Data Mining Overview Topics discussed: The Need for Data Mining and Business Value The Data Mining Process: Define Business Objectives Get Raw Data Identify Relevant Predictive Variables Gain

More information

An Introduction to Advanced Analytics and Data Mining

An Introduction to Advanced Analytics and Data Mining An Introduction to Advanced Analytics and Data Mining Dr Barry Leventhal Henry Stewart Briefing on Marketing Analytics 19 th November 2010 Agenda What are Advanced Analytics and Data Mining? The toolkit

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK

How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK Agenda Analytics why now? The process around data and text mining Case Studies The Value of Information

More information

Improving tax administration with data mining

Improving tax administration with data mining Executive report Improving tax administration with data mining Daniele Micci-Barreca, PhD, and Satheesh Ramachandran, PhD Elite Analytics, LLC Table of contents Introduction..............................................................2

More information

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

More information

Sun Bear Marketing Automation Software

Sun Bear Marketing Automation Software Sun Bear Marketing Automation Software Provide your marketing and sales groups with a single, integrated, web based platform that allows them to easily automate and manage marketing database, campaign,

More information

Prerequisites. Course Outline

Prerequisites. Course Outline MS-55040: Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot Description This three-day instructor-led course will introduce the students to the concepts of data mining,

More information

Behavioral Segmentation

Behavioral Segmentation Behavioral Segmentation TM Contents 1. The Importance of Segmentation in Contemporary Marketing... 2 2. Traditional Methods of Segmentation and their Limitations... 2 2.1 Lack of Homogeneity... 3 2.2 Determining

More information

Pentaho Data Mining Last Modified on January 22, 2007

Pentaho Data Mining Last Modified on January 22, 2007 Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org

More information

Easily Identify the Right Customers

Easily Identify the Right Customers PASW Direct Marketing 18 Specifications Easily Identify the Right Customers You want your marketing programs to be as profitable as possible, and gaining insight into the information contained in your

More information

Environmental Remote Sensing GEOG 2021

Environmental Remote Sensing GEOG 2021 Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class

More information

Data Mining for Fun and Profit

Data Mining for Fun and Profit Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools

More information

Five predictive imperatives for maximizing customer value

Five predictive imperatives for maximizing customer value Five predictive imperatives for maximizing customer value Applying predictive analytics to enhance customer relationship management Contents: 1 Introduction 4 The five predictive imperatives 13 Products

More information

Data Mining Techniques

Data Mining Techniques 15.564 Information Technology I Business Intelligence Outline Operational vs. Decision Support Systems What is Data Mining? Overview of Data Mining Techniques Overview of Data Mining Process Data Warehouses

More information

Afni deploys predictive analytics to drive milliondollar financial benefits

Afni deploys predictive analytics to drive milliondollar financial benefits Afni deploys predictive analytics to drive milliondollar financial benefits Using a smarter approach to debt recovery to identify the best payers and focus collection efforts Overview The need Afni wanted

More information

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms Data Mining Techniques forcrm Data Mining The non-trivial extraction of novel, implicit, and actionable knowledge from large datasets. Extremely large datasets Discovery of the non-obvious Useful knowledge

More information

The Keys to Successful Service Level Agreements Effectively Meeting Enterprise Demands

The Keys to Successful Service Level Agreements Effectively Meeting Enterprise Demands A P P L I C A T I O N S A WHITE PAPER SERIES SYNTEL, A U.S.-BASED IT SERVICE PROVIDER WITH AN EXTENSIVE GLOBAL DELIVERY SERVICE, SUGGESTS SPECIFIC BEST PRACTICES FOR REDUCING COSTS AND IMPROVING BUSINESS

More information

Class 10. Data Mining and Artificial Intelligence. Data Mining. We are in the 21 st century So where are the robots?

Class 10. Data Mining and Artificial Intelligence. Data Mining. We are in the 21 st century So where are the robots? Class 1 Data Mining Data Mining and Artificial Intelligence We are in the 21 st century So where are the robots? Data mining is the one really successful application of artificial intelligence technology.

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

Planning successful data mining projects

Planning successful data mining projects IBM SPSS Modeler Planning successful data mining projects A practical, three-step guide to planning your first data mining project and selling it internally Contents: 1 Executive summary 2 One: Start with

More information

WHITEPAPER. Creating and Deploying Predictive Strategies that Drive Customer Value in Marketing, Sales and Risk

WHITEPAPER. Creating and Deploying Predictive Strategies that Drive Customer Value in Marketing, Sales and Risk WHITEPAPER Creating and Deploying Predictive Strategies that Drive Customer Value in Marketing, Sales and Risk Overview Angoss is helping its clients achieve significant revenue growth and measurable return

More information

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA ABSTRACT Current trends in data mining allow the business community to take advantage of

More information

What is Customer Relationship Management? Customer Relationship Management Analytics. Customer Life Cycle. Objectives of CRM. Three Types of CRM

What is Customer Relationship Management? Customer Relationship Management Analytics. Customer Life Cycle. Objectives of CRM. Three Types of CRM Relationship Management Analytics What is Relationship Management? CRM is a strategy which utilises a combination of Week 13: Summary information technology policies processes, employees to develop profitable

More information

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com

More information

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification

More information

IBM SPSS Direct Marketing

IBM SPSS Direct Marketing IBM Software IBM SPSS Statistics 19 IBM SPSS Direct Marketing Understand your customers and improve marketing campaigns Highlights With IBM SPSS Direct Marketing, you can: Understand your customers in

More information

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

BUSINESSOBJECTS PREDICTIVE WORKBENCH XI 3.0

BUSINESSOBJECTS PREDICTIVE WORKBENCH XI 3.0 PRODUCTS BUSINESSOBJECTS PREDICTIVE WORKBENCH XI 3.0 Transform Your Future with Insight Today Key Features As part of the BusinessObjects XI platform, BusinessObjects Predictive Workbench: Provides robust

More information

Data Mining: Overview. What is Data Mining?

Data Mining: Overview. What is Data Mining? Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,

More information

Numerical Algorithms Group

Numerical Algorithms Group Title: Summary: Using the Component Approach to Craft Customized Data Mining Solutions One definition of data mining is the non-trivial extraction of implicit, previously unknown and potentially useful

More information

Using predictive analytics to maximise the value of charity donors

Using predictive analytics to maximise the value of charity donors Using predictive analytics to maximise the value of charity donors Jarlath Quinn Analytics Consultant Rachel Clinton Business Development www.sv-europe.com FAQs Is this session being recorded? Yes Can

More information

Mitel Professional Services Catalog for Contact Center JULY 2015 SWEDEN, DENMARK, FINLAND AND BALTICS RELEASE 1.0

Mitel Professional Services Catalog for Contact Center JULY 2015 SWEDEN, DENMARK, FINLAND AND BALTICS RELEASE 1.0 Mitel Professional Services Catalog for Contact Center JULY 2015 SWEDEN, DENMARK, FINLAND AND BALTICS RELEASE 1.0 Contents MITEL PROFESSIONAL SERVICES DELIVERY METHODOLOGY... 2 CUSTOMER NEEDS... 2 ENGAGING

More information

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

More information

How to create an effective data management strategy

How to create an effective data management strategy How to create an effective data management strategy Companies today are storing more and more data, whether that s your key account contacts, existing customer information or potential prospects it s not

More information

Measuring and Monitoring the Quality of Master Data By Thomas Ravn and Martin Høedholt, November 2008

Measuring and Monitoring the Quality of Master Data By Thomas Ravn and Martin Høedholt, November 2008 Measuring and Monitoring the Quality of Master Data By Thomas Ravn and Martin Høedholt, November 2008 Introduction We ve all heard about the importance of data quality in our IT-systems and how the data

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Chapter 5 Foundations of Business Intelligence: Databases and Information Management 5.1 See Markers-ORDER-DB Logically Related Tables Relational Approach: Physically Related Tables: The Relationship Screen

More information

Data Mining: An Introduction

Data Mining: An Introduction Data Mining: An Introduction Michael J. A. Berry and Gordon A. Linoff. Data Mining Techniques for Marketing, Sales and Customer Support, 2nd Edition, 2004 Data mining What promotions should be targeted

More information

Data Project Extract Big Data Analytics course. Toulouse Business School London 2015

Data Project Extract Big Data Analytics course. Toulouse Business School London 2015 Data Project Extract Big Data Analytics course Toulouse Business School London 2015 How do you analyse data? Project are often a flop: Need a problem, a business problem to solve. Start with a small well-defined

More information

Predictive Models for Enhanced Audit Selection: The Texas Audit Scoring System

Predictive Models for Enhanced Audit Selection: The Texas Audit Scoring System Predictive Models for Enhanced Audit Selection: The Texas Audit Scoring System FTA TECHNOLOGY CONFERENCE 2003 Bill Haffey, SPSS Inc. Daniele Micci-Barreca, Elite Analytics LLC Agenda ß Data Mining Overview

More information

CRISP-DM: Towards a Standard Process Model for Data Mining

CRISP-DM: Towards a Standard Process Model for Data Mining CRISP-DM: Towards a Standard Process Model for Mining Rüdiger Wirth DaimlerChrysler Research & Technology FT3/KL PO BOX 2360 89013 Ulm, Germany ruediger.wirth@daimlerchrysler.com Jochen Hipp Wilhelm-Schickard-Institute,

More information

Dawn Broschard, EdD Senior Research Analyst Office of Retention and Graduation Success dbroscha@fiu.edu

Dawn Broschard, EdD Senior Research Analyst Office of Retention and Graduation Success dbroscha@fiu.edu Using Decision Trees to Analyze Students at Risk of Dropping Out in Their First Year of College Based on Data Gathered Prior to Attending Their First Semester Dawn Broschard, EdD Senior Research Analyst

More information

The Top 10 Secrets to Using Data Mining to Succeed at CRM

The Top 10 Secrets to Using Data Mining to Succeed at CRM The Top 10 Secrets to Using Data Mining to Succeed at CRM Discover proven strategies and best practices Highlights: Plan and execute successful data mining projects. Understand the roles and responsibilities

More information

Data Mining. SPSS Clementine 12.0. 1. Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine

Data Mining. SPSS Clementine 12.0. 1. Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine Data Mining SPSS 12.0 1. Overview Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Types of Models Interface Projects References Outline Introduction Introduction Three of the common data mining

More information

1 Choosing the right data mining techniques for the job (8 minutes,

1 Choosing the right data mining techniques for the job (8 minutes, CS490D Spring 2004 Final Solutions, May 3, 2004 Prof. Chris Clifton Time will be tight. If you spend more than the recommended time on any question, go on to the next one. If you can t answer it in the

More information

Data Mining Applications in Manufacturing

Data Mining Applications in Manufacturing Data Mining Applications in Manufacturing Dr Jenny Harding Senior Lecturer Wolfson School of Mechanical & Manufacturing Engineering, Loughborough University Identification of Knowledge - Context Intelligent

More information

Classification of Titanic Passenger Data and Chances of Surviving the Disaster Data Mining with Weka and Kaggle Competition Data

Classification of Titanic Passenger Data and Chances of Surviving the Disaster Data Mining with Weka and Kaggle Competition Data Proceedings of Student-Faculty Research Day, CSIS, Pace University, May 2 nd, 2014 Classification of Titanic Passenger Data and Chances of Surviving the Disaster Data Mining with Weka and Kaggle Competition

More information

Data Mining. Nonlinear Classification

Data Mining. Nonlinear Classification Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

More information

Metrics 101. Produced by the TMF Reference Model Metrics and Reporting Sub-team. 1 August 2014

Metrics 101. Produced by the TMF Reference Model Metrics and Reporting Sub-team. 1 August 2014 Metrics 101 Produced by the TMF Reference Model Metrics and Reporting Sub-team 1 August 2014 Agenda Why a metrics program? Goals of a metrics program Types of metrics Further analytics on metrics Metrics

More information

Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

More information

Seven Ways To Help ERP IN 2014 AND BEYOND

Seven Ways To Help ERP IN 2014 AND BEYOND Seven Ways To Help Data Migration During Implementation SPECial REPORT SERIES ERP IN 2014 AND BEYOND CONTENTS INTRODUCTION 3 Develop a Data MigraTION Plan 4 PerfORM a ThOROUgh Gap Analysis 6 COMMIT ResOURCes

More information

T O O L K I T. 5 Steps to Build a Hosting Business. Hosting Business Tool Kit

T O O L K I T. 5 Steps to Build a Hosting Business. Hosting Business Tool Kit 5 Steps to Build a Hosting Business HOSTING BUSINESS T O O L K I T Strategies and Tactics to PLAN, GROW and PROFIT as a Hosting Service Provider Internet Names for Business Inc. All rights reserved. Page

More information

Predictive modelling around the world 28.11.13

Predictive modelling around the world 28.11.13 Predictive modelling around the world 28.11.13 Agenda Why this presentation is really interesting Introduction to predictive modelling Case studies Conclusions Why this presentation is really interesting

More information

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information

More information

Overview. Data Mining. Predicting Stock Market Returns. Predicting Health Risk. Wharton Department of Statistics. Wharton

Overview. Data Mining. Predicting Stock Market Returns. Predicting Health Risk. Wharton Department of Statistics. Wharton Overview Data Mining Bob Stine www-stat.wharton.upenn.edu/~bob Applications - Marketing: Direct mail advertising (Zahavi example) - Biomedical: finding predictive risk factors - Financial: predicting returns

More information

Data Mining Classification: Decision Trees

Data Mining Classification: Decision Trees Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous

More information

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore. CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes

More information

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

More information

WHITE PAPER. Payment Integrity Trends: What s A Code Worth. A White Paper by Equian

WHITE PAPER. Payment Integrity Trends: What s A Code Worth. A White Paper by Equian WHITE PAPER Payment Integrity Trends: What s A Code Worth A White Paper by Equian June 2014 To install or not install a pre-payment code edit, that is the question. Not all standard coding rules and edits

More information

analytics+insights for life science Descriptive to Prescriptive Accelerating Business Insights with Data Analytics a lifescale leadership brief

analytics+insights for life science Descriptive to Prescriptive Accelerating Business Insights with Data Analytics a lifescale leadership brief analytics+insights for life science Descriptive to Prescriptive Accelerating Business Insights with Data Analytics a lifescale leadership brief The potential of data analytics can be confusing for many

More information

The Predictive Data Mining Revolution in Scorecards:

The Predictive Data Mining Revolution in Scorecards: January 13, 2013 StatSoft White Paper The Predictive Data Mining Revolution in Scorecards: Accurate Risk Scoring via Ensemble Models Summary Predictive modeling methods, based on machine learning algorithms

More information

Five Predictive Imperatives for Maximizing Customer Value

Five Predictive Imperatives for Maximizing Customer Value Five Predictive Imperatives for Maximizing Customer Value Applying predictive analytics to enhance customer relationship management Contents: 1 Customers rule the economy 1 Many CRM initiatives are failing

More information

Structural Health Monitoring Tools (SHMTools)

Structural Health Monitoring Tools (SHMTools) Structural Health Monitoring Tools (SHMTools) Getting Started LANL/UCSD Engineering Institute LA-CC-14-046 c Copyright 2014, Los Alamos National Security, LLC All rights reserved. May 30, 2014 Contents

More information

Best Practices in Data Mining. Executive Summary

Best Practices in Data Mining. Executive Summary Executive Summary Prepared by: Database & Marketing Technology Council Authors: Richard Boire, Paul Tyndall, Greg Carriere, Rob Champion Released: August 2003 Executive Summary Canadian marketers have

More information

Application of Predictive Analytics for Better Alignment of Business and IT

Application of Predictive Analytics for Better Alignment of Business and IT Application of Predictive Analytics for Better Alignment of Business and IT Boris Zibitsker, PhD bzibitsker@beznext.com July 25, 2014 Big Data Summit - Riga, Latvia About the Presenter Boris Zibitsker

More information

Visual Data Mining. Motivation. Why Visual Data Mining. Integration of visualization and data mining : Chidroop Madhavarapu CSE 591:Visual Analytics

Visual Data Mining. Motivation. Why Visual Data Mining. Integration of visualization and data mining : Chidroop Madhavarapu CSE 591:Visual Analytics Motivation Visual Data Mining Visualization for Data Mining Huge amounts of information Limited display capacity of output devices Chidroop Madhavarapu CSE 591:Visual Analytics Visual Data Mining (VDM)

More information

Applied Business Intelligence. Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA

Applied Business Intelligence. Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA Applied Business Intelligence Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA Agenda Business Drivers and Perspectives Technology & Analytical Applications Trends Challenges

More information

Pelco by Schneider Electric Chooses BPMonline to Automate Sales in 26 Countries

Pelco by Schneider Electric Chooses BPMonline to Automate Sales in 26 Countries Pelco by Schneider Electric Chooses BPMonline to Automate Sales in 26 Countries Pelco by Schneider Electric selected BPMonline to deliver an integrated global sales forecasting and opportunity management

More information

Achieve Better Insight and Prediction with Data Mining

Achieve Better Insight and Prediction with Data Mining Clementine 11.1 Specifications Achieve Better Insight and Prediction with Data Mining Data mining provides organizations with a clearer view of current conditions and deeper insight into future events.

More information

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set Overview Evaluation Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes training set, validation set, test set holdout, stratification

More information

Predictive Dynamix Inc Turning Business Experience Into Better Decisions

Predictive Dynamix Inc Turning Business Experience Into Better Decisions Overview Geospatial Data Mining for Market Intelligence By Paul Duke, Predictive Dynamix, Inc. Copyright 2000-2001. All rights reserved. Today, there is a huge amount of information readily available describing

More information

Data Mining in Construction s Project Time Management - Kayson Case Study

Data Mining in Construction s Project Time Management - Kayson Case Study Data Mining in Construction s Project Time Management - Kayson Case Study Shahram Shadrokh (Assistant Professor) Sharif University of Technology, Shadrokh@sharif.edu Seyedbehzad Aghdashi (PhD Student)

More information

How to increase Marketing Efficiency to Gain and Retain Customers

How to increase Marketing Efficiency to Gain and Retain Customers How to increase Marketing Efficiency to Gain and Retain Customers How marketing automation and CRM can help a midsized business consolidate data, improve customer information, streamline marketing efforts,

More information

BENEFITS OF AUTOMATING DATA WAREHOUSING

BENEFITS OF AUTOMATING DATA WAREHOUSING BENEFITS OF AUTOMATING DATA WAREHOUSING Introduction...2 The Process...2 The Problem...2 The Solution...2 Benefits...2 Background...3 Automating the Data Warehouse with UC4 Workload Automation Suite...3

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

More information

Analysis. Print Service Providers Boost Revenues through Value-Added Services. February 2015. Service Area. Comments or Questions?

Analysis. Print Service Providers Boost Revenues through Value-Added Services. February 2015. Service Area. Comments or Questions? Analysis February 2015 Print Service Providers Boost Revenues through Value-Added Services Service Area Business Development Strategies Service Comments or Questions? Table of Contents Executive Summary...

More information

Using Predictive Analytics to Detect Contract Fraud, Waste, and Abuse Case Study from U.S. Postal Service OIG

Using Predictive Analytics to Detect Contract Fraud, Waste, and Abuse Case Study from U.S. Postal Service OIG Using Predictive Analytics to Detect Contract Fraud, Waste, and Abuse Case Study from U.S. Postal Service OIG MACPA Government & Non Profit Conference April 26, 2013 Isaiah Goodall, Director of Business

More information

TDWI strives to provide course books that are content-rich and that serve as useful reference documents after a class has ended.

TDWI strives to provide course books that are content-rich and that serve as useful reference documents after a class has ended. Previews of TDWI course books are provided as an opportunity to see the quality of our material and help you to select the courses that best fit your needs. The previews can not be printed. TDWI strives

More information

Introduction to Computers and Programming. Testing

Introduction to Computers and Programming. Testing Introduction to Computers and Programming Prof. I. K. Lundqvist Lecture 13 April 16 2004 Testing Goals of Testing Classification Test Coverage Test Technique Blackbox vs Whitebox Real bugs and software

More information

Content. Management Summary... 3

Content. Management Summary... 3 Real Time Marketing Self-learning, intelligent customer scoring offers financial service providers a made-to-measure forecasting model for individual customers Content Management Summary... 3 Intelligent,

More information

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an

More information