Customer and Business Analytic



Similar documents
CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

Exploratory Data Analysis with MATLAB

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Mining. Practical. Data. Monte F. Hancock, Jr. Chief Scientist, Celestech, Inc. CRC Press. Taylor & Francis Group

Computer-Aided Multivariate Analysis

Cloud Computing. and Scheduling. Data-Intensive Computing. Frederic Magoules, Jie Pan, and Fei Teng SILKQH. CRC Press. Taylor & Francis Group

Engineering Design. Software. Theory and Practice. Carlos E. Otero. CRC Press. Taylor & Francis Croup. Taylor St Francis Croup, an Informa business

Quality Management. Theory and Application PETER D. MAUCH. Ltfi) CRC Press. \ V J Taylor & Francis Group. ^ ^ Boca Raton London New York

Advances in Network Management

Development and Management

Introduction to Supply Chain Management Technologies

METHODS IN MEDICAL INFORMATICS

Ctfo MANAGEMENT SECURITY PATCH. Felicia M. Nicastro. Second Edition. CRC Press. VC#*' J Taylor & Francis Group / Boca Raton London New York

Data Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd Edition

Information Technology and Organizational Learning

Business Analytics and Credit Scoring

Supply Chain Risk. An Emerging Discipline. Gregory L. Schlegel. Robert J. Trent

Implementation. Business-Driven IT-Wide Agile (Scrum) and Kanban (Lean) Andrew T. Pham and David K. Pham. An Action Guide for Business and IT Leaders

Parallel Computing for Data Science

Implementing the Project Management Balanced Scorecard

SOFTWARE TESTING AS A SERVICE

CHAPMAN & HALL/CRC INNOVATIONS IN SOFTWARE ENGINEERING AND SOFTWARE DEVELOPMENT. Software Test Attacks to Break Mobile and Embedded Devices

RESILIENT. SECURE and SOFTWARE. Requirements, Test Cases, and Testing Methods. Mark S. Merkow and Lakshmikanth Raghavan. CRC Press

Improving Business Process Performance

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Advanced analytics at your hands

Schneps, Leila; Colmez, Coralie. Math on Trial : How Numbers Get Used and Abused in the Courtroom. New York, NY, USA: Basic Books, p i.

Data Mining Applications in Higher Education

Regression Modeling Strategies

CREATING A THIRD EDITION DAVID MANN

Lean Management System LMS:2OI2

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

Prerequisites. Course Outline

TDWI Best Practice BI & DW Predictive Analytics & Data Mining

Management. ITIL Release. Dave Howard. A Hands-on Guide. CRC Press. Taylor & Francis Group. Taylor St Francis Croup, an Informa business

Course Syllabus. Purposes of Course:

Introduction to Financial Models for Management and Planning

Computer Security Literacy

DIGITAL IMAGE PROCESSING AND ANALYSIS

Business Intelligence. Data Mining and Optimization for Decision Making

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone:

Data Visualization. Principles and Practice. Second Edition. Alexandru Telea

Azure Machine Learning, SQL Data Mining and R

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Grid Computing FUNDAMENTALS OF. Theory, Algorithms and Technologies. Frederic Magoules. Edited by. CRC Press

THE COMPLETE PROJECT MANAGEMENT METHODOLOGY AND TOOLKIT

Design of Enterprise Systems

for Research and Guiding Innovation for Positive R&D Outcomes Lory Mitchell Wingate

ANDROID SECURITY ATTACKS AND DEFENSES ABHISHEK DUBEY I ANMOL MISRA. ( r öc) CRC Press VV J Taylor & Francis Group ^ "^ Boca Raton London New York

IMPROVEMENT THE PRACTITIONER'S GUIDE TO DATA QUALITY DAVID LOSHIN

Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING

THE HYBRID CART-LOGIT MODEL IN CLASSIFICATION AND DATA MINING. Dan Steinberg and N. Scott Cardell

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

SECOND EDITION THE SECURITY RISK ASSESSMENT HANDBOOK. A Complete Guide for Performing Security Risk Assessments DOUGLAS J. LANDOLL

Diagrams and Graphs of Statistical Data

A Simulation-Based lntroduction Using Excel

Data Mining. SPSS Clementine Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine

TNS EX A MINE BehaviourForecast Predictive Analytics for CRM. TNS Infratest Applied Marketing Science

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

Application of SAS! Enterprise Miner in Credit Risk Analytics. Presented by Minakshi Srivastava, VP, Bank of America

TOYOTA. by TOYOTA. Reflections from the Inside Leaders on the Techniques That Revolutionized the Industry. Edited by Samuel Obara and Darril Wilburn

The Predictive Data Mining Revolution in Scorecards:

Governance Simplified

Project Management Concepts, Methods, and Techniques

ANALYTICS CENTER LEARNING PROGRAM

A fast, powerful data mining workbench designed for small to midsize organizations

QUANTITATIVE METHODS. for Decision Makers. Mik Wisniewski. Fifth Edition. FT Prentice Hall

not possible or was possible at a high cost for collecting the data.

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

DATA MINING TECHNIQUES AND APPLICATIONS

Customer Relationship Management

Data Mining Algorithms Part 1. Dejan Sarka

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

An Introduction to Advanced Analytics and Data Mining

The Geography of International terrorism

Networking. Systems Design and. Development. CRC Press. Taylor & Francis Croup. Boca Raton London New York. CRC Press is an imprint of the

Easily Identify Your Best Customers

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

Study Guide. ScrumMaster. The. James Schiel. CRC Press. Taylor & Francis Croup, an Inform* business AN AUERBACH BOOK. CRC Press (s an imprint of the

Data Mining. Concepts, Models, Methods, and Algorithms. 2nd Edition

EFFECTIVE NON-PROFIT MANAGEMENT

Enhancing Compliance with Predictive Analytics

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

Transcription:

Customer and Business Analytic Applied Data Mining for Business Decision Making Using R Daniel S. Putler Robert E. Krider CRC Press Taylor &. Francis Group Boca Raton London New York CRC Press is an imprint of the Taylor & Francis Group an informa business A CHAPMAN & HALL BOOK

Contents List of Figures List of Tables Preface xiii xxi xxiii I Purpose and Process 1 1 Database Marketing and Data Mining 3 1.1 Database Marketing 4 1.1.1 Common Database Marketing Applications 5 1.1.2 Obstacles to Implementing a Database Marketing Program 8 1.1.3 Who Stands to Benefit the Most from the Use of Database Marketing? 9 1.2 Data Mining ^ 9 1.2.1 Two Definitions of Data Mining 9 1.2.2 Classes of Data Mining Methods 10 1.2.2.1 Grouping Methods 10 1.2.2.2 Predictive Modeling Methods 11 1.3 Linking Methods to Marketing Applications 14 2 A Process Model for Data Mining CRISP-DM 17 2.1 History and Background 17 2.2 The Basic Structure of CRISP-DM 19 vii

viii Contents 2.2.1 CRISP-DM Phases 19 2.2.2 The Process Model within a Phase. 21 2.2.3 The CRISP-DM Phases in More Detail 21 2.2.3.1 Business Understanding 21 2.2.3.2 Data Understanding 22 2.2.3.3 Data Preparation 23 2.2.3.4 Modeling 25 2.2.3.5 Evaluation 26 2.2.3.6 Deployment 27 2.2.4 The Typical Allocation of Effort across Project Phases 28 II Predictive Modeling Tools 31 3 Basic Tools for Understanding Data 33 3.1 Measurement Scales 34 3.2 Software Tools 36 3.2.1 Getting R 37 3.2.2 Installing R on Windows 41 3.2.3 Installing R on OS X 43 3.2.4 Installing the RcmdrPlugin.BCA Package and Its Dependencies 45 3.3 Reading Data into R Tutorial 48 3.4 Creating Simple Summary Statistics Tutorial. 57 3.5 Frequency Distributions and Histograms Tutorial 63 3.6 Contingency Tables Tutorial 73 4 Multiple Linear Regression 81 4.1 Jargon Clarification 82 4.2 Graphical and Algebraic Representation of the Single Predictor Problem 83

Contents ix 4.2.1 The Probability of a Relationship between the Variables 89 4.2.2 Outliers 91 4.3 Multiple Regression 91 4.3.1 Categorical Predictors 92 4.3.2 Nonlinear Relationships and Variable Transformations 94 4.3.3 Too Many Predictor Variables: Overfitting and Adjusted R 2 97 4.4 Summary 98 4.5 Data Visualization and Linear Regression Tutorial 99 5 Logistic Regression 117 5.1 A Graphical Illustration of the Problem 118 5.2 The Generalized Linear Model 121 5.3 Logistic Regression Details 124 5.4 Logistic Regression Tutorial 126 5.4.1 Highly Targeted Database Marketing 126 5.4.2 Oversampling 127 5.4.3 Overfitting and Model Validation 128 6 Lift Charts 147 6.1 Constructing Lift Charts 147 6.1.1 Predict, Sort, and Compare to Actual Behavior... 147 6.1.2 Correcting Lift Charts for Oversampling 151 6.2 Using Lift, Charts 154 6.3 Lift Chart Tutorial. 159 7 Tree Models 165 7.1 The Tree Algorithm 166 7.1.1 Calibrating the Tree on an Estimation Sample 167 7.1.2 Stopping Rules and Controlling Overfitting 170 7.2 Trees Models Tutorial 172

x Contents 8 Neural Network Models 187 8.1 The Biological Inspiration for Artificial Neural Networks... 187 8.2 Artificial Neural Networks as Predictive Models 192 8.3 Neural Network Models Tutorial 194 9 Putting It All Together 201 9.1 Stepwise Variable Selection 201 9.2 The Rapid Model Development Framework 204 9.2.1 Up-Selling Using the Wesbrook Database 204 9.2.2 Think about the Behavior That You Are Trying to Predict 205 9.2.3 Carefully Examine the Variables Contained in the Data Set 205 9.2.4 Use Decision Trees and Regression to Find the Important Predictor Variables 207 9.2.5 Use a Neural Network to Examine Whether Nonlinear Relationships Are Present 208 9.2.6 If There Are Nonlinear Relationships, Use Visualization to Find and Understand Them 209 9.3 Applying the Rapid Development Framework Tutorial... 210 III Grouping Methods 233 10 Ward's Method of Cluster Analysis and Principal Components 235 10.1 Summarizing Data Sets 235 10.2 Ward's Method of Cluster Analysis 236 10.2.1 A Single Variable Example 238 10.2.2 Extension to Two or More Variables 240 10.3 Principal Components 242 10.4 Ward's Method Tutorial 248

Contents xi 11 K-Centroids Partitioning Cluster Analysis 259 11.1 How K-Centroid Clustering Works 260 11.1.1 The Basic Algorithm to Find K-Centroids Clusters.. 260 11.1.2 Specific K-Centroid Clustering Algorithms 261 11.2 Cluster Types and the Nature of Customer Segments 264 11.3 Methods to Assess Cluster Structure 267 11.3.1 The Adjusted Rand Index to Assess Cluster Structure Reproducibility 268 11.3.2 The Calinski-Harabasz Index to Assess within Cluster Homogeneity and between Cluster Separation 274 11.4 K-Centroids Clustering Tutorial 275 Bibliography 283 Index 287