Predictive Analytics Workshop using IBM SPSS Modeler IBM Corporation

Similar documents
Real World Application and Usage of IBM Advanced Analytics Technology

Predictive Analytics Workshop With IBM SPSS Modeler

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

Maximizing Return and Minimizing Cost with the Decision Management Systems

Predictive Analytics: Turn Information into Insights

Data Mining + Business Intelligence. Integration, Design and Implementation

PREDICTIVE ANALYTICS IN HIGHER EDUCATION NOVEMBER 6, 2014

IBM SPSS Modeler Professional

IBM SPSS Modeler Premium

III JORNADAS DE DATA MINING

Make Better Decisions Through Predictive Intelligence

Data Mining Solutions for the Business Environment

Data Mining Applications in Higher Education

Predictive Analytics for Government Chih-Feng Ku Solutions Manager, Business Analytics IBM Asia Pacific Business Analytics

Solve your toughest challenges with data mining

The Real Benefits from Text Mining

Three proven methods to achieve a higher ROI from data mining

Data Mining for Fun and Profit

INTRODUCTION TO DATA MINING SAS ENTERPRISE MINER

The Analytical Revolution

Solve Your Toughest Challenges with Data Mining

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010

Working with telecommunications

Solve your toughest challenges with data mining

Hexaware E-book on Predictive Analytics

Auto-Classification for Document Archiving and Records Declaration

Using Data Mining to Detect Insurance Fraud

Using Data Mining to Detect Insurance Fraud

Making critical connections: predictive analytics in government

Predictive Marketing for Banking

Make Better Decisions Through Predictive Intelligence

SAP Predictive Analysis: Strategy, Value Proposition

TEXT ANALYTICS INTEGRATION

The Future of Business Analytics is Now! 2013 IBM Corporation

Predictive Analytics in an hour: a no-nonsense quick guide

IBM SPSS Direct Marketing

Minimize customer churn with analytics

SPSS Modeler Integration with IBM DB2 Analytics Accelerator

How to Make Your Predictive Models Actionable

IBM Next Best Action. Tony Hocevar Business Analytics Growth Markets

Making Critical Connections: Predictive Analytics in Government

Five Predictive Imperatives for Maximizing Customer Value

Worldwide Advanced and Predictive Analytics Software Market Shares, 2014: The Rise of the Long Tail

Performing a data mining tool evaluation

Smarter Analytics Leadership Summit Content Review

How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK

KNOWLEDGE BASE DATA MINING FOR BUSINESS INTELLIGENCE

ETPL Extract, Transform, Predict and Load

IBM SPSS Modeler Professional

IBM Predictive Analytics Solutions for Education

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Test Data Management in the New Era of Computing

IBM Predictive Analytics Solutions

Introduction to Data Mining

ANALYTICS CENTER LEARNING PROGRAM

Improve Results with High- Performance Data Mining

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III

Get to Know the IBM SPSS Product Portfolio

Next presentation starting soon Next Gen Customer Experience Enabled by PwC & Oracle s Cloud CRM & CX Applications

Overview, Goals, & Introductions

A New Era Of Analytic

Grow Revenues and Reduce Risk with Powerful Analytics Software

not possible or was possible at a high cost for collecting the data.

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Predictive Analytics for Database Marketing

Big Data: Key Concepts The three Vs

Data Mining Techniques in CRM

IBM SPSS Modeler 15 In-Database Mining Guide

BIG DATA: PROMISE, POWER AND PITFALLS NISHANT MEHTA

DBTech Pro Workshop. Knowledge Discovery from Databases (KDD) Including Data Warehousing and Data Mining. Georgios Evangelidis

LEVERAGING BIG DATA & ANALYTICS TO IMPROVE EFFICIENCY. Bill Franks Chief Analytics Officer Teradata July 2013

Manage student performance in real time

Predictive Analytics in an hour: a no-nonsense quick guide

IBM's Fraud and Abuse, Analytics and Management Solution

DEMYSTIFYING BIG DATA. What it is, what it isn t, and what it can do for you.

A Capability Model for Business Analytics: Part 2 Assessing Analytic Capabilities

Current Challenges. Predictive Analytics: Answering the Age-Old Question, What Should We Do Next?

Optimizing Case Management with Predictive Tax Compliance

Driving Business Value with Big Data and Analytics

Introduction. A. Bellaachia Page: 1

Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

Leading the way with Information-Led Transformation. Mark Register, Vice President Information Management Software, IBM AP

Better planning and forecasting with IBM Predictive Analytics

Smarter Analytics. Barbara Cain. Driving Value from Big Data

Beyond listening Driving better decisions with business intelligence from social sources

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

What is Customer Relationship Management? Customer Relationship Management Analytics. Customer Life Cycle. Objectives of CRM. Three Types of CRM

Three Ways to Improve Claims Management with Business Analytics

An Introduction to Advanced Analytics and Data Mining

> Cognizant Analytics for Banking & Financial Services Firms

Predictive Analytics Certificate Program

Banking Analytics Training Program

Taking A Proactive Approach To Loyalty & Retention

Prerequisites. Course Outline

Introduction to Predictive Analytics: SPSS Modeler

Data Mining for Everyone

The power of IBM SPSS Statistics and R together

Data Mining Governance for Service Oriented Architecture

Transcription:

Predictive Analytics Workshop using IBM SPSS Modeler 2012 IBM Corporation

Objectives Smarter Software for Smarter Cities

Agenda 8:30-8:40 Welcome and Introductions 8:40-8:55 Introduction to Predictive Analytics 8:55-9:05 Exercise: Navigating IBM SPSS Modeler 9:05-9:25 Exercise: Predictive in 20 Minutes 9:25-9:40 Data Mining Methodology and Application 9:40-9:55 Break 9:55-10:55 Exercise: Data Mining Techniques 10:55-11:25 Exercise: Text Analytics 11:25-11:50 Deployment 11:50-12:00 Wrap-up

Purpose of the Workshop Introduction to predictive analytics and data mining Stimulate thinking about how data mining would benefit your organization Demonstrate ease of use of powerful technology Get experience in doing data mining See examples of how other organizations are benefitting from deploying predictive analytics

Smarter Planet Instrumented Interconnected Intelligent

What is Predictive Analytics? Predictive Analytics helps connect data to effective action by drawing reliable conclusions about current conditions and future events Gareth Herschel, Research Director, Gartner Group

Our Ultimate Goal is to Ensure Your Success Florida Juvenile Justice reduced delinquency in schools by 34% Madrid reduced emergency response time by 25% Analytics helped reduce Hamilton County s dropout rate by 25% Predictive analytics helped slash Memphis s crime rate by 40% in one year Lancaster, uses predictive policing models to reduced crime saving the city $1.3M a year. Analytics decreased fraud in the Federal government by 50%

Predictive Analytics in Public Sector Crime Analyses Force Deployment Lead Generation Hot Spot Analyses Fraud detection and prevention Money laundering Network intrusion Tax audits & collection Entity Resolution Text Analytics Social Network Analysis Education Which among our students are at risk? Who are the most promising applicants? Which alumni will donate and how much? How can my institution plan for future development? Corrections Which inmates are at risk of recidivism? Why do some inmates return? Which programs are successful?

IBM SPSS Modeler A Quick Overview

IBM SPSS Modeler High performance data mining and text analytics workbench Used for the proactive Identification of fraud, waste and abuse Reduction of costs Identification of risk Students at risk of failing Inmates at risk of returning Patients at risk of relapse Forecasting demographic shifts and migration Allows analytics to be repeated and integrated

IBM SPSS Modeler

IBM SPSS Modeler

IBM SPSS Modeler

IBM SPSS Modeler

Predictive in 20 Minutes A Quick Exercise

Exercise: Predictive in 20 Minutes Goal: Create a model to identify who are at risk of heart attack Approach: Use patient data which contains various health and behavioral information Define which fields to use Choose the modeling technique Automatically generate a model to identify who are at risk Review results Why? For public health policy implications, by proactively identifying and quantifying high-risk behaviors and practices

Break - Please Return in 15 Minutes

IBM SPSS Modeler One Analytical Workbench Endless Techniques

Data Mining Methodology Cross-Industry Standard Process Model for Data Mining Describes Components of Complete Data Mining Project Cycle Shows Iterative Nature of Data Mining Vendor and Industry Neutral

Data Mining Techniques Technique Usage Algorithms Classification (or prediction) Used to predict group membership (e.g., will this employee leave?) or a number (e.g., how many widgets will I sell?) Auto Classifiers, Decision Trees, Logistic, SVM, Time Series, etc.

Data Mining Techniques Technique Usage Algorithms Classification (or prediction) Segmentation Used to predict group membership (e.g., will this employee leave?) or a number (e.g., how many widgets will I sell?) Used to classify data points into groups that are internally homogenous and externally heterogeneous Identify cases that are unusual Auto Classifiers, Decision Trees, Logistic, SVM, Time Series, etc. Auto Clustering, K- means, etc. Anomoly detection

Data Mining Techniques Technique Usage Algorithms Classification (or prediction) Segmentation Association Used to predict group membership (e.g., will this employee leave?) or a number (e.g., how many widgets will I sell?) Used to classify data points into groups that are internally homogenous and externally heterogeneous. Identify cases that are unusual Used to find events that occur together or in a sequence (e.g., market basket) Auto Classifiers, Decision Trees, Logistic, SVM, Time Series, etc. Auto Clustering, K- means, etc. Anomoly detection APRIORI, Carma, Sequence

Additional Data Mining Techniques Technique Usage Algorithms Text Analytics Entity Analytics Used to discover patterns resident in text or other unstructured data (e.g., sentiment analysis) Used to determine which cases are likely the same actor, and which seemingly identical cases are actually independent Natural Language Processing Parts of Speech Analysis Context Accumulation Social Network Analysis Used to uncover associations which may exist between cases, and identify central or influential actors

IBM SPSS Modeler Segmentation Modeling

Segmentation Modeling Goal: Discover natural groupings or clusters of alumni donors Approach: Alumni data from a university Define which fields to use Use K-Means Clustering to generate a model to group alumni Appendix: Use these clusters to predict donation Why? Better alumni understanding (demographics, socio-economic etc) Tailored messages for each group/segment Personal and more relevant for alumni Institutional Planning

IBM SPSS Modeler Entity Analytics

Entity Analytics Suppose that you have the following records from two different sources, and are not sure whether they refer to the same person or different people. Source 1 Record no.: 70001 Name: Jon Smith Address: 123 Main Street Driv. License: 0001133107 DL No exact matches between the two records. However, if we introduce a third source, we find some common attributes Source 2 Record no.: 9103 Name: JOHNATHAN Smith Date of Birth: 06/17/1934 Telephone: 555-1212 Email: jls@mail.com IP address: 9.50.18.77. Source 3 Record no.: 6251 Name: Jon Smith Telephone: 555-1212 Driv. License: 0001133107 Telephone

Entity Analytics 3634 Suspects Results Fields Used in EA Resolution % Missing in PD Database LAST 0.7 FIRST 0.4 MIDDLE 63.9 RACE 0.1 SEX 0.1 DOB 2.4 ADDR 0.9 DRLIC 63.8 PHONE 59.1

IBM SPSS Modeler Classification Modeling

Classification model Goal: Identify students likely to persist Approach: Use student performance scores and other demographics Define which fields to use Use the Auto Classifier to choose the appropriate modeling technique Review results Why? Identify students likely to persist into their second year Conversely, same methods can be used to identify students at risk of attrition (or prisoners at risk of recidivism, or patients likely to respond to treatment)

IBM SPSS Modeler Text Analytics

The Importance of Text Because people communicate with words, not numbers, it has become critical to be able to mine text for its meaning and to sort, analyse, and understand it in the same way that data has been tamed. In fact, the two basic types of information complement each other, with data supplying the what and text supplying the why. Source IDC: Text Analytics: Software s Missing Piece?

Text Analytics Turn unstructured officer notes and narratives into useable and searchable context-rich content with Text Analytics.

Data Mining and Text Analytics Data Mining Use advanced analytical techniques on data Discover key relationships between variables Model effect of variables on outcomes Determine influence on outcomes Predict outcomes Apply models to new data Text Analytics Extract, analyze and create structure for unstructured data Integrate analysis results into operational systems Integrate analysis results into Business Intelligence applications Integrate analysis results with structured data and use as input for Data Mining Improves model accuracy

Deployment Many Options

Why IBM SPSS?

Workshop Takeaways Easy to use, visual interface Short timeframe to be productive with actionable results Does not require knowledge of programming language Business results focused Cost effective solution that delivers powerful results across organization Flexible licensing and deployment options Full range of algorithms for your business problems End-to-end solution Data preparation through real time interactions Use structured, unstructured and survey data Full suite of products, from data collection through deployment Flexible architecture Leverages the investments already made in technology Does not require data in a proprietary format or DB Structured and unstructured data Open architecture (both inputs and outputs) SQL Pushback

Nucleus Research: The Real ROI from IBM 94% of clients achieved a positive ROI, with an average payback period of 10.7 months Key benefits achieved include reduced costs, increased productivity, improved citizen & employee satisfaction and safety. 81% of projects deployed on time, 75% on or under budget This is one of the highest ROI scores Nucleus has ever seen in its Real ROI series of research reports. Rebecca Wettemann, Vice President of Research, Nucleus Research

Appendix

Data Mining Overview From Amazon.com Paperback: 512 pages Publisher: Wiley; 1 edition (December 28, 1999) Language: English ISBN-10: 0471331236 ISBN-13: 978-0471331230 ; Good introductory text on data mining for marketing from two top communicators in the field

Statistical Analysis and Data Mining Handbook of Statistical Analysis and Data Mining Applications Robert Nisbet, John Elder IV, and Gary Miner Academic Press (2009) ISBN-10: 0123747651 An excellent guide to many aspects of data mining including Text mining.

Data Mining Algorithms From Amazon.com Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations by Eibe Frank, Ian H. Witten Paperback - 416 pages (October 13, 1999) Morgan Kaufmann Publishers; ISBN: 1558605525; Best book I ve found in between highly technical and introductory books. Good coverage of topics, especially trees and rules, but no neural networks.