Predictive Analytics: Extracts from Red Olive foundational course



Similar documents
Lead Generation Quickstart Guide

An Introduction to Advanced Analytics and Data Mining

TNS EX A MINE BehaviourForecast Predictive Analytics for CRM. TNS Infratest Applied Marketing Science

Data Mining Part 5. Prediction

Driving Value From Big Data

Application of SAS! Enterprise Miner in Credit Risk Analytics. Presented by Minakshi Srivastava, VP, Bank of America

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

Easily Identify Your Best Customers

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Welcome to the second half ofour orientation on Spotfire Administration.

Free Trial - BIRT Analytics - IAAs

Predictive modelling around the world

2015 Workshops for Professors

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

Data Mining Algorithms Part 1. Dejan Sarka

Simple Predictive Analytics Curtis Seare

A Comparison of Decision Tree and Logistic Regression Model Xianzhe Chen, North Dakota State University, Fargo, ND

Customer Analytics. Turn Big Data into Big Value

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

KnowledgeSEEKER Marketing Edition

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa

Deriving Value From Big Data Visual, Predictive, GeoLocation and Event Analytics

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

Business Analytics and Data Mining for CRM Business Analytics and Data Mining for CRM: Jumpstart workshop

New Work Item for ISO Predictive Analytics (Initial Notes and Thoughts) Introduction

The Correlation Coefficient

Using multiple models: Bagging, Boosting, Ensembles, Forests

Introduction to Regression and Data Analysis

Data Mining in CRM & Direct Marketing. Jun Du The University of Western Ontario jdu43@uwo.ca

How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Data Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd Edition

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA

Data Mining. SPSS Clementine Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine

Predictive Modeling and Big Data

Customer and Business Analytic

From Particles To Electronic Trading. Simon Bevan

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone:

430 Statistics and Financial Mathematics for Business

not possible or was possible at a high cost for collecting the data.

Product recommendations and promotions (couponing and discounts) Cross-sell and Upsell strategies

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

Data mining as a tool of revealing the hidden connection of the plant

Joseph Twagilimana, University of Louisville, Louisville, KY

How to Get More Value from Your Survey Data

Advanced In-Database Analytics

SAS Add-In 2.1 for Microsoft Office: Getting Started with Data Analysis

Get to Know the IBM SPSS Product Portfolio

Deployment of Predictive Models. Sumit Kumar Bardhan

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

Big Data Analytics. Optimizing Operations and Enabling New Business Models

Better decision making under uncertain conditions using Monte Carlo Simulation

You buy a TV for $1000 and pay it off with $100 every week. The table below shows the amount of money you sll owe every week. Week

ANALYTICS IN BIG DATA ERA

Diagrams and Graphs of Statistical Data

APPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION ANALYSIS.

About Dell Statistica

The SkySQL Administration Console

Lecture 10: Regression Trees

The Data Mining Process

A Property & Casualty Insurance Predictive Modeling Process in SAS

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

Cleaned Data. Recommendations

HMRC Tax Credits Error and Fraud Additional Capacity Trial. Customer Experience Survey Report on Findings. HM Revenue and Customs Research Report 306

CRM Analytics for Telecommunications

IBM SPSS Direct Marketing 23

Everything You Need to Know About Digital Marketing

Chapter ML:XI (continued)

WHITE PAPER ON. Operational Analytics. HTC Global Services Inc. Do not copy or distribute.

Role of Social Networking in Marketing using Data Mining

Data Mining with SAS. Mathias Lanner Copyright 2010 SAS Institute Inc. All rights reserved.

IBM SPSS Direct Marketing 22

What s New in IBM SPSS Statistics 20

ANALYTICS CENTER LEARNING PROGRAM

College Readiness LINKING STUDY

CoolaData Predictive Analytics

Biomarker Discovery and Data Visualization Tool for Ovarian Cancer Screening

ANALYSIS OF WEBSITE USAGE WITH USER DETAILS USING DATA MINING PATTERN RECOGNITION

Web analytics: Data Collected via the Internet

A Marketer s Guide to Analytics

Webinar Workbook. 1. Grow your list and optimise your process. Subscribers will always consider when is when deciding whether to subscribe :

Transcription:

Predictive Analytics: Extracts from Red Olive foundational course For more details or to speak about a tailored course for your organisation please contact: Jefferson Lynch: jefferson.lynch@red-olive.co.uk +44 1256 831100 December 2014 Analytics and Data Management 1

Contents What makes a great analysis? Measuring relationships between variables Profiling What is data mining? The data mining process Data mining techniques Discussion next steps for data mining Back-up slides Introduction to descriptive statistics Copyright 2014 Red Olive Ltd, All Rights Reserved. 2

Some examples Copyright 2014 Red Olive Ltd, All Rights Reserved. 3

Monitoring Trends Traffic Disruption in London Christmas 2011 Road works ban starts (1 st July 2012) Winter-time road works / end FY Queen s Diamond Jubilee London 2012 Olympic and Paralympic Games Information from Transport for London Oracle Day presentation, 6 Nov 2012 Copyright 2014 Red Olive Ltd, All Rights Reserved. 4

Geographical Mash-ups Visualising: Connections between businesses in East London Based on: Streams of Twitter data, tracking relationships, mentions and retweeets Source: http://www.techcitymap.com/index.html#/ Copyright 2014 Red Olive Ltd, All Rights Reserved. 5

Census Analysis Census 2011: Explore population changes in your area Source: The Telegraph online Interactive tool for looking comparing areas on their 2001 and 2011 demographic profiles http://www.telegraph.co.uk/ear th/greenpolitics/population/94 03239/Census-2011-Explorethe-population-changes-inyour-area.html Original source: ONS data visualisation centre http://www.ons.gov.uk/ons/interactive/index.html Copyright 2014 Red Olive Ltd, All Rights Reserved. 6

Measuring relationships between variables In order to start making connections we need to investigate relationships between variables Start point - relationships between two variables at time Multivariate techniques allow us to investigate relationships between many variables The appropriate measure of relationship depends on the type of data that you re analysing primarily whether scale (numeric) or nominal (categorical) Copyright 2012 Red Olive Ltd, All Rights Reserved. 7

Measures of relationship Scale (numeric) data Correlation quantifies the linear relationship between variables in scatter plots +1 = exact positive relationship e.g. e.g. 0 = no relationship e.g. x x x x x x x x x -1 = exact negative relationship e.g. Copyright 2012 Red Olive Ltd, All Rights Reserved. 8

Correlation coefficient takes values between -1 and +1 The correlation will rarely be exactly 1 or -1 This would suggest that the variables were exactly dependent on each other Likewise the correlation is rarely exactly 0 Because a slight relationship can occur by chance Correlation measures the extent of a linear relationship, so needs to be handled with care Four sets of data with the same correlation of 0.816 For Correlation: Excel function CORREL Copyright 2014 Red Olive Ltd, All Rights Reserved. 9

What is data mining? Copyright 2014 Red Olive Ltd, All Rights Reserved. 10

Two main types of data mining model Type 1: Models driven by a Target Variable e.g. Which site visitors are likely to subscribe? - Implies building a Predictive Model - Directed Data Mining Techniques Type 2: Models with no Target Variable e.g. How does the subscriber base segment? - Implies a Descriptive Model - Undirected Data Mining Techniques Copyright 2014 Red Olive Ltd, All Rights Reserved. 11

Gains Chart based on representative evaluation sample Cumulative % oof respoondents Gains Chart Churn Model 100.00% 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% prediction random optimal 30.00% 20.00% 10.00% 0.00% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Cumulative % of base Copyright 2014 Red Olive Ltd, All Rights Reserved. 12

Data mining techniques and where they can be applied Copyright 2014 Red Olive Ltd, All Rights Reserved. 13

Techniques to be discussed Predictive Forecasting Decision trees Regression models Descriptive Factor analysis Cluster analysis Affinity analysis Copyright 2014 Red Olive Ltd, All Rights Reserved. 14

Techniques on individual-level data Data mining methods Copyright 2014 Red Olive Ltd, All Rights Reserved. 15

Example Decision Tree Target Variable: Good/Bad Credit Rating Highly significant Best predictor: Income Level 2 nd best predictor: Number of credit cards Final predictor: Age End nodes: No further splits Copyright 2014 Red Olive Ltd, All Rights Reserved. 16 16

Regression Example Regression Model Source: The Times 24/11/2012 Copyright 2014 Red Olive Ltd, All Rights Reserved. 17

The affinity tile map Strengths of affinities are displayed using a hot-cold colour palette By clicking on a tile, details of the pair of products and their affinity are revealed Source: Teradata Copyright 2014 Red Olive Ltd, All Rights Reserved. 18