GETTING AHEAD OF THE COMPETITION WITH DATA MINING

Similar documents
Past, present, and future Analytics at Loyalty NZ. V. Morder SUNZ 2014

from Larson Text By Susan Miertschin

not possible or was possible at a high cost for collecting the data.

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

Data Mining with SAS. Mathias Lanner Copyright 2010 SAS Institute Inc. All rights reserved.

Prerequisites. Course Outline

WHITE PAPER: DATA DRIVEN MARKETING DECISIONS IN THE RETAIL INDUSTRY

Tutorials for Project on Building a Business Analytic Model Using Data Mining Tool and Data Warehouse and OLAP Cubes IST 734

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Customer Analytics. Turn Big Data into Big Value

Product recommendations and promotions (couponing and discounts) Cross-sell and Upsell strategies

Data Mining Algorithms Part 1. Dejan Sarka

How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK

Know Your Buyer: A predictive approach to understand online buyers behavior By Sandip Pal Happiest Minds, Analytics Practice

IT462 Lab 5: Clustering with MS SQL Server

Business Intelligence Solutions for Gaming and Hospitality

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

AgilOne + Responsys. Personalizing and measuring your Responsys campaigns just got a whole lot easier.

5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK

The. biddible. Guide to AdWords at Christmas

Five predictive imperatives for maximizing customer value

Data Mining for Fun and Profit

Simple Predictive Analytics Curtis Seare

Data Mining: Overview. What is Data Mining?

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA

How to Plan a Successful Load Testing Programme for today s websites

RESEARCH NOTE NETSUITE S IMPACT ON E-COMMERCE COMPANIES

Segmentation and Data Management

White Paper. Data Mining for Business

Data Mining Applications in Higher Education

Maximize Sales and Margins with Comprehensive Customer Analytics

The 7 Biggest Marketing Mistakes Small Business Owners Make and How to Avoid Them

DATA MINING AND WAREHOUSING CONCEPTS

White Paper. Thirsting for Insight? Quench It With 5 Data Management for Analytics Best Practices.

Frequency Matters. The keys to optimizing send frequency

Data Mining and Predictive Modeling with Excel 2007

Using Tableau Software with Hortonworks Data Platform

Effective Segmentation. Six steps to effective segmentation

The Fundamentals of B2C Marketing Automation for Effective Marketing Communications

Data Visualization Techniques

Easily Identify Your Best Customers

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam

Banking On A Customer-Centric Approach To Data

THE THREE "Rs" OF PREDICTIVE ANALYTICS

IT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users

Data Mining with SQL Server Data Tools

Data Visualization Techniques

Data Mining in CRM & Direct Marketing. Jun Du The University of Western Ontario jdu43@uwo.ca

Redefining Customer Analytics

The Definitive Guide to Lifetime Value THE DEFINITIVE GUIDE TO CUSTOMER LIFETIME VALUE

Pay per Click Success 5 Easy Ways to Grow Sales and Lower Costs

Data Mining is the process of knowledge discovery involving finding

They have way too many things to do already. Not enough time to do them. They don't know how to get started with new marketing projects.

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis

6/10/2015. Chapter Nine Overview. Learning Outcomes. Opening Case: Twitter: A Social CRM Tool

Hexaware E-book on Predictive Analytics

Business Intelligence: Effective Decision Making

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Successful Steps and Simple Ideas to Maximise your Direct Marketing Return On Investment

Reducing Customer Churn

Exclusive access to metrics to measure KPIs in real time, and at scale

Outline. BI and Enterprise-wide decisions BI in different Business Areas BI Strategy, Architecture, and Perspectives

730 Yale Avenue Swarthmore, PA

Data Mining Solutions for the Business Environment

The Importance of Local Marketing to Multi-Location Automotive Businesses

Session 10 : E-business models, Big Data, Data Mining, Cloud Computing

Getting ahead online. your guide to. GOL412_GBBO brochure_aw5.indd 1 10/2/10 10:10:01

CREATING PACKAGED IP FOR BUSINESS ANALYTICS PROJECTS

Foundations of Business Intelligence: Databases and Information Management

TIBCO Industry Analytics: Consumer Packaged Goods and Retail Solutions

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

A Quick Snapshot: Microsoft Dynamics GP Functionality for Wholesale Distributors

Outlines. Business Intelligence. What Is Business Intelligence? Data mining life cycle

Three proven methods to achieve a higher ROI from data mining

Better Business Analytics with Powerful Business Intelligence Tools

SAS VISUAL ANALYTICS AN OVERVIEW OF POWERFUL DISCOVERY, ANALYSIS AND REPORTING

INTRODUCTION TO BUSINESS INTELLIGENCE What to consider implementing a Data Warehouse and Business Intelligence

Pentaho Data Mining Last Modified on January 22, 2007

Data Analytical Framework for Customer Centric Solutions

Direct Marketing of Insurance. Integration of Marketing, Pricing and Underwriting

Making confident decisions with the full spectrum of analysis capabilities

Business Intelligence, Analytics & Reporting: Glossary of Terms

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

Data Mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques

Data Warehouse and Business Intelligence Testing: Challenges, Best Practices & the Solution

SAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics

Driving business intelligence to new destinations

What is Prospect Analytics?

Transcription:

WHITE PAPER GETTING AHEAD OF THE COMPETITION WITH DATA MINING Ultimately, data mining boils down to continually finding new ways to be more profitable which in today s competitive world means making better and more accurate decisions faster than your competition. Data mining has a great affinity to the six sigma mind set and the two disciplines work well together. They both start out with an unknown outcome, the end goal being improvement, and both use well established statistical methods. Whilst the end result does not guarantee you will double your sales or half your production defects, you are very likely to find improvements and gain new insights into your business. How much these insight are worth varies from industry to industry, but there is always that chance to find highly valuable nuggets of gold. For example, if you are drilling for oil, knowing where to drill is hugely valuable given the cost of setting up. However, knowing the reasons why a product fails quality control so they can be mitigated has a benefit that all depends on the cost of rectification and frequency of reoccurrence, which will vary greatly from business to business. The fact is that whilst we often use our in-depth business and industry knowledge to generate many great improvement ideas, you would have even more great ideas when presented with trends in your data that you were not aware of. For example, if, as a delivery company, you discovered that the accident rate and delivery times were greatly affected by the driver s age and whether the route was inner city or more motorway. Would you not adjust which route the driver goes on? If you discovered managers and team leaders were sick three times less than everybody else would you change your sickness policy or find new ways to give people more responsibilities and empowerment? Chances are, you would take some form of action, but you would probably only have taken that action in the light of this new information. The other side to data mining is about accuracy and measurability, the great example being forecasting. This typically means going beyond simple averages, linear regression functions and moving averages to more complex weighted moving averages, exponential smoothing, and time serial analysis to forecast sales, staffing and stock levels, etc. These last methods will often acknowledge other attributes, clusters, seasonality, and have further alpha inputs that can lead to significantly higher accuracy levels. As the best method often varies from case to case, each can be implemented separately and then statistically compared. This is done using functions like mean square error, mean absolute variation and

tracking signal so that the most pertinent forecast is used. If you re not already using the these methods, how much better would some of your decisions be if your estimates were that bit more accurate, and how much money would that save or make for your business? What is also useful is that the activity of forecasting often identifies hidden factors affecting you estimates you were not aware of. For example, you log competitor sales campaign dates and plot them against your own data. Often competitors stick to schedule which, once known, can be very valuable to your forecasting. In customer relationship management it s critical to know your customer. Knowing your customer makes your marketing more effective, reduces churn, increases spend levels and ensures you are giving the customer the service they want and expect. It also helps you exceed their expectations and evolve your offering. This is an area where data mining plays a major role and where nearly all the methods and tools play a part. Usually the first step in understanding your customers is to classify and cluster them. For a retail company a simple example would mean classifying your customer base into age groups, disposable income bands and their primary interest, and then clustering the combinations that naturally stick together. The resulting clusters are often given memorable names like the silver surfer, but using these clusters makes marketing to each group far easier, helps you focus on the right delivery channel strategy and can greatly increases conversion rates by being more targeted. Whilst this activity is often done manually, it becomes progressively harder as the number of influencing attributes grows. This is when it becomes a good idea to augment your internal data with external data. Data mining tools, however, will scan all your data and automatically recommend groupings based on reviewing every field. The nice thing about clusters is that whilst you can t change the customer s gender or address, you can often influence other attributes and help move a customer to a new cluster of higher value. A great example is Amazon prime. Once a customer joins prime there is usually a big shift in the lifetime value, shopping frequency, basket size, churn, ability to upsell and so on. Knowing the value difference between the two clusters helps you understand the maximum spend available per customer to move them up to a higher value cluster. However, bear in mind it is also effective to move customers down to a less profitable cluster if they look likely to leave, especially if you know before they consciously decide to do so. But just how do you know they are likely to leave? The answer lies in understanding the data of every customer who has left in the past and then use data mining tools to automatically detect all the influencing factors. There are several methods, but the simplest tool will give you a bar graph showing the relative importance of each factor. Once identified, these factors let you filter and identify the customers most likely to leave so you re able to take proactive action such as getting in touch and offering them a better deal or a complementary bottle of wine when things go wrong. Just as important as knowing your customers should be knowing your competitors. Today, with most business publishing their prices on the internet and the public data available on web traffic, it s now easier than even to profile your customers and track how they are doing in comparison to your own efforts. Some answers, however, you will never find because you just don t have the data or it s in a state that can t be used. The unfortunate truth is that from a Gartner perspective most businesses have poor data quality levels with most floating around level 1 and 2 of their 5 level quality framework. This means your business is most likely to be missing data and have a significant amount of incorrect data. Whilst poor data quality dilutes your data analysis, it can be addressed with automated data cleansing via ETL or exclusion entirely from the models you build. There are, however, data mining tools for filling in missing data and identifying outliers statistically, but the best solution is to have the right level of governance in place and a solid data strategy.

However, the biggest factor is if you don t capture the data or it s just not available. This is where a little creativity comes into play as you start to look at what data you can get, and how likely it may influence what you are investigating. For example, a business wants to open its next store but needs to pick the best possible location. It s unlikely to already have good sales data on the area unless it sells online. From a data mining perspective, this is where it is useful to know as much as possible from as many angles as possible. Example additional data for the location might include population demographics, disposable income, education levels, number of household vehicles, density of competing businesses, university student populations, average house prices, etc. By linking all these factors to your own data and then letting the data mining tools discover the trends, the decision on location becomes more informed and the risk of a poor investment is reduced. Another technique is to create derived metrics from the data you have to predict key supporting factors. If you take the example of betting on a greyhound race. If you know that only 5% of dogs who are bumped finish first, or that 38% of the dogs that reach the first corner leading go on to win, it soon becomes clear that it s worth knowing the likelihood of these supporting factors. This is where you would create new metrics specifically to predict their likelihood. Factors such as the average trap the dog starts in to see if collisions are more likely when in the wrong trap, or the ranked weight of each dog to see if the lighter dog has the advantage in the initial sprint. The other methods that support understanding your customer include association analysis, which is commonly used to do basket analysis and is great at supporting a higher basket margin. This uses your customers shopping behaviours to learn what they buy in groups. This learning can then be shared with other customers by grouping products together on the shop floor or web page making it easier for them to buy as a group or bundle deal, but it also plants the idea in their minds of the possible combination. Another method is called sequence analysis. This is often used to understand the click paths customers take when using a website or can be used to look at customer purchases over time. Knowing a customer buys ink for a printer 3 months after buying the printer can be useful to know and a great reason to contact the customer. All these methods are all easily accessible from the free Excel data mining addin and table analysis bars shown below. All you need is to be able to connect to an Analysis Services server. Whilst the list of applications for data mining is exhaustive, hopefully you get the idea. Data mining brings to the table some very sophisticated mathematical techniques as well as some quite simple but clever methods. Consider the differences between the Naïve Bayes method, decision trees and the neural net methods. The Bayes method was invented by reverend Thomas Bayes in 1763 and was based on him using marbles to count events and hence is quite simple but effective as shown below in this bike buyer sample dataset.

Decision trees which work by discovering the key pathways through your data are more sophisticated and visual. They give you the probability of the desired outcomes and colour grade the pathway through the data with darker shades indicating higher probability. This is my personal favourite as it can produce some great visual output that s easy to understand. The example below shows the factors affecting a shopper likelihood of buying a bike. The best pathway shows customers Aged between 36 and 39 who don t own a cars have a 40.66% likelihood of buying a bike. However, the most interesting is probably the Neural

Network method which was discover as a by-product of mapping out how the human brain works. This method analyses all the possible relationships in your data and then after combining all the attributes of the first pass will take a second look at your data in a similar way. This is the equivalent of a chess player thinking 2 moves ahead and is best suited to highly complex problems and is not something any report or cube can easily do. Below is the same bike buyer example but predicted using the neural net method, which, as you can see, is much more detailed. At slicedbread, we use the Microsoft tools embedded in SQL Server and Office to support our clients with their data mining projects. Whilst many statisticians will prefer the more advanced and expensive SAS and SPSS solutions, Microsoft has the advantage of being free as long as you own SQL server and Excel, it s user friendly and Excel can handle more complexity than most typical businesses use. However, if you need more advanced capabilities then you can always use the data mining extensions language DMX in analysis services which has extra support for things like nested tables. Also, if you have SQL Enterprise then there are some more advanced algorithms available. When we work on a data mining project, we follow the life cycle below which I would recommend if you want to have a go at this yourself. Define the problem. Whilst you can point the tools at hundreds of fields and ask it to tell you something you don t know, you will get better quality results by focusing around a single goal. Collect you data. The more attributes you investigate the more data you will need, but a simple snap shot will usually do. The tools will usually use 70% of your data to design the model and the remaining 30% to test it. The model you will build will often last some time before needing to be re-evaluated. Transform and clean you data. Expect this to take the majority of the effort. The cleaner and more discrete your data the better. As some models work on continuous data such as sales, and others utilise discrete data

like sales channel, it s best to take continuous data and create additional discretized groupings in advance so you can have the best of both worlds. A good example would be age bands. Where you have bad data, either exclude it or fix it, but don t leave it in. Because the process is cyclic as you close in on your goal, your first pass of the data is to identify trends, with subsequent passes going deeper into the data as you go for greater accuracy. If you have millions of records, I would recommend pre-aggregating the data into a more summary form. Build Your Model. Using a clean and well prepared data source makes data mining almost a matter of pointing the tools at the data, selecting your field settings and clicking go. However, there is plenty of opportunity to refine the models as you discover what is correlating and supporting your goal and what has no affect and needs removing. In most cases, you will often find yourself going back to you data and either creating new derived fields or pulling in brand new fields related to an attribute that you have just discovered is highly influential. In addition to this approach, there also some very powerful techniques you can use if you put your data into a pivot table using PowerPivot (ideally). This approach can reproduce some of the same outcomes but with a greater control and understanding in how you got to the best solution. To take advantage of this, switch to a percent of row total view and keep pulling in new dimensions to see correlations quickly against your target attributes that are on a separate axis. This is sometimes exceptionally useful as it s much quicker than continuously rebuilding your model and it can really help your focus your efforts. Apply the model to deliver live data. Sometime just knowing the answers to questions is enough to go and change your business processes, but there will be times you will want to record the answers in your data warehouse. This will support automated decision systems, which are particularly useful in call centres as well as supporting CRM applications; but it s also a great way of sharing the knowledge. Whilst you can query the data mining models for predictions on the fly and export the bulk results using DMX, my preferred method is to duplicate the data mining logic in your traditional ETL processes. This is because it s faster, and when you re updating your data every hour, as we do, this is an import factor. Whilst not every trend or pattern you discover will be useful, the return on your investment should always be positive (Microsoft press states an average 150%), especially when using Microsoft tools, and in some cases can be like winning the lottery. If you re not already data mining but can see the benefit, you re half way there. The biggest misconception in the data mining world is that data mining is the same as data analysis, which every business is doing. This is your opportunity to get one step ahead. Give us a call if you want to chat about your data mining needs or if you would like to challenge us to find the gold in your data for free. Who are slicedbread? Consider slicedbread a blend of ideas people, creatives, information architects and technical wizards, who believe that by devising the best strategies and manipulating the best technology, they can deliver unparalleled competitive advantage to clients (and also have a pretty good time whilst doing it). slicedbread build better business apps. Get in touch 01565 757 832 mail@ @slicedbread_it