How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK
Agenda Analytics why now? The process around data and text mining Case Studies
The Value of Information Feb 12 2007? Feb 19, 2007? Feb 26, 2007?
Why are companies now using Analytics? Data ERP, POS, Web, etc. Skills A generation of computer-savvy, spreadsheet-trained managers Business need Differentiated capabilities Optimized processes
Why is analytics important? So organizations can make more money save more money allocate what they have more effectively by getting better answers faster.
Business Intelligence How many organizations continue to define and deploy BI Business Value Reporting / OLAP Data Management Data Access How Much? How Many? What Happened? Data Information Knowledge
Simple yet useful questions How much did we sell (by month, channel, region)? Which stores sold the most? What is our best-selling product? What is our most profitable product? SKU Brand Product Subcategory Category Year All Merchandise Location Week Month Time Company Country Region Zone Store
lead to more compelling questions How much will we sell next month/quarter/year (for each product/store location!)? Why did we sell so much more this month? How do we optimally replenish inventory? Which customers are likely to respond to a mailing and of those who respond, how much are they likely to spend? How do we attract and retain profitable customers? How do we optimally allocate marketing dollars to maximize profits? Where should we locate new stores to maximize revenue?
Beyond Traditional BI to Business Analytics Optimization Business Value Predictive Modeling Forecasting Reporting / OLAP Data Management Data Access How Much? How Many? What Happened? What will happen next? What s the best that can happen? Data Information Knowledge Intelligence
Data Mining Definition the process of selecting, exploring and modeling large amounts of data to uncover previously unknown information for a business benefit
Business Drivers for Data Mining Customer-focused Life-time value Profiling/segmentation Retention Acquisition/ winback Cross- /up-selling Campaign analysis Channel analysis Channel development Loyalty program analysis Operations-focused Profitability analysis Pricing Fraud detection Risk assessment Portfolio management Employee turnover Cash management Capacity planning Distribution analysis All Impact the Bottom Line
Customer Focused Business Questions Profiling & Segmentation Who are my customers? Why are customers leaving? Who is going to leave next? Which customers are most profitable? How should my segments be defined? Cross Selling Opportunities Which customers are good candidates for cross or up selling activities? Which product combinations and features do customers want? Target Analysis Who should I target next? Which customers are more likely to respond? What s the expected response rate? Which communication channel should I use? Behavioural Modelling What is the customer potential / life time value? Can I customise offerings based on needs, preferences and profitability?
The Analytical Intelligence Cycle Integration of People, Processes, and Technology Data Manager Data Preparation Deployment Services Report Administration Monitor Results Start Formulate Problem Data Miner Exploratory Analysis Descriptive Segmentation Predictive Modeling Deploy Model Accumulate Data Business Manager Manages Campaigns Domain Expert Evaluates Processes & ROI Evaluate Model Data Quality Analysis Predictive Modeling Transform and Select
Data Mining Process Finding the best model SAMPLE Sampling Yes/no EXPLORE Data Visualisation Summary Statistics MODIFY Transformation Outlier Elimination MODEL Tree Based Regression Neural Networks Other Stats. ASSESS Model Evaluation
Data Mining Process: Sample Data reduction Validation Simple random sampling Stratified random sampling Cluster sampling First N
Data Mining Process: Explore Simple analyses (e.g. mean, range for churn vs. nonchurn) Visual exploration Histograms Scatter plots 3d-rotating plots Interactive exploration Colours and shapes
Data Mining Process: Modify Create new variables Variable grouping Data transformation Outlier elimination Missing values?!?! f () Σ x - µ σ e x log p
Data Mining Process: Model Decision Tree: if TIME REMAINING < 5 and TARIFF= FREQUENT then churn score= 0.6 else. Regression logit(churn score)= 0.2*TIME REMAINING + 0.5*AGE + 0.3*(USAGE*GENDER) + Neural Network
Data Mining Process: Assess Models should be assessed and compared in terms of: Accuracy of classification Ability to identify small groups of customers with a high proportion of target behaviour ( Lift ). Cost savings can be derived from this Lift value 3 + ve ROI 1 0 10 20 30 40 50 60 70 80 90 100 - ve Percentile Percentile Baseline Tree Neural
Text Mining Text Mining: The process of discovering and extracting meaningful patterns and relationships from text collections Text Mining = Data Mining + Natural Language Processing
Text Mining Process S E Reading the text files M Singular Value Decomposition Term weighting/rollup Text Preprocessing Dimension Reduction M A Document analysis
The Analytical Intelligence Cycle Integration of People, Processes, and Technology Data Manager Data Preparation Deployment Services Report Administration Monitor Results Start Formulate Problem Data Miner Exploratory Analysis Descriptive Segmentation Predictive Modeling Deploy Model Accumulate Data Business Manager Manages Campaigns Domain Expert Evaluates Processes & ROI Evaluate Model Data Quality Analysis Predictive Modeling Transform and Select
Improve Performance - The Model Management Challenge Close the gap between model development and model deployment (ROI!) Proliferation of Data & Models Largely Manual Processes Automate Model Deployment Integrating with Operational Systems Increased Regulation
Integrated with the Model Development Environment Model Registration Map to Task Model Development Environment Champion Model Selection SAS Enterprise Miner SAS Credit Scoring SAS/STAT Base SAS Score Code Model Testing Production Environment Interactive Model Deployment Batch Real Time Development Environment Model Tracking Model Retirement Production Environment
The Analytical Intelligence Cycle Integration of People, Processes, and Technology Data Manager Data Preparation Deployment Services Report Administration Monitor Results Start Formulate Problem Data Miner Exploratory Analysis Descriptive Segmentation Predictive Modeling Deploy Model Accumulate Data Business Manager Manages Campaigns Domain Expert Evaluates Processes & ROI Evaluate Model Data Quality Analysis Predictive Modeling Transform and Select
Challenge Solution Results In order to meet many global business challenges & the operational complexity of the industry, the airline needed to change by introducing analytical excellence; business modelling, complex data analysis. SAS Data Mining have been applied to measure customer value, to segment customer data and predict customer attrition. Recent Data Mining projects have included Executive Club travel pattern segmentation, In-flight retail & on-board customer survey monitoring Now enable BA to understand their needs of their customer s better to deliver a superior service resulting in customer s staying more loyal to the brand
Challenge Solution Results The company runs the biggest loyalty programme in the UK, therefore, it was essential that they carry out customer insight, campaign management, opportunity identification and performance measurement for their sponsors. The company uses SAS Data Mining capabilities, in order to segment customers. Sponsors can then select from a wide range of segments, at a low cost and target them accordingly. Can now provide insight to sponsors on loyalty card users increasing response rates rise from 2% to 20%.
Challenge Solution Results Detect and contain warranty and call center issues before they become widespread Automatic monitoring of free-form text to identify quality and safety issues Automatic alert generation Surface previously unknown vehicle issues Fix issues faster Hundreds of millions of dollars saved