Beginner s Guide: Harnessing Big Data and Internet of Things For Real World Use Cases Ashish C. Morzaria SAP
Legal disclaimer The information in this presentation is confidential and proprietary to SAP and may not be disclosed without the permission of SAP. This presentation is not subject to your license agreement or any other service or subscription agreement with SAP. SAP has no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation and SAP's strategy and possible future developments, products and or platforms directions and functionality are all subject to change and may be changed by SAP at any time for any reason without notice. The information in this document is not a commitment, promise or legal obligation to deliver any material, code or functionality. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. This document is for informational purposes and may not be incorporated into a contract. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP s willful misconduct or gross negligence. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions. 2015 SAP AG. All rights reserved.
Agenda The WHAT and WHY: Big Data and IoT Current landscape Real life examples How to make sense of it all The HOW: Advanced Analytics Introduction to predictive analytics Big Data(set) walkthrough Operationalizing insights in existing tools The REST: Hadoop and HANA SAP Predictive Analytics Wrap-up and Q&A 2015 SAP SE or an SAP affiliate company. All rights reserved. 2
The Promise of Big Data Greater insights, more profit, better efficiency... What customers to more effectively target? Who are the customers more likely to leave? What equipment is likely to fail if I don t perform maintenance? How many people should I staff based on demand? BUT It s difficult to determine which data is valuable Data cannot be analyzed raw using existing tools Individual pieces of data are less valuable Aggregated data is useless and non-aggregated data is huge 2015 SAP SE or an SAP affiliate company. All rights reserved. 3
What is Big Data? It s data that is. Big right? 2015 SAP AG. All rights reserved.
Big Data Is More Than Data That Is Big! Large datasets = Big Data? Possibly just Large Data Big Data is about handling data very quickly? It is actually about cost and scalability Big Data sources will replace traditional databases? Each has its own strengths don t rip and replace for fun 2015 SAP SE or an SAP affiliate company. All rights reserved. 5
Common Misconceptions of Big Data 2015 SAP SE or an SAP affiliate company. All rights reserved. 6
The Evolution of Big Data Traditional Big Data Process How It Needs To Be (And Can Be) Code Clicks Developers Business Analysts Ad-Hoc Governed Manually Deployed & Monitored Automated 2015 SAP SE or an SAP affiliate company. All rights reserved. 7
The Internet of Things IoT 2015 SAP AG. All rights reserved.
Internet of Things Definition 2015 SAP SE or an SAP affiliate company. All rights reserved. 9
Internet of Things (IoT) Trends 12 50 bn 40 50 % Devices connected by 2020* CAGR for M2M market until 2020** 1/5 Price of communication module today vs. four years ago ++ Maturity and reliability of technology *Source: EIU The rise of the machines **Source: Gartner 2015 SAP SE or an SAP affiliate company. All rights reserved. Customer 10
Why IoT Data Is Different? 2015 SAP SE or an SAP affiliate company. All rights reserved. 11
Why IoT Now? Sensor costs Bandwidth costs Processing costs Big Data IPv6 addressing 2015 SAP SE or an SAP affiliate company. All rights reserved. 12
2015 SAP SE or an SAP affiliate company. All rights reserved. 13
Some Examples Other than BIG? 2015 SAP AG. All rights reserved.
Smart Traffic Monitoring Seattle I5 2015 SAP SE or an SAP affiliate company. All rights reserved. 15
Smart Farming Animal wearables Monitor ph of cow s stomach Recommend dietary changes to farmers Improve milk production 2015 SAP SE or an SAP affiliate company. All rights reserved. 16
Smart Airplane Jet Engine Sensors 20 TB/hour 2015 SAP SE or an SAP affiliate company. All rights reserved. 17
Making Sense Of All That Data Why Traditional Tools Are Not Up To The Task Anymore 2015 SAP AG. All rights reserved.
The Big Data Problem Web, mobile, social & machine generated data explosion Faster Decision Cycles Advanced Analytics Skills Gap Demand for deep analytical talent in the United States could be 50 to 60% greater than its projected supply by 2018.
The Solution: Automated Algorithmic Analysis Descriptive BI Diagnostic BI Predictive BI Prescriptive BI Describe Understand Predict Recommend A human using a computer A computer providing info to a human
What Is Predictive Analysis? High level: Using algorithmic analysis to recognize data relationships that influence likely outcomes and identify potential risks and opportunities before they occur to make better decisions in the future. Approximate the relationship between variables and their outputs and represent it as an algorithm (set of rules) Use the algorithm against future data to predict the response with the least amount of error In the BI context: PA is another toolset that helps us uncover patterns and relationships algorithmically that can be used against future data sets instead of relying solely on visual representations that require the analyst to infer the future 2015 SAP SE or an SAP affiliate company. All rights reserved. 21
Predictive Analytics Is the Next Step in Business Intelligence Where Is Your Organization on the Spectrum? 2015 SAP SE or an SAP affiliate company. All rights reserved. 22
The HOW Gaining Insight 2015 SAP AG. All rights reserved.
Predictive Analytics Goes Beyond Visualization Database Database Predictive Analytics Datasets Datasets Models Direct Insights Reports Reports
What s a Predictive Model? A predictive model algorithmically represents the desired target within the dataset. This model can then be applied to other (or future) datasets to identify elements that should be targeted. Predictive Models: This last transaction could be fraudulent Descriptive Models: Male purchasers of baby formula are 56% likely to also buy a case of beer, but female purchasers are likely to have a higher number of items and a larger total bill Decision Models: Your credit score is pretty low and you have few assets, giving you a mortgage would be pretty risky 2015 SAP SE or an SAP affiliate company. All rights reserved. 25
How Are Models Created? A model can be created by a Data Scientist: Requires strong understanding of the data as every observation/assumption defines the model Resulting model can be validated by applying to historical data to determine Predictive Power SAP Predictive Analytics Automated Algorithms are adaptive and are self-training Model can be validated by applying against the testing set of historical data to determine Predictive Power System iterates continuously until predictive power is high enough Constant validation to determine when model needs to be modified (Model Manager) Data Scientists can typically create better models: A trained Data Scientist who has spent days or weeks creating and validating models can have a more logical model because it is based on semantic understanding of the data A programmatically created model relies on algorithmic pattern recognition, which typically cannot have semantics added afterwards BUT: It takes them *much* longer 2015 SAP SE or an SAP affiliate company. All rights reserved. 26
Automated Modeling Process High Level 2. Produce candidate models Analytical Data Set Variable Reduction 1. Cut ADS Cutting Strategy Estimation Sub-set Validation Sub-set Model Model 1 Model 2 N 3. Evaluate models to choose the best one Test Sub-set 4. Test the performance of the selected best model 2015 SAP SE or an SAP affiliate company. All rights reserved. 27
Predictive Analytics Requires Metadata Information about who they are: Profile information Location Friends / Associations Comments / Feedback Interaction history Comments from calls Derived information about their behavior: What they searched for What they put in their cart What they actually bought When they bought How they bought How much they paid What else they bought What do they buy regularly? 2015 SAP SE or an SAP affiliate company. All rights reserved. 28
What Type of Derived Attributes? E Time Window Aggregates/Sequences Using relative time filtering Text Analytics Selection of root words through a corpus Synonyms, concepts Social Derived Attributes Community, roles in communities, pressure, influence Geoloc Aware Geographical position as modeling input, colocation, path discovery 2015 SAP SE or an SAP affiliate company. All rights reserved. 29
Let s Take An Example Turning Big Data Into Business Intelligence 2015 SAP AG. All rights reserved.
Teasing Out Info From Your Website 2015 SAP SE or an SAP affiliate company. All rights reserved. 31
Deriving Massive Insights Session Duration Page Visit Time of Visit Day, week, etc. Biz hours, night, etc. 2015 SAP SE or an SAP affiliate company. All rights reserved. 32
Deriving Massive Insights Page Pages visited Page by category Time per page / category Clickpath sequence 2015 SAP SE or an SAP affiliate company. All rights reserved. 33
Deriving Massive Insights Event Login Offer Conversion Shopping Cart Purchase 2015 SAP SE or an SAP affiliate company. All rights reserved. 34
Deriving Massive Insights Device Computer vs. Mobile Apple (Mac, ipad, ios) 2015 SAP SE or an SAP affiliate company. All rights reserved. 35
Deriving Massive Insights Browser Type (IE, Firefox, Safari) Version 2015 SAP SE or an SAP affiliate company. All rights reserved. 36
Deriving Massive Insights Geo-Location City, State, Country Latitude / Longitude Venue (Restaurant, etc.) 2015 SAP SE or an SAP affiliate company. All rights reserved. 37
Deriving Massive Insights Browser / OS Session Duratio n Device Event / Action Page Views Traffic Sources Referri ng Keyword Geo- Locatio n Time of Visit Product Preference 2015 SAP SE or an SAP affiliate company. All rights reserved. 38
Wider Data Sets Build Stronger Models Complex Aggregates Millions of Rows? Thousands of Variables 2015 SAP SE or an SAP affiliate company. All rights reserved. 39
Lift with Simple Aggregates 20 Variables -Demographics / Account Information -Simple Aggregates (e.g. Account Balance, Total Usage) 2015 SAP SE or an SAP affiliate company. All rights reserved. 40
Lift with Complex Aggregates 100 Variables -Pivoting Transactions (e.g. Calls by Type) -Time-Sensitive Aggregates (e.g. Calls by Week) 2015 SAP SE or an SAP affiliate company. All rights reserved. 41
Lift with Social Network Analysis 200 Variables -Social Network Analysis (e.g. Calls in First Circle) -Community Detection (e.g. Community Churn Rate) 2015 SAP SE or an SAP affiliate company. All rights reserved. 42
A Real Life Example A Customer Story 2015 SAP AG. All rights reserved.
Case: Maximizing Mall Rents
Tangible Results The Value of the Lift Curve 2015 SAP AG. All rights reserved.
Very Visible Results Make more money: Walmart discovered prior to hurricanes, customers bought flashlights, batteries and Pop-Tarts 1 Best Buy discovered 7% of its customers account for 43% of its sales 1 Reduce costs: A major Canadian bank: Increased campaign response rates by 600% Cut acquisition cost by 50% Boosted ROI by 100% 2 A European telecom reduced customer churn from 20% to 5% using predictive analysis 1 Airlines better estimate the number of passengers who won t show up for a flight 2 Save Lives: Health care: finding emerging symptoms for Ebola before the pattern is obvious to the naked eye Route optimization to reduce response times: how much is 5 minutes worth to a dying patient? 1 The Economist, The Data Deluge, Data, data everywhere, February 27,2010, pages 3-5 2 Wayne W. Eckerson, Predictive Analytics: Extending the Value of Your Data Warehousing Investment, TDWI Best Practices Report, 2007, page 6 2015 SAP SE or an SAP affiliate company. All rights reserved. 46
Bringing Insights To The Enterprise Operationalizing Predictive Analytics 2015 SAP AG. All rights reserved.
Name Gender Age Marital Recent Activity C-Sat Renewed Before Predicted Churn Batch Scoring NEW Data (Current Customers) Hancock, John M 38 D Y 4.2 N? Doe, Jane F 45 M Y 9.4 N? Red, Simply F 18 S N 2.1 N? Significantly increase ROI through dataset reduction: Lower campaign costs by targeting those most likely to leave Increase response rate by targeting even more specifically on other attributes Increase C-Sat by not hassling loyal customers Model Customer not expected to churn, so don t bother them! Hancock, John M 38 D Y 4.2 N Y Doe, Jane F 45 M Y 9.4 N N Red, Simply F 18 S N 2.1 N Y Hancock, John M 38 D Y 4.2 N Y Red, Simply F 18 S N 2.1 N Y Targeted List
Embedding Predictive Analytics Into BI Workflows Model Embedded as Stored Procedure Customer Database Hancock, John M 38 D Y 4.2 N Y SQL Doe, Jane F 45 M Y 9.4 N N Business Users can get on-the-fly scoring without even knowing they are using predictive algorithms Red, Simply F 18 S N 2.1 N Y Lumira Dataset w/ Scoring Lumira Server for Teams Lumira Storyboard Lumiar Server for BI Platform SAP Lumira Cloud
Hadoop and HANA How Are They Related? 2015 SAP AG. All rights reserved.
Hadoop and Hana are complementary HANA is for the known-knowns Operationalization reporting source Used to deliver known facts effectively to a wide range of users Hadoop is for the unknown-unknowns Investigation for deeper insight Cost-effectiveness and processing power Use analytical tools to discover 2015 SAP SE or an SAP affiliate company. All rights reserved. 51
SAP Partnerships - Open Hadoop / NoSQL Strategy 2015 SAP SE or an SAP affiliate company. All rights reserved. 52
Predictive Offerings From SAP 2015 SAP SE or an SAP affiliate company. All rights reserved. 53
Predictive Analytics Solutions from SAP Any Data Source Relational Databases Big Data Sources CRM/ERP Stores Applications Cloud Services SAP Predictive Analytics 2.x Data Preparation Visualization Automated Analysis SDK/API Model Management Scoring Social Recommendation Expert Analysis Connectors SAP HANA In-Memory Processing Engine Predictive Applications 25+ Industries 11+ LoBs R Transactions Predictive Analysis Library Automated Predictive Library R-Scripts Financial & Insurance Services Retail & Consumer Products Telecommunications Public Sector & Healthcare O&G, Manufacturing & Utilities Cloud / On-Premise
SAP Solutions for the Entire Spectrum of Users No Low High Level Of Skill Set - Analytics Business User Data Analyst Data Scientist HANA Application Developer Embedded Analytics Industry & Business Process Analytics SAP Predictive Analytics Custom Analytics Lumira Automated Analytics Expert Analytics Application Embedded PA Application Function Modeler PAL, APL SAP ANALYTICS R Integration 2015 SAP AG. All rights reserved. Strictly Confidential 55
SAP Solutions for the Entire Spectrum of Users No Low High Level Of Skill Set - Analytics Business User Data Analyst Data Scientist HANA Application Developer Embedded Analytics Industry & Business Process Analytics Custom Analytics Application Embedded PA Lumira Automated Analytics Expert Analytics Application Function Modeler PAL, APL SAP ANALYTICS R Integration 2015 SAP AG. All rights reserved. Strictly Confidential 56
SAP Predictive Analytics: Automated Analytics Data Scientist in a Box Provide Business Analysts and Data Scientists with a fully automated process Data preparation Create 1000s of derived attributes Define metadata once Builds analytic dataset automatically Predictive modeling/data mining Regression/Classification Segmentation Forecasting Association rules Social Network Analysis Advanced model deployment and management
SAP Predictive Analytics: Expert Analytics Self-Service for Business Analysts and Data Scientists Provide Data Scientists and Business Analysts with sophisticated algorithms to take the next step in understanding their business and modeling outcomes Perform statistical analysis on your data to understand trends and detect outliers in your business Build models and apply to scenarios to forecast potential future outcomes Breadth of connectivity to access almost any data Optimized for SAP HANA to support huge data volumes and in-memory processing
Wrap-Up Closing Points 2015 SAP AG. All rights reserved.
2020 predictions Gartner projects IoT to generate $1.9 Trillion value add Every product more than $100 will be smart IDC projects $8.9 Trillion market for IoT and 212 billion connected things Including 30.1 billion autonomous things 2015 SAP SE or an SAP affiliate company. All rights reserved. 60
Where to Start? SAP Predictive Analytics is available for 30-day Trial! www.sap.com/trypredictive What data do you have (or can generate)? Don t be picky any/all data could be potentially useful Go for unaggregated data whenever possible summarized data hides relationships What could it tell you? Consider common use cases classification, scoring, association, recommendation, forecasting Do you have the tools/skills to extract the insights? How are you going to operationalize the insights? Insights that cannot be embedded into your processes/applications are essentially useless to the majority Consider how you can include predictive results in your BI tools today Define your goals before spending too much time analyzing how else would you know where to look? 2015 SAP SE or an SAP affiliate company. All rights reserved. 61
Key Messages Increasingly, Big Data = Predictive Analytics The volume, velocity, and veracity of data demands automated analysis Algorithms are the key to finding needles in the Big Data (Hay)stack BI tools are not always equipped to deal with Big Data natively Value of Big Data/IoT per byte is MUCH lower to traditional data A single insight could take gigabytes of data and hours/days of analysis Value is incremental to traditional data therefore is not a replacement for RDBMs Automated Analytics empowers non-data scientists to analyze Big Data Much flatter learning curve than learning R or a proprietary predictive technology Use cases for Big Data expand outside just Data Science use cases 2015 SAP SE or an SAP affiliate company. All rights reserved. 62
Additional Predictive Sessions at ASUG BOUC 2706: Roadmap: Predictive Analytics: What's New and What's Next 2765: Beginner's Guide to Harnessing Big Data and Internet of Things For Real-World Use Cases 3526: ASUG Predictive Analytics Influence Council 3627: SAP Predictive Analytics Solution Hands-On Session 2703: Top 10 Predictive Use Cases and Customer Case Studies 2513: Predictive Maintenance & Service: Customer Lessons Learned 2015 SAP SE or an SAP affiliate company. All rights reserved. 63
Where to Find More Information Predictive on SCN: Predictive Official Product Tutorials: Predictive Blog: Predictive 30-day Trial Predictive Product Roadmap * http://scn.sap.com/community/predictive-analysis http://scn.sap.com/docs/doc-32651 http://scn.sap.com/community/predictive-analysis/blog http://www.sap.com/trypredictive http://service.sap.com/roadmaps (then Analytics > Predictive) * Requires login credentials to the SAP Service Marketplace 2015 SAP SE or an SAP affiliate company. All rights reserved. 64
Thank you! Please Fill Out Your Surveys! Your feedback is important! Ashish C. Morzaria, SAP Director Advanced Analytics a.morzaria@sap.com 2015 SAP SE or an SAP affiliate company. All rights reserved.
2015 SAP SE or an SAP affiliate company. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company. SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. Please see http://global12.sap.com/corporate-en/legal/copyright/index.epx for additional trademark information and notices. Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors. National product specifications may vary. These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation, and SAP SE s or its affiliated companies strategy and possible future developments, products, and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forwardlooking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions. 2015 SAP SE or an SAP affiliate company. All rights reserved. 66