September 9 11, 2013 Anaheim, California An In-Depth Look at In-Memory Predictive Analytics for Developers Philip Mugglestone SAP
Learning Points Understand the SAP HANA Predictive Analysis library (PAL) and R Integration for SAP HANA Understand how SAP Predictive Analysis interacts with SAP HANA Understand which approach is best leveraged when 2
Competitive Advantage Extend Your Analytics Capabilities Where You Want to Be Sense & Respond Predict & Act Optimization Raw Data Cleaned Data Standard Reports Ad Hoc Reports & OLAP Generic Predictive Analytics Predictive Modeling Why did it happen? What will happen? What is the best that could happen? What happened? Analytics Maturity The key is unlocking data to move decision making from sense & respond to predict & act 2013 SAP AG. All rights reserved. 3
Why Predictive Now? Increased Business Interest Increased Data Value (Big Data) Increasing Technology Performance Now that BI users know what happened, they are asking why and what s likely to happen next Explosive demand from sales, marketing, and call center analyses fraud, and government intelligence/ security agencies Exploding data volume Expanding data varieties Parallel processing, faster CPUs, and in-memory technologies reduce time and cost of data processing Changing landscapes and new opportunities 2013 SAP AG. All rights reserved. 4
Predictive Analytics with SAP HANA Transforming the Future with Insight Today Unleash the value of Big Data through the power of SAP HANA Employ in-database predictive algorithms Access 3,500+ open-source algorithms via R integration for SAP HANA Intuitively design and visualize complex predictive models SAP Predictive Analysis software Bring predictive insight to everyone in the business Embed within business applications Extend into BI and reports Insight into events instantly delivered to dashboards, alerts, and mobile devices 2013 SAP AG. All rights reserved. 5
SAP HANA Predictive Ecosystem SAP Predictive Analysis SAP and Custom Applications Business Intelligence Clients SAP HANA Platform SAP HANA Studio Predictive Analysis Library (PAL) R Integration for SAP HANA R Data Pre-Processing and Loading SAP Data Services, Information Composer, SLT, DXC, Hadoop 2013 SAP AG. All rights reserved. 6
SAP HANA In-Memory Predictive Analytics Combine the depth and power of in-memory analytics within SAP HANA with the breadth of R to support a variety of advanced analytic and predictive scenarios Predictive Analysis Library (PAL) Native predictive algorithms In-database processing for powerful and fast results Quicker implementations Support for clustering, classification, association, time series, etc. R Integration for SAP HANA Enables the use of the R open source environment (> 3,500 packages) in the context of the HANA in-memory database R integration enabled via high-performing, parallelized connection R script is embedded within SAP HANA SQL Script 2013 SAP AG. All rights reserved. 7
SAP HANA Application Function Library (AFL) AFL Technology includes Application Functions Written in C++ and delivered as AFL content by SAP Predictive Analysis and Business Function Library have been released in SPS05 as AFL content SAP HANA SAP HANA Clients SQLScript Parameter Table AFL Framework On demand library loading framework for registered and supported libraries AFL are consumed for use from SQL Script via socalled wrapper-procedures Consumption can be controlled via permissions Beyond the initial script-based approach, the Application Function Modeler provides a graphical editor to facilitate the design-time process of creating the wrapper-procedures and can easily be re-used as part of development workflow Application Functions (C++) Predictive Analysis Library AFL Framework Business Function Library 2013 SAP AG. All rights reserved. 8
SAP HANA In-Memory Predictive Analytics Predictive Analysis Library (PAL) Algorithms Supported Association Analysis Apriori & Apriori Lite Cluster Analysis K-Means Kohonen Self-Organized Maps DBSCAN * ABC Classification Classification Analysis C4.5 Decision Tree Analysis CHAID Decision Tree Analysis K-Nearest Neighbor Multiple Linear Regression Polynomial Regression Exponential Regression Bi-Variate Geometric Regression Bi-Variate Logarithmic Regression Logistic Regression Naïve Bayes * Time Series Analysis Single, Double, & Triple Exponential Smoothing Outlier Detection Inter-Quartile Range Test (Tukey s Test) Variance Test Anomaly Detection Data Preparation Sampling, Binning, & Scaling Convert Categorical to Binary Link Prediction * Common Neighbours; Jaccard s Coefficient; Adamic/Adar; Katz β Other Weighted Scores Table * New in SPS06 2013 SAP AG. All rights reserved. 9
SAP HANA Predictive Analysis Library (PAL) Reference http://help.sap.com/hana_appliance http://help.sap.com/hana/sap_hana_predictive_analysis_library_pal_en.pdf 2013 SAP AG. All rights reserved. 10
Detailed Example 2013 SAP AG. All rights reserved. 11
SAP HANA Application Function Modeler New for SAP HANA SP6 Graphical editor to facilitate a faster and easier design-time process of creating the wrapper-procedures AFL models stored as repository objects and easily re-used as part of development workflow Model Editor Drag n drop of functions Template for table types Data source selection and automatic mappings to table types Function List and Search Sample SQL for procedure consumption Library Selection Parameters and specifications for table types 2013 SAP AG. All rights reserved. 12
Detailed Example 2013 SAP AG. All rights reserved. 13
R Integration for SAP HANA What Is R? R is a software environment for statistical computing and graphics Open Source statistical programming language Over 3,500 add-on packages; ability to write your own functions Widely used for a variety of statistical methods More algorithms and packages than SAS + SPSS + Statistica Who s using it? Growing number of data analysts in industry, government, consulting, and academia Cross-industry use: high-tech, retail, manufacturing, CPG, financial services, banking, telecom, etc. Why do they use it? Free, comprehensive, and many learn it at college/university Offers rich library of statistical and graphical packages 2013 SAP AG. All rights reserved. 14
R Integration for SAP HANA Functionality Overview Embedding R scripts within the SAP HANA database execution Enhancements are made to the SAP HANA database to allow R code (RLANG) to be processed as part of the overall query execution plan This scenario is suitable when the modeling and consumption environment sits on SAP HANA and the R environment is used for specific analytic functions Sample Code in SAP HANA SQLScript DROP TABLE "spamclassified"; CREATE COLUMN TABLE "spamclassified" LIKE "spameval" WITH NO DATA; ALTER TABLE "spamclassified" ADD ("classified" VARCHAR(5000)); DROP PROCEDURE USE_SVM; CREATE PROCEDURE USE_SVM( IN train "spamtraining", IN eval "spameval", OUT result "spamclassified") LANGUAGE RLANG AS Send data and R script 1 3 Get back the result from R to SAP HANA 2 Run the R scripts BEGIN library(kernlab) model <- ksvm(type~., data=train, kernel=rbfdot(sigma=0.1)) classified <- predict(model, eval [,- (which(names(eval) %in% "type"))]) result <- as.data.frame(cbind(eval, classified)) END; CALL USE_SVM("spamTraining", "spameval", "spamclassified") WITH OVERVIEW; 2013 SAP AG. All rights reserved. 15
R Integration for SAP HANA System Landscape and Requirements Software Requirements: SAP HANA 1.0 SPS4 or later R version 2.13 or later (http://www.r-project.org/) R Integration Requirements: SAP does not ship, maintain, or provide training and/or support for the R environment R is an open source project under the GPL license and needs to be downloaded, installed, and configured by the customer The R Integration is currently only supported on SLES 11 SP1 operating system R (and the Rserve 0.6-5) must be installed on a different host machine from the SAP HANA software Recommended to have a 10 GB dedicated link between the R server and HANA server 2013 SAP AG. All rights reserved. 16
SAP HANA R Integration Guide http://help.sap.com/hana_appliance http://help.sap.com/hana/sap_hana_r_integration_guide_en.pdf 2013 SAP AG. All rights reserved. 17
Detailed Example 2013 SAP AG. All rights reserved. 18
SAP Predictive Analysis Pushes predictive processing into SAP HANA Intuitively design complex predictive models Visualize, discover, and share hidden insights 2013 SAP AG. All rights reserved. 19
3 Types of Users for SAP Predictive Analytics Data Scientists Create complex predictive models and simulations Validate predictive business requirements.001% Data Analysts Business Users/Execs Publish results back to source Transform and enrich data source(s) Create simple predictive models and simulations Visualize results and publish to BI Platform Interact with published predictive analysis Visualize results in context of use case Collaborate with colleagues toward closure/action 3% 97% Representative User Base 2013 SAP AG. All rights reserved. 20
SAP Predictive Analysis Approaches SAP HANA Predictive Analysis Library algorithms SAP Predictive Analysis (PA) native algorithms Open Source R integration algorithms R Integration for SAP HANA algorithms Analysis performed by SAP HANA (no movement of data) - controlled by SAP PA Data is brought to SAP PA and analysis is performed in the client Data is brought to SAP PA and analysis is performed in the client Analysis is done on R server attached to SAP HANA and controlled by SAP PA 2013 SAP AG. All rights reserved. 21
Detailed Example 2013 SAP AG. All rights reserved. 22
Which One When? Developer SAP HANA Predictive Analysis Library R Integration for SAP HANA SAP Predictive Analysis Business User In-Memory Performance Flexibility 2013 SAP AG. All rights reserved. 23
Which One When? Number of users User Personas 500/5,000 Embedded Predictive Analysis Industry applications LOB applications BI client tools Application End User Information Consumer Interactive Consumer 50/100 Large number SAP PA visualization SAP PA wizard All Personas Business Analyst/Interactive Consumer 5/20 SAP PA designer SAP HANA PAL & R Business Analyst Professional Data Analyst Applications Developer Bi-directional 2013 SAP AG. All rights reserved. 24
http://academy.saphana.com 2013 SAP AG. All rights reserved. 25
http://www.saphana.com/community/implement/hanaacademy#predictive-analytics-library 2013 SAP AG. All rights reserved. 26
http://www.saphana.com/community/implement/hana-academy#rintegration 2013 SAP AG. All rights reserved. 27
http://www.saphana.com/community/implement/hanaacademy#predictive-analysis 2013 SAP AG. All rights reserved. 28
Key Learnings The time for Predictive Analytics has come! SAP HANA Predictive Analysis Library (PAL) Depth for developers: Native, in-memory optimized, amazing performance R Integration for SAP HANA Breadth for developers: Open, extensible, and flexible SAP Predictive Analysis Business focus: Exploit SAP HANA s predictive capabilities without coding Amazing visualizations Extensive learning resources from the SAP HANA Academy 29
Thank You! Contact information: Philip Mugglestone philip.mugglestone@sap.com @pmugglestone http://academy.saphana.com
Thank you for participating. Please provide feedback on this session by completing a short survey via the event mobile application. SESSION CODE: 1104 Learn more year-round at www.asug.com