Enhance your Analytics using Logical Data Warehouse and Data Virtualization thru SAP HANA smart data access Balaji Krishna, Product Management SAP HANA Platform. SAP Labs @balajivkrishna SESSION CODE: 0210
LEGAL DISCLAIMER The information in this presentation is confidential and proprietary to SAP and may not be disclosed without the permission of SAP. This presentation is not subject to your license agreement or any other service or subscription agreement with SAP. SAP has no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation and SAP's strategy and possible future developments, products and or platforms directions and functionality are all subject to change and may be changed by SAP at any time for any reason without notice. The information in this document is not a commitment, promise or legal obligation to deliver any material, code or functionality. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. This document is for informational purposes and may not be incorporated into a contract. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP s willful misconduct or gross negligence. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forwardlooking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.
LEARNING POINTS Understand smart data access(sda) features offered as part of SAP HANA Platform Learn how to deploy Logical Data Warehouses using Data Virtualization techniques like SDA Extend the in-memory Data Fabric capabilities using SAP HANA
AGENDA Overview of smart data access (Data Virtualization) Technical Architecture Common Workflows using SDA Customer Use cases Roadmap Q & A
Development SAP HANA Platform More than just a database Any Apps Any App Server SAP Business Suite and BW ABAP App Server SQL MDX R JSON Open Connectivity SAP HANA Platform Extended Application Services Supports any Device App Server UI Integration Services Web Server Processing Engine OLTP OLAP Search Text Analysis Predictive Events Spatial Rules Planning Calculators Database Services Application Function Libraries & Data Models Administration Predictive Analysis Libraries Business Function Libraries Data Models & Stored Procedures Integration Services Data Virtualization Replication ETL/ELT Mobile Synch Streaming Deployment: On-Premise Hybrid On-Demand SAP HANA platform converges Database, Data Processing and Application Platform capabilities & provides Libraries for predictive, planning, text, spatial, and business analytics so businesses can operate in real-time. 2011 SAP AG. All rights reserved. Public 5
SAP HANA Smart Data Access Data virtualization for on-premise and hybrid cloud environments Benefits Enables access to remote data access just like local table Smart query processing including query decomposition with predicate push-down, functional compensation Supports data location agnostic development No special syntax to access heterogeneous data sources Heterogeneous data sources Oracle, MS SQL, Teradata, DB2, Netezza Hadoop Hive (Hortonworks, Cloudera, MapR, etc.) Spark SAP HANA (BWoH, SoH) SAP Sybase ASE and IQ SAP Sybase ESP, SQLA 2014 SAP AG or an SAP affiliate company. All rights reserved. 6
AGENDA Overview of smart data access (Data Virtualization) Technical Architecture Common Workflows using SDA Customer Use cases Roadmap Q & A
Components Data Consumers Transactions + Analytics Applications Performance and query optimization Leverage remote compute engines Relational Views Query monitoring and statistics Data virtualization Components Data-type Transformation Query Monitoring Query Optimization In-Flight Data Cleansing & Transformation Data Statistics Recommendation Engine Security Data Caching Powerful transformations and information validation Smart placement and caching recommendations Buit-In Adapters Adapter SDK Other data sources (SDK, databases, files, web services, etc.) Remote Data Sources Current Future Teradata Hadoop Oracle/DB2 ASE etc. 2014 SAP AG or an SAP affiliate company. All rights reserved. 8
SAP HANA smart data access High Level Component Diagram SAP HANA Studio Schema Management Security Management Modeler Configuration Query Monitoring Administration Federation Perspectives Query optimization and execution Identifies query fragment which can be sent to the target for remote processing Performs functional compensation Adapter framework enables data type conversions Driver manager loads database specific drivers Access methods enable integration with other HANA query processing components SAP HANA HANA Views Analytical Model Persistence Layer (Repository & Catalog) Base Tables Virtual Tables Federation Model Query Processing Query Optimization Remote Query Execution Federation Support Federation Adapter Framework Access Methods Data Type Conversions Virtual Access Layer Driver Manager ASE ODBC Driver Teradata..ODBC Driver ASE TD Added for HANA Federation 2014 SAP AG or an SAP affiliate company. All rights reserved. 9
SUMMARY OF KEY FEATURES FEATURE DESCRIPTION Data Sources Oracle, MSSQL, DB2, IQ, ASE, Teradata, Hadoop, Netezza, HANA (BWoH, SoH) DDL Support Select, Insert, Update Delete Create/Drop Source (allows for the creation of data source targets) Create/Drop Virtual Tables (allows virtual tables, which point to remote tables to be created and modified). Optimization Push down filters, Push down aggregates, Semi-Join reduction, join relocation, etc. Execution Functional Compensation Functional Translation Parallel execution Security Access remote sources via secondary credentials/technical user Adapter Framework Provides uniform access to remote data, via ODBC adapters Transforms remote server data types to HANA data types Studio Support Create/Drop data sources/virtual tables Execute and Monitor queries Analyze query plans Calc View Support for Virtual Tables
FUCTIONAL COMPENSATION EXAMPLES Few examples of functional compensation happening in HANA Full outer join A FULL OUTER JOIN B ON (A.a = B.b). Even if both the tables are on the same ASE server, we would still do the outer join in HANA as ASE doesn t support it. STDDEV/VAR for Teradata We will compensate this in HANA if the STDDEV and VAR are on integer columns as Teradata doesn t support that. Also, there might be other built-in functions that remote server doesn t support which we will compensated in HANA.
Join Relocation using Temporary Table on Remote Source HANA SQL select TOP 100 "CUST"."CUSTOMER_ID", COUNT(*), PAYMENT_METHOD, SUM("FACT_01"."QUANTITY") from WORKSHOP.CUST join "WORKSHOP"."FACT_01" on "CUST"."CUSTOMER_ID" = "FACT_01"."CUSTOMER_ID" where "CUST"."CUSTOMER_SCORE" > 97 group by "CUST"."CUSTOMER_ID", PAYMENT_METHOD order by SUM("FACT_01"."QUANTITY") DESC Rows Fetched from Remote Server Remote Query Tree HANA Sends SQL to Remote Database SELECT SQ.* FROM ( SELECT "FACT_01"."CUSTOMER_ID" AS "CUSTOMER_ID", COL0, "FACT_01"."PAYMENT_METHOD" AS "PAYMENT_METHOD", COUNT(*) AS SUM("FACT_01"."QUANTITY") AS COL1 FROM NLSUSER."FACT_01" "FACT_01" GROUP BY "FACT_01"."CUSTOMER_ID","FACT_01"."PAYMENT_METHOD" ) SQ, #CUSTOMER_ID_4 WHERE SQ."CUSTOMER_ID" = #CUSTOMER_ID_4.CUSTOMER_ID HANA SDA Inserts 2,073 rows into a temporary table in remote source Then forms SQL Query to join to temporary table on remote database insert into #CUSTOMER_ID_4 values( 99935 ) 2014 SAP AG or an SAP affiliate company. All rights reserved. 13
AGENDA Overview of smart data access (Data Virtualization) Technical Architecture Common Workflows using SDA Customer Use cases Roadmap Q & A
SAP HANA Studio SAP HANA Studio SAP HANA Transactions + Analytics Applications Enables users to develop applications on SAP HANA Create remote sources, create virtual tables, set security policies SAP HANA Tables Virtual Tables Access remote sources and build virtual tables, using remote table schema and data types Adapter Framework Built-In Adapters Third Party Adapters Create Calc Views using Virtual Tables Execute queries and analyze query plan Monitor queries Data Sources 2014 SAP AG or an SAP affiliate company. All rights reserved. 15
CREATING A REMOTE SOURCE 1 Right Click to add remote sources 2 Add remote sources INFORMATION TO FILL IN Adapter Name Server Name Port # Server Name Credentials for remote source 2014 SAP AG or an SAP affiliate company. All rights reserved. 16
CREATING A VIRTUAL TABLE 1 Right Click on Tables and add New Virtual Table 2 Browse tables on remote source, and select table 3 Based on remote table schema, HANA creates virtual table with same columns, and closest datatypes which match HANA data types 4 The virtual table created is visible under Tables Note: icon is different compared to any other HANA tables 2014 SAP AG or an SAP affiliate company. All rights reserved. 17
QUERY MONITORING 1 Click on Smart Data Access to open monitoring window 3 Using Information Provided User Can Analyze Queries and Decide What Remote Tables To Move Into HANA for enhancing performance 2 Query Monitoring Window Provides Information On: Query Executed Start Time Execution Time Rows Retrieved Server Accessed SQL Text Sent to Remote Source 2014 SAP AG or an SAP affiliate company. All rights reserved. 18
Calc View support for Virtual Tables When creating a calculation view, it will be possible to add virtual tables as data sources. Virtual tables can also be referenced by the calculation scenarios. The support for adding virtual tables in the calculation view can also be done from SAP HANA Studio. Optimizations such as push down of filters is also supported is these Calc scenarios. As of SPS08 Virtual Tables are supported in Analytic and Attribute Views as part of migration to Calc Views
AGENDA Overview of smart data access (Data Virtualization) Technical Architecture Common Workflows using SDA Customer Use cases Roadmap Q & A
USE CASES USE CASE Developing Apps Using Dispersed Enterprise-Wide Data DESCRIPTION Build applications on HANA, but access data from other sources using HANA smart data access, without moving data into HANA Supported databases Oracle, MS SQL, DB2, HANA, ASE, IQ, Teradata, HIVE/Hadoop, Netezza Federation of data in Hadoop/HIVE IQ as near-line-store for HANA archived data Access Hadoop/HIVE data from HANA virtual tables via ODBC Using smart data access feature, HANA customers can access IQ as near line store Store archived data in IQ, and real time in HANA Access IQ data as hot-archive
SAP HANA and Hadoop SAP HANA Studio BI Clients SQL Query Layer SAP HANA tables Virtual tables for Hadoop Tables Meta data for Data Sources ODBC Connectivity Hive triggers map-reduce job HIVE JDBC ODBC Command Line Interface Web Interface Thrift Server Driver (Computer, Optimizer, Executor) Metastore HADOOP (MAP-REDUCE+HDFS) Job Tracker Name Node Date Node + Task Tracker 2014 SAP AG or an SAP affiliate company. All rights reserved. 22
Remote caching for Hadoop sources When SAP HANA dispatches a federated query to HIVE, it involves series of map and reduce job execution. This could take few minutes to hours to complete a query depending on the data size in Hadoop and the current cluster capacity. In most cases, the data in Hadoop cluster is not frequently updated and successive execution of map/reduce jobs might result in same tuples. As of SP07, HANA allows this result view to be materialized in the remote system thus avoiding the repetitive execution of the same query. This behavior can be controlled by hinting the optimizer to use remote caching. Syntax Select * from hive_activity_log where incident_type = ERROR and plant = 001 with hint (USE_REMOTE_CACHE)
smart data access with BW on HANA Generate HANA models from BW Infocubes Expose these models as Virtual tables in a 2 nd HANA instance Applications/Analytics Applications/ Analytics Virtual Table Physical Tables SAP HANA 2014 SAP AG or an SAP affiliate company. All rights reserved. 24
The Virtual BW Open ODS View * Pilot only (Note 1922533) BW on HANA Virtual Access BW Query Open ODS Layer Open ODS View Virtual Virtual Table Table/Vie w Smart Data Access External Sources BW Managed Persistence Virtual Access * BW Query Open ODS Layer Open ODS View Virtual DSO w/ fields* Persistent Virtual Table Table/Vie w Open ODS View offers Metadata object as an abstraction layer for underlying source object HANA virtual tables as supported source objects via SDA Querying on field level Supported for Teradata, Sybase ASE/IQ, Hadoop Optimized Query execution by pushing down to HANA Supported scenarios: Virtual Access Persistent Access * Switch from Virtual to Persistent * Based on Field based DSO including DTP and Transformation Direct staging into DSO bypassing PSA No need to adjust existing queries Easy assignment of semantics Underlying object (Table, DB View, DataSource) can be tagged as Text, Master data or Facts Single fields of the object can be linked to already existing Open ODS Views or InfoObjects This presentation outlines our general product direction and should not be relied on in making a purchase decision. This presentation is not subject to your license agreement or any other agreement with SAP. SAP has no obligation to pursue any course of business outlined in this presentation or to develop or release any functionality mentioned in this presentation. This presentation and SAP's strategy and possible future developments are subject to change and may be changed by SAP at any time for any reason without notice. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP assumes no responsibility for errors omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent. 2013 2014 SAP AG or an SAP affiliate company. All rights reserved. 25
MII SDA Adapter for Manufacturing The SAP Smart Data Access (SAP SDA) adapter for SAP Manufacturing Integration and Intelligence (SAP MII) can be used to connect an SAP MII back-end system through SAP SDA to an SAP HANA system. Queries on the SAP MII side are exposed as virtual tables to SAP HANA. The adapter transforms queries to their virtual tables into appropriate SAP MII service queries. The results from SAP MII are then transformed into rows of the virtual tables. For detailed information on setting up the SAP SDA adapter for SAP MII refer to SAP Note 1984859 2014 SAP AG or an SAP affiliate company. All rights reserved. 26
AGENDA Overview of smart data access (Data Virtualization) Technical Architecture Common Workflows using SDA Customer Use cases Roadmap Q & A
Use Cases SAP HANA smart data access Product road map overview key themes and capabilities High Performance Distributed Data Access Support for SAP HANA, Teradata, Hadoop/HIVE, SAP Sybase IQ, SAP Sybase ASE, Oracle, MS SQL Define remote data sources, virtual tables, to access data in remote servers from HANA Query Optimization with query push down to ensure minimal data transfer from remote sources to HANA Compute and utilize data statistics for virtual table Secure access to remote servers via technical and secondary credentials Query Monitoring and Collection of key query execution metrics for optimization HANA Studio Support for virtual table development and monitoring support Dynamic, Enterprise Data Virtualization With Recommendation Engine Support other databases such as MaxDB, etc. Provide SDK for adding support for additional data sources, on-demand Tighter Hadoop integration thru direct support for HDFS (as HANA UDF) Enhanced performance with caching, and more query optimizations Increased DML Support with support for insert/update/delete on virtual tables Enhanced HANA Studio Support ease of application development Tighter integration with SAP data sources, including HANA to HANA (SoH, BWoH) Support for Import/Exporting Virtual Tables CDS support for Virtual Tables Application Transparency with Synchronized Cache, and Enhanced Recommendation Engine In-Flight Data Cleansing and Transformation Application transparency Support for Hadoop PIG Advanced SDA for Hadoop: windowing, table partitioned functions Provide recommendation engine to enable customers to optimize performance Support for SSO to Remote Sources Expanded DDL, and DML support with ability to create tables on remote sources, and pass through mode Multi Node Transaction Support insert, update, delete: SAP Sybase Control Center integration Extend BW/HANA to consume data from Teradata Big-Data Use Cases - Hadoop + HANA NLS - IQ as NLS for HANA HANA landscape integration - Integrate new apps on HANA with BW on HANA deployments Consume enterprise-wide data in HANA for analytical and transactional apps Enable optimal data placement via recommendation engine - decide what data to move into HANA for better performance Expand use cases, by supporting transactions against other data sources via HANA Enable fast reporting/drill down use cases via caching Deeper NLS support ability to insert data from HANA to IQ Minimal application disruption and enable support for packaged apps Expand use cases which consume web data Expanded transactional use cases for heterogeneous data sources, through HANA Deploy use cases in environments with data quality / data representation challenges TODAY (HANA SP08) Planned Innovations Future Direction This is the current state of planning and may be changed by SAP at any time. 2014 SAP AG or an SAP affiliate company. All rights reserved. 28
SDA Roadmap- IM in HANA Data Provisioning Virtual Tables Metadata HANA Design Time Built-In Adapters SAP HANA Transformations Data Provisioning Server Adapter Framework Source Protocol Transactional & Analytical Apps HANA Tables Custom Adapters TCP/IP or HTTPS Single design time In memory transformations Real-time and scheduled data replication with transactional consistency and guaranteed delivery Built-In adapters for common sources; open SDK for extensibility 2014 SAP AG or an SAP affiliate company. All rights reserved. 29
SUMMARY Enables application developers to write new data intensive applications, without regard to where the data resides, size or the functionality supported Platform will leverage data processing engines (databases, Hadoop, etc.) to synthesize information needed for analysis resulting in high performance However HANA smart data access does not eliminate the need for EDW which addresses the Data Consistency, harmonization and Governance issues required for Compliance Customer benefits: Rapid deployment of high-performance, data intensive transactional and analytical applications Cost Savings Benefit from HANA functionality, without moving all data into HANA
How to find SAP HANA documentation on this topic? SAP HANA smart data access Central Note 1868209 HANA Academy Videos on SDA YouTube SAP BW and SDA - http://scn.sap.com/docs/doc-52945 SAP HANA Platform SAP Help Security Administration SAP HANA Administration Guide Development SAP HANA Developer Guide References SAP HANA SQL Reference 2014 SAP AG or an SAP affiliate company. All rights reserved. 31
THANK YOU FOR PARTICIPATING Please provide feedback on this session by completing a short survey via the event mobile application. SESSION CODE: 0210 For ongoing education on this area of focus, visit www.asug.com
FOLLOW US Follow the ASUGNews team: Tom Wailgum: @twailgum & Courtney Bjorlin: @cbjorlin For all things SAP