Data Provisioning from SAP and Non-SAP Data Sources to SAP HANA Prasad illapani SAP HANA & Big Data - Product Management & Strategy SAP Labs LLC., Bellevue, WA Template Revision 20130104 v3.0
Legal Disclaimer The information in this document is confidential and proprietary to SAP and may not be disclosed without the permission of SAP. This presentation is not subject to your license agreement or any other service or subscription agreement with SAP. SAP has no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation and SAP's strategy and possible future developments, products and or platforms directions and functionality are all subject to change and may be changed by SAP at any time for any reason without notice. The information on this document is not a commitment, promise or legal obligation to deliver any material, code or functionality. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. This document is for informational purposes and may not be incorporated into a contract. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions. 2
Agenda SAP HANA Platform - Data Management Portfolio Data Provisioning with SAP HANA - Options SAP LT Replication Server ( Real Time ) SAP Replication Server ( Real Time ) SAP Data Services 4.2 ( Batch ) SAP Event Stream Processor 5.1 ( Real Time ) SAP Sql Anywhere ( Sync ) 3 rd Party ETL tools Integration with SAP HANA via Certification Key Takeaways 3
SAP HANA- Data Management Portfolio End-to End Data Management & App Platform for Real-Time Business Custom Apps Mobile Apps Analytics / BI Apps ERP Apps Big Data Apps SAP ESP Complex Event Processing Server to filter and analyze high volume streaming data in real time for new real-time applications. SAP HANA PLATFORM In-memory processing platform for real-time transactions + end-to-end analytics that offers massive simplification. Application Development Extended Application Services Unified Administration SAP SQLA Mobile & Embedded Database to provide delivery to the point of decision and enable new LoT applications. OLTP SAP ASE Relational Database to support extreme transactional applications with best TCO and price performance. Processing Engine SAP HANA platform Database Services (OLTP + OLAP) OLAP e SAP IQ Column Database to power extremely large EDW and Big Data analytics with best TCO SAP Replication Server / SAP SLT Replication Solution to move and synchronize data for HA/DR and real-time data distribution applications. Relatime Replication to Sap/non-SAP data sources Application Function Lib. & Data Models Integration Services SAP Data Services World Class Integration package to help cost effectively move, improve and govern your data for critical applications 4
Data Provisioning - Options Real-time high volume data integration from any source Any DATA Data Acquisition Components/Capabilities SAP Business Suite Trigger-Based SAP LT Replication Server ODBC Non-SAP Data Sources Log-Based SAP Replication Server ECH SAP HANA Cloud Deployments ETL, Batch SAP Data Services ODBC Virtual Tables Event Data Source Event Streams SAP Event Stream Processor HANA ODBC Network Devices Wired / Wireless Data Synchronization SAP SQL Anywhere ODBC Data Sources (SP6: HANA, IQ, ASE, Hadoop, Teradata) Data Virtualization SAP HANA Smart Data Access * HIVE-ODBC * Smart Data Access SAP HANA capability 5
LT Replication from SAP sources system SAP HANA Studio SAP Business Suite SAP LT Replication Server Read Engine Read Engine RFC Connection Mapping & Transformation Engine SAP HANA Application Tables Write Engine DB Connection Application table Logging table DB trigger SAP Source System Efficient implementation of data replication via DB trigger based on change capturing concept SAP LT Replication Server Highly scalable and reliable replication process, including comprehensive data transformation capabilities on the fly Target Systems Fast data replication via DB connection, integration into SAP HANA Studio 6
LT Replication from non-sap sources system SAP HANA Studio Non-ABAP System Application table Logging table DB Connection SAP LT Replication Server Read Engine Mapping & Transformation Engine Write Engine DB Connection SAP HANA Application Tables DB trigger Non-SAP Source System SAP LT Replication Server Target Systems SAP LT Replication Server transfers all metadata table definitions from the non-abap source system to the HANA system. From the HANA Studio perspective, non-sap source replication works the same as for SAP sources. When a table replication is started, SAP LT Replication Server creates logging tables in the source system. The read engine is created in the SAP LT Replication Server. The connection the non-sap source system is established as a database connection. 7
SAP Landscape Transformation Replication Server Product Road Map Overview - Key Themes and Capabilities Strategic Developments Replication from ABAP to ABAP systems (covering the complete SAP Business Suite) Data provisioning for SAP BW 7.3 or higher and SAP Data Services 4.2 Evolved and integrated solution as part of SAP s RTDP strategy New Features 1:N replication for non-abap source systems Replication logging feature for backup and recovery Support of views as source objects Filtering option for records in source system Continuous Improvements Enhanced monitoring capabilities Simplified administration Support of replication to non-abap systems (today already available as project solution) Strategic Developments Transactional consistency for complex objects Open interface to feed analytical non-abap target systems from ABAP source systems New Features Templates to manage and reuse settings across tables, configurations, and systems Automated parallelization for replication Integrated consistency check with automated repair mode SAP BW scenario: Alternative for extracting data for certain complex objects Preview mode for test runs Continuous Improvements Automated adaption of replication after operational events like NZDT, OS/DB migration or system refresh Strategic Developments Object-based replication Enhanced troubleshooting framework with self-repair functionality Alternative for extracting data for almost all complex objects to enable real-time replication and to reduce the transfer volume for SAP BW Simulation and debugging engine for transformation rules Manage execution, monitoring, or troubleshooting on mobile devices Heterogeneous fallback and data synchronization solution for Suite on SAP HANA Optimized delta recording for SAP HANA as a source database TODAY Planned Innovations Future Direction (Release DMIS 2011 SP6) 8
SAP Replication Server Real-time data, delivered reliably, enterprise-wide SAP Replication Server is a real-time, low-impact, database replication solution with rich enterprise-level features for both SAP and non-sap databases and applications. It securely, reliably, and rapidly moves and synchronizes data across multiple data sources in real time to deliver: High Availability/ Disaster Recovery SAP Replication Server keeps full up-to-the-minute copies of data available and standing by. If primary systems fail, the synchronized replicates take up the workload without interruption. Real-Time Reporting and Consolidation With SAP Replication Server delivering consolidated, realtime data to analytics systems, every part of the business can make decisions based on the most current data. Data Distribution Data sharing and synchronization enable organizations to leverage all their data across heterogeneous systems, with confidence that it is current and consistent. 9
SAP Replication Server for SAP HANA 15.7 SP200 Source DB Source: SAP HANA SAP ASE Oracle MS SQL IBM DB2 ECH SAP Replication Server for HANA LAN WAN SAP Replication Server for HANA SAP Replication Server provides real-time or scheduled transactional replication for SAP HANA 1. Real time replication solution enables both in-bound and out-bound HANA data movement a) Replication Agent for Oracle, ASE, DB2 and MS SQL (RAX) provides non-intrusive, low latency change data capture for both DDL and DML b) Replication Agent for HANA (RAH) enables HANA to HANA replication for both DDL and DML, optimized for real time HANA data distribution and reporting c) Express Connect for HANA (ECH) leverages HANA native driver s bulk capability for better performance 2. Support for both SAP Business Suite and non-business Suite source applications 3. Heterogeneous direct load materialization (a.k.a. initial load) without downtime 4. Preserve Transactional Consistency between source and target databases 5. Flexible Deployment over LAN/WAN, with multiple sources to multiple targets topology 6. Data Assurance support to ensure distributed data consistency ECH SAP Replication Server for HANA 10
Heterogeneous Direct Load Materialization (DLM) Source DB ECH SAP ASE Oracle MS SQL IBM DB2 SAP Replication Server for HANA 1. Direct load materialization optimized for large data volume with zero downtime a) Replication to other tables not suspended during direct load materialization 2. Seamless integration with Replication Server a) Integrated with create subscription command to table level replication definitions b) Completely eliminate need for manual / bulk materialization 3. Multiple parallel threads can be configured to load data from one primary table to its corresponding replicate table. a) Default # of thread is configured as 5. b) Multiple tables can be configured for materialization in parallel. 4. Build-in monitoring for materialization progress for ease of management 11
SAP Replication Server Data Assurance for SAP HANA 1. Ensures data consistency between sources and SAP HANA targets 2. Highly scalable and can be deployed flexibly to meet high performance and complex topology requirements 3. Supported database types: a) HANA b) IQ c) ASE d) Oracle e) MS SQL f) IBM DB2 Create CompareSet Wizard SAP Replication Server - DA Monitoring Comparison Job 12
Real Time HANA to HANA replication 1. Replicates data from HANA to HANA, optimized for Replicating and Data Distribution 2. Supports Multiple Replication Modes a) Transactional Consistency b) Eventual Consistency c) Change Data Capture d) Transactional Change Data Capture 3. Highly scalable and parallel task-based execution for both initial load and replication 4. Captures and replicates both DDL and DML 13
SAP Replication Server Product road map overview - key themes and capabilities Real-time replication for SAP HANA Real-time replication to SAP HANA from SAP Business Suite Cloud data replication SAP Business Suite support & SAP HANA replication ASE 15.7 SP100 feature support Integrated ASE/SRS for Business Suite Planned and unplanned downtime support for SAP Business Suite/ASE HANA replication from ASE, Oracle, DB2 and MS SQL for non-sap applications Heterogeneous materialization for HANA RSME enhancement (JBoss certification) Data assurance Business Suite ASE, HANA, Oracle & IQ SRS and Data Services integration phase I Today HANA Replication for Business Suite & Zero Data Loss (ZDL) for ASE HANA replication for Business Suite on ASE, Oracle, DB2 and MS SQL DDL replication support for HANA HANA out-bound replication for real-time reporting Data assurance for HANA (DB2, MS SQL) Oracle XStream API integration ZDL support for HA/DR using synchronous replication Report off-loading (use standby for read only reporting) SRS and Data Services integration phase II Planned Innovations Unified RTDP replication platform Integration of SRS capabilities into HANA engine Hana Studio Integration Extend integrated HA/DR, synchronous replication feature to all ASE customers Advanced admin & monitoring support for Replication via SCC Cloud Data Replication Hadoop Integration Replicate out of non-database sources and databases that don t expose log API Future Direction This is the current state of planning and may be changed by SAP at any time. This is the current state of planning and may be changed by SAP at any time. 2014 SAP AG. All rights reserved. 14 provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement
SAP Data Services 4.2 SAP Data Services (DS) is suited for Data Integration (Batch), with HANA optimized capabilities for Transforming, Cleansing* and Integrating (bulk or delta) structured and unstructured* data from many different Sources (SAP and non-sap) to the Target (SAP HANA). Native support for 40+ sources and interfaces SAP Business Suite, Success Factors, RDBMS, 3 rd party Apps Hadoop/Hive Text and Binary Files, XML, Excel, JMS, Web Sources Data Services HANA Studio SAP inmemory computing SAP HANA SAP Data Services: Connectivity Transformations Quality * Data Integrator (for ETL only) is included with most HANA packages. A full Data Service license is required to utilize Data Quality and Text Data Processing. 15
Dataflow Optimization Using HANA calculation views Push down data transformation to HANA for faster and more efficient processes Data flows are optimized with SQL and SAP HANA SQL Script/L based Calculation views More operations are pushed down into HANA with Calculation view optimization compared to SQL optimization Table READ, FUNCTIONS, ORDER BY, GROUP BY, JOIN, MERGE operations are pushed down to HANA using Calculation views Loading of Target HANA tables are done within Calculation View 16
Dataflow Optimization Comparison - SQL and HANA calculation views SQL Calc View Calculation View Calc View Calc View Calc View SQL 17
Optimized SQL & HANA calculation views SELECT "EMP"."EMPNO", "EMP"."DEPTNO FROM "HARI"."EMP" "EMP" where ( "EMP"."EMPNO" > 500000) ORDER BY "EMP"."DEPTNO" ASC Optimized SQL SELECT { fn ucase( "PERS_1"."FNAME" ) }, "PERS_1"."ADDRESS", "PERS_1"."EMPNO" FROM "HARI"."PERS" "PERS_1" SELECT "EMP_2"."EMPNO", "EMP_2"."DEPTNO FROM "HARI"."EMP" "EMP_2" where ( "EMP_2"."EMPNO" < 500000) ORDER BY "EMP_2"."DEPTNO" ASC SELECT { fn ucase( "PERS"."FNAME" ) }, "PERS"."ADDRESS", "PERS"."EMPNO" FROM "HARI"."PERS" "PERS" CREATE PROCEDURE "HARI"."DS_76F91B_CV_LDR" (OUT VAR_DS_76F91B_TT "HARI"."TT_EMP_DTL") LANGUAGE SQLSCRIPT READS SQL DATA AS BEGIN PERS = SELECT "EMPNO" "EMPNO","FNAME" "FNAME","ADDRESS" "ADDRESS" FROM "HARI"."PERS"; ORDER_BY1 = SELECT "EMP_2"."EMPNO" "EMPNO", "EMP_2"."DEPTNO" "DEPTNO" FROM "HARI"."EMP" "EMP_2" where ( "EMP_2"."EMPNO" < 500000) ORDER BY "EMP_2"."DEPTNO" ASC; Optimized CalcView JOIN_1 = SELECT "ORDER_BY1"."EMPNO" "EMPNO", { fn ucase( "PERS"."FNAME" ) } "FNAME", "PERS"."ADDRESS" "ADDRESS" FROM :PERS "PERS" INNER JOIN :ORDER_BY1 "ORDER_BY1" ON "PERS"."EMPNO" = "ORDER_BY1"."EMPNO"; PERS_1 = SELECT "EMPNO" "EMPNO","FNAME" "FNAME","ADDRESS" "ADDRESS" FROM "HARI"."PERS"; ORDER_BY2 = SELECT "EMP"."EMPNO" "EMPNO", "EMP"."DEPTNO" "DEPTNO" FROM "HARI"."EMP" "EMP" where ( "EMP"."EMPNO" > 500000) ORDER BY "EMP"."DEPTNO" ASC; JOIN_2 = SELECT "ORDER_BY2"."EMPNO" "EMPNO", { fn ucase( "PERS_1"."FNAME" ) } "FNAME", "PERS_1"."ADDRESS" "ADDRESS" FROM :PERS_1 "PERS_1" INNER JOIN :ORDER_BY2 "ORDER_BY2" ON "PERS_1"."EMPNO" = "ORDER_BY2"."EMPNO"; MergeTx = SELECT "EMPNO","FNAME","ADDRESS" FROM :JOIN_1 UNION ALL SELECT "EMPNO","FNAME","ADDRESS" FROM :JOIN_2; VAR_DS_76F91B_TT = SELECT * FROM :MergeTx; END; 18
SAP Data Services Product road map overview - key themes and capabilities Real-time Data capture Data Preview of Hadoop data Enhanced unstructured support Simple Data quality advisor for cleansing & matching rules Workbench support for batch data processing Big data Real-time changed data capture with Replication Server and LT Replication Server RESTful web services support Enhanced DS Adapter SDK framework Hive/HDFS Support Pushdown Transformation to Hadoop Governance Lineage into HANA analytical views Data profiling natively in HANA Today (Release 4.2 SP1) Simple Enhanced user operations in Designer and Management Console Big data Tight integration with Replication Server JSON file format support Performance optimization for HANA and other data sources Data Preview for HDFS files Data Preview for HIVE tables Support HiveServer2 Support Apache Hadoop 2.2 Governance Stored procedure support for metadata integrator Data Quality Management global expansion Job statistical dashboards Planned Innovations Simple Enhanced user experiences in Data Services UI tools Big data HBase system support Integrated Job Monitoring Enhanced unstructured data support for archived files, email and mail attachment Expanded data sources support Leverage in-memory technology to provide high performance and powerful new functionalities to enable smart information management systems Enterprise support Data masking for classified data Data Quality Management global and domain expansion Future Direction This is the current state of planning and may be changed by SAP at any time. 19
SAP Event Stream Processor Input Streams Sensors adapters ESP Engine Alerts Messages Dashboards Transactions Market data Applications Clicks SAP HANA SAP HANA Studio 20
What sets SAP ESP apart Performance Scalable for extreme throughput Consistent low-latency Only CEP vendor to submit to STAC benchmarking User Productivity CCL is familiar and simple minimal learning curve Studio with visual editor as well as text editing and testing tools SPLASH scripting Overcome limitations of SQL Flexibility and productivity Private Cloud Architecture Scalable, Dynamic State Management Unique ability to automatically apply incoming events as inserts/updates/deletes to a table Simplifies modeling Extremely efficient Dynamic Add continuous queries to live system High Availability Hot-Hot with Auto-Failover Advanced Subscriptions Initial state followed by updates Subscriptions with predicates SCC Operations Console Dedicated browser-based monitoring and control Part of the SAP Real-time Data Platform Comprehensive integrated suite of capabilities 21
SAP Event Stream Processor Product road map overview - key themes and capabilities SAP HANA integration High speed data capture using SAP HANA Join with HANA tables in ESP projects ESP plug-in for HANA studio Data type alignment Expanded integration options Web services & SAP RFC Adapter framework for rapid integration customization Elastic ESP Run-time parallelization of stream operators for increased scalability Zero data loss options Project-wide checkpoints and guaranteed delivery subscriptions Streaming option for SAP HANA ESP installation via HANA installer Integrated monitoring via HANA cockpit and SAP DB control center ESP project deployment units Expanded integration options Web socket support Usability/simplification Database services management Cluster configuration Streaming engine within the SAP HANA landscape Streaming leverages HANA platform services Common HANA adapter framework Events on HANA tables ESP cloud deployment Dynamic provisioning Multi-tenant Elastic ESP Distributed projects System optimization tools Today (Release 5.1 SP04) See Appendix for abbreviations Planned Innovations Future Direction This is the current state of planning and may be changed by SAP at any time. 22
SAP SQL Anywhere SQL Anywhere is a comprehensive suite of infrastructure that provides data management, cloud infrastructure, synchronization and data exchange technologies that enable the rapid development and deployment of database-powered applications in embedded, small and medium businesses (SMB), SaaS, and remote and mobile environments. Data Management SQL Anywhere UltraLite Cloud Infrastructure SQL Anywhere, ondemand Data Exchange MobiLink SQL Remote Store data in servers, desktops, laptops, tablets and smartphones Manage many databases in a cloud environment Synchronize information to and from enterprise back-ends 23
Moving Data Through Many Environments Enterprise Systems Data Center Office Individual Departmental Workgroup Embedded Mobile & Remote 24
SAP SQL Anywhere Product road map overview - key themes and capabilities Zero-admin and Cloud ready Autonomic data management Deeper Ecosystem Integration SQL Anywhere 16 Zero-admin, full-featured RDBMS Fully embeddable in ISV systems Bi-directional synchronization with enterprise DBMS systems Areas of enhancement around: Performance and scalability Data distribution Security Developer productivity SQL Anywhere, on-demand 1.0 Cloud data management infrastructure featuring: Elastic data provisioning Tenant data isolation Multi-tenant security Scalability for ISVs Full relational power ISV-focused tooling Support multiple database versions Today Autonomic data management Recognize and adapt to modern hardware configurations Proactively monitor server health Continuous improvements to selftuning capabilities Support JavaScript external environment SAP Ecosystem integration Synchronization to SAP Landscapes Integration with SAP Mobile Platform Planned Innovations Autonomic data management Continued focus on selfmanagement for on-premise, cloud, and virtualization scenarios Keep up with changing OS landscape, language drivers/apis, and developer tool support SAP Ecosystem integration SAP Real-time Data Platform Integration Embedded and satellite DB Server Data federation Tooling SAP Mobile Platform integration Extended cloud functionality Cloud architecture maintenance enhancements SQL Anywhere multi-version database support Future Direction See Appendix for abbreviations This is the current state of planning and may be changed by SAP at any time. 25
3 rd Party Integration SAP HANA Certification Program SAP Partners can enroll into SAP HANA Certification Program to integrate their ETL tools with SAP HANA for data loading use case scenarios. Certification is available for JDBC and ODBC interfaces Partners can initiate the discussions through their SAP partner manger SAP HANA PM & ICC teams can provide additional details about the certification requirements Certified Partner ETL tools: - Informatica Inc. - SAS Inc. - IBM (Q2-2014) 26
Key Takeaways SAP Hana Platform - Overview Data Provisioning options with SAP HANA & Roadmaps 3 rd Party Integration Tools Certified against SAP HANA 27
Further Information SAP Data Services http://scn.sap.com/community/data-services Experience SAP HANA http://www.saphana.com/ SAP EventStreamProcessor http://www.sap.com/solutions/technology/database/complex-eventprocessing/index.epx http://scn.sap.com/community/event-processing http://scn.sap.com/community/developer-center/sybase-esp SAP LT Replication Server http://service.sap.com/hana http://scn.sap.com/community/replication-server 28
Thank you Prasad Illapani Product Management & Strategy (Big Data & SAP HANA) SAP Labs LLC, Bellevue, WA Email: prasad.illapani@sap.com
2014 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. National product specifications may vary. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and other countries. Please see http://www.sap.com/corporate-en/legal/copyright/index.epx#trademark for additional trademark information and notices. 30