SAP HANA From Relational OLAP Database to Big Data Infrastructure



Similar documents
SAP HANA From Relational OLAP Database to Big Data Infrastructure

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Providing real-time, built-in analytics with S/4HANA. Jürgen Thielemans, SAP Enterprise Architect SAP Belgium&Luxembourg

Exploring the Synergistic Relationships Between BPC, BW and HANA

Oracle Big Data SQL Technical Update

SAP HANA SPS 09 - What s New? HANA IM Services: SDI and SDQ

SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here

In-memory computing with SAP HANA

BW-EML SAP Standard Application Benchmark

SAP Database Strategy Overview. Uwe Grigoleit September 2013

Extend your analytic capabilities with SAP Predictive Analysis

SAP and Hortonworks Reference Architecture

Enhance your Analytics using Logical Data Warehouse and Data Virtualization thru SAP HANA smart data access SESSION CODE: 0210

Luncheon Webinar Series May 13, 2013

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Overview of How SAP IQ Augments the SAP Technology Landscape with Temperature Sensitive Data Management

SAP HANA SPS 09 - What s New? SAP HANA Scalability

SAP BW on HANA : Complete reference guide

SAP Business Warehouse Powered by SAP HANA for the Utilities Industry

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013

Big Data Analytics Using SAP HANA Dynamic Tiering Balaji Krishna SAP Labs SESSION CODE: BI474

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

A Few Cool Features in BW 7.4 on HANA that Make a Difference

Key Attributes for Analytics in an IBM i environment

Hadoop: Embracing future hardware

SAP BW 7.40 Near-Line Storage for SAP IQ What's New?

I/O Considerations in Big Data Analytics

SAP HANA Vora : Gain Contextual Awareness for a Smarter Digital Enterprise

Oracle Database 11g Comparison Chart

Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

Dell s SAP HANA Appliance

RDP300 - Real-Time Data Warehousing with SAP NetWeaver Business Warehouse October 2013

The Arts & Science of Tuning HANA models for Performance. Abani Pattanayak, SAP HANA CoE Nov 12, 2015

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software

SAP HANA SPS 09 - What s New? SAP HANA Multitenant Database Containers

Big Data Analytics - Accelerated. stream-horizon.com

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Interactive data analytics drive insights

SAP Real-time Data Platform. April 2013

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

BIG DATA-AS-A-SERVICE

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

Big Data Analytics - Accelerated. stream-horizon.com

The Future of Data Management

Big Data Technologies Compared June 2014

Agenda. ! Strengths of PostgreSQL. ! Strengths of Hadoop. ! Hadoop Community. ! Use Cases

The IBM Cognos Platform for Enterprise Business Intelligence

Accelerating Hadoop MapReduce Using an In-Memory Data Grid

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

SAP BW: The Real-time Data Application Platform How SAP BW uses the SAP database options

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

Next-Gen Big Data Analytics using the Spark stack

Moving From Hadoop to Spark

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Real-Time Analytics for Big Market Data with XAP In-Memory Computing

Microsoft Analytics Platform System. Solution Brief

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

Introducing Oracle Exalytics In-Memory Machine

MySQL és Hadoop mint Big Data platform (SQL + NoSQL = MySQL Cluster?!)

Leveraging SAP HANA & Hortonworks Data Platform to analyze Wikipedia Page Hit Data

Oracle Database 12c Plug In. Switch On. Get SMART.

Hadoop in the Enterprise

Apache Hadoop: Past, Present, and Future

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Lofan Abrams Data Services for Big Data Session # 2987

Testing Big data is one of the biggest

Accelerating the path to SAP BW powered by SAP HANA

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER

Postgres Plus Advanced Server

Reimagining Business with SAP HANA Cloud Platform for the Internet of Things

REAL-TIME BIG DATA ANALYTICS

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

SAP HANA implementation on SLT with a Non SAP source. Poornima Ramachandra

SQL Server 2014 New Features/In- Memory Store. Juergen Thomas Microsoft Corporation

Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework

Roadmap Talend : découvrez les futures fonctionnalités de Talend

SAP HANA In-Memory Database Sizing Guideline

EMC DATA PROTECTION FOR SAP HANA

SAP HANA Cloud Platform Frequently Asked Questions - Business

Baader Investment Conference. Dr. Werner Brandt, CFO, SAP AG Munich, September 24, 2013

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

OLTP Meets Bigdata, Challenges, Options, and Future Saibabu Devabhaktuni

Architectures for Big Data Analytics A database perspective

Implement Hadoop jobs to extract business value from large and varied data sets

SAP Big Data and Cloud Application Development. Mark Mumy Director, Enterprise Architecture and Big Data

Getting Real Real Time Data Integration Patterns and Architectures

SAP HANA In-Memory in Virtualized Data Centers. Arne Arnold, SAP HANA Product Management January 2013

Information Architecture

SQL Server Consolidation Using Cisco Unified Computing System and Microsoft Hyper-V

Transcription:

SAP HANA From Relational OLAP Database to Big Data Infrastructure Anil K Goel VP & Chief Architect, SAP HANA Data Platform WBDB 2015, June 16, 2015 Toronto

SAP Big Data Story Data Lifecycle Management Benchmarking SAP HANA Big Data Platform Complex Event Processing Enterprise Data Consolidation Data Federation 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 2

Big Data Dimensions in SAP Products / Technologies The Native and Open Strategy SAP HANA Big Data Platform Variety In-memory/On-disk Column/Disk stores Full-text capabilities Support for GIS, time series, graph data, Velocity SQL-Anywhere for distributed data capturing, local processing, efficient propagation Loading from external data sources via Data Services or Replication Server Event Stream Processor for CEP / low-latency data processing Volume Multi-TB databases with SAP HANA in-memory DBMS 12PB data warehouse with SAP IQ storage Hadoop for even larger stores 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 3

SAP HANA Big Data Platform Holistic End-to-end Ecosystem SAP HANA Big Data Platform HANA Data Management Platform Information Management Text Search Graph Geospatial Predictive SAP HANA In-Memory 0.0sec Instant Results HANA Dynamic Tiering Warm Data HADOOP HANA SOE Infinite Storage Raw / Archive Data Complex Event Processing / Smart Data Streaming Bi-directional Replication / Remote Data Sync Administration Monitoring Operations User Management Security 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 4

SAP HANA Platform for the Complete Solution Space The Value Dimension SAP HANA Big Data Platform 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 5

Dynamic Tiering (DT) Data Lifecycle Management DDL, DML, AUTH, DDL, DML, AUTH, Dynamically partitioned data sets Data Life Cycle / Aging Storage hierarchies Unified system landscape simplification Seamless growth and aging of data Integrated query processing Independent management of storage hierarchies Common Studio & Installer DT Engines HANA Engines DT Storage HANA Storage Consistent Backup & Recovery HANA 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 6

Dynamic Tiering (DT) Data Lifecycle Management DDL: CREATE TABLE table_name table_definition USING EXTENDED STORAGE Transactions DT Storage participates in distributed transactions [ICDE2013] Internal optimized 2PC like coordination Recovery integrated with SAP HANA including point-in-time recovery Query Processing Tightly integrated DQP across HANA and DT Multiple distributed query processing strategies Remote Scan Semijoin Table Relocation Union Plan 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 7

Smart Data Access (SDA) Data Federation Access layer for various remote data sources, e.g. Hadoop, relational and non-relational systems Extensible federation layer based Expose tables of remote data sources as virtual tables Uses capability description for remote data source Realizes HANA Open Strategy for Big Data 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 8

Smart Data Access (SDA) Data Federation Register Remote Date Sources CREATE REMOTE SOURCE HIVE1 ADAPTER "hiveodbc" CONFIGURATION DSN=hive1 WITH CREDENTIAL TYPE PASSWORD USING user=dfuser;password=dfpass ; CREATE VIRTUAL TABLE "VIRTUAL_PRODUCT" AT "HIVE1"."dflo"."dflo"."product"; SELECT product_name, brand_name FROM "VIRTUAL_PRODUCT"; Federated Query Processing Capability-based framework, e.g. CAP_JOINS : true or CAP_JOINS_OUTER : true HANA leverages statistics available via remote sources (e.g. Hive MetaStore) for cost-based optimization 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 9

HANA & Hadoop Integration Enterprise Data Consolidation Hadoop Integration SQL on Hadoop via SDA (virtual tables) Hive or Spark Execution of MR-Jobs via HANA (Virtual Functions) Access to HDFS (via virtual function) Integration on storage & processing Use Hadoop storage (HDFS) Push processing to Hadoop (code to data) 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 10

Remote Materialization for Hive Enterprise Data Consolidation Batch processing in Hive or MapReduce implies high latencies and response times Use remote materialization with configurable data freshness requirements WITH HINT (USE_REMOTE_CACHE) for SQL query Hive extension checks for hint and checks for cached result with configured freshness guarantee Non-transactional access for Hadoop-side data Cached data is stored in HDFS Many enhancements possible On-the-fly materialized view creation Remote and / or local 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 11

WBDB 2014: SAP Standard Application Benchmarks Benchmarking Goals Analyze and optimize performance of SAP components and business scenarios Compare the performance of computer systems from different vendors Create an open competition between different vendors by providing the possibility to publish results Provide input for initial sizing Throughput numbers are defined in business application terms, e.g. fully processed order line items per hour Business throughput is mapped onto the resource consumption of the most prominent hardware components, incl. CPU (SAPS) and memory Method is monitored and approved by the SAP Benchmark Council Business-relevant results Free from artifacts, customers can rely on the results Hardware and software combination must be available for customers Only configurations that can be used in production environments are permitted 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 12

WBDB 2014: BW-EML Standard Application Benchmark Facts Benchmarking BW-EML Benchmark Workload: Mixture of multiuser query load accessing all 10 available InfoProviders Simultaneous Delta Loads: During high load reporting activity, static operational data is extended with delta data every 5 minutes with a total of 1/1000 the original data set into all InfoProviders High Load Phase: High load phase is minimum 1 hour. Benchmark Key Performance Indicator: The key figure of this benchmark is the number of ad-hoc navigation steps/hour. Multiuser query load: Each benchmark user runs all queries / navigation steps sequentially in a configurable number of loops Navigation Steps per Loop: Total number of navigation steps per loop: 40 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 13

SAP Big Data Benchmarking Considerations Benchmarking Data types and life cycle across the platform Multi-store: transactional (in-memory/on-disk), MPP, Hadoop, Dynamic Tiering / Aging / Data Life Cycle Impendence mismatch Streaming Multi-data Multi-workload Analytical: predictive, machine learning, typical Big Data, Transactional: business data, systems of record, low latency, Hybrid: Distributed data center (Industrial) IoT as a special use case Machine data Series data Edge data processing 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 14

SAP Big Data Benchmarking Considerations Benchmarking Data Generation Customer specific Realistic Scalable Dexter @SAP Myriad-Oligos @TU Berlin Diversity Applications Systems Languages 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 15

SAP HANA Big Data Platform Summary SAP HANA Native Big Data Capabilities Optimal Integration of Non- SAP Big Data Technologies into SAP Business Landscapes Application & Analytics Capabilities Hadoop / Others as Strategic Big Data Technologies for SAP Extend With Our Own Engines and Analytics (HANA SOE) 2015 SAP SE or an SAP affiliate company. All rights reserved. Public 16

Anil K Goel anil.goel@sap.com 2015 SAP SE or an SAP affiliate company. All rights reserved.