TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Similar documents

Exploring the Synergistic Relationships Between BPC, BW and HANA

The Future of Data Management

Extend your analytic capabilities with SAP Predictive Analysis

An Overview of SAP BW Powered by HANA. Al Weedman

Native Connectivity to Big Data Sources in MSTR 10

Big Data Analytics Using SAP HANA Dynamic Tiering Balaji Krishna SAP Labs SESSION CODE: BI474

In-memory computing with SAP HANA

The Enterprise Data Hub and The Modern Information Architecture

SAP Database Strategy Overview. Uwe Grigoleit September 2013

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Upcoming Announcements

Enhance your Analytics using Logical Data Warehouse and Data Virtualization thru SAP HANA smart data access SESSION CODE: 0210

Providing real-time, built-in analytics with S/4HANA. Jürgen Thielemans, SAP Enterprise Architect SAP Belgium&Luxembourg

SAP HANA From Relational OLAP Database to Big Data Infrastructure

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

SAP and Hortonworks Reference Architecture

HDP Hadoop From concept to deployment.

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Big Data Technologies Compared June 2014

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Oracle Big Data SQL Technical Update

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

Luncheon Webinar Series May 13, 2013

Oracle Big Data Strategy Simplified Infrastrcuture

Moving From Hadoop to Spark

Dell In-Memory Appliance for Cloudera Enterprise

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

Next-Gen Big Data Analytics using the Spark stack

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

SAP BW on HANA : Complete reference guide

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013

SAP BW 7.40 Near-Line Storage for SAP IQ What's New?

Traditional BI vs. Business Data Lake A comparison

BIG DATA TRENDS AND TECHNOLOGIES

The Future of Data Management with Hadoop and the Enterprise Data Hub

How To Create A Data Visualization With Apache Spark And Zeppelin

Comprehensive Analytics on the Hortonworks Data Platform

#TalendSandbox for Big Data

Ali Ghodsi Head of PM and Engineering Databricks

Tap into Hadoop and Other No SQL Sources

Safe Harbor Statement

CIO Guide How to Use Hadoop with Your SAP Software Landscape

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Testing Big data is one of the biggest

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

HDP Enabling the Modern Data Architecture

Big Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Hortonworks Data Platform for Hadoop and SAP HANA

Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora

Introducing Oracle Exalytics In-Memory Machine

SAP HANA SPS 09 - What s New? HANA IM Services: SDI and SDQ

Cisco IT Hadoop Journey

Big Data Analytics - Accelerated. stream-horizon.com

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

An Open Source Memory-Centric Distributed Storage System

From Spark to Ignition:

Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization

Data Management for SAP Business Suite and SAP S/4HANA. Robert Wassermann, SAP SE

Unified Big Data Processing with Apache Spark. Matei

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Architecting for the Internet of Things & Big Data

A Whole New World. Big Data Technologies Big Discovery Big Insights Endless Possibilities

Real Time Big Data Processing

Ganzheitliches Datenmanagement

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

GridGain In- Memory Data Fabric: UlCmate Speed and Scale for TransacCons and AnalyCcs

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Apache Kylin Introduction Dec 8,

RDP300 - Real-Time Data Warehousing with SAP NetWeaver Business Warehouse October 2013

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Data Security in Hadoop

Self-service BI for big data applications using Apache Drill

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

Building Confidence in Big Data Innovations in Information Integration & Governance for Big Data

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Hadoop Ecosystem B Y R A H I M A.

Hadoop & Spark Using Amazon EMR

Using Tableau Software with Hortonworks Data Platform

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

SAP BW: The Real-time Data Application Platform How SAP BW uses the SAP database options

How Companies are! Using Spark

Unified Big Data Analytics Pipeline. 连城

WHAT S NEW IN SAS 9.4

Overview of How SAP IQ Augments the SAP Technology Landscape with Temperature Sensitive Data Management

Protecting Big Data Data Protection Solutions for the Business Data Lake

SAP HANA In-Memory in Virtualized Data Centers. Arne Arnold, SAP HANA Product Management January 2013

Bringing Big Data to People

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Bringing the Power of SAS to Hadoop. White Paper

Teradata s Big Data Technology Strategy & Roadmap

Transcription:

TE's Analytics on Hadoop and SAP HANA Using SAP Vora Naveen Narra Senior Manager TE Connectivity Santha Kumar Rajendran Enterprise Data Architect TE Balaji Krishna - Director, SAP HANA Product Mgmt. - SAP Session ID# A 4963

TE CONNECTIVITY ( FORMERLY TYCO ELECTRONICS ) COMPANY PROFILE AMERICAS CHINA EMEA ASIA * (EXCLUDING CHINA) $ 4.4B $ 4.8B $ 2.3B $ 2.4B Design Centers 10 5 3 3 Manufacturing Sites 38 29 8 15 Engineers 2,570 2,020 810 2,100 $ 13.9B FY 14 SALES WORLDWIDE *Including India

TE DATA ANALYTICS VISION Critical Capabilities Speed from source to data to publish insights Self Service BI different solutions for different skills Enterprise Data Platform Complete, Secured, Understood, Trusted Data Governance data definitions, data lineage, data visibility BU Data Labs for data discovery and investigative analysis

ETL TE ANALYTICS CONCEPTUAL ARCHITECTURE Optimized to service the right data workload and analytical use Data Sources Data Platforms Data Presentation ERP Sources ETL Data to Run the Business Enterprise Data Warehouse Analytics to Run the Business Business Objects Guided Analytics Non-ERP Sources Standard Reporting ETL Analytics to Change the Business Emerging Sources Emerging Sources ETL Data to Change the Business Data Discovery/ Visualization Predictive Analytics External Sources Enterprise Data Hub Machine Learning Data Governance / Data Security manage data as an asset

TE ANALYTICS LOGICAL ARCHITECTURE Structured Data SAP Sources Other ERP / TED / Sales Force / Elequa etc 4 to 5 Billion per Month 3 to 4 Billion per Month EDW - Enterprise Data warehouse HANA Sybase IQ Hot Warm HANA Guided Analytics Standard Reporting Data Discovery / Visualization Machine Data Social Media, Geo spatial etc SAP Data Services (ETL) Semistructured & Unstructured Data EDH - Enterprise Data Hub ( Hadoop ) ODBC Reporting Predictive Analytics Machine Learning

TE ANALYTICS LOGICAL ARCHITECTURE WITH VORA Structured Data SAP Sources Other ERP / TED / Sales Force / Elequa etc 4 to 5 Billion per Month 3 to 4 Billion per Month EDW - Enterprise Data warehouse HANA Sybase IQ Hot Warm HANA Guided Analytics Standard Reporting Data Discovery / Visualization Vora Semistructured & Unstructured Data EDH - Enterprise Data Hub ( Hadoop ) Machine Data Social Media, Geo spatial etc ODBC Reporting Predictive Analytics Machine Learning

DATA TIERING USING HANA DYNAMIC TIERING HOT STORE (In-memory) WARM STORE (on-disk) Active Tables/ DATA Marts PSA Staging Reporting Layer Historic Snapshots Dynamic Tiering is a warm storage option for HANA and is a integral part of HANA Architecture Dynamic Tiering helps to reduce In-memory footprint by pushing non frequently used data from Memory to disk Dynamic Tiering with BW helps to push staging and Write optimized DSOs and Changelog tables to disk SAP HANA By using Dynamic Tiering we can push all Snapshots and Historic data to disk

DYNAMIC TIERING USE CASES Use Case 1 : PSA Persistent staging area is the temporary landing zone in BW for any data loaded into BW, as of now we retain 3 to 15 days of data in PSA and this constitute of 8 % of our HANA usage Use Case 2 : We are using Staging Tables in HANA to store data for PSA ( No Reporting ) Staging ( No Reporting ) processing, lookup etc and this constitute 10 % of our HANA Usage Use Case 3 : BW Change Log tables are used to determine delta records in BW and we retain 3 to 15 days of data in Change log tables and this constitute 8 % of our HANA usage Use Case 4 : BW L1 Staging DSOs We use these tables as staging tables and they hold all the data loaded into BW at the most granular level, they retain massive volume of data and they constitute about 25 % of our HANA usage Use case 5 & 6 : We are planning to store all our Historic data ( data older than 3 years ) and Snapshots in DT for slow reporting Change Log ( No Reporting ) DSO ( No Reporting ) Historic Snapshots Reporting Data Marts SAP HANA WARM STORAGE ( disk based storage )

HADOOP OVERVIEW

APACHE SPARK Apache Spark is a fast and general engine for largescale data processing. Spark has an advanced DAG execution engine that supports cyclic data flow and in-memory computing Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application. The Data Sources API provides a pluggable mechanism for accessing structured data though Spark SQL

HADOOP/SPARK INTEGRATION WITH HANA 2012 -SP06 Hive Added as a Remote Source ODBC Based Communication SP07 Query Optimization Like Remote Caching and Join Relocation Ambari Hive Mahout Pig Yarn/MR Spark HBASE SP09 HDFS Reading HDFS Directly Map Reduce Job Execution SP10 Spark SQL added as a new Remote Source Ambari launcher tile in HANA Cockpit

THE JOURNEY SO FAR.. HANA & HADOOP INTEGRATION HANA & Hadoop Integration SQL on Hadoop via SDA (virtual tables) Hive (SPS06) Remote caching with Hive (SPS07) Connectivity to Apache Spark using ODBC Execution of MR-Jobs via HANA (Virtual Functions) and direct access to HDFS (SPS 09) Spark SQL adapter via SDA (SPS10) Join relocation to Hadoop thru SparkRDD Unified Admin thru Ambari integration for Hortonworks Key Benefits Deep Integration for storage & processing Optimized data access between HANA & Hadoop Data tiering to Hadoop for cold storage

SAP HANA VORA: THE POWER OF CONTEXT, IN- MEMORY Data Hierarchies Semantic Analysis & Optimization Query Acceleration Metadata Catalog Any Hadoop. For Cloud. A massively distributed in-memory computing system that scales to 1000 s of node both on-prem and in the cloud and simplifies big data processing for the business ApacheSpark Other

MOTIVATION - (SOME) EXISTING SOLUTIONS Hadoop Distributed SQL Databases No-SQL Databases MLlib GraphX HiveSQL SparkSQL SAP HANA Google F1 Facebook Presto Amazon Redshift MongoDB (Document Store) Neo4j (Graph Store) Berkley DB (Key Value Store) IBM Informix (Time Series Store) Apache Lucene (Text Search) Holistic, enterprise ready, and massive scale out solution?

IN-MEMORY DATA FABRIC FOR ENTERPRISE + DISTRIBUTED COMPUTE ALL IN-MEMORY Enterprise Compute Distributed Compute CONSUME COMPUTE STORE HANA OLTP + OLAP Scale Up + Scale Out + Tiering Appliance TDI Federated Queries & Programming Model Vora Vora Vora Vora Vora Vora Vora Vora Vora Vora Vora Vora Massive Scale Out Distributed File System Network Storage Cloud Persistence Any Hardware

SAP HANA VORA: STRATEGIC POINT OF VIEW SAP HANA Add functionality for enterprise applications Hierarchies OLAP modeling Boost SQL performance Federate access across HANA and Hadoop Integrate tooling

SAP HANA VORA : EXTENDING THE SAP HANA PLATFORM S4/HANA/ HANA Live SAP Business Warehouse Industry Applications Partner Applications Data Hierarchies Query Acceleration Application Services Semantic Analysis & Optimization Metadata Catalog Database Services Spark Other Integration Services Unstructured Data Vora SAP HANA Platform

SAP HANA VORA ROADMAP Today Planned Innovations Future Direction Deliver Enterprise Analytics and HANA- Spark integration Enable OLAP style Analytics on Hadoop data Support for HDFS, Parquet, ORC and S3 data formats Hierarchies on Hadoop data Integration to SAP HANA thru Apache Spark Data Source API LLVM technology to translate SQL code to C programs for faster performance Deeper Integration with SAP HANA Modeler for Vora simple web interface to model, build and query cubes More OLAP features like UoM conversion, currency conversion Extend HANA Data Lifecycle Management to Hadoop thru Vora integration Extend engine support for Time series, Graph, Document store and disk based processing Enhanced Kerberos support SAP ILM integration and beyond Extend HANA integration to support ERP ILM scenarios for archived data in Hadoop Distributed query processing using native Vora processing engine Cluster integration for different distributions (monitoring and admin) Hana integration : HANA shell as an integrated part of VORA delivery Security for Vora tables SAP HANA Vora 1.1 This is the current state of planning and may be changed by SAP at any time.

FOLLOW US Thank you for your time Follow us on at @ASUG365