Hortonworks Data Platform for Hadoop and SAP HANA

Similar documents
Upcoming Announcements

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

HDP Hadoop From concept to deployment.

HDP Enabling the Modern Data Architecture

Data Security in Hadoop

SAP and Hortonworks Reference Architecture

Comprehensive Analytics on the Hortonworks Data Platform

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

SAP SE - Legal Requirements and Requirements

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Real-Time Enterprise Management with SAP Business Suite on the SAP HANA Platform

SAP Audit Management A Preview

Big Data Realities Hadoop in the Enterprise Architecture

Modernizing Your Data Warehouse for Hadoop

Real-Time Reconciliation of Invoice and Goods Receipts powered by SAP HANA. Stefan Karl, Finance Solutions, SAP ASUG Presentation, May 2013

SAP HANA SPS 09 - What s New? HANA IM Services: SDI and SDQ

A Modern Data Architecture with Apache Hadoop

Hadoop, the Data Lake, and a New World of Analytics

SAP S/4HANA Embedded Analytics

SAP HANA Vora : Gain Contextual Awareness for a Smarter Digital Enterprise

Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora

SAP BusinessObjects Business Intelligence 4 Innovation and Implementation

Design & Innovation from SAP AppHaus Realization with SAP HANA Cloud Platform. Michael Sambeth, Business Development HCP, SAP (Suisse) SA

Mobile app for Android Version 1.0.x, January 2014

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

The Digital Enterprise Demands a Modern Integration Approach. Nada daveiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader

SAP Business One mobile app for Android Version 1.0.x November 2013

Modern Data Architecture for Predictive Analytics

Cost-Effective Data Management and a Simplified Data Warehouse

Big Data: Making Sense of it all!

BIG DATA TRENDS AND TECHNOLOGIES

SAP HANA SPS 09 - What s New? SAP DB Control Center DBA Tool to manage Data Center

The Internet of Things and I4.0 is an Evolution. New Markets (e.g. maintenance hub operator) Data Driven. Services. (e.g. predictive.

Introducing SAP Cloud for Analytics. Pras Chatterjee, Senior Director Product Marketing, EPM November 2015

Formulate Winning Sales and Operations Strategies Through Integrated Planning

SAP HANA SPS 09 - What s New? Development Tools

SAP Business One mobile app for ios. Version 1.9.x September 2013

SAP Database Strategy Overview. Uwe Grigoleit September 2013

#TalendSandbox for Big Data

EO Data by using SAP HANA Spatial Hinnerk Gildhoff, Head of HANA Spatial, SAP Satellite Masters Conference 21 th October 2015 Public

Mobile app for Android Version 1.2.x, December 2015

The Future of Data Management with Hadoop and the Enterprise Data Hub

Agil visualisering och dataanalys

The Future of Data Management

Using In-Memory Data Fabric Architecture from SAP to Create Your Data Advantage

Reimagining Business with SAP HANA Cloud Platform for the Internet of Things

SAP BusinessObjects Cloud

High Availability & Disaster Recovery. Sivagopal Modadugula/SAP HANA Product Management Session # 0506 May 09, 2014

How to Hadoop Without the Worry: Protecting Big Data at Scale

HADOOP. Revised 10/19/2015

R49 Using SAP Payment Engine for payment transactions. Process Diagram

Mobile app for ios Version 1.10.x, August 2014

SAP Big Data and Cloud Application Development. Mark Mumy Director, Enterprise Architecture and Big Data

SAP Mobile Documents. December, 2015

Mobile app for ios Version 1.11.x, December 2015

Managing Customer Relationships with SAP Business One

The Arts & Science of Tuning HANA models for Performance. Abani Pattanayak, SAP HANA CoE Nov 12, 2015

Increasing Demand Insight and Forecast Accuracy with Demand Sensing and Shaping. Ganesh Wadawadigi, Ph.D. VP, Supply Chain Solutions, SAP

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Deploying Hadoop with Manager

Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization

SAP HANA An In-Memory Data Platform for Real-Time Business

Hortonworks Data Platform. Buyer s Guide

SAP BusinessObjects Dashboarding Strategy and Statement of Direction

SAP Document Center. May Public

Software and Delivery Requirements

EMC: Managing Data Growth with SAP HANA and the Near-Line Storage Capabilities of SAP IQ

Landscape Deployment Recommendations for. SAP Fiori Front-End Server

Price and Revenue Management - Manual Price Changes. SAP Best Practices for Retail

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

The Edge Editions of SAP InfiniteInsight Overview

SAP BusinessObjects Business Intelligence 4.1 One Strategy for Enterprise BI. May 2013

Why Cloud Platforms are the Secret Weapon to Make Your Business More Agile and Competitive

Application Test Management and Quality Assurance

Partner Certification to Operate SAP Solutions and SAP Software Environments

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013

Dominik Wagenknecht Accenture

Information Builders Mission & Value Proposition

SAP HANA SPS 09 - What s New? SAP HANA Scalability

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

DMM301 Benefits and Patterns of a Logical Data Warehouse with SAP BW on SAP HANA

SAP Working Capital Analytics Overview. SAP Business Suite Application Innovation January 2014

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Please give me your feedback

Big Data Management and Security

Protect Your Connected Business Systems by Identifying and Analyzing Threats

Constructing a Data Lake: Hadoop and Oracle Database United!

SAP HANA SPS 09 - What s New? Administration & Monitoring

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

Enterprise MDM SAP HANA Powered Intelligent Process Analytics at Colgate Jian Ming Se Colgate / Juergen Bold SAP AG

Roadmap Talend : découvrez les futures fonctionnalités de Talend

Transcription:

Hortonworks Data Platform for Hadoop and SAP HANA Prasad illapani, Big Data & SAP HANA- Product Management & Strategy SAP Labs LLC., Bellevue, WA Bob Page, VP Partner Products, Hortonworks Inc. Palo Alto, CA

Legal Disclaimer The information in this document is confidential and proprietary to SAP and may not be disclosed without the permission of SAP. This presentation is not subject to your license agreement or any other service or subscription agreement with SAP. SAP has no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation and SAP's strategy and possible future developments, products and or platforms directions and functionality are all subject to change and may be changed by SAP at any time for any reason without notice. The information on this document is not a commitment, promise or legal obligation to deliver any material, code or functionality. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. This document is for informational purposes and may not be incorporated into a contract. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions. 2

Agenda Hortonworks Data Platform - HDP 2.1 Enterprise Hadoop SAP HANA Platform for Big Data SAP HANA Platform and Hortonworks Data Platform- Solution Use Cases and Patterns Key Takeaways 3

A Modern Data Architecture APPLICATIONS Business Analytics Custom Applications Packaged Applications DEV & DATA TOOLS Build & Test DATA SYSTEM RDBMS EDW MPP REPOSITORIES Governance & Integration Enterprise Hadoop Data Access Data Management Security Operations OPERATIONS TOOLS Provision, Manage & Monitor SOURCES OLTP, ERP, Documents, Web Logs, CRM Systems Emails Click Streams Social Networks Machine Generated Sensor Data Geolocation Data 4

HDP 2.1: Enterprise Hadoop HDP 2.1 Hortonworks Data Platform GOVERNANCE & INTEGRATION DATA ACCESS SECURITY OPERATIONS Data Workflow, Lifecycle & Governance Falcon Sqoop Flume NFS WebHDFS Batch Map Reduce Script Pig SQL Hive/Tez, HCatalog NoSQL HBase Accumulo Stream Storm YARN : Data Operating System Search Solr 1 HDFS (Hadoop Distributed File System) Others In-Memory Analytics, ISV engines N Authentication Authorization Accounting Data Protection Storage: HDFS Resources: YARN Access: Hive, Pipeline: Falcon Cluster: Knox Provision, Manage & Monitor Ambari Zookeeper Scheduling Oozie DATA MANAGEMENT Deployment Choice Linux Windows On- Premise Cloud 5

HDP: Open, Reliable, & Current HDP certifies most recent & stable community innovation HDP 2.1 0.13.0 1.5.1 April 2014 2.4.0 0.4 0.12.1 0.12.0 0.98.1 0.9.1 0.9.0 4.8.0 1.4.5 1.4.0 0.5 4.0.0 0.4 HDP 2.0 October 2013 HDP 1.3 May 2013 2.2.0 1.1.2* Hadoop &YARN Tez 0.12.0 0.11 Pig 0.11.0 Hive & HCatalog 0.96.0 0.94.6 HBase Storm 0.8.0 0.7.0 Mahout Solr 1.4.4 1.4.3 Sqoop 1.3.1 Flume Falcon 1.4.1 1.2.3 Ambari 3.3.2 Oozie 3.4.5 Zookeeper Knox Data Management Data Access Governance & Integration Operations Security Hortonworks Data Platform 6

HDP: Interactive SQL-IN-Hadoop Stinger Initiative DELIVERED Next generation SQL based interactive query in Hadoop Speed Improve Hive query performance has increased by 100X to allow for interactive query times (seconds) Scale The only SQL interface to Hadoop designed for queries that scale from TB to PB SQL Support broadest range of SQL semantics for analytic applications running against Hadoop Business Analytics Apache MapReduce SQL Apache YARN 1 Custom Apps Apache Hive Apache Tez HDFS (Hadoop Distributed File System) N Stinger Project Stinger Phase 1: Base Optimizations SQL Types SQL Analytic Functions ORCFile Modern File Format Stinger Phase 2: SQL Types SQL Analytic Functions Advanced Optimizations Performance Boosts via YARN Stinger Phase 3 Hive on Apache Tez Query Service (always on) Buffer Cache Cost Based Optimizer (Optiq) An Open Community at its finest: Apache Hive Contribution 1,672 Jira Tickets Closed 145 Developers 44 Companies ~330,000 Lines Of Code Added (2.5x) 13 Months 7

HDP 2.1 HDP: Governance and Integration Governance & Integration Data Access Data Management Security Operations Apache Falcon Simplified Data Governance for Enterprise Hadoop First time included in HDP Provides key governance framework for: Acquisition & processing of data sets Replication & Retention of datasets Redirect datasets to non-hadoop extensions Provides audit trail & lineage Another great example of Open Community Innovation Originally built and contributed to Apache by InMobi Fastest path to innovation is the open community 14 months in the making Tested In production Vibrant community of developers building Investment Phases Phase-1 Incubate Apache Falcon Dataset replication & retention Falcon tech preview Phase-2 Basic dashboard for pipeline viewing Kerberos security support Ambari integration for management Hive/HCatalog integration Phase-3 Advanced Dashboard for pipeline definition & management Audit Lineage Data tagging File import SSH & SCP 8

HDP 2.1 HDP: Apache Knox Governance & Integration Data Access Data Management Security Operations Important Note: Security for Hadoop must be addressed within every layer of the stack and integrated into existing frameworks For a full description of what is available in Enterprise Hadoop today across Authentication, Authorization, accountability and Encryption please visit our security labs page Apache Knox Perimeter security for Hadoop A common place to preform authentication across Hadoop and all related projects Integrated to LDAP and AD Currently supports: WebHDFS, WebHCAT, Oozie, Hive & HBase Broad community effort, Incubated with Microsoft, broad set of developers invovled Security Investments Phase 1 Strong AuthN with Kerberos HBase, Hive, HDFS basic AuthZ Encryption with SSL for NN, JT, etc. Wire encryption with Shuffle, HDFS, JDBC Security Phase 2: ACLs for HDFS Knox: Hadoop REST API Security SQL-style Hive AuthZ (GRANT, REVOKE) SSL support for Hive Server 2 SSL for DN/NN UI & WebHDFS PAM support for Hive Security Phase 3: Audit event correlation and Audit viewer Support Token-Based AuthN beyond kerb Data Encryption in HDFS, Hive & Hbase Knox for HDFS HA, Ambari & Falcon 9

HDP 2.1 New: Stream Processing Governance & Integration Data Access Data Management Security Operations Apache Storm Real-time event processing for sensor and business activity monitoring Unlocks new business cases for Hadoop Key component of a data lake architecture Scale: Ingest millions of events per second. Fast query on petabytes of data Integrated with Ambari to manage Investment Phases Phase-1 Install, Start, & Stop via Ambari Kafka, HBase, & HDFS Connectors Ganglia & Nagios based monitoring Phase-2 Storm-on-YARN Ingest & Notification for JMS Data persistence: EDWs, RDBMS, Cassandra Phase-3 High Availability mgmnt w/ambari AD/LDAP plugin for authentication Declarative wiring Hive update support Advanced scheduler 10

HDP 2.1 HDP: Search Governance & Integration Data Access Data Management Security Operations Apache Solr Open source enterprise search for Hadoop and HDP Open architecture: In the community, for the community Simple, powerful UI for advanced search applications High performance indexing & sub-second search times over billions of documents Deep Integration Roadmap with HDP Partnership with LucidWorks LucidWorks provides tier 3 & 4 support Alignment w/ strategy of working within the community and with the core committers 9 committers total (7 PMC) 11

HDP 2.1 HDP: Operating Enterprise Hadoop Governance & Integration Data Access Data Management Security Operations AMBARI WEB Apache Ambari is the only 100% open source framework for provisioning, managing and monitoring Apache Hadoop clusters Integration With Existing Operations Tools Viewpoint Others New in HDP 2.1 Support new Data Access Engines Stack extensibility, Cluster Blueprints Rolling restarts Maintenance mode more... REST APIs AMBARI SERVER PROVISION MANAGE MONITOR PROVISION MANAGE MONITOR compute & storage.......... compute & storage 12

SAP HANA Platform Supports any Device Any Apps Any App Server SQL Analytics (Visualize, Predict, Engage) MDX R JSON Open Connectivity SAP Business Suite and BW ABAP App Server SAP HANA Platform Hana One HEC SQL, SQLScript, JavaScript Spatial Search/Graph Text Mining Stored Procedure & Data Models Application & UI Services Business Function Library Predictive Analysis Library Database Services Planning Engine Rules Engine Integration Services/Security/ Governance/LCM/Landscape Management Transaction Unstructured Machine HADOOP Real-time Locations Other Apps SAP HANA Platform converges Database, Data Processing, Application Platform capabilities & provides libraries for Predictive, Planning, Text, Spatial, Graph and Business Analytics to enable business to operate in real-time. SAP HANA Platform is available as an On Premise Appliance or via Cloud offerings: SAP HANA One on AWS, SAP HANA Enterprise Cloud (HEC). 13

SAP HANA Platform is expanding frontiers Cloud Big Data Hosting (HEC) Platform-aaS (HCP) HANA-aaS Elastic Storage Deep Exploration and Analysis Data-as-a-Service Internet of Everything Customer / Audience Behavior Corporate Functions SAP HANA Platform Logical Data Warehouse Data Aging / Archiving 14

SAP HANA + Hortonworks Data Platform (HDP) SOURCES ERP Apps Mobile Apps Custom Apps SAP Analytics Sensor Geo Logs Text Structured Weather Data Acquisition, Ingestion & Provisioning BATCH Processing HANA Engine Processing Engine Database Services Application Function Libraries & Data Models (OLTP + OLAP) INTERACTIVE SAP HANA PLATFORM In-memory processing platform for real-time transactions + end-to-end analytics Application Development Extended Application Services Application Function Libraries & Data Models ONLINE Integration Services STREAMING HANA IN-MEMORY YARN: Cluster Resource Management Unified Administration ISV APPS OTHERS Social HDFS: Redundant, Reliable Storage Other Hortonworks Data Platform (HDP) 15

Generic pattern 1: Machine Data Insight Prototypical Machine Data case Analytics & Applications SAP Enterprise data Non-SAP Enterprise data Mobile data Machine data (Sensors, SCADA, Logs, Etc.) Data Sources Real-Time Replication Stream Processing Synchronization Dashboard / Reporting in Real- Time HANA In Memory Transactional Analytical Extended Storage (IQ) Tiered Storage (Hot-warm-cold) Smart Data Access Predictive Analysis Planning & Simulation Graph Spatial Large Low-Cost Data Platform (Hadoop / IQ) Historical Data, Offline Batch Processes, Model Training etc. SAP HANA Data Platform Real-time operations, analysis and actions Real-time data stream (Billions of events/day) Millions of events/day correlated with Enterprise Data Use Cases Energy Optimization Predictive maintenance Remote asset mgmt. Supply/demand forecast Inventory mgmt. Route optimization Transform High Volume, High Velocity data into High Value Data. Enable Real-Time Analytics. 16

Generic pattern 2: Customer Insight Prototypical customer behavior analysis case SAP Enterprise data Mobile data Clickstream data Social data Historical Data Data Sources Real-Time Replication Stream Processing Data Movement Analytics & Applications Real-Time Offers HANA In Memory Transactional Analytical Extended Storage (IQ) Dashboard / Reporting in Real-Time Tiered Storage (Hot-warm-cold) Smart Data Access Predictive Analysis Planning & Simulation Graph Spatial Large Low-Cost Data Platform (Hadoop / IQ) Historical Data, Offline Batch Processes, Model Training etc. SAP HANA Data Platform Enable actionable insight got targeted applications Terabytes of data/month Millions of events/day correlated with Enterprise Data Use Cases Customer Behavior Customer Segmentation Customer Loyalty Customer Churn Online Consume Habits Campaign Performance Predictive Maintenance Enable real-time analytics and actionable insight. 17

SAP/Hortonworks Retail Big Data Architecture Streaming Data Events, Replicate Data Tables from Transactional Applications Real-Time Data Acquisition SAP IS-Retail Retail ERP System Sybase Event Stream Processor SAP Replication Server SAP SLT Real-time Near Realtime SAP Customer Activity Repository Real-Time Multichannel & Application Platform Consuming Applications Federated Smart Data Access Multichannel Sales ----------------- 360 Degree Customer View Predictive Engine Customer Mobile Applications Sybase Unwired Platform Spatial Engine Transfer Datasets SAP Business Objects BI Suite Exploration, Reporting, Dashboarding, Predictive, Mobile OLAP Engine Hortonworks Data Platform Data Lake Large Scale Data Capture, Generate Analytical Datasets, Train/Validate Predictive Models Batch Data Acquisition SAP Data Services Transactional Systems, Databases, Flat Files, Batch Data Feeds 18

SAP HANA/ HDP- Retail Big Data Solution Real-Time Data Acquisition SAP IS-Retail Retail ERP System SAP Event Stream Processor SAP Replication Server SAP SLT Real-time Multichannel Sales ----------------- 360 Degree Customer View Custom Mobile Applications SAP HANA PLATFORM In-memory processing platform for real-time transactions + end-to-end analytics Application Development Processing Engine Database Services Federated Transfer Smart Data (OLTP + OLAP) Datasets Access Application Function Libraries & Data Models SAP Customer Activity Repository Extended Application Services Integration Services SAP BI Suite Exploration, Reporting, Dashboarding, Predictive, Mobile Unified Administration Hortonworks Data Platform Data Lake Large Scale Data Capture, Generate Analytical Datasets, Train/Validate Predictive Models Batch /Near Real time SAP Data Services 19

SAP HANA Platform for Big Data Real-time Simplicity Trusted SAP HANA remains the only truly in-memory business technology platform in the market today SAP HANA is designed for real-time performance to process streaming, transactional and analytical data SAP HANA Platform extends these boundaries with special purpose, best of breed engines High performance single store radically simplifies by eliminating the necessity for multi-staged persisted data processing Deep integration between engines offers broadest coverage of data processing with minimal data movement and shared information architectures. Enterprise class data and service level guarantees offering highest levels of trusted data access. Single platform for modeling and processing a wide range of data forms. Full spectrum of transactional consistency options over very large data sets across the platform. 20

Key Takeaways Learn Hortonworks Data Platform Understand SAP HANA Platform SAP HANA and Hortonworks Data Platform Solution 21

Further Information Experience SAP Big Data http://www.sapbigdata.com/ SAP HANA and Hadoop http://www.sapbigdata.com/platform/hadoop/#sthash.bs6twlb9.dpbs Hortonworks Data Platform www.hortonworks.com Hortonworks Sandbox www.hortonworks.com/sandbox Hortonworks & SAP www.hortonworks.com/partners/sap 22

Thank you Prasad Illapani Big Data Product Management & Strategy SAP Labs LLC, Bellevue, WA Email: prasad.illapani@sap.com Bob Page VP Partner Products Hortonworks Inc. Email: bpage@hortonworks.com

2014 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. National product specifications may vary. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and other countries. Please see http://www.sap.com/corporate-en/legal/copyright/index.epx#trademark for additional trademark information and notices. 24