SAP HANA, HADOOP and other Big Data Tools



Similar documents
Zero-in on business decisions through innovation solutions for smart big data management. How to turn volume, variety and velocity into value

The Future of Data Management

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

The Future of Data Management with Hadoop and the Enterprise Data Hub

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

The Enterprise Data Hub and The Modern Information Architecture

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Big Data Technologies Compared June 2014

Modernizing Your Data Warehouse for Hadoop

HDP Enabling the Modern Data Architecture

SAP Database Strategy Overview. Uwe Grigoleit September 2013

Il mondo dei DB Cambia : Tecnologie e opportunita`

HDP Hadoop From concept to deployment.

TABLE OF CONTENTS 1 Chapter 1: Introduction 2 Chapter 2: Big Data Technology & Business Case 3 Chapter 3: Key Investment Sectors for Big Data

Talend Big Data. Delivering instant value from all your data. Talend

Mind Commerce. Commerce Publishing v3122/ Publisher Sample

The Potential of Big Data in the Cloud. Juan Madera Technology Consultant

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

How To Understand The Business Case For Big Data

VIEWPOINT. High Performance Analytics. Industry Context and Trends

MapReduce with Apache Hadoop Analysing Big Data

Getting Started Practical Input For Your Roadmap

Big Data Multi-Platform Analytics (Hadoop, NoSQL, Graph, Analytical Database)

Comprehensive Analytics on the Hortonworks Data Platform

How To Handle Big Data With A Data Scientist

Tap into Hadoop and Other No SQL Sources

BIG DATA TRENDS AND TECHNOLOGIES

Big Data: Are You Ready? Kevin Lancaster

Big Data and Data Science: Behind the Buzz Words

Achieving Business Value through Big Data Analytics Philip Russom

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

Big Data and Trusted Information

Big Data Big Data/Data Analytics & Software Development

Big Data and Industrial Internet

How Big Is Big Data Adoption? Survey Results. Survey Results Big Data Company Strategy... 6

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

SAP and Hortonworks Reference Architecture

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Please give me your feedback

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

BIG DATA: ARE YOU READY? Andy Kyiet Demand Flow Intelligence May, 2013

SAP Real-time Data Platform. April 2013

The big data business model: opportunity and key success factors

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Market for Telecom Structured Data, Big Data, and Analytics: Business Case, Analysis and Forecasts

This Symposium brought to you by

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?

So What s the Big Deal?

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Introducing Oracle Exalytics In-Memory Machine

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Apache Hadoop in the Enterprise. Dr. Amr Awadallah,

Information Builders Mission & Value Proposition

How To Scale Out Of A Nosql Database

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Blueprints for Big Data Success

Luncheon Webinar Series May 13, 2013

INVESTOR PRESENTATION. First Quarter 2014

Strategic Decisions Supported by SAP Big Data Solutions. Angélica Bedoya / Strategic Solutions GTM Mar /2014

#TalendSandbox for Big Data

SAP Manufacturing Intelligence By John Kong 26 June 2015

Big Data Realities Hadoop in the Enterprise Architecture

Market Overview: Big Data Integration

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

Big Data and the SAP Data Platform Including SAP HANA and Apache Hadoop Balaji Krishna, SAP HANA Product Management

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Data Integration Checklist

Native Connectivity to Big Data Sources in MSTR 10

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Bringing Big Data to People

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Has been into training Big Data Hadoop and MongoDB from more than a year now

CIO Guide How to Use Hadoop with Your SAP Software Landscape

Hurtownie Danych i Business Intelligence: Big Data

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

Transcription:

SAP HANA, HADOOP and other Big Data Tools

Big Data: Why now? x2 90% digital data globally doubles every two years 1 of all data is unstructured and cannot be handled with traditional analytics tools 1 85% of Top 500 enterprises will Fail to exploit Big Data 2 70% of all IT invest 2015 will be Big Data driven 2 >30% of enterprises have no formal concept for data management 5 10-50% cost reduction in production through Big Data exploitation 4 1 IDC Predictions 2012, 2 Gartner, Predicts 2012. 4 McKinsey Global Institute 2011, Big data: The next frontier for innovation, competition, and productivity, 5 Economist Intelligence Unit 2011, Big data. Harnessing a game-changing asset

The BI Ecosystem according to Forrester Mobile database In-memory database Enterprise data warehouse Traditional EDW Column-store EDW MPP EDW Database appliances Relational OLTP NoSQL (nonrelational) Cloud database Relational Scale-out relational Object database Document database Graph database Key-value Traditional data sources New data sources CRM ERP Legacy apps Public data Sensors Marketplace Social media Geo-location Source: Forrester Research, Inc.

The facts behind in-memory Cost of a Terabyte of Enterprise Disk Storage 1990 in the region of USD 9 million 2013 in the region of USD 100 Cost of a Terabyte of RAM 1990 in the region of USD 106 million 2013 in the region of USD 500 i.e. over the last 20 years the price ratio of Memory to Storage has dropped from 1:12 to 1:5 But in real terms the drop in price is 200 000 times Performance Comparison of Memory to Disk Read Enterprise Disk between 4 and 13 million nanoseconds Memory between 0.4 and 40 nanoseconds i.e. between 150 000 and 1 million times faster when already in memory

Positioning Big Data Technologies November 2013 Approaching and beyond mainstream adoption Hadoop SQL Interfaces Hadoop Distribution In-memory Analytics

Big Data tools complement existing BI investment They do not replace them - Yet Business Intelligence Tools and analytical applications Reporting Dashboard OLAP Data & Text Mining Data Warehouse Appliance Data Mart Cube Data integration ETL Transactional OLTP DBMS Business Applications ERP, CRM, etc. Existing data sources

Big Data tools complement existing BI investment They do not replace them - Yet Business Intelligence Tools and analytical applications Reporting Dashboard OLAP Data & Text Mining Predictive Analytics Operational Intelligence Structured and unstructured data Complex event processing Data Warehouse Appliance Data Mart Cube Real-time data processing and analysis Data integration ETL Static data Flowing data Transactional OLTP DBMS Business Applications ERP, CRM, etc. Hadoop, NoSQL, Log-Data In-Memory Database Existing data sources New data sources

The 3 V s of Big Data Legacy BI High performance BI Hadoop Ecosystem Business Problem Backward-looking analysis Using data out of business applications Quasi-real-time, In-memory analysis Using data out of business applications Complex Event Processing Batch, Forward-looking predictive analysis Questions defined in the moment, using data from many sources Technology Solution SAP Business Objects IBM Cognos MicroStrategy Selected Vendors SAP HANA Cloudera Hadoop Hortonworks Hadoop Structured Limited (2 3 TB in RAM) Data Type/Scalability Structured Limited (1 PB in RAM) Structured or unstructured Quasi unlimited (20 30 PB)

HADOOP vs In-Memory analytics How fast do you want your delivery made? What is being delivered?? $ + How much do you want to spend? Do you have specialist drivers?

HADOOP vs In-Memory analytics IMA Ferrari Sexy Very fast Limited luggage space Hadoop (with Impala) MPV Good performance Capacity Easy to drive Affordable Hadoop (without Impala) Long Haul Trucks Excellent Capacity Drives overnight Moderate performance Needs a specialist driver s license

HADOOP vs In-Memory analytics Some Hadoop improvements Hadoop becomes easier and easier to use With the ecosystem of contributors and distributions e.g. Cloudera s Impala, Microsoft s HDInsight, MapR s Drill, Hortonworks Stinger Initiative Cloudera s Hadoop offerings when you buy the Trucks they throw in the MPV's for free Hadoop 2.0 brings YARN, Graph Analysis and Stream Processing The speed of improvements in HDFS/HBase/Hive/Yarn The gap between batch and real-time/low-latency is going to be cut fairly soon e.g. from Hive 0.10 to 0.11 with the new RCFile data format there is a performance boost >10x

Use case segmentation drives solution design and technology selection USE CASE Real-time Reporting of SAP OLTP data, including joins and data transformations Summarise Unstructured DATA LOGS (scheduled) Realtime reporting of Summarised Data Logs, with Joins to other NON OLTP Data Near Realtime reporting of Social Media Data Realtime reporting of recent OLTP data joined with recent Social Media Data Image Analysis Processing (scheduled) Image Analysis Reporting Predictive Analysis Reporting (comparing OLTP & NON OLTP DATA) POTENTIAL TOOL SAP HANA HADOOP MAP/REDUCE IMPALA IMPALA + HADOOP MAP/REDUCE (scheduled to collect recent Social Media Data) HANA + HADOOP MAP/REDUCE (scheduled to collect recent Social Media Data and load into HANA) HADOOP MAP/REDUCE (scheduled job runs sophisticated analysis of Video files and stores results in a structured file) IMPALA (to report on results file) HANA + HADOOP MAP/REDUCE (scheduled to collect & transfer applicable Historic or relevant Non OLTP Data to HANA)

The NEW Real time analytics with SAP HANA & Hadoop Integrate and federate non-sap SAP DS Sybase ESP UI/Front end analytics SAP 3 rd party DBMS Hadoop SLT DXC ETL Smart Access Smart Access Sybase ASE & IQ Hadoop MapReduce/Batch C Computing engine SAP HANA In-Memory SAP ERP/DW SAP LIVE & UI Analytics Mobile & Embedded Applications non-sap BI

Learning some of the language of Big Data ZooKeeper Talend Pentaho Kafka Nutch Matlab Ruby Neo4j Aster Tableau GreenPlum MongoDB Hadoop Java NoSQL Cassandra Shep InfoChimps Platfora C++ Avro Yarn Hive Pig Karmasphere Studio Hbase MapReduce Continuity R HDFS Redis GoPivotal Riak Skytree Chukwa Python Jaspersoft Splunk JRuby CouchDB

The other Big Data tools Once you have a data store and a means of accessing the data. Operational Intelligence Platform Video search, audio search and content analytics Text search Graph databases Complex event processing In-memory data grid Speech recognition Pattern recognition

Some new roles in data/analytics The coming of age of data in the enterprise The Data Scientist The Chief Data Officer Data Explorer Campaign Expert 50% Data Security Officer Business Solution Architect/ Domain Expert Data Hygienist/ Data Steward Big Data talent gap expected until 2018

Predictive analytics for transport, logistics & retail new customer base external online sources Facebook Twitter LinkedIn Google+ YouTube existing customer base High-Tech / Pharma TomTom MarketWatch Financial Times Bloomberg 5 Order volume, received service quality 6 Market and Customer Intelligence the information-driven Transport & logistics & Retail provider Marketing And Sales Product Management Operations New Business strategic network planning Customer market sentiment intelligence and feedback for sme Manufacturing / Long-term FMCG demand forecasts for Supply chain monitoring data is used to create Real-time customer loyalty transport capacity are generated service management improvementmarket intelligence 3 reports for small and incidents in order to support Public strategic customer and information product innovation is mapped medium-sized companies. investments into against the network. business parameters in order to Commerce Sector A comprehensive view on customer risk evaluation and predict churn requirements and initiate countermeasures. and service quality resilience is used Planning to enhance the product portfolio. By tracking and predicting events that lead to supply chain disruptions, the resilience level of Network transport flow dataservices is increased manpower and Households / SME resources. 9 Network flow data Continuous sensor data 8 Financial Industry Public Authorities 11 Market Research SME commercial data services Retail Adress Verification Market Intelligence Supply Chain Monitoring Environmental Statistics environmental intelligence Sensors attached to delivery vehicles produce Location, traffic density, fine-meshed statistics on pollution, traffic directions, delivery density, sequencenoise, parking spot utilization etc. financial demand and supply chain analytics A micro-economic view is created on global supply chain data that helps financial institutions improve their rating and investment decisions. 1 2 consolidated pickup operational capacity planning and delivery 4 Short- and mid-term capacity planning Carriers allows of multiple existing fleets are leveraged optimal utilization and scaling of manpower to pick up or andeliver shipments along routes they resources. Location, Destination, would take anyway. Availability real-time route optimization 10 address verification Delivery Routes are dynamically Fleet personnel verifies recipient addresses calculated which based areon delivery 7 transmitted to a central address verification sequence, service traffic conditions and provided to retailers and marketing agencies. recipient status.

Greater Efficiency for truck and container movements The right information, in the right place, in time, predictable Cloud solution collects all relevant real-time information in one place smartport logistics developed by T-Systems, Deutsche Telekom Innovation Laboratories, SAP Research and Hamburg Port Authority Portal provides transparency for all stakeholders, with role-based access Stakeholder integration Incl. port authority, forwarding agents, terminal and parking lot operators, plus others as required (sea shipping companies etc.) Precise communications thanks to real-time data and smart devices Only location-based information sent to driver, thanks to geo-fencing 5-10 minutes saved per tour means one more pick-up per day

Health care & Pharmagrids got smart Transparency enhanced with predictive analytics Insurance Physicians, Specialists, Family Doctors Immediate availability of patient and poc data Pinpointing guzzlers Management of Devices Optimization and automation of processes Hospitals & Pharma Intelligent management of medical care Patient controlled data distribution Integration Consolidation Optimization Up to 20 % lower costs 1) Factor of 5.8: Potential growth by 2015 2) Secured connection for error-free data transfer Up to 20 % reduction in HR costs thanks to automation Seamless data flow VOLUME VELOCITY VARIETY VALUE Full transparency Processing & integrating smart data management Rapid reactions 100 % compliance with legal requirements

Summary Data Volumes are here to stay In-Memory Computing is becoming increasingly affordable Hadoop is not your Big Data answer it is part of your BI and Big Data ecosystem BI and Big Data Ecosystem will likely benefit from other tools as well An Enterprise Data Strategy and Data Governance is critical to success

Summary Make sure you have two conversations in your enterprise 1 2 A Business Conversation about the business values from your BI Ecosystem An IT Conversation to ensure your IT Organisation understands the new world of BI, the shortcomings, the strengths and roles of the component technologies

Summary What matters is how and why vastly more data leads to vastly greater value creation. Designing and determining those links is typically in the province of top management but needs to be facilitated by the IT Organisation in Business terms

A parting thought: Big Data s 4 V s ANALYTICS creates VALUE value comes from knowing more than the rest

QUESTIONS?

BACKUP

HADOOP Innovation #1: Much cheaper storage SAN Storage NAS File Servers Local Storage Gigabyte $1 Million gets you $2 - $10 $1 - $5 <$0.50 0.5 Petabytes 200,000 IOPS 8 Gbyte/sec 1 Petabyte 200,000 IOPS 10 Gbyte/sec 10 Petabytes 400,000 IOPS 250 Gbyte/sec Software HDS, bundled with hardware by HDS NetApp, bundled with hardware by NetApp Open source Hadoop ecosystem, hardware self-assembled

Learning the language of Big Data Colour coding key Core Hadoop Kernel/Modules Hadoop DW Modules NoSQL DB Platforms MPP Analytics Platforms Programming Languages IDEs Data Hubs BI Suite Analysis and Visualisation Data Analysis Tool Data Integration Tool Startup - undefined

How use case segmentation drives solution design and technology selection

Gartner hyper cycle for analytic applications A great starting point for BI and Big Data use cases