Big Data Realities Hadoop in the Enterprise Architecture

Similar documents
Apache Hadoop's Role in Your Big Data Architecture

Hadoop in the Enterprise

Big Data: Making Sense of it all!

HDP Hadoop From concept to deployment.

HDP Enabling the Modern Data Architecture

Modern Data Architecture for Predictive Analytics

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Comprehensive Analytics on the Hortonworks Data Platform

Upcoming Announcements

Stinger Initiative: Introduction

Hadoop, the Data Lake, and a New World of Analytics

A Modern Data Architecture with Apache Hadoop

The Future of Data Management

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

#TalendSandbox for Big Data

Next Gen Hadoop Gather around the campfire and I will tell you a good YARN

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization

The Future of Data Management with Hadoop and the Enterprise Data Hub

YARN Apache Hadoop Next Generation Compute Platform

Modernizing Your Data Warehouse for Hadoop

Modern Data Architecture for Retail with Apache Hadoop on Windows

The Evolving Apache Hadoop Eco-System

Data Security in Hadoop

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Community Driven Apache Hadoop. Apache Hadoop Basics. May Hortonworks Inc.

Hortonworks Data Platform for Hadoop and SAP HANA

Evolution from Big Data to Smart Data

Modern Data Architecture for Financial Services with Apache Hadoop on Windows

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Apache Hadoop Patterns of Use

Teradata Unified Big Data Architecture

Talend Big Data. Delivering instant value from all your data. Talend

The Next Wave of Data Management. Is Big Data The New Normal?

SAP and Hortonworks Reference Architecture

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

HADOOP. Revised 10/19/2015

Apache Hadoop: The Big Data Refinery

Big Data 101 Webinar

Please give me your feedback

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

How Companies are! Using Spark

Cloudera Enterprise Data Hub in Telecom:

Exploiting Data at Rest and Data in Motion with a Big Data Platform

The Digital Enterprise Demands a Modern Integration Approach. Nada daveiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

Bringing Big Data to People

Workshop on Hadoop with Big Data

Hadoop Job Oriented Training Agenda

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Hortonworks Data Platform. Buyer s Guide

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

How Big Is Big Data Adoption? Survey Results. Survey Results Big Data Company Strategy... 6

BIG DATA TRENDS AND TECHNOLOGIES

Information Builders Mission & Value Proposition

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

So What s the Big Deal?

Big Data and Industrial Internet

Open Source in Financial Services: Meet the challenges of new business models and disruption

Big Data: What You Should Know. Mark Child Research Manager - Software IDC CEMA

Dominik Wagenknecht Accenture

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Using Tableau Software with Hortonworks Data Platform

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Hortonworks Architecting the Future of Big Data

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

SQLSaturday #399 Sacramento 25 July, Big Data Analytics with Excel

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

A Modern Data Architecture with Apache Hadoop

Hadoop in the Hybrid Cloud

Hadoop Big Data for Processing Data and Performing Workload

Roadmap Talend : découvrez les futures fonctionnalités de Talend

THE JOURNEY TO A DATA LAKE

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Getting Started Practical Input For Your Roadmap

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera

BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

TRAINING PROGRAM ON BIGDATA/HADOOP

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

QUEST meeting Big Data Analytics

Hadoop. Sunday, November 25, 12

The Enterprise Data Hub and The Modern Information Architecture

Transcription:

Big Data Realities Hadoop in the Enterprise Architecture Paul Phillips Director, EMEA, Hortonworks pphillips@hortonworks.com +44 (0)777 444 3857 Hortonworks Inc. 2012 Page 1

Agenda The Growth of Enterprise Data Hadoop New Data Architecture Hortonworks an Overview The Future of Hadoop and Big Data Page 2

Agenda The Growth of Enterprise Data Hadoop New Data Architecture Hortonworks an Overview The Future of Hadoop and Big Data Page 3

The Growth of Data in the Enterprise Data Explosion 1 Zettabyte (ZB) = 1 Billion TBs By 2015, organizations that build a modern information management system will outperform their peers financially by 20 percent. Gartner, Mark Beyer, Information Management in the 21st Century 15x growth rate of machine generated data by 2020 Source: IDC Page 4

Big Data: Organizational Game Changer Petabytes Terabytes Gigabytes Megabytes BIG DATA WEB CRM ERP Purchase detail Purchase record Payment record Web logs A/B testing Segmentation Mobile Web Sentiment SMS/MMS User Click Stream Speech to Text Behavioral Targeting Customer Touches Support Contacts Offer details Transactions + Interactions + Observations = BIG DATA Social Interactions & Feeds Spatial & GPS Coordinates Click Streams Search Marketing Affiliate Networks Dynamic Funnels Offer history Sensors / RFID / Devices Business Data Feeds External Demographics User Generated Content HD Video, Audio, Images Product/Service Logs Increasing Data Variety and Complexity Page 5

Agenda The Growth of Enterprise Data Hadoop New Data Architecture Hortonworks an Overview Model and Strategy The Future of Hadoop and Big Data Page 6

Growth Pressures Existing Data Architectures APPLICATIO NS Packaged Analy9c App Custom Analy9c App DEV & DATA TOOLS BUILD & TEST DATA SYSTEMS RDBMS EDW MPP OPERATIONAL TOOLS MANAGE & MONITOR DATA SOURCES Tradi9onal Sources (RDBMS, OLTP, OLAP) Data growth 8% annually Page 7

An Emerging Data Architecture APPLICATIO NS Packaged Analy9c App Custom Analy9c App DEV & DATA TOOLS BUILD & TEST DATA SYSTEMS RDBMS EDW MPP ENTERPRISE HADOOP PLATFORM OPERATIONAL TOOLS MANAGE & MONITOR DATA SOURCES Tradi9onal Sources (RDBMS, OLTP, OLAP) New Sources (web logs, email, sensors, social media) Data growth 85% annually Page 8

Deutsche Telekom s Perspective Hadoop! Coming soon to an enterprise data warehouse near you. I predict that by 2015, fully 80 percent of all new data will enter the typical enterprise on a Hadoop cluster first, making it the de facto enterprise-wide landing zone for large amounts of data. Hadoop is open source and runs on industry-standard hardware, which means it is at least 10 times more economical than conventional data warehouse solutions. Juergen Urbanski, VP Big Data Architectures & Technologies T-Systems, the enterprise IT division of Deutsche Telekom

6 Key DATA TYPES as Application Catalysts 1. Sentiment Understand how your customers feel about your brand, products and services right now 2. Clickstream Capture and analyze site visitors data trails and optimize your website 3. Sensor/Machine Track your machine usage, predict failures and schedule maintenance 4. Geographic Analyze location-based data to manage operations where they occur 5. System Logs Research logs to diagnose system failures and prevent future errors 6. Unstructured Text Read and understand web pages, books, PDFs faster than they are written Page 10

Agenda The Growth of Enterprise Data Hadoop New Data Architecture Hortonworks an Overview The Future of Hadoop and Big Data Page 11

A little history it s 2005

A Brief History of Apache Hadoop 2005: Yahoo! creates team under E14 to work on Hadoop Apache Project Established Yahoo! begins to Operate at scale Hortonworks Data Platform 2004 2006 2008 2010 2012 2011: Hortonworks created to focus on Enterprise Hadoop 2013 Enterprise Hadoop Page 13

Hortonworks Overview Headquarters: Palo Alto, CA Employees: 230+ and growing Investors: Benchmark, Index, Yahoo MISSION Enable Apache Hadoop to become an enterprise viable data platform Develop Distribute Support We employ the core architects, builders and operators of Apache Hadoop Everything contributed back Endorsed by Strategic Partners We distribute the only 100% Open Source Enterprise Hadoop Distribution: Hortonworks Data Platform Enterprise level support to allow enterprises to deploy at scale We enable the ecosystem to work better with Hadoop Page 14

Leadership Starts at the Core Driving next generation Hadoop YARN, MapReduce2, HDFS2, High Availability, Disaster Recovery 420k+ lines authored since 2006 More than twice nearest contributor Deeply integrating w/ecosystem Enabling new deployment platforms (ex. Windows & Azure, Linux & VMware HA) Creating deeply engineered solutions (ex. Teradata big data appliance) All Apache, NO holdbacks 100% of code contributed to Apache Page 15

Agenda The Growth of Enterprise Data Hadoop New Data Architecture Hortonworks an Overview The Future of Hadoop and Big Data Page 16

HDP: Enterprise Hadoop Distribution OPERATIONAL SERVICES AMBARI FALCON* OOZIE HADOOP CORE PLATFORM SERVICES FLUME SQOOP LOAD & EXTRACT NFS WebHDFS KNOX* DATA SERVICES PIG HIVE & HCATALOG MAP REDUCE* TEZ* YARN* HDFS Enterprise Readiness High Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots HBASE OTHER HORTONWORKS DATA PLATFORM (HDP) Hortonworks Data Platform (HDP) Enterprise Hadoop The ONLY 100% open source and complete distribution Enterprise grade, proven and tested at scale Ecosystem endorsed to ensure interoperability Page 17

The 1 st Generation of Hadoop: Batch HADOOP 1.0 Built for Web-Scale Batch Apps Single App BATCH HDFS Single App INTERACTIVE Single App BATCH HDFS Single App ONLINE Single App BATCH HDFS All other usage patterns must leverage that same infrastructure Forces the creation of silos for managing mixed workloads

The Enterprise Requirement: Beyond Batch To become an enterprise viable data platform, customers have told us they want to store ALL DATA in one place and interact with it in MULTIPLE WAYS Simultaneously & with predictable levels of service BATCH INTERACTIVE ONLINE STREAMING GRAPH IN- MEMORY HPC MPI SEARCH HDFS (Redundant, Reliable Storage) Page 19

YARN: Taking Hadoop Beyond Batch Created to manage resource needs across all uses Ensures predictable performance & QoS for all apps Enables apps to run IN Hadoop rather than ON Key to leveraging all other common services of the Hadoop platform: security, data lifecycle management, etc. BATCH (MapReduce) Applica9ons Run Na9vely IN Hadoop INTERACTIVE (Tez) ONLINE (HBase) STREAMING (Storm, S4, ) GRAPH (Giraph) IN- MEMORY (Spark) YARN (Cluster Resource Management) HDFS2 (Redundant, Reliable Storage) HPC MPI (OpenMPI) OTHER (Search) (Weave ) Page 20

Market Transitioning into Early Majority relative % customers Innovators, technology enthusiasts Early adopters, visionaries The CHASM Early majority, pragmatists Late majority, conservatives Laggards, Skeptics Customers want technology & performance Customers want solutions & convenience time Source: Geoffrey Moore - Crossing the Chasm Page 21

The Future of the Hadoop and Big Data The next generation data architecture evolving rapidly Store ALL data in a Hadoop data reservoir Push subsets of data to a final platform for processing Hadoop 2.0 takes Hadoop beyond Batch 2.0 YARN based architecture enabling mixed use workloads with enterprise resource management Enabling a new generation of applications at scale Based on new data types (sensor, sentiment, clickstream, etc.) or keeping existing types for much longer

Hortonworks Sandbox Hands on tutorials integrated into Sandbox HDP environment for evaluation Page 23

THANK YOU! Paul Phillips Director, EMEA pphillips@hortonworks.com +44(0)777 444 3857 Download Sandbox hortonworks.com/sandbox Page 24