Hadoop in the Enterprise

Size: px
Start display at page:

Download "Hadoop in the Enterprise"

Transcription

1 Hadoop in the Enterprise Modern Architecture with Hadoop 2 Jeff Markham Technical Director, APAC Hortonworks

2 Hadoop Wave ONE: Web-scale Batch Apps relative % customers 2006 to 2012 Web-Scale Batch Applications Innovators, technology enthusiasts Early adopters, visionaries The CHASM Early majority, pragmatists Late majority, conservatives Laggards, Skeptics time Customers want technology & performance Customers want solutions & convenience Source: Geoffrey Moore - Crossing the Chasm

3 Hadoop Wave TWO: Broad Enterprise Apps relative % customers 2013 & Beyond Batch, Interactive, Online, Streaming, etc., etc. Innovators, technology enthusiasts Early adopters, visionaries The CHASM Early majority, pragmatists Late majority, conservatives Laggards, Skeptics time Customers want technology & performance Customers want solutions & convenience Source: Geoffrey Moore - Crossing the Chasm

4 Hadoop 2.0 Key Highlights 2.0 Architected for the Broad Enterprise Single Cluster, Many Workloads Enterprise Requirements Mixed workloads Interactive Query Reliability Point in time Recovery HDP 2.0 Features YARN Hive on Tez Full Stack HA Snapshots BATCH INTERACTIVE ONLINE STREAMING Multi Data Center Disaster Recovery ZERO downtime Rolling Upgrades

5 The 1 st Generation of Hadoop: Batch HADOOP 1.0 Built for Web-Scale Batch Apps Single App INTERACTIVE Single App ONLINE All other usage patterns must leverage that same infrastructure Single App BATCH Single App BATCH Single App BATCH Forces the creation of silos for managing mixed workloads HDFS HDFS HDFS

6 A Transition From Hadoop 1 to 2 HADOOP 1.0 MapReduce (cluster resource management & data processing) HDFS (redundant, reliable storage)

7 A Transition From Hadoop 1 to 2 HADOOP 1.0 HADOOP 2.0 MapReduce (cluster resource management & data processing) HDFS (redundant, reliable storage) MapReduce (data processing) YARN (cluster resource management) HDFS (redundant, reliable storage) Others (data processing)

8 The Enterprise Requirement: Beyond Batch To become an enterprise viable data platform, customers have told us they want to store ALL DATA in one place and interact with it in MULTIPLE WAYS Simultaneously & with predictable levels of service BATCH INTERACTIVE ONLINE STREAMING GRAPH IN- MEMORY HPC MPI OTHER HDFS (Redundant, Reliable Storage) Page 17

9 YARN: Taking Hadoop Beyond Batch Created to manage resource needs across all uses Ensures predictable performance & QoS for all apps Enables apps to run IN Hadoop rather than ON Key to leveraging all other common services of the Hadoop platform: security, data lifecycle management, etc. ApplicaIons Run NaIvely IN Hadoop BATCH (MapReduce) INTERACTIVE (Tez) ONLINE (HBase) STREAMING (Storm, S4, ) GRAPH (Giraph) IN- MEMORY (Spark) HPC MPI (OpenMPI) OTHER (Search) (Weave ) YARN (Cluster Resource Management) HDFS2 (Redundant, Reliable Storage) Page 18

10 Old School Hadoop: MapReduce

11 New School Hadoop with YARN Node Manager Container App Mstr Client Client Resource Manager Node Manager App Mstr Container MapReduce Status Job Submission Node Status Resource Request Container Node Manager Container

12 5 Key Benefits of YARN 5 1. Scale! 2. Compatibility with MapReduce. 3. Improved cluster utilization. 4. New Programming Models 5. Agility Page 23

13 Apache Tez An alternate data processing framework to MapReduce Improves performance of low-latency applications Page 24

14 SQL-IN-Hadoop with Apache Hive Hadoop Business AnalyIcs MAP REDUCE SQL HIVE YARN HDFS2 Custom Apps TEZ Apache Hive: First Application to use YARN Hive on Tez optimizes resource for Hive queries to improve performance Apache Hive is the standard for SQL interaction in Hadoop (Most applications claim Hive compatibility today) Apache Tez: optimized for YARN, general purpose processing framework for existing Hadoop applications Stinger Initiative Simple Focus x Performance Improvement Increased SQL Compatibility Enable Hive to support interactive workloads Improve existing tools & preserve investments SInger Phase 1 Base OpJmizaJons SQL AnalyJcs ORCFile Format SInger Phase 2 YARN Resource Mgmnt Hive on Apache Tez Query Service (always on) SInger Phase 3 Vector Query Buffer Cache Query Planner Page 25

15 Hive: More SQL & 100X Faster Stinger Phase 1 Base Optimizations SQL Analytics ORCFile Format Stinger Phase 2 YARN Resource Mgmnt Hive on Apache Tez Query Service Stinger Phase 3 Vector Query Buffer Cache Query Planner Done in Hive 0.11 We Are Here Work Started SQL Compliance Highlights ROLLUP and CUBE Windowing functions (OVER, RANK, etc.) DECIMAL CHAR VARCHAR DATE UNION DISTINCT and UNION outside of subquery Sub-queries for IN/NOT IN, HAVING EXISTS / NOT EXISTS INTERSECT, EXCEPT

16 Hive s Performance Trajectory

17 Making Hadoop Enterprise Ready

18 Thank You!

Big Data Realities Hadoop in the Enterprise Architecture

Big Data Realities Hadoop in the Enterprise Architecture Big Data Realities Hadoop in the Enterprise Architecture Paul Phillips Director, EMEA, Hortonworks pphillips@hortonworks.com +44 (0)777 444 3857 Hortonworks Inc. 2012 Page 1 Agenda The Growth of Enterprise

More information

YARN Apache Hadoop Next Generation Compute Platform

YARN Apache Hadoop Next Generation Compute Platform YARN Apache Hadoop Next Generation Compute Platform Bikas Saha @bikassaha Hortonworks Inc. 2013 Page 1 Apache Hadoop & YARN Apache Hadoop De facto Big Data open source platform Running for about 5 years

More information

Apache Hadoop's Role in Your Big Data Architecture

Apache Hadoop's Role in Your Big Data Architecture Apache Hadoop's Role in Your Big Data Architecture Chris Harris EMEA, Hortonworks charris@hortonworks.com Twi

More information

All You Wanted to Know About Big Data Projects Chida Sadayappan @schida. Jan 2014

All You Wanted to Know About Big Data Projects Chida Sadayappan @schida. Jan 2014 All You Wanted to Know About Big Data Projects Chida Sadayappan @schida Jan 2014 1 WHAT WE DISCUSS HERE AGENDA > > > > > > Need History Open Source - Hadoop BigData EcoSystem Use Cases Managing BigData

More information

Hortonworks: We Do Hadoop.

Hortonworks: We Do Hadoop. Hortonworks: We Do Hadoop. Our mission is to enable your Modern Data Architecture by delivering One Enterprise Hadoop November 2013 Page 1 Recent Announcements October 23 Hortonworks Data Platform 2.0

More information

Upcoming Announcements

Upcoming Announcements Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within

More information

Hadoop 2.6 Configuration and More Examples

Hadoop 2.6 Configuration and More Examples Hadoop 2.6 Configuration and More Examples Big Data 2015 Apache Hadoop & YARN Apache Hadoop (1.X)! De facto Big Data open source platform Running for about 5 years in production at hundreds of companies

More information

How Companies are! Using Spark

How Companies are! Using Spark How Companies are! Using Spark And where the Edge in Big Data will be Matei Zaharia History Decreasing storage costs have led to an explosion of big data Commodity cluster software, like Hadoop, has made

More information

Comprehensive Analytics on the Hortonworks Data Platform

Comprehensive Analytics on the Hortonworks Data Platform Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page

More information

Hadoop2, Spark Big Data, real time, machine learning & use cases. Cédric Carbone Twitter : @carbone

Hadoop2, Spark Big Data, real time, machine learning & use cases. Cédric Carbone Twitter : @carbone Hadoop2, Spark Big Data, real time, machine learning & use cases Cédric Carbone Twitter : @carbone Agenda Map Reduce Hadoop v1 limits Hadoop v2 and YARN Apache Spark Streaming : Spark vs Storm Machine

More information

Evolving Core Hadoop in Face of Big Data Trends

Evolving Core Hadoop in Face of Big Data Trends Evolving Core Hadoop in Face of Big Data Trends Sanjay Radia Founder & Architect Page 1 Hello Founder, Hortonworks Part of the Hadoop team at Yahoo! since 2007 Chief Architect of Hadoop Core at Yahoo!

More information

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015 Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015 We Do Hadoop Fall 2014 Page 1 HDP delivers a comprehensive data management platform GOVERNANCE Hortonworks Data Platform

More information

Data Security in Hadoop

Data Security in Hadoop Data Security in Hadoop Eric Mizell Director, Solution Engineering Page 1 What is Data Security? Data Security for Hadoop allows you to administer a singular policy for authentication of users, authorize

More information

Hadoop Job Oriented Training Agenda

Hadoop Job Oriented Training Agenda 1 Hadoop Job Oriented Training Agenda Kapil CK hdpguru@gmail.com Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module

More information

A Brief Introduction to Apache Tez

A Brief Introduction to Apache Tez A Brief Introduction to Apache Tez Introduction It is a fact that data is basically the new currency of the modern business world. Companies that effectively maximize the value of their data (extract value

More information

Next Gen Hadoop Gather around the campfire and I will tell you a good YARN

Next Gen Hadoop Gather around the campfire and I will tell you a good YARN Next Gen Hadoop Gather around the campfire and I will tell you a good YARN Akmal B. Chaudhri* Hortonworks *about.me/akmalchaudhri My background ~25 years experience in IT Developer (Reuters) Academic (City

More information

Sujee Maniyam, ElephantScale

Sujee Maniyam, ElephantScale Hadoop PRESENTATION 2 : New TITLE and GOES Noteworthy HERE Sujee Maniyam, ElephantScale SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce

More information

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013 Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache

More information

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera Accelerating Enterprise Big Data Success Tim Stevens, VP of Business and Corporate Development Cloudera 1 Big Opportunity: Extract value from data Revenue Growth x = 50 Billion 35 ZB Cost Savings Margin

More information

Stinger Initiative: Introduction

Stinger Initiative: Introduction Stinger Initiative: Introduction Interactive Query on Hadoop Chris Harris E-Mail : charris@hortonworks.com Twitter : cj_harris5 Page 1 The World of Data is Changing Data Explosion 1 Zettabyte (ZB) = 1

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

Apache Flink Next-gen data analysis. Kostas Tzoumas ktzoumas@apache.org @kostas_tzoumas

Apache Flink Next-gen data analysis. Kostas Tzoumas ktzoumas@apache.org @kostas_tzoumas Apache Flink Next-gen data analysis Kostas Tzoumas ktzoumas@apache.org @kostas_tzoumas What is Flink Project undergoing incubation in the Apache Software Foundation Originating from the Stratosphere research

More information

[Hadoop, Storm and Couchbase: Faster Big Data]

[Hadoop, Storm and Couchbase: Faster Big Data] [Hadoop, Storm and Couchbase: Faster Big Data] With over 8,500 clients, LivePerson is the global leader in intelligent online customer engagement. With an increasing amount of agent/customer engagements,

More information

YARN, the Apache Hadoop Platform for Streaming, Realtime and Batch Processing

YARN, the Apache Hadoop Platform for Streaming, Realtime and Batch Processing YARN, the Apache Hadoop Platform for Streaming, Realtime and Batch Processing Eric Charles [http://echarles.net] @echarles Datalayer [http://datalayer.io] @datalayerio FOSDEM 02 Feb 2014 NoSQL DevRoom

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Databases 2 (VU) ( )

Databases 2 (VU) ( ) Databases 2 (VU) (707.030) MapReduce (Part 3) Mark Kröll KTI, TU Graz Nov. 14, 2016 Mark Kröll (KTI, TU Graz) MapReduce Nov. 14, 2016 1 / 41 Outline 1 Problems Suited for Map-Reduce Matrix-Vector Multiplication

More information

Big Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016

Big Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016 Big Data Approaches Making Sense of Big Data Ian Crosland Jan 2016 Accelerate Big Data ROI Even firms that are investing in Big Data are still struggling to get the most from it. Make Big Data Accessible

More information

Solving performance and data protection problems with active-active Hadoop SOLUTIONS BRIEF

Solving performance and data protection problems with active-active Hadoop SOLUTIONS BRIEF Solving performance and data protection problems with active-active Hadoop SOLUTIONS BRIEF Solving performance and data protection problems with active-active Hadoop Many Hadoop deployments are not realizing

More information

Extending Hadoop beyond MapReduce

Extending Hadoop beyond MapReduce Extending Hadoop beyond MapReduce Mahadev Konar Co-Founder @mahadevkonar (@hortonworks) Page 1 Bio Apache Hadoop since 2006 - committer and PMC member Developed and supported Map Reduce @Yahoo! - Core

More information

IIS AND HORTONWORKS ENABLING A NEW ENTERPRISE DATA PLATFORM

IIS AND HORTONWORKS ENABLING A NEW ENTERPRISE DATA PLATFORM IIS AND HORTONWORKS ENABLING A NEW ENTERPRISE DATA PLATFORM 1 IIS AND HORTONWORKS SEEK TO ENABLE A NEW ENTERPRISE DATA PLATFORM TO POWER AN INNOVATIVE, MODERN DATA ARCHITECTURE IIS AND HORTONWORKS ENABLING

More information

Hortonworks Architecting the Future of Big Data

Hortonworks Architecting the Future of Big Data Hortonworks Architecting the Future of Big Data Eric Baldeschwieler CEO twitter: @jeric14 (@hortonworks) Formerly VP Hadoop Engineering @Yahoo! 8 Years at Yahoo! Hortonworks Inc. 2011 June 29, 2011 About

More information

Hortonworks Data Platform Reference Architecture

Hortonworks Data Platform Reference Architecture Hortonworks Data Platform Reference Architecture A PSSC Labs Reference Architecture Guide December 2014 Introduction PSSC Labs continues to bring innovative compute server and cluster platforms to market.

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

Real Time Fraud Detection With Sequence Mining on Big Data Platform. Pranab Ghosh Big Data Consultant IEEE CNSV meeting, May 6 2014 Santa Clara, CA

Real Time Fraud Detection With Sequence Mining on Big Data Platform. Pranab Ghosh Big Data Consultant IEEE CNSV meeting, May 6 2014 Santa Clara, CA Real Time Fraud Detection With Sequence Mining on Big Data Platform Pranab Ghosh Big Data Consultant IEEE CNSV meeting, May 6 2014 Santa Clara, CA Open Source Big Data Eco System Query (NOSQL) : Cassandra,

More information

Hadoop vs Apache Spark

Hadoop vs Apache Spark Innovate, Integrate, Transform Hadoop vs Apache Spark www.altencalsoftlabs.com Introduction Any sufficiently advanced technology is indistinguishable from magic. said Arthur C. Clark. Big data technologies

More information

Hortonworks CISC Innovation day

Hortonworks CISC Innovation day Hortonworks CISC Innovation day Simon gregory sgregory@hortonworks.com Here was the ask Hortonworks' data reposition - how this works and the types of data you work with. 1: Data Types & Value. What have

More information

Enterprise Operational SQL on Hadoop Trafodion Overview

Enterprise Operational SQL on Hadoop Trafodion Overview Enterprise Operational SQL on Hadoop Trafodion Overview Rohit Jain Distinguished & Chief Technologist Strategic & Emerging Technologies Enterprise Database Solutions Copyright 2012 Hewlett-Packard Development

More information

Hadoop in the Hybrid Cloud

Hadoop in the Hybrid Cloud Presented by Hortonworks and Microsoft Introduction An increasing number of enterprises are either currently using or are planning to use cloud deployment models to expand their IT infrastructure. Big

More information

Information Builders Mission & Value Proposition

Information Builders Mission & Value Proposition Value 10/06/2015 2015 MapR Technologies 2015 MapR Technologies 1 Information Builders Mission & Value Proposition Economies of Scale & Increasing Returns (Note: Not to be confused with diminishing returns

More information

Virtualizing Apache Hadoop. June, 2012

Virtualizing Apache Hadoop. June, 2012 June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING

More information

Native Connectivity to Big Data Sources in MSTR 10

Native Connectivity to Big Data Sources in MSTR 10 Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single

More information

Apache Cassandra & Friends

Apache Cassandra & Friends Apache Cassandra & Friends About Me DigBigData Founder & CTO Working with Cassandra since 2010 Apache Cassandra MVP for 2014 Systems Engineer at heart About this Presentation We began to see patterns in

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

Systems Engineering II. Pramod Bhatotia TU Dresden pramod.bhatotia@tu- dresden.de

Systems Engineering II. Pramod Bhatotia TU Dresden pramod.bhatotia@tu- dresden.de Systems Engineering II Pramod Bhatotia TU Dresden pramod.bhatotia@tu- dresden.de About me! Since May 2015 2015 2012 Research Group Leader cfaed, TU Dresden PhD Student MPI- SWS Research Intern Microsoft

More information

Enable your Modern Data Architecture by delivering Enterprise Apache Hadoop

Enable your Modern Data Architecture by delivering Enterprise Apache Hadoop Modern Data Architecture with Enterprise Apache Hadoop Hortonworks. We do Hadoop. Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Our Mission: Enable your Modern Data Architecture

More information

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

TE's Analytics on Hadoop and SAP HANA Using SAP Vora TE's Analytics on Hadoop and SAP HANA Using SAP Vora Naveen Narra Senior Manager TE Connectivity Santha Kumar Rajendran Enterprise Data Architect TE Balaji Krishna - Director, SAP HANA Product Mgmt. -

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Talend Big Data. Delivering instant value from all your data. Talend 2014 1

Talend Big Data. Delivering instant value from all your data. Talend 2014 1 Talend Big Data Delivering instant value from all your data Talend 2014 1 I may say that this is the greatest factor: the way in which the expedition is equipped. Roald Amundsen race to the south pole,

More information

SAP HANA From Relational OLAP Database to Big Data Infrastructure

SAP HANA From Relational OLAP Database to Big Data Infrastructure SAP HANA From Relational OLAP Database to Big Data Infrastructure Anil K Goel VP & Chief Architect, SAP HANA Data Platform WBDB 2015, June 16, 2015 Toronto SAP Big Data Story Data Lifecycle Management

More information

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES AWS GLOBAL INFRASTRUCTURE 10 Regions 25 Availability Zones 51 Edge locations WHAT

More information

Pilot-Streaming: Design Considerations for a Stream Processing Framework for High- Performance Computing

Pilot-Streaming: Design Considerations for a Stream Processing Framework for High- Performance Computing Pilot-Streaming: Design Considerations for a Stream Processing Framework for High- Performance Computing Andre Luckow, Peter M. Kasson, Shantenu Jha STREAMING 2016, 03/23/2016 RADICAL, Rutgers, http://radical.rutgers.edu

More information

Apache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source

Apache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source Apache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source DMITRIY SETRAKYAN Founder, PPMC http://www.ignite.incubator.apache.org @apacheignite @dsetrakyan Agenda About In- Memory

More information

Ali Ghodsi Head of PM and Engineering Databricks

Ali Ghodsi Head of PM and Engineering Databricks Making Big Data Simple Ali Ghodsi Head of PM and Engineering Databricks Big Data is Hard: A Big Data Project Tasks Tasks Build a Hadoop cluster Challenges Clusters hard to setup and manage Build a data

More information

Dominik Wagenknecht Accenture

Dominik Wagenknecht Accenture Dominik Wagenknecht Accenture Improving Mainframe Performance with Hadoop October 17, 2014 Organizers General Partner Top Media Partner Media Partner Supporters About me Dominik Wagenknecht Accenture Vienna

More information

Processing of Big Data. Nelson L. S. da Fonseca IEEE ComSoc Summer Scool Trento, July 9 th, 2015

Processing of Big Data. Nelson L. S. da Fonseca IEEE ComSoc Summer Scool Trento, July 9 th, 2015 Processing of Big Data Nelson L. S. da Fonseca IEEE ComSoc Summer Scool Trento, July 9 th, 2015 Acknowledgement Some slides in this set of slides were provided by EMC Corporation and Sandra Avila, University

More information

Make Business Intelligence Work on Big Data Speed. Scale. Simplicity.

Make Business Intelligence Work on Big Data Speed. Scale. Simplicity. Make Business Intelligence Work on Big Data PUT THE POWER OF BIG DATA IN THE HANDS OF BUSINESS USERS Connect your BI tools directly to your Big Data without compromising scale, performance, or control.

More information

HADOOP. Revised 10/19/2015

HADOOP. Revised 10/19/2015 HADOOP Revised 10/19/2015 This Page Intentionally Left Blank Table of Contents Hortonworks HDP Developer: Java... 1 Hortonworks HDP Developer: Apache Pig and Hive... 2 Hortonworks HDP Developer: Windows...

More information

Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ. Cloudera World Japan November 2014

Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ. Cloudera World Japan November 2014 Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ Cloudera World Japan November 2014 WANdisco Background WANdisco: Wide Area Network Distributed Computing Enterprise ready, high availability

More information

Where is Hadoop Going Next?

Where is Hadoop Going Next? Where is Hadoop Going Next? Owen O Malley owen@hortonworks.com @owen_omalley November 2014 Page 1 Who am I? Worked at Yahoo Seach Webmap in a Week Dreadnaught to Juggernaut to Hadoop MapReduce Security

More information

Workshop on Hadoop with Big Data

Workshop on Hadoop with Big Data Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly

More information

#TalendSandbox for Big Data

#TalendSandbox for Big Data Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND

More information

Moving From Hadoop to Spark

Moving From Hadoop to Spark + Moving From Hadoop to Spark Sujee Maniyam Founder / Principal @ www.elephantscale.com sujee@elephantscale.com Bay Area ACM meetup (2015-02-23) + HI, Featured in Hadoop Weekly #109 + About Me : Sujee

More information

Dell Reference Configuration for Hortonworks Data Platform

Dell Reference Configuration for Hortonworks Data Platform Dell Reference Configuration for Hortonworks Data Platform A Quick Reference Configuration Guide Armando Acosta Hadoop Product Manager Dell Revolutionary Cloud and Big Data Group Kris Applegate Solution

More information

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based

More information

Hadoop Big Data for Processing Data and Performing Workload

Hadoop Big Data for Processing Data and Performing Workload Hadoop Big Data for Processing Data and Performing Workload Girish T B 1, Shadik Mohammed Ghouse 2, Dr. B. R. Prasad Babu 3 1 M Tech Student, 2 Assosiate professor, 3 Professor & Head (PG), of Computer

More information

BUILDING BLOCKS FOR A SCALE-OUT DATA LAKE WITH EMC AND PIVOTAL Increase efficiency and accelerate business insight

BUILDING BLOCKS FOR A SCALE-OUT DATA LAKE WITH EMC AND PIVOTAL Increase efficiency and accelerate business insight BUILDING BLOCKS FOR A SCALE-OUT DATA LAKE WITH EMC AND PIVOTAL Increase efficiency and accelerate business insight ESSENTIALS Combines EMC Isilon scale-out NAS storage, EMC Data Computing Appliance (DCA),

More information

Hortonworks Data Platform for Hadoop and SAP HANA

Hortonworks Data Platform for Hadoop and SAP HANA Hortonworks Data Platform for Hadoop and SAP HANA Prasad illapani, Big Data & SAP HANA- Product Management & Strategy SAP Labs LLC., Bellevue, WA Bob Page, VP Partner Products, Hortonworks Inc. Palo Alto,

More information

Modernizing Your Data Warehouse for Hadoop

Modernizing Your Data Warehouse for Hadoop Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking

More information

Using Tableau Software with Hortonworks Data Platform

Using Tableau Software with Hortonworks Data Platform Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data

More information

HPC ABDS: The Case for an Integrating Apache Big Data Stack

HPC ABDS: The Case for an Integrating Apache Big Data Stack HPC ABDS: The Case for an Integrating Apache Big Data Stack with HPC 1st JTC 1 SGBD Meeting SDSC San Diego March 19 2014 Judy Qiu Shantenu Jha (Rutgers) Geoffrey Fox gcf@indiana.edu http://www.infomall.org

More information

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015 Pulsar Realtime Analytics At Scale Tony Ng April 14, 2015 Big Data Trends Bigger data volumes More data sources DBs, logs, behavioral & business event streams, sensors Faster analysis Next day to hours

More information

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya Oracle Database - Engineered for Innovation Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya Oracle Database 11g Release 2 Shipping since September 2009 11.2.0.3 Patch Set now

More information

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Addressing Open Source Big Data, Hadoop, and MapReduce limitations Addressing Open Source Big Data, Hadoop, and MapReduce limitations 1 Agenda What is Big Data / Hadoop? Limitations of the existing hadoop distributions Going enterprise with Hadoop 2 How Big are Data?

More information

Hadoop: Embracing future hardware

Hadoop: Embracing future hardware Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop

More information

Big Data and Data Science. The globally recognised training program

Big Data and Data Science. The globally recognised training program Big Data and Data Science The globally recognised training program Certificate in Big Data Analytics Duration 5 days Big Data and Data Science enables value creation from data, through the use of calculative

More information

HDP Enabling the Modern Data Architecture

HDP Enabling the Modern Data Architecture HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,

More information

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS ..................................... PEPPERDATA IN MULTI-TENANT ENVIRONMENTS technical whitepaper June 2015 SUMMARY OF WHAT S WRITTEN IN THIS DOCUMENT If you are short on time and don t want to read the

More information

The Top 10 7 Hadoop Patterns and Anti-patterns. Alex Holmes @

The Top 10 7 Hadoop Patterns and Anti-patterns. Alex Holmes @ The Top 10 7 Hadoop Patterns and Anti-patterns Alex Holmes @ whoami Alex Holmes Software engineer Working on distributed systems for many years Hadoop since 2008 @grep_alex grepalex.com what s hadoop...

More information

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks Hadoop Introduction Olivier Renault Solution Engineer - Hortonworks Hortonworks A Brief History of Apache Hadoop Apache Project Established Yahoo! begins to Operate at scale Hortonworks Data Platform 2013

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

3 Reasons Enterprises Struggle with Storm & Spark Streaming and Adopt DataTorrent RTS

3 Reasons Enterprises Struggle with Storm & Spark Streaming and Adopt DataTorrent RTS . 3 Reasons Enterprises Struggle with Storm & Spark Streaming and Adopt DataTorrent RTS Deliver fast actionable business insights for data scientists, rapid application creation for developers and enterprise-grade

More information

HADOOP SOLUTION USING DELL EMC ISILON AND CLOUDERA ENTERPRISE

HADOOP SOLUTION USING DELL EMC ISILON AND CLOUDERA ENTERPRISE HADOOP SOLUTION USING DELL EMC ISILON AND ESSENTIALS Dell EMC ISILON Use the industry's leading scale-out NAS solution with native Hadoop support Reduce costs and accelerate results with in-place data

More information

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools

More information

REAL-TIME BIG DATA ANALYTICS

REAL-TIME BIG DATA ANALYTICS www.leanxcale.com info@leanxcale.com REAL-TIME BIG DATA ANALYTICS Blending Transactional and Analytical Processing Delivers Real-Time Big Data Analytics 2 ULTRA-SCALABLE FULL ACID FULL SQL DATABASE LeanXcale

More information

Evolution from Big Data to Smart Data

Evolution from Big Data to Smart Data Evolution from Big Data to Smart Data Information is Exploding 120 HOURS VIDEO UPLOADED TO YOUTUBE 50,000 APPS DOWNLOADED 204 MILLION E-MAILS EVERY MINUTE EVERY DAY Intel Corporation 2015 The Data is Changing

More information

Apache Hadoop: Past, Present, and Future

Apache Hadoop: Past, Present, and Future The 4 th China Cloud Computing Conference May 25 th, 2012. Apache Hadoop: Past, Present, and Future Dr. Amr Awadallah Founder, Chief Technical Officer aaa@cloudera.com, twitter: @awadallah Hadoop Past

More information

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University

More information

Cisco IT Hadoop Journey

Cisco IT Hadoop Journey Cisco IT Hadoop Journey Srini Desikan, Program Manager IT 2015 MapR Technologies 1 Agenda Hadoop Platform Timeline Key Decisions / Lessons Learnt Data Lake Hadoop s place in IT Data Platforms Use Cases

More information

Dell In-Memory Appliance for Cloudera Enterprise

Dell In-Memory Appliance for Cloudera Enterprise Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert Armando_Acosta@Dell.com/

More information

Big Data & Hadoop ADVANCED EXCEL & DASHBOARDS MASTERCLASS

Big Data & Hadoop ADVANCED EXCEL & DASHBOARDS MASTERCLASS ADVANCED EXCEL Live Virtual Training & DASHBOARDS MASTERCLASS Big Data & Hadoop 2016 Sessions Exclusive Live Virtual Drive efficiency and quality by combining separate pool of structured, semi-structured

More information

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to

More information

More Data in Less Time

More Data in Less Time More Data in Less Time Leveraging Cloudera CDH as an Operational Data Store Daniel Tydecks, Systems Engineering DACH & CE Goals of an Operational Data Store Load Data Sources Traditional Architecture Operational

More information

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack HIGHLIGHTS Real-Time Results Elasticsearch on Cisco UCS enables a deeper

More information

Hadoop: Addressing Big Data Challenges

Hadoop: Addressing Big Data Challenges Hadoop: Addressing Big Data Challenges Eric Baldeschwieler CTO Hortonworks @JERIC14 Page 1 Big Data = Transactions + Interactions + Observations Petabytes Terabytes Gigabytes Megabytes Sensors / RFID /

More information

Data movement for globally deployed Big Data Hadoop architectures

Data movement for globally deployed Big Data Hadoop architectures Data movement for globally deployed Big Data Hadoop architectures Scott Rudenstein VP Technical Services November 2015 WANdisco Background WANdisco: Wide Area Network Distributed Computing " Enterprise

More information

Actian SQL in Hadoop Buyer s Guide

Actian SQL in Hadoop Buyer s Guide Actian SQL in Hadoop Buyer s Guide Contents Introduction: Big Data and Hadoop... 3 SQL on Hadoop Benefits... 4 Approaches to SQL on Hadoop... 4 The Top 10 SQL in Hadoop Capabilities... 5 SQL in Hadoop

More information

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013 Integrating Hadoop Into Business Intelligence & Data Warehousing Philip Russom TDWI Research Director for Data Management, April 9 2013 TDWI would like to thank the following companies for sponsoring the

More information

Unified Big Data Processing with Apache Spark. Matei Zaharia @matei_zaharia

Unified Big Data Processing with Apache Spark. Matei Zaharia @matei_zaharia Unified Big Data Processing with Apache Spark Matei Zaharia @matei_zaharia What is Apache Spark? Fast & general engine for big data processing Generalizes MapReduce model to support more types of processing

More information