Hortonworks Architecting the Future of Big Data

Similar documents
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Apache Hadoop: The Big Data Refinery

Extending Hadoop beyond MapReduce

HDFS Federation. Sanjay Radia Founder and Hortonworks. Page 1

Apache Spark and the future of big data applica5ons. Eric Baldeschwieler

Data Security in Hadoop

Comprehensive Analytics on the Hortonworks Data Platform

Big Data Realities Hadoop in the Enterprise Architecture

Community Driven Apache Hadoop. Apache Hadoop Basics. May Hortonworks Inc.

Hadoop in the Enterprise

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

Big Data: Making Sense of it all!

Upcoming Announcements

Application Development. A Paradigm Shift

A very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect

ITG Software Engineering

I/O Considerations in Big Data Analytics

Big Data and Data Science. The globally recognised training program

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

HDP Hadoop From concept to deployment.

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Petabyte-scale Data with Apache HDFS. Matt Foley Hortonworks, Inc.

Implement Hadoop jobs to extract business value from large and varied data sets

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

The Evolving Apache Hadoop Eco-System

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Peers Techno log ies Pv t. L td. HADOOP

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

Building Your Big Data Team

Distributed Calculus with Hadoop MapReduce inside Orange Search Engine. mardi 3 juillet 12

YARN Apache Hadoop Next Generation Compute Platform

Bringing Big Data to People

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

The Digital Enterprise Demands a Modern Integration Approach. Nada daveiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader

Cloudera s Commitment to Open Source and Open Standards

Roadmap Talend : découvrez les futures fonctionnalités de Talend

A Brief Introduction to Apache Tez

The Future of Data Management with Hadoop and the Enterprise Data Hub

APACHE HADOOP JERRIN JOSEPH CSU ID#

Savanna Hadoop on. OpenStack. Savanna Technical Lead

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

Hortonworks CISC Innovation day

Hadoop Big Data for Processing Data and Performing Workload

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

#TalendSandbox for Big Data

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Workshop on Hadoop with Big Data

Modern Data Architecture for Predictive Analytics

A Modern Data Architecture with Apache Hadoop

Like what you hear? Tweet it using: #Sec360

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

docs.hortonworks.com

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

HDP Enabling the Modern Data Architecture

Encryption and Anonymization in Hadoop

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera

HADOOP. Revised 10/19/2015

Extended Attributes and Transparent Encryption in Apache Hadoop

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Data processing goes big

docs.hortonworks.com

White Paper: What You Need To Know About Hadoop

Hadoop IST 734 SS CHUNG

ASAM ODS, Peak ODS Server and openmdm as a company-wide information hub for test and simulation data. Peak Solution GmbH, Nuremberg

THE HADOOP DISTRIBUTED FILE SYSTEM

Apache Hadoop: Past, Present, and Future

Staying agile with Big Data

YARN, the Apache Hadoop Platform for Streaming, Realtime and Batch Processing

The Inside Scoop on Hadoop

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Qsoft Inc

OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT

Dominik Wagenknecht Accenture

How Companies are! Using Spark

CSE-E5430 Scalable Cloud Computing Lecture 2

Hadoop: Embracing future hardware

Choosing a Provider from the Hadoop Ecosystem

Big Data Zurich, November 23. September 2011

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn

Cloud Computing. Big Data. High Performance Computing

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?

Hadoop Introduction coreservlets.com and Dima May coreservlets.com and Dima May

OpenStack PHP Usergroup Berlin, April 2011

IIS AND HORTONWORKS ENABLING A NEW ENTERPRISE DATA PLATFORM

Certified Big Data and Apache Hadoop Developer VS-1221

Dell In-Memory Appliance for Cloudera Enterprise

Deploying Hadoop with Manager

HDFS Under the Hood. Sanjay Radia. Grid Computing, Hadoop Yahoo Inc.

Dell Reference Configuration for Hortonworks Data Platform

Apache Hadoop Patterns of Use

HadoopTM Analytics DDN

T E C H N O L O G Y T R A N S F E R P R E S E N T S

BIG DATA TRENDS AND TECHNOLOGIES

Apache Bigtop: 100% Apache Bigdata management distribution. (and so much more!)

The Pain Curve Lack of Hadoop Automation Leads to Failure

CDH AND BUSINESS CONTINUITY:

TRAINING PROGRAM ON BIGDATA/HADOOP

Cray XC30 Hadoop Platform Jonathan (Bill) Sparks Howard Pritchard Martha Dumler

Transcription:

Hortonworks Architecting the Future of Big Data Eric Baldeschwieler CEO twitter: @jeric14 (@hortonworks) Formerly VP Hadoop Engineering @Yahoo! 8 Years at Yahoo! Hortonworks Inc. 2011 June 29, 2011

About Hortonworks Mission: Revolutionize and commoditize the storage and processing of big data via open source Vision: Half of the world s data will be stored in Apache Hadoop within five years Strategy: Grow the Apache Hadoop Ecosystem by making Apache Hadoop easier to consume, profit by providing training, support and certification An independent company Focused on making Apache Hadoop great Hold nothing back, Apache Hadoop will be complete Hortonworks Inc. 2011 3

Credentials Technical: key architects and committers from Yahoo! Hadoop engineering team Highest concentration of Apache Hadoop committers Contributed >70% of the code in Hadoop, Pig and ZooKeeper Delivered every major/stable Apache Hadoop release since 0.1 History of driving innovation across entire Apache Hadoop stack Experience managing world s largest deployment Business operations: team of highly successful open source veterans Led by Rob Bearden, former COO of SpringSource & JBoss Investors: backed by Benchmark Capital and Yahoo! Benchmark was key investor in Red Hat, MySQL, SpringSource, Twitter & ebay Hortonworks Inc. 2011 4

Hortonworks and Yahoo! Yahoo! is a development partner Leverage large Yahoo! development, testing & operations team More than 1,000 active & sophisticated users of Apache Hadoop Access to the Yahoo! grid for testing large workloads Only organization that has delivered a stable release of Apache Hadoop Yahoo will continue to contribute Apache Hadoop code too! Yahoo! is a customer Hortonworks provides level 3 support and training to Yahoo! Yahoo deploys Apache Hadoop releases across its 42,000 grid Yahoo! is an investor Hortonworks Inc. 2011 5

Current State of Adoption Enterprise Adoption Early adopters Technology is hard to install, manage & use Technology lacks enterprise robustness Requires significant investment in technical staff or consulting Hard to find & hire experienced developer & operations talent Technology & Knowledge Gaps Prevent Apache Hadoop from Reaching Full Potential Customers are asking their vendors for help with Hadoop! We re seeing Hadoop in all of our fortune 2000 data accounts Vendor Ecosystem Adoption Early in vendor adoption lifecycle Hadoop is hard to integrate and extend Hard to find & hire experienced developer & operations talent Hortonworks Inc. 2011 6

Hortonworks Role & Opportunity Bridge the Gap! Grow Market Enterprise Adoption Sell training and support via Partners Vendor Ecosystem Adoption Fundamental shift in enterprise data architecture strategy Apache Hadoop becomes standard for managing new types & scale of data New applications & solutions will be created to leverage data in Apache Hadoop Creates massive big data technology and services opportunity for ecosystem Hortonworks Inc. 2011 7

Hortonworks Objectives Make Apache Hadoop projects easier to install, manage & use Regular sustaining releases Compiled code for each project (e.g. RPMs) Testing at scale Make Apache Hadoop more robust Performance gains High availability Administration & monitoring Make Apache Hadoop easier to integrate & extend Open APIs for extension & experimentation All done within Apache Hadoop community Develop collaboratively with community Complete transparency All code contributed back to Apache Anyone should be able to easily deploy the Hadoop projects directly from Apache Hortonworks Inc. 2011 8

Technology Roadmap Phase 1 Making Apache Hadoop Accessible Release the most stable version of Hadoop ever Release directly usable code via Apache (RPMs,.debs ) Frequent sustaining releases off of the stable branches 2011 Phase 2 Next Generation Apache Hadoop Address key product gaps (Hbase support, HA, Management ) Enable community & partner innovation via modular architecture & open APIs Work with community to define integrated stack 2012 (Alphas starting Oct 2011) Hortonworks Inc. 2011 9

Phase 2 - Next Generation Apache Hadoop Core HDFS Federation Next Gen MapReduce New Write Pipeline (HBase support) HA (no SPOF) and Wire compatibility Data - HCatalog 0.3 Pig, Hive, MapReduce and Streaming as clients HDFS and HBase as storage systems Performance and storage improvements Management & Ease of use All components fully tested and deployable as a stack Stack installation and centralized config management REST and GUI for user tasks Hortonworks Inc. 2011 10

Hortonworks @ Hadoop Summit 11am: Crossing the Chasm: Hadoop for the Enterprise Tech talk by Sanjay Radia 1:45pm: Next Generation Apache Hadoop MapReduce Community track by Arun Murthy 2:15pm: Introducing HCatalog (Hadoop Table Manager) Community track by Alan Gates 4:00pm: Large Scale Math with Hadoop MapReduce Applications and Research Track by Tsz-Wo Sze 4:30pm: HDFS Federation and Other Features Community track by Suresh Srinivas Hortonworks Inc. 2011 11

Thank You. Hortonworks Inc. 2011