Upcoming Announcements



Similar documents
Hortonworks Data Platform for Hadoop and SAP HANA

HDP Hadoop From concept to deployment.

Data Security in Hadoop

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

HDP Enabling the Modern Data Architecture

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

Comprehensive Analytics on the Hortonworks Data Platform

A Modern Data Architecture with Apache Hadoop

Hadoop, the Data Lake, and a New World of Analytics

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

HADOOP. Revised 10/19/2015

Big Data Realities Hadoop in the Enterprise Architecture

Hortonworks Data Platform. Buyer s Guide

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

The Digital Enterprise Demands a Modern Integration Approach. Nada daveiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader

Dominik Wagenknecht Accenture

Big Data: Making Sense of it all!

Modernizing Your Data Warehouse for Hadoop

How to Hadoop Without the Worry: Protecting Big Data at Scale

Hadoop Ecosystem B Y R A H I M A.

Why Spark on Hadoop Matters

SAP and Hortonworks Reference Architecture

Moving From Hadoop to Spark

Hadoop in the Enterprise

Hadoop Job Oriented Training Agenda

Big Data Management and Security

#TalendSandbox for Big Data

Modern Data Architecture for Predictive Analytics

Workshop on Hadoop with Big Data

Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Information Builders Mission & Value Proposition

The Future of Data Management

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

The Future of Data Management with Hadoop and the Enterprise Data Hub

Roadmap Talend : découvrez les futures fonctionnalités de Talend

BIG DATA TRENDS AND TECHNOLOGIES

Peers Techno log ies Pv t. L td. HADOOP

The Evolving Apache Hadoop Eco-System

Training Catalog. Summer 2015 Training Catalog. Apache Hadoop Training from the Experts. Apache Hadoop Training From the Experts

HADOOP VENDOR DISTRIBUTIONS THE WHY, THE WHO AND THE HOW? Guruprasad K.N. Enterprise Architect Wipro BOTWORKS

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Qsoft Inc

Bringing Big Data to People

Hadoop & Spark Using Amazon EMR

How Companies are! Using Spark

Constructing a Data Lake: Hadoop and Oracle Database United!

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Like what you hear? Tweet it using: #Sec360

Apache Sentry. Prasad Mujumdar

Modern Data Architecture for Retail with Apache Hadoop on Windows

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Ensure PCI DSS compliance for your Hadoop environment. A Hortonworks White Paper October 2015

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Please give me your feedback

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

Next Gen Hadoop Gather around the campfire and I will tell you a good YARN

Modern Data Architecture for Financial Services with Apache Hadoop on Windows

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Federated SQL on Hadoop and Beyond: Leveraging Apache Geode to Build a Poor Man's SAP HANA. by Christian

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Deploying Hadoop with Manager

Case Study : 3 different hadoop cluster deployments

Savanna Hadoop on. OpenStack. Savanna Technical Lead

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Trend Micro Big Data Platform and Apache Bigtop. 葉 祐 欣 (Evans Ye) Big Data Conference 2015

Self-service BI for big data applications using Apache Drill

Apache Hadoop's Role in Your Big Data Architecture

Hadoop Trends and Practical Use Cases. April 2014

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Chase Wu New Jersey Ins0tute of Technology

HADOOP BIG DATA DEVELOPER TRAINING AGENDA

Stinger Initiative: Introduction

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Evolution from Big Data to Smart Data

Apache Hadoop: The Big Data Refinery

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

Extending Hadoop beyond MapReduce

Creating Big Data Applications with Spring XD

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Big Data Analytics - Accelerated. stream-horizon.com

Olivier Renault Solu/on Engineer Hortonworks. Hadoop Security

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Encryption and Anonymization in Hadoop

Building a real-time, self-service data analytics ecosystem Greg Arnold, Sr. Director Engineering

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Solving performance and data protection problems with active-active Hadoop SOLUTIONS BRIEF

Oracle Database 12c Plug In. Switch On. Get SMART.

Unified Batch & Stream Processing Platform

Transcription:

Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1

Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within the core of Enterprise Hadoop to enable an ecosystem to flourish and cement Hadoop s role in the data architectures of tomorrow Interactive SQL Query: Final phase of Stinger Delivered. Comprehensive Features:,, Processing Versatility: Storm, Search April 21 Concurrent Partnership Cascading is the proven application development platform for building data applications on Hadoop April 2 April 3 LucidWorks partnership Hadoop Summit Europe 2014 SOLD OUT, double exhibitors, double content, year over year. A resell agreement has been inked with Lucidworks to provide tier 2 and tier 3 support for HDP Search Integrate and Deliver the Cascading SDK into Collection of tools, documentation, libraries, tutorials and example projects Simplifies SQL integration and enables Scala development for Hadoop Hortonworks provides level 1 & 2 support for Cascading SDK Page 2

Hadoop within an emerging Modern Architecture APPLICATIONS Business Analytics Custom Applications Packaged Applications DEV & DATA TOOLS Build & Test DATA SYSTEM RDBMS EDW MPP REPOSITORIES Access OPERATIONS TOOLS Provision, Manage & Monitor SOURCES OLTP, ERP, Documents, Web Logs, CRM Systems Emails Click Streams Social Networks Machine Generated Sensor Geolocation Page 3

Core Capabilities of Enterprise Hadoop PRESENTATION & APPLICATION Enable both existing and new application to provide value to the organization ENTERPRISE MGMT & SECURITY Empower existing operations and security tools to manage Hadoop GOVERNANCE & INTEGRATION DATA ACCESS SECURITY OPERATIONS Load data and manage according to policy Access your data simultaneously in multiple ways (batch, interactive, real-time) Store and process all of your Corporate Assets Provide layered approach to security through Authentication, Authorization, Accounting, and Protection Deploy and effectively manage the platform DATA MANAGEMENT Provide deployment choice across physical, virtual, cloud DEPLOYMENT OPTIONS Page 4

delivered in Open Source GOVERNANCE & INTEGRATION DATA ACCESS SECURITY OPERATIONS Workflow, Lifecycle & Falcon Sqoop Flume NFS WebHDFS Batch Map Reduce Script Pig SQL Hive/Tez, HCatalog NoSQL HBase Accumulo Stream Storm YARN : Operating System Search Solr 1 HDFS (Hadoop Distributed File System) Others In-Memory Analytics, ISV engines N Authentication Authorization Accounting Protection Storage: HDFS Resources: YARN Access: Hive, Pipeline: Falcon Cluster: Knox Provision, Manage & Monitor Ambari Zookeeper Scheduling Oozie DATA MANAGEMENT Page 5

: Enterprise Hadoop Hortonworks Platform GOVERNANCE & INTEGRATION DATA ACCESS SECURITY OPERATIONS Workflow, Lifecycle & Falcon Sqoop Flume NFS WebHDFS Batch Map Reduce Script Pig SQL Hive/Tez, HCatalog NoSQL HBase Accumulo Stream Storm YARN : Operating System Search Solr 1 HDFS (Hadoop Distributed File System) Others In-Memory Analytics, ISV engines N Authentication Authorization Accounting Protection Storage: HDFS Resources: YARN Access: Hive, Pipeline: Falcon Cluster: Knox Provision, Manage & Monitor Ambari Zookeeper Scheduling Oozie DATA MANAGEMENT Deployment Choice Linux Windows On- Premise Cloud Page 6

Investment Themes Access Represents a MAJOR step forward for Hadoop Delivery of Interactive Query via Stinger Initiative, Addition of, more, Stream Processing and Search, Highlight Release Three Key Highlights of Release 1. Stinger Initiative DELIVERED: Interactive Query in Apache Hive 2. NEW Capabilities for Hadoop : delivered with Apache Falcon : Apache Knox extends perimeter security for Hadoop 3. NEW Engines included in HDP Stream processing: Apache Storm to analyze/process streams of data Search: via Apache Solr Page 7

: Reliable, Consistent & Current HDP certifies most recent & stable community innovation Access April 2014 2.4.0 0.4.0 0.12.1 0.13.0 0.12.0 0.98.0 4.0.0 1.5.1 0.9.1 0.9.0 4.7.8 0.5.0 1.4.0 1.5.1 4.0.0 0.4.0 HDP 2.0 October 2013 HDP 1.3 May 2013 2.2.0 1.1.2 Hadoop &YARN Tez 0.12.0 0.11.0 Pig 0.11.0 Hive & HCatalog 0.96.1 0.94.6 HBase Phoenix Accumulo Storm 0.8.0 0.7.0 Mahout Solr Falcon 1.4.4 1.4.3 Sqoop 1.3.1 Flume 1.4.4 1.2.5 Ambari 3.3.2 Oozie 3.4.5 Zookeeper Knox Access Hortonworks Platform Page 8

Interactive SQL-IN-Hadoop Delivered Access Stinger Initiative DELIVERED Next generation SQL based interactive query in Hadoop Speed Improve Hive query performance has increased by 100X to allow for interactive query times (seconds) Scale The only SQL interface to Hadoop designed for queries that scale from TB to PB SQL Support broadest range of SQL semantics for analytic applications running against Hadoop Business Analytics Apache MapReduce SQL Apache Hive Apache YARN 1 HDFS (Hadoop Distributed File System) Custom Apps Apache Tez N Stinger Project Stinger Phase 1: Base Optimizations SQL Types SQL Analytic Functions ORCFile Modern File Format Stinger Phase 2: SQL Types SQL Analytic Functions Advanced Optimizations Performance Boosts via YARN Stinger Phase 3 Hive on Apache Tez Query Service (always on) Buffer Cache Cost Based Optimizer (Optiq) Apache Hive Contribution an Open Community at its finest 1,672 Jira Tickets Closed 145 Developers 44 Companies ~390,000 Lines Of Code Added (2x) 13 Months Page 9

New: Access Apache Falcon Simplified for Enterprise Hadoop First time included in HDP Provides key governance framework for: Acquisition & processing of data sets Replication & Retention of datasets Redirect datasets to non-hadoop extensions Provides audit trail & lineage Another great example of Open Community Innovation Originally built and contributed to Apache by InMobi Fastest path to innovation is the open community 14 months in the making Tested In production Vibrant community of developers building Investment Phases Phase-1 Incubate Apache Falcon set replication & retention Falcon tech preview Phase-2 Basic dashboard for pipeline viewing Kerberos security support Ambari integration for management Hive/HCatalog integration Phase-3 Advanced Dashboard for pipeline definition & management Audit Lineage tagging File import SSH & SCP Page 10

New: Apache Knox for Perimeter Access Important Note: for Hadoop must be addressed within every layer of the stack and integrated into existing frameworks For a full description of what is available in Enterprise Hadoop today across Authentication, Authorization, accountability and Encryption please visit our security labs page Investments Apache Knox Perimeter security for Hadoop A common place to preform authentication across Hadoop and all related projects Integrated to LDAP and AD Currently supports: WebHDFS, WebHCAT, Oozie, Hive & HBase Broad community effort, Incubated with Microsoft, broad set of developers invovled Phase 1 Strong AuthNwith Kerberos HBase, Hive, HDFS basic AuthZ Encryption with SSL for NN, JT, etc. Wire encryption with Shuffle, HDFS, JDBC Phase 2: ACLs for HDFS Knox: Hadoop REST API SQL-style Hive AuthZ(GRANT, REVOKE) SSL support for Hive Server 2 SSL for DN/NN UI & WebHDFS PAM support for Hive Phase 3: Audit event correlation and Audit viewer Support Token-Based AuthN beyond kerb Encryption in HDFS, Hive & Hbase Knox for HDFS HA, Ambari & Falcon Page 11

New: Stream Processing with Apache Storm Access Apache Storm Real-time event processing for sensor and business activity monitoring Unlocks new business cases for Hadoop Key component of a data lake architecture Scale: Ingest millions of events per second. Fast query on petabytes of data Integrated with Ambari to manage Investment Phases Phase-1 Install, Start, & Stop via Ambari Kafka, HBase, & HDFS Connectors Ganglia & Nagios based monitoring Phase-2 Storm-on-YARN Ingest & Notification for JMS persistence: EDWs, RDBMS, Cassandra Phase-3 High Availability mgmnt w/ambari AD/LDAP plugin for authentication Declarative wiring Hive update support Advanced scheduler Page 12

New: Search for Hadoop Access Apache Solr Open source enterprise search for Hadoop and HDP Open architecture: In the community, for the community Simple, powerful UI for advanced search applications High performance indexing & sub-second search times over billions of documents Deep Integration Roadmap with HDP Partnership with LucidWorks LucidWorks provides tier 3 & 4 support Alignment w/ strategy of working within the community and with the core committers 9 committers total (7 PMC) Page 13

Cascading SDK & Cascading SDK Enables the the rapid development of batch and interactive data-driven applications Integration Roadmap Step 1: Integrate Cascading SDK for customers to use with Step 2: Integration with Tez Page 14

Tech Preview: Apache Spark Access In-memory processing is HOT! however, most of the world using for science and machine learning In memory sandbox for iterative data analytics used by a handful of data scientists Hortonworks provides guidance for initial applicability and scale Exploring key use cases with customers focused on Iterative access & machine learning Experience thus far supports target deployments of no more than: 1 TB of data, 40 nodes, and 1-3 users Skill set required: Scala (Java-based API Framework) Page 15

Operating Enterprise Hadoop Access AMBARI WEB Apache Ambari is the only 100% open source framework for provisioning, managing and monitoring Apache Hadoop clusters Integration With Existing Tools Viewpoint Others New in Support new Access Engines Stack extensibility, Cluster Blueprints Rolling restarts Maintenance mode more... REST APIs AMBARI SERVER PROVISION MANAGE MONITOR PROVISION MANAGE MONITOR compute & storage.......... compute & storage Page 16

Investment Themes Access Represents a MAJOR step forward for Hadoop Delivery of Interactive Query via Stinger Initiative, Addition of, more, Stream Processing and Search, Highlight Release Three Key Highlights of Release 1. Stinger Initiative DELIVERED: Interactive Query in Apache Hive 2. NEW Capabilities for Hadoop : delivered with Apache Falcon : Apache Knox extends perimeter security for Hadoop 3. NEW Engines included in HDP Stream processing: Apache Storm to analyze/process streams of data Search: via Apache Solr AND the HDP Spark Tech Preview, Simultaneous Linux & Windows Release, COUNTLESS additional features Page 17

Thank You Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 18