Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera



Similar documents
Dell* In-Memory Appliance for Cloudera* Enterprise

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Big Data Performance Growth on the Rise

Interactive data analytics drive insights

Big Data. Value, use cases and architectures. Petar Torre Lead Architect Service Provider Group. Dubrovnik, Croatia, South East Europe May, 2013

Cloud Computing. Big Data. High Performance Computing

Big Data for Big Science. Bernard Doering Business Development, EMEA Big Data Software

Big Data and Industrial Internet

More Data in Less Time

The Future of Data Management

Intel IT s Big Data Transformation. Aziz Safa VP, GM Enterprise Applications & Strategy November, 2014

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Extended Attributes and Transparent Encryption in Apache Hadoop

Fast, Low-Overhead Encryption for Apache Hadoop*

The Open Cloud Near-Term Infrastructure Trends in Cloud Computing

Dominik Wagenknecht Accenture

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Cloud-based Analytics and Map Reduce

Dell In-Memory Appliance for Cloudera Enterprise

Professional Hadoop Solutions

Unlocking the Intelligence in. Big Data. Ron Kasabian General Manager Big Data Solutions Intel Corporation

Enabling High performance Big Data platform with RDMA

Data Security in Hadoop

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Hadoop Trends and Practical Use Cases. April 2014

HDP Hadoop From concept to deployment.

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution

Upcoming Announcements

Cloudera Enterprise Data Hub in Telecom:

Hadoop & Spark Using Amazon EMR

Hadoop in the Enterprise

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

Cloudera Enterprise Data Hub. GCloud Service Definition Lot 3: Software as a Service

Cloud Courses Description

Big Data and Natural Language: Extracting Insight From Text

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Cloud Courses Description

HDP Enabling the Modern Data Architecture

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

An Open Source Memory-Centric Distributed Storage System

Lenovo ThinkServer and Cloudera Solution for Apache Hadoop

Cisco Data Preparation

Big Data Realities Hadoop in the Enterprise Architecture

Real-Time Big Data Analytics for the Enterprise

HITACHI DATA SYSTEMS HADOOP SOLUTION JUNE 12, 2012

Big Data Analytics - Accelerated. stream-horizon.com

Making a Smooth Transition to a Hybrid Cloud with Microsoft Cloud OS

The Platform is the Planet

Oracle Big Data SQL Technical Update

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER

Pilot-Streaming: Design Considerations for a Stream Processing Framework for High- Performance Computing

A Brief Introduction to Apache Tez

Building Your Big Data Team

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Build Your Competitive Edge in Big Data with Cisco. Rick Speyer Senior Global Marketing Manager Big Data Cisco Systems 6/25/2015

Virtualizing Apache Hadoop. June, 2012

YARN Apache Hadoop Next Generation Compute Platform

Deploying an Operational Data Store Designed for Big Data

The Future of Data Management with Hadoop and the Enterprise Data Hub

Real Time Big Data Processing

Data center day. Big data. Jason Waxman VP, GM, Cloud Platforms Group. August 27, 2015

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Cloudera in the Public Cloud

Modernizing Your Data Warehouse for Hadoop

Native Connectivity to Big Data Sources in MSTR 10

Benchmarking Sahara-based Big-Data-as-a-Service Solutions. Zhidong Yu, Weiting Chen (Intel) Matthew Farrellee (Red Hat) May 2015

Built for Business. Ready for the Future.

Open Source for Cloud Infrastructure

Business opportunities from IOT and Big Data. Joachim Aertebjerg Director Enterprise Solution Sales Intel EMEA

Big Data and Hadoop for the Executive A Reference Guide

Workshop on Hadoop with Big Data

Solution Brief Big Data in the Cloud: Converging Technologies

The Enterprise Data Hub and The Modern Information Architecture

Big Analytics in the Cloud. Matt Winkler PM, Big

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research

SQL Server Consolidation Using Cisco Unified Computing System and Microsoft Hyper-V

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera

2015 Global Technology conference. Diane Bryant Senior Vice President & General Manager Data Center Group Intel Corporation

Oracle Big Data Fundamentals Ed 1 NEW

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software

Hadoop Ecosystem B Y R A H I M A.

Successfully Deploying Globalized Applications Requires Application Delivery Controllers

Hadoop in the Hybrid Cloud

Roadmap Talend : découvrez les futures fonctionnalités de Talend

Citrix XenServer Industry-leading open source platform for cost-effective cloud, server and desktop virtualization. citrix.com

Adobe Deploys Hadoop as a Service on VMware vsphere

HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW

Hadoop-BAM and SeqPig

How to Hadoop Without the Worry: Protecting Big Data at Scale

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Transcription:

Accelerating Enterprise Big Data Success Tim Stevens, VP of Business and Corporate Development Cloudera 1

Big Opportunity: Extract value from data Revenue Growth x = 50 Billion 35 ZB Cost Savings Margin Gain THINGS DATA VALUE 2

Big Gap: Roadblocks on the journey Worry about attacks Bring data to compute -- fail to scale x = NO NO NO 50 Billion 35 ZB Waste time on Revenue misguided pilots Growth Cost Savings SECURITY INSIGHT PROOF Hold back production deployment Delay insights with batch processing Pay more for data management Store underutilized data Fail to show Margin ROI Gain Use sub-optimal hardware THINGS DATA VALUE 3

Intel Confidential NDA ONLY Big Picture: Datacenter Inflection 3 2 1 Linux/x86 Units UNIX/RISC units Cluster to Cloud ASIC to IA/Fabric Physical to Virtual SW-only to HW-assisted UNIX to Linux RISC to IA Virtualized Nonvirtualized Public Private 2010 2011 2012 2013 2008 2009 2010 2011 2012 2013 4 Big Data In 2000 Intel saw Linux coming & invested in heavily in Red Hat; in 2005 we saw virtualization happening and invested in VMware; in 2008 we started investing heavily in hyper-scale computing. We think big data & Hadoop will dwarf all of them. 0 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Diane Bryant, SVP & GM Data Center Group, Intel 4

Big Deal: Cloudera + Intel Alliance Intel invests $740M in Cloudera As Intel s largest datacenter venture deal, represents Intel commitment to big data Supports Cloudera s ability to remain independent Intel & Cloudera drive innovation through open source Accelerate evolution of Hadoop by joining forces on foundational technologies Enable open source developers to innovate in and on top of the Hadoop platform Intel enables CDH to run best on Intel Architecture Enables Cloudera to make best use of Intel data center technologies Provides datacenter infrastructure for Cloudera development & benchmarking at scale Intel & Cloudera foster the broadest ecosystem of big data solutions 5 2014 Cloudera, Inc. All rights reserved.

Big Goal: Converge on one open source platform Most stable, compatible, and mature Hadoop distribution Leading SQL functionality & performance (Impala) Deepest management and governance capabilities 150 Hadoop developers 100 open source committers The only distribution with performance and security enhanced from the silicon up Leading security capabilities including encryption, access control, and auditing 50 Hadoop developers and 12 committers Long-standing committment to open source with 1000 developers working on Linux, KVM, Xen, Java, OpenStack, Hadoop 6 2014 Cloudera, Inc. All rights reserved.

Driving innovation through open source Ramp the pace of innovation in the Apache Hadoop platform while reducing fragmentation SQL Streaming Performance Project Gryphon Impala Apache Storm Apache Spark Streaming Apache Tez Apache Spark Impala Spark Streaming Spark Security Project Rhino Apache Sentry 2014 Cloudera, Inc. All rights reserved. Project Rhino (including Sentry) Storage Apache HDFS Apache HBase Accelerated investment in both 7

Enabling CDH to run best on Intel Architecture Software & Silicon co-evolve to deliver dramatic gains 1 Push computeintensive work down to the silicon 2 Increase main memory utilization up 3 to 20X Design for rackscale architecture Encryption (AES-NI) Compression (SSE 4.2) Math (MKL) Improve Disk:Memory 200:1 10:1 8 2014 Cloudera, Inc. All rights reserved.

Focus of Joint Engineering Feature / Target Cloudera Enterprise SECURITY PERFORMANCE MANAGEMENT APPLICATIONS HDFS Encryption and extended file ACLs Centralized authorization via Sentry Simplified Kerberos Crypto acceleration with AES-NI MR/Shuffle optimizations Compression acceleration with SSE 4.2 Service management extensions Simplified cloud provisioning, including AWS support Backup and Disaster Recovery Certified w/ Intel Enterprise Edition of Lustre Impala enhancements including low-latency SQL engine, SQL-92 analytic queries, and more Spark support in CDH, including Spark on YARN, Spark security, and Spark streaming SQL on HBase HBase cell-level authorization Search: document and index security Auditing & data lineage Optimizations using AVX and other IA Optimizations using MKL Explore Xeon Phi with Java support Deeper diagnostics of various modules Support for Azure, VMware, OpenStack Extended RBAC in Cloudera Manager Spark interoperability with Impala Wire encryption for Spark Pig integration with Spark Spark/Sentry integration 9

Cloudera Enterprise Data Hub powered by Apache Hadoop Open Source Scalable Flexible Cost-Effective Managed Batch Processing Enterprise Data Hub, powered by Apache Hadoop Analytic SQL Search Machine Learning Workload Management Stream Processing 3 rd Party Apps Data Management Open Architecture Secure Governed Storage for Any Type of Data Filesystem Unified, Elastic, Resilient, Secure Online NoSQL System Management 10 2014 Cloudera, Inc. All rights reserved.

Improving Apache Hadoop performance with IA Up to 50% Faster Up to 80% Faster Up to 50% Faster Compute Storage & Memory Network Compared to previous generation SSD compared to HDD 10GbE compared to 1GbE As measured by time to completion of 1TB sort on 10 node cluster Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Source: Intel Internal testing For more information go to : intel.com/performance ` 11

Enabling ecosystem with joint leadership Market leader in big data management systems Largest base of paid customers & free users Consistently delivering industry-leading capabilities around Apache Hadoop Market leader in silicon Long & successful history of investment and collaboration with software platforms Global reach; market leading Hadoop distribution in China 12 2014 Cloudera, Inc. All rights reserved.

Joint customers leading the way Cost Savings Revenue Growth Margin Gain Captures TB s of data from smart meters Analyzes usage patterns to optimize customer consumption $320M USD in utility savings Utilities simply can t cope with the vast volumes of smart meter data not just with storing the data, but being able to analyze it and put it to use -- Drew Hylbert, VP Technology & Infrastucture, Opower Needs to be IoT oriented Needs to leverage Hadoop 13

Summary: Faster Insights, Better Security, and Less Complexity Accelerate innovation via open source software Maintain an open horizontal platform for big data Continue to enhance Apache Hadoop and related projects Enable CDH to run best on IA Optimize performance across compute, storage, & network Ensure platform security, enhanced by hardware Foster evolution of big data ecosystem Establish usage models and industry standard benchmarks Develop reference architectures and industry-wide solutions 14

More Resources intel.com/bigdata cloudera.com 15

16 2014 Cloudera, Inc. All rights reserved. Tim Stevens tstevens@cloudera.com