Big Data on Tap Jonathan Gray

Similar documents
The Digital Enterprise Demands a Modern Integration Approach. Nada daveiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

Ganzheitliches Datenmanagement

HDP Hadoop From concept to deployment.

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Upcoming Announcements

How to Run a Successful Big Data POC in 6 Weeks

Empowering the Masses with Analytics

IBM Software Hadoop in the cloud

AtScale Intelligence Platform

Databricks. A Primer

MicroStrategy Course Catalog

Informatica and the Vibe Virtual Data Machine

Databricks. A Primer

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

Three Reasons Why Visual Data Discovery Falls Short

Unified Batch & Stream Processing Platform

The Future of Data Management

Cisco IT Hadoop Journey

Integrating a Big Data Platform into Government:

SAP and Hortonworks Reference Architecture

WHITE PAPER SPLUNK SOFTWARE AS A SIEM

Luncheon Webinar Series May 13, 2013

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

Data Integration Checklist

More Data in Less Time

Making big data simple with Databricks

Powerful analytics. and enterprise security. in a single platform. microstrategy.com 1

SIMPLIFYING BIG DATA Real- &me, interac&ve data analy&cs pla4orm for Hadoop NFLABS

Cloud Ready Data: Speeding Your Journey to the Cloud

Data Doesn t Communicate Itself Using Visualization to Tell Better Stories

SQLstream 4 Product Brief. CHANGING THE ECONOMICS OF BIG DATA SQLstream 4.0 product brief

From Spark to Ignition:

BIG DATA ANALYTICS For REAL TIME SYSTEM

WHITE PAPER. Five Steps to Better Application Monitoring and Troubleshooting

HDP Enabling the Modern Data Architecture

Common Situations. Departments choosing best in class solutions for their specific needs. Lack of coordinated BI strategy across the enterprise

Actian SQL in Hadoop Buyer s Guide

DRIVING INNOVATION THROUGH DATA ACCELERATING BIG DATA APPLICATION DEVELOPMENT WITH CASCADING

Oracle Big Data Building A Big Data Management System

Three Open Blueprints For Big Data Success

Next-Generation Cloud Analytics with Amazon Redshift

Ubuntu and Hadoop: the perfect match

Big Data Analytics Nokia

Roadmap Talend : découvrez les futures fonctionnalités de Talend

5 Ways Informatica Cloud Data Integration Extends PowerCenter and Enables Hybrid IT. White Paper

Bring your data to life with Microsoft Power BI. Peter Myers Bitwise Solutions

BIG DATA & DATA SCIENCE

BIG DATA SOLUTION DATA SHEET

Cloudera Enterprise Data Hub in Telecom:

Extend your analytic capabilities with SAP Predictive Analysis

STREAM ANALYTIX. Industry s only Multi-Engine Streaming Analytics Platform

Hadoop & Spark Using Amazon EMR

SAP Predictive Analytics

Building Confidence in Big Data Innovations in Information Integration & Governance for Big Data

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

Microsoft Power BI. Nov 21, 2015

Data processing goes big

A Comprehensive Review of Self-Service Data Visualization in MicroStrategy. Vijay Anand January 28, 2014

IBM Big Data Platform

Paxata Security Overview

Beyond Lambda - how to get from logical to physical. Artur Borycki, Director International Technology & Innovations

HADOOP IN ENTERPRISE FUTURE-PROOF YOUR BIG DATA INVESTMENTS WITH CASCADING. Supreet Oberoi Nov. 4-6, 2014 Big Data Expo Santa Clara

The Future of Data Management with Hadoop and the Enterprise Data Hub

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop

Unleash your intuition

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems

How to avoid building a data swamp

<Insert Picture Here> Oracle BI Standard Edition One The Right BI Foundation for the Emerging Enterprise

A Sumo Logic White Paper. Harnessing Continuous Intelligence to Enable the Modern DevOps Team

An enterprise- grade cloud management platform that enables on- demand, self- service IT operating models for Global 2000 enterprises

Open Source Business Intelligence Intro

Comprehensive Analytics on the Hortonworks Data Platform

Microsoft Big Data. Solution Brief

Introducing the Reimagined Power BI Platform. Jen Underwood, Microsoft

White Paper: Datameer s User-Focused Big Data Solutions

Cisco IT Hadoop Journey

Your Data, Any Place, Any Time.

Sage Integration Cloud Technology Whitepaper

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Are You Big Data Ready?

#TalendSandbox for Big Data

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

locuz.com Big Data Services

Informatica PowerCenter Data Virtualization Edition

ENZO UNIFIED SOLVES THE CHALLENGES OF REAL-TIME DATA INTEGRATION

Modern IT Operations Management. Why a New Approach is Required, and How Boundary Delivers

The Enterprise Data Hub and The Modern Information Architecture

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn

Salesforce.com and MicroStrategy. A functional overview and recommendation for analysis and application development

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Your Data, Any Place, Any Time. Microsoft SQL Server 2008 provides a trusted, productive, and intelligent data platform that enables you to:

Ernesto Ongaro BI Consultant February 19, The 5 Levels of Embedded BI

Self-Service Business Intelligence: The hunt for real insights in hidden knowledge Whitepaper

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

Modern Data Integration

Hortonworks CISC Innovation day

Transcription:

Unified Integration for Data-Driven Applications Big Data on Tap Jonathan Gray Founder & CEO November 7, 2016

Hadoop Enables New Applications and Architectures ENTERPRISE DATA LAKES BIG DATA ANALYTICS PRODUCTION DATA APPS Batch and Realtime Data Ingestion Any type of data from any type of source in any volume 360 o Customer View Integrate data from any source and expose through queries and APIs Recommendation Engines Build models in batch using historical data and serve them in realtime Batch and Streaming ETL Code-free self-service creation and management of pipelines SQL Exploration and Data Science All data is automatically accessible via SQL and client SDKs Data as a Service Easily expose generic or custom REST APIs on any data Realtime Dashboards Perform realtime OLAP aggregations and serve them through REST APIs Time Series Analysis Store, process and serve massive volumes of time-series data Realtime Log Analytics Ingestion and processing of high-throughput streaming log events Anomaly Detection Systems Process streaming events and predictably compare them in realtime to historical data NRT Event Monitoring Reliably monitor large streams of data and perform defined actions within a specified time Internet of Things Ingestion, storage and processing of events that is highly-available, scalable and consistent Data Applications Drive Meaningful Business Value 2

But Getting Value from Big Data is Hard Complexity of technologies and new user learning curve Proliferation of projects, services and APIs Divergence of distributions and technologies Integration silos created by narrow point solutions Too much focus on infrastructure and integration, rather than applications and analytics 3

And There Are Many Faces of Hadoop Developer Ops Data Scientist LOB / Product Architecture & Programming Configuring & Monitoring Scripting & Machine Learning Driving Revenue & Decision Making Focused on Apps & Solutions Focused on Infrastructure & SLA s Focused on Data & Algorithms Focused on Products & Insights Without a consistent set of tools, IT will not be an effective data enabler for the business 4

Enter Cask Founded in 2011 By early Hadoop engineers from Facebook and Yahoo! Strategic Investors AT&T, Cloudera and Ericsson Key Customers & Partners AT&T, Ericsson, Lotame, Salesforce, Cloudera, Hortonworks, MapR, Microsoft, IBM, Tableau Raised $37+ Million Andreessen Horowitz, Safeguard, Battery Venture and Ignition Partners Latest Release 3.6 Cask Data Application Platform, Cask Hydrator and Cask Tracker NEW: CDAP 4 Preview Featuring Cask Market, the big data app store Why Cask? A Container Architecture that puts Big Data on Tap 5

The Evolution of the Cask Platform Convergence of Big Data Apps and Data Integration CDAP v2 CDAP v3 CDAP v4 Big Data App Server Big Data Apps + Data Integration Unified Integration for Big Data Abstractions & integrations Metrics & logs Debugging environment Data ingest Data pipelines Workflows and metadata Security & governance Self-service environment Enterprise integrations WebLogic for Hadoop WebLogic Meets Informatica Unified Big Data Integration 6

Introducing Cask Data Application Platform (CDAP) Data Lake Fraud Detection Customer 360 Recommendation Engine Sensor Data Analytics First Unified Integration Platform for Big Data Platform for distributed apps, bringing together application management with data integration Enterprise-grade Security & Governance Self-Service User Experience 100% open source and built for extensibility Distributed Application Framework Modern Data Integration Supports all major Hadoop distributions and clouds Integrates the latest open source big data technologies 7

Modern Data Integration INGEST EXPLORE PROCESS SERVE any data from any source for analytics and data science for ETL and machine learning any data to any destination Real-time and Batch Reliable and Scalable Simple and Self-Service 8

Distributed Application Framework DEVELOP TEST DEPLOY SCALE rapidly build applications powerful test and CI framework run any apps in any environment horizontally scale apps and data Real-time and Batch Memory, Local, Distributed Analytics and Applications 9

Security and Governance CAPTURE DISCOVER TRACK ANALYZE store all metadata about your data easily locate any of your data every audit plus lineage graphs understand usage patterns of data ENCRYPT AUTHENTICATE AUTHORIZE 10

Self-Service User Experience A code-free framework to build and run data pipelines A data discovery tool to explore metadata and usage Drag & drop graphical interface Create, debug, deploy and manage Separation of logic and execution environment Native to Hadoop & Spark scales out Rich applevel metadata Track lineage and audits Analyze usage of datasets MDM integration framework 11

The CDAP Architecture Applications Datasets Table Avro Parquet Timeseries OLAP Cube Geospatial ObjectStore Metadata Programs MapReduce Spark Tigon Workflow Service Worker Metadata Application Container Architecture Reusable Programming Abstractions Global User and Machine Metadata Highly Extensible Plugin Architecture Metadata 12

CDAP Enables the Full Big Data Application Lifecycle Standardization, deep Simplified packaging, deployment integrations, tools and docs Separation of app logic from data logic and integration logic Conceptual integrity within Rapid Development Production Operations & Governance and monitoring of apps on Hadoop Enhanced security and governance with centralized metrics and logs Tracking and exploration of applications and consistency metadata, data provenance, audit across environments trails and usage analytics Single framework for building and running data apps and data lakes on Hadoop and Spark reduces time to develop and deploy big data apps by 80% reduces time to insights and accelerates business value removes barriers to innovation and future-proofs your apps 13

Customer Success Stories Customer Health Insurance Provider offloading clinical / immunization reporting from Netezza Leading SaaS Platform taking new real-time, massive scale products to market Large Telco Enterprise building a centralized, secured, multi-tenant Data Lake Situation Lack of existing Hadoop expertise and frustration with hand-coding and scripting tools Small team and significant technical challenges limit pace of development and solution scale Multiple teams and technologies with widely varied skillsets and incompatible design choices Cask Cask Hydrator for rapid creation of data pipelines and Cask Tracker for data discovery CDAP for real-time ingestion and consistent processing with production operations support CDAP for data lake management and orchestration, tightly integrated into existing systems Solution POC in 2 days Production in 2 months Development in 1 month Production in 3 months Hundreds of Users Thousands of Pipelines 14

Awards and Accolades Cask was Named a Gartner Cool Vendor 2016 Cask was Certified a Great Place to Work 2016 CDAP is a big win for us the amount of code we needed to write was minimal with CDAP, and it was much easier and faster than we ever expected (Jia-Long Wu, Data Architect, Lotame) Cask has tilted the playing field, earning a massive unfair advantage over proprietary point products for data integration and ingest (Nik Rouda, Senior Analyst, Enterprise Strategy Group) for the rest of us who lack the technological chips or patience to make it all work, there s good news: it will soon get easier, thanks to the work done by the big data pioneers, as well as vendors like Cask (Alex Woodie, Managing Editor, Datanami) 15

NEW: CDAP 4 Big Data Apps on Tap! Release of CDAP 4 Preview Available for download now! Cask Market Big Data App Store Cask Wrangler Interactive Data Preparation Reimagined CDAP UI Rewrite based on React Resource Center Interactive Wizards for Common Tasks 16

NEW: CDAP 4 Big Data Apps on Tap! Cask Market The App Store for Big Data Goal: Time to value in minutes w/ no existing experience Application and Library Ecosystem with pre-built Hadoop solutions, reusable templates, and third-party plugins Available from anywhere inside the CDAP UI with a click Initially, everything in the Cask Market has been bootstrapped by Cask based on ongoing work across our customers, is 100% open source and available on GitHub Eventually, developers and ISVs will be able to showcase and market their own applications and libraries (ex: Graylog) Cask Market includes Interactive, Guided Wizards for Configuring Pre-Built Templates 17

Summary CDAP provides the first unified integration platform for big data Cask Hydrator and Cask Tracker are visual extensions of CDAP for self-service access CDAP empowers enterprise IT to deliver faster time to value for Hadoop and Spark, from prototype to production Big Data on Tap Cask Market is a big data app store available in CDAP 4 with pre-built apps, pipelines, plugins CDAP is 100% open source, highly extensible, enterprise-ready, and commercially supported

Thanks! For more information, go to: 19