The Enterprise Data Hub and The Modern Information Architecture



Similar documents
The Future of Data Management

The Future of Data Management with Hadoop and the Enterprise Data Hub

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Apache Hadoop in the Enterprise. Dr. Amr Awadallah,

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

More Data in Less Time

Build Your Competitive Edge in Big Data with Cisco. Rick Speyer Senior Global Marketing Manager Big Data Cisco Systems 6/25/2015

Hadoop Trends and Practical Use Cases. April 2014

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

Big Data Use Cases Update

HDP Hadoop From concept to deployment.

HDP Enabling the Modern Data Architecture

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Talend Big Data. Delivering instant value from all your data. Talend

New Clinical Research & Care Opportunities Through Big Data Informatics

INTELLIGENT BUSINESS STRATEGIES WHITE PAPER

Deploying an Operational Data Store Designed for Big Data

Getting Started Practical Input For Your Roadmap

Accelerate your Big Data Strategy. Execute faster with Capgemini and Cloudera s Enterprise Data Hub Accelerator

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Ganzheitliches Datenmanagement

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP

Safe Harbor Statement

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Architecting for the Internet of Things & Big Data

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Oracle Big Data Building A Big Data Management System

EMC/Greenplum Driving the Future of Data Warehousing and Analytics

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop

Investor Presentation. Second Quarter 2015

Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization

Integrating a Big Data Platform into Government:

Modern Data Architecture for Predictive Analytics

UNIFY YOUR (BIG) DATA

Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal

Oracle Big Data Strategy Simplified Infrastrcuture

THE JOURNEY TO A DATA LAKE

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Tap into Hadoop and Other No SQL Sources

The Principles of the Business Data Lake

How the oil and gas industry can gain value from Big Data?

Luncheon Webinar Series May 13, 2013

VIEWPOINT. High Performance Analytics. Industry Context and Trends

Il mondo dei DB Cambia : Tecnologie e opportunita`

How to avoid building a data swamp

Leveraging Machine Data to Deliver New Insights for Business Analytics

DATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

Analyzing Big Data with AWS

Big Data Can Drive the Business and IT to Evolve and Adapt

Big Data Comes of Age: Shifting to a Real-time Data Platform

Using Tableau Software with Hortonworks Data Platform

Big Data and Your Data Warehouse Philip Russom

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION

How To Use Big Data For Business

Disrupt or be disrupted IT Driving Business Transformation

How To Create A Business Intelligence (Bi)

SAP and Hortonworks Reference Architecture

INVENTING THE FUTURE HITACHI DATA SYSTEMS BIG DATA ROADMAP MICHAEL HAY

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BEYOND BI: Big Data Analytic Use Cases

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc.

Big Data Strategy. Use Case Study. Amy O Connor // Field Sales Evangelist

Interactive data analytics drive insights

Apache Hadoop Patterns of Use

Oracle Database 12c Plug In. Switch On. Get SMART.

WHEN DATA TALKS OPPORTUNITY KNOCKS

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

Traditional BI vs. Business Data Lake A comparison

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Apache Hadoop: Past, Present, and Future

Big Data Analytics Best Practices

Cloud Integration and the Big Data Journey - Common Use-Case Patterns

Evolution to Revolution: Big Data 2.0

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Information Builders Mission & Value Proposition

Cloudera Enterprise Data Hub in Telecom:

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

Optimized for the Industrial Internet: GE s Industrial Data Lake Platform

#TalendSandbox for Big Data

Cloud Ready Data: Speeding Your Journey to the Cloud

Hadoop: Distributed Data Processing. Amr Awadallah Founder/CTO, Cloudera, Inc. ACM Data Mining SIG Thursday, January 25 th, 2010

Deploying Big Data to the Cloud: Roadmap for Success

Artur Borycki. Director International Solutions Marketing

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

Transcription:

The Enterprise Data Hub and The Modern Information Architecture Dr. Amr Awadallah CTO & Co-Founder, Cloudera Twitter: @awadallah 1 2013 Cloudera, Inc. All rights reserved.

Cloudera Overview The Leader in Open Source Data Management built on Apache Hadoop The Leading Open Source Distribution of Apache Hadoop Powerful Suite of Big Data Management Software Enterprise Grade Security, Auditability, and Reliability Founded: 2008, Employees: 500+ Customers: Over 50% of the Fortune 50 and 65% of the Fortune 500 plus top US intelligence and defense agencies. 80% market share of Hadoop distributions. Partners: 800+ in hardware, software, and services. Education: 15,000+ trained; includes developers, admins, analysts, data scientists. Community: Founders and top supporters of the Hadoop open source ecosystem 2 2013 2014 Cloudera, Inc. Inc. All rights All Rights reserved. Reserved.

Cloudera s Mission Help Organizations Leverage the Power of All Their Data to Ask Bigger Questions. 3 2013 Cloudera, Inc. All rights reserved.

Why is this Happening Now? 4 2014 Cloudera, Inc. All Rights Reserved.

It isn t All About Size 10TB to 10PB IT S ALL (BIG) DATA 5 2014 Cloudera, Inc. All Rights Reserved.

And It Isn t Just About Web 2.0 / Social AUTOMOTIVE Auto sensors reporting location, problems COMMUNICATIONS Location-based advertising CONSUMER PACKAGED GOODS Sentiment analysis of what s hot, customer service FINANCIAL SERVICES Risk & portfolio analysis New products EDUCATION & RESEARCH Experiment sensor analysis HIGH TECHNOLOGY / INDUSTRIAL MFG. Mfg quality Warranty analysis LIFE SCIENCES Clinical trials Genomics MEDIA / ENTERTAINMENT Viewers / advertising effectiveness ON-LINE SERVICES / SOCIAL MEDIA People & career matching Website optimization HEALTH CARE Patient sensors, monitoring, EHRs Quality of care OIL & GAS Drilling exploration sensor analysis RETAIL Consumer sentiment Optimized marketing TRAVEL & TRANSPORTATION Sensor analysis for optimal traffic flows Customer sentiment UTILITIES Smart Meter analysis for network capacity LAW ENFORCEMENT & DEFENSE Threat analysis - social media monitoring, photo analysis 6 2014 2013 Cloudera, Inc. All Rights rights Reserved. reserved.

Legacy Information Architecture is a Mess Thousands of Employees & Inaccessible Information Issues: Limited Scale Limited Agility Limited History Limited Visibility EDWs Marts Servers Document Stores Storage Search Data Archives Silos of Multi- Structured Data ERP, CRM, RDBMS, Machines Files, Images, Video, Logs, Clickstreams External Data Sources 7 2014 Cloudera, Inc. All Rights Reserved.

Enterprise Data Hub is the Solution 4 3 Multi-workload Data Platform Bring applications to data Combine different workloads on common data (i.e. SQL +Search) True BI agility Self-service Exploratory BI Simple search + BI tools Schema on read agility Reduce BI user backlog requests 2 Data Mgmt & Transformations 1 Active Archive 8 4 3 2 Servers EDH Marts 1 Storage Archives EDWs 1 Documents Search One source of data for all analytics Persisted state of transformed data Significantly faster & cheaper Full fidelity original data Indefinite time, any source Lowest cost storage ERP, CRM, RDBMS, Machines Files, Images, Video, Logs, Clickstreams 2014 Cloudera, Inc. All Rights Reserved. External Data Sources

The Enterprise Data Hub Online NoSQL DBMS Analytic MPP DBMS Enterprise Data Hub Search Engine Batch Processing Resource Management Stream Processing Unified Scale-out Storage For Any Type of Data Elastic, Fault-tolerant, Self-healing, In-memory capabilities Machine Learning SQL Streaming File System (NFS) Metadata, Security, Audit, Lineage System Management Data Management Key Attributes: 1. Secure & Compliant Robust access controls Data encryption options Shared security policies 2. Enterprise Data Governance Meta data management Data lineage/tethering Audit histories 3. Unified & Manageable Common storage & resource management On-prem, cloud & managed service Highly available (including DR) 4. Open Architecture Open source plaform APIs & engines for multiple workloads Extensible for 3 rd parties 9 2014 Cloudera, Inc. All Rights Reserved.

Data Warehouse vs. Data Hub Enterprise Data Warehouse Enterprise Data Hub 10 2013 2014 Cloudera, Cloudera, Inc. Inc. All rights All Rights reserved. Reserved.

The Modern Information Architecture Data Architects System Operators Engineers Data Scientists Analysts Business Users META DATA / ETL TOOLS CLOUDERA MANAGER DEVELOPER TOOLS DATA MODELING BI / ANALYTICS ENTERPRISE REPORTING ENTERPRISE DATA HUB ENTERPRISE DATA WAREHOUSE ONLINE SERVING SYSTEM SYS LOGS WEB LOGS FILES RDBMS WEB/MOBILE APPLICATION Customers & End Users 11 2014 Cloudera, Inc. All Rights Reserved.

Customer Journey to Achieve Full Potential Operational Efficiency Information Advantage Cheap Storage ETL Acceleration EDW Optimization Exploration Data Science Consolidation 360 View IT Business 12 2014 2013 Cloudera, Inc. All Rights rights Reserved. reserved.

Other Starting Use Cases for the EDH Market Basket Analysis Fraud Detection Log Processing Predictive Maintenance Risk Management Innovation and Advantage Ask bigger questions in the pursuit of discovering something incredible Operational Efficiency Perform existing workloads faster, cheaper, better ETL Acceleration Active Archive EDW Optimization Deep Exploratory BI Historical Compliance 13 2013 2014 Cloudera, Cloudera, Inc. Inc. All rights All Rights reserved. Reserved.

Conclusion: EDH Allows You To Active Archive Retain Option Value of Data Accelerate ETL Transformations Enable Exploration/Agility Consolidate Silos Achieve True 360 View of Customers and Products. 14 2014 Cloudera, Inc. All Rights Reserved. 2013 Cloudera, Inc. All rights reserved.

The Future Is Information Driven. Start Now. 15 2013 Cloudera, Inc. All rights reserved.