1 The Enterprise Data Hub and The Modern Information Architecture Dr. Amr Awadallah CTO & Co-Founder, Cloudera Cloudera, Inc. All rights reserved.
2 Cloudera Overview The Leader in Open Source Data Management built on Apache Hadoop The Leading Open Source Distribution of Apache Hadoop Powerful Suite of Big Data Management Software Enterprise Grade Security, Auditability, and Reliability Founded: 2008, Employees: 500+ Customers: Over 50% of the Fortune 50 and 65% of the Fortune 500 plus top US intelligence and defense agencies. 80% market share of Hadoop distributions. Partners: 800+ in hardware, software, and services. Education: 15,000+ trained; includes developers, admins, analysts, data scientists. Community: Founders and top supporters of the Hadoop open source ecosystem Cloudera, Inc. Inc. All rights All Rights reserved. Reserved.
3 Cloudera s Mission Help Organizations Leverage the Power of All Their Data to Ask Bigger Questions Cloudera, Inc. All rights reserved.
4 Why is this Happening Now? Cloudera, Inc. All Rights Reserved.
5 It isn t All About Size 10TB to 10PB IT S ALL (BIG) DATA Cloudera, Inc. All Rights Reserved.
6 And It Isn t Just About Web 2.0 / Social AUTOMOTIVE Auto sensors reporting location, problems COMMUNICATIONS Location-based advertising CONSUMER PACKAGED GOODS Sentiment analysis of what s hot, customer service FINANCIAL SERVICES Risk & portfolio analysis New products EDUCATION & RESEARCH Experiment sensor analysis HIGH TECHNOLOGY / INDUSTRIAL MFG. Mfg quality Warranty analysis LIFE SCIENCES Clinical trials Genomics MEDIA / ENTERTAINMENT Viewers / advertising effectiveness ON-LINE SERVICES / SOCIAL MEDIA People & career matching Website optimization HEALTH CARE Patient sensors, monitoring, EHRs Quality of care OIL & GAS Drilling exploration sensor analysis RETAIL Consumer sentiment Optimized marketing TRAVEL & TRANSPORTATION Sensor analysis for optimal traffic flows Customer sentiment UTILITIES Smart Meter analysis for network capacity LAW ENFORCEMENT & DEFENSE Threat analysis - social media monitoring, photo analysis Cloudera, Inc. All Rights rights Reserved. reserved.
7 Legacy Information Architecture is a Mess Thousands of Employees & Inaccessible Information Issues: Limited Scale Limited Agility Limited History Limited Visibility EDWs Marts Servers Document Stores Storage Search Data Archives Silos of Multi- Structured Data ERP, CRM, RDBMS, Machines Files, Images, Video, Logs, Clickstreams External Data Sources Cloudera, Inc. All Rights Reserved.
8 Enterprise Data Hub is the Solution 4 3 Multi-workload Data Platform Bring applications to data Combine different workloads on common data (i.e. SQL +Search) True BI agility Self-service Exploratory BI Simple search + BI tools Schema on read agility Reduce BI user backlog requests 2 Data Mgmt & Transformations 1 Active Archive Servers EDH Marts 1 Storage Archives EDWs 1 Documents Search One source of data for all analytics Persisted state of transformed data Significantly faster & cheaper Full fidelity original data Indefinite time, any source Lowest cost storage ERP, CRM, RDBMS, Machines Files, Images, Video, Logs, Clickstreams 2014 Cloudera, Inc. All Rights Reserved. External Data Sources
9 The Enterprise Data Hub Online NoSQL DBMS Analytic MPP DBMS Enterprise Data Hub Search Engine Batch Processing Resource Management Stream Processing Unified Scale-out Storage For Any Type of Data Elastic, Fault-tolerant, Self-healing, In-memory capabilities Machine Learning SQL Streaming File System (NFS) Metadata, Security, Audit, Lineage System Management Data Management Key Attributes: 1. Secure & Compliant Robust access controls Data encryption options Shared security policies 2. Enterprise Data Governance Meta data management Data lineage/tethering Audit histories 3. Unified & Manageable Common storage & resource management On-prem, cloud & managed service Highly available (including DR) 4. Open Architecture Open source plaform APIs & engines for multiple workloads Extensible for 3 rd parties Cloudera, Inc. All Rights Reserved.
10 Data Warehouse vs. Data Hub Enterprise Data Warehouse Enterprise Data Hub Cloudera, Cloudera, Inc. Inc. All rights All Rights reserved. Reserved.
11 The Modern Information Architecture Data Architects System Operators Engineers Data Scientists Analysts Business Users META DATA / ETL TOOLS CLOUDERA MANAGER DEVELOPER TOOLS DATA MODELING BI / ANALYTICS ENTERPRISE REPORTING ENTERPRISE DATA HUB ENTERPRISE DATA WAREHOUSE ONLINE SERVING SYSTEM SYS LOGS WEB LOGS FILES RDBMS WEB/MOBILE APPLICATION Customers & End Users Cloudera, Inc. All Rights Reserved.
12 Customer Journey to Achieve Full Potential Operational Efficiency Information Advantage Cheap Storage ETL Acceleration EDW Optimization Exploration Data Science Consolidation 360 View IT Business Cloudera, Inc. All Rights rights Reserved. reserved.
13 Other Starting Use Cases for the EDH Market Basket Analysis Fraud Detection Log Processing Predictive Maintenance Risk Management Innovation and Advantage Ask bigger questions in the pursuit of discovering something incredible Operational Efficiency Perform existing workloads faster, cheaper, better ETL Acceleration Active Archive EDW Optimization Deep Exploratory BI Historical Compliance Cloudera, Cloudera, Inc. Inc. All rights All Rights reserved. Reserved.
14 Conclusion: EDH Allows You To Active Archive Retain Option Value of Data Accelerate ETL Transformations Enable Exploration/Agility Consolidate Silos Achieve True 360 View of Customers and Products Cloudera, Inc. All Rights Reserved Cloudera, Inc. All rights reserved.
15 The Future Is Information Driven. Start Now Cloudera, Inc. All rights reserved.
Big Data Use Cases Update Sanat Joshi Industry Solutions Manufacturing Industries Business Unit 1 Data Explosion Web & social networks experienced it first Infographic by Go-gulf.com 2 Number Of Connected
White Paper The Business Analyst s Guide to Hadoop Get Ready, Get Set, and Go: A Three-Step Guide to Implementing Hadoop-based Analytics By Alteryx and Hortonworks (T)here is considerable evidence that
BIG DATA TECHNOLOGIES, USE CASES, AND RESEARCH ISSUES Il-Yeol Song, Ph.D. College of Computing & Informatics Drexel University Philadelphia, PA 19104 ACM SAC 2015 April 14, 2015 Salamanca, Spain Source:
A Forrester Consulting Thought Leadership Paper Commissioned By SAP Real-Time Data Management Delivers Faster Insights, Extreme Transaction Processing, And Competitive Advantage June 2013 Table Of Contents
DATAMEER USE CASES EBOOK Top Five High-Impact Use Cases for Big Data Analytics You ve been collecting data for years. Learn how to use it to grow your business and gain a competitive edge. INTRODUCTION
1 Contents Introduction. 1 View Point Phil Shelley, CTO, Sears Holdings Making it Real Industry Use Cases Retail Extreme Personalization. 6 Airlines Smart Pricing. 9 Auto Warranty and Insurance Efficiency.
1 Modern Data Architecture for Retail with Apache Hadoop on Windows A Hortonworks and Microsoft White Paper JUNE 2014 2 Executive Summary Retailers have a long history of investing in data and analytics
Introduction Enterprise Data Hub Accelerator Retail Sector Use Cases Capabilities Information-Driven Transformation in Retail with the Enterprise Data Hub Accelerator Introduction Enterprise Data Hub Accelerator
INTELLIGENT BUSINESS STRATEGIES W H I T E P A P E R Architecting A Big Data Platform for Analytics By Mike Ferguson Intelligent Business Strategies October 2012 Prepared for: Table of Contents Introduction...
IBM Software Group 2014 Cloud, Big Data, Mobile, Social and Security Pairoj Ruamviboonsuk Software Client Architect IBM SWG Thailand Igniting change the transformative power of computing Back-office computing
White Paper Data Warehouse Optimization with Hadoop A Big Data Reference Architecture Using Informatica and Cloudera Technologies This document contains Confidential, Proprietary and Trade Secret Information
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
Cisco IT Hadoop Journey Srini Desikan, Program Manager IT 2015 MapR Technologies 1 Agenda Hadoop Platform Timeline Key Decisions / Lessons Learnt Data Lake Hadoop s place in IT Data Platforms Use Cases
An Oracle White Paper June 2013 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure
How the oil and gas industry can gain value from Big Data? Arild Kristensen Nordic Sales Manager, Big Data Analytics firstname.lastname@example.org, tlf. +4790532591 April 25, 2013 2013 IBM Corporation Dilbert
OPEN DATA CENTER ALLIANCE : sm Big Data Consumer Guide SM Table of Contents Legal Notice...3 Executive Summary...4 Introduction...5 Objective...5 Big Data 101...5 Defining Big Data...5 Big Data Evolution...7
HRG Assessment Vblock Unified Data Solution for Oil & Gas It is 2014 and oil and gas industry leaders are engaged in making the digital oil field a reality. While this is a good first step in improving
White Paper MarkLogic and Intel for Financial Services Simplifying Data Governance and Accelerating Real-time Big Data Analysis in Financial Services with MarkLogic Server and Intel Reduce risk and speed
Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com W H I T E P A P E R B i g D a t a : W h a t I t I s a n d W h y Y o u S h o u l d C a r e Sponsored
Innovation, Big Data and SAP HANA Make Big Data Real with SAP Solutions Alessandro Nibioli, SAP Italia Disruptive technologies are transforming business models everywhere BIG DATA SOCIAL Business oriented
Big Data and Advanced Analytics Technologies and Use Cases" Colin White President, BI Research DAMA Portland February 2013" Agenda There is considerable interest at present on the topic of big data. Much
Big Data Analytics Harvard-Smithsonian Center for Astrophysics Data Science Training for Librarians April 4, 2013 David Dietrich, EMC Education Services I ll go into a company and say, What data problems
BIG DATA PLATFORM Reinventing Businesses through Innovation, Value & Simplicity Dr. Jan Teichmann, Product & Strategy Dec 5, 2013 Agenda HANA Data Data Platform Platform Journey Platform Platform Future