Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Similar documents
The Future of Data Management

The Enterprise Data Hub and The Modern Information Architecture

The Future of Data Management with Hadoop and the Enterprise Data Hub

Hadoop Trends and Practical Use Cases. April 2014

More Data in Less Time

HDP Hadoop From concept to deployment.

HDP Enabling the Modern Data Architecture

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Deploying an Operational Data Store Designed for Big Data

Information Builders Mission & Value Proposition

Apache Hadoop in the Enterprise. Dr. Amr Awadallah,

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Cloudera Enterprise Data Hub in Telecom:

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Ganzheitliches Datenmanagement

Luncheon Webinar Series May 13, 2013

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer,

Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization

Traditional BI vs. Business Data Lake A comparison

Getting Started Practical Input For Your Roadmap

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

How To Use Big Data For Business

Accelerate your Big Data Strategy. Execute faster with Capgemini and Cloudera s Enterprise Data Hub Accelerator

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop

Architecting for the Internet of Things & Big Data

Real Time Big Data Processing

Interactive data analytics drive insights

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

Driving Growth in Insurance With a Big Data Architecture

THE JOURNEY TO A DATA LAKE

A Modern Data Architecture with Apache Hadoop

Data Integration Checklist

Data Refinery with Big Data Aspects

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera

Build Your Competitive Edge in Big Data with Cisco. Rick Speyer Senior Global Marketing Manager Big Data Cisco Systems 6/25/2015

INTELLIGENT BUSINESS STRATEGIES WHITE PAPER

How the oil and gas industry can gain value from Big Data?

Databricks. A Primer

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Databricks. A Primer

Roadmap Talend : découvrez les futures fonctionnalités de Talend

The Principles of the Business Data Lake

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Talend Big Data. Delivering instant value from all your data. Talend

Agenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR

TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP

Big Data: What You Should Know. Mark Child Research Manager - Software IDC CEMA

Optimized for the Industrial Internet: GE s Industrial Data Lake Platform

Leveraging Machine Data to Deliver New Insights for Business Analytics

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Oracle Database 12c Plug In. Switch On. Get SMART.

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Self-service BI for big data applications using Apache Drill

Protecting Big Data Data Protection Solutions for the Business Data Lake

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Oracle Big Data Building A Big Data Management System

Are You Big Data Ready?

Investor Presentation. Second Quarter 2015

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Apache Hadoop's Role in Your Big Data Architecture

Apache Hadoop: The Big Data Refinery

Accelerate Data Loading for Big Data Analytics Attunity Click-2-Load for HP Vertica

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

How Big Is Big Data Adoption? Survey Results. Survey Results Big Data Company Strategy... 6

Open Source in Financial Services: Meet the challenges of new business models and disruption

Native Connectivity to Big Data Sources in MSTR 10

How to avoid building a data swamp

The Next Wave of Data Management. Is Big Data The New Normal?

SAP and Hortonworks Reference Architecture

Il mondo dei DB Cambia : Tecnologie e opportunita`

Integrating a Big Data Platform into Government:

Upcoming Announcements

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

Making Sense of Big Data in Insurance

Artur Borycki. Director International Solutions Marketing

Hadoop & Spark Using Amazon EMR

Financial, Telco, Retail, & Manufacturing: Hadoop Business Services for Industries

Data Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here

Transcription:

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera

Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees >800 Global 24x7 Support Follow-the-sun capability; Pro-active & Predictive Support Programs Dedicated Support Engineers; Support Centers in NA, Europe & Asia Professional Services World class services delivery teams worldwide Mission Critical Thousands of enterprise customers rely on Cloudera 50% of the Fortune 50; 65% of the Fortune 500 Top Defense & Intelligence Agencies The Largest Ecosystem Over 1,300 Members of our Partner Program, ClouderaConnect Cloudera University Over 40,000 people trained around the world Open Source Leaders Cloudera employees are founders of most of the Apache Hadoop ecosystem projects, and leading contributors to all of them 2014 Cloudera, Inc. All rights reserved Not for Distribution. 2

Data drives innovation Internet of Things INTELLIGENT CLOUD Richer data to analyze 40 Zettabytes of data will be generated WW in 2020 1 Richer user experiences INTELLIGENT THINGS SMART CLIENTS 1 IDG 2.8 Zettabytes of data generated WW in 2012 1 Sources: (1) IDC Digital Universe 2020, (2) IDC Richer data from devices 3

Big Data is All Data and All Paradigms Transactional & Application Data Machine Data Social Data Enterprise Content Volume Velocity Variety Variety Structured Semi-structured Highly unstructured Highly unstructured Throughput Ingestion Veracity Volume 4

Big Data has outgrown the Database GB of Data (IN BILLIONS) Systems 10,000 1.8 trillion gigabytes of data was created in 2011* More than 80% is unstructured data Quantity doubles every 2 years UNSTRUCTURED DATA STRUCTURED DATA 0 2005 2015 2010 * Source: IDC 2011 5

Hadoop is the Platform for Big Data Open Source Software Store, Process, Analyze Data Large Data Sets, stored as raw files Structured, unstructured, semi-structured all data types 2014 Cloudera, Inc. All rights reserved. 6

Internet of Things is driving Big Data Market leader in silicon Long & successful history of investment and collaboration with software platforms: Linux, VMware Global reach; market leading Hadoop distribution in China Market leader in Hadoop Largest base of paid customers & free users Consistently delivering industry-leading capabilities around Apache Hadoop Laser focus on delivering Enterprisegrade Hadoop capabilities and innovation of Apache 7

Expanding Big Data Requires a New Approach 1980s Bring Data to Compute Now Bring Compute to Data Compute Compute Compute Data Data Process-centric businesses use: Structured data mainly Internal data only Important data only Relative size & complexity Data Compute Data Information-centric businesses use all data: Multi-structured, internal & external data of all types Compute Data Compute 8

Two Categories of Hadoop Use Cases Business Intelligence Advanced Analytics Applications Innovation and Advantage Ask Bigger Questions: Gain value from all your data Data Processing: ETL and/or ELT offload Data Storage: Active Archive ( data lake ) Most Companies Start Here Operational Efficiency Perform existing workloads faster, cheaper, better 2014 Cloudera, Inc. All rights reserved. 10

The Old Way: Moving Data to Compute Huge Investment in Specialized Systems that Treat Data as a Commodity Major Challenges Missing Data Leaving data behind Risk and compliance High cost of storage Time to Data Up-front modeling Transforms slow Transforms lose data Cost of Analytics Existing systems strained No agility BI backlog Complex Architecture Many special-purpose systems Moving data around No complete views EDWS MARTS SERVERS DOCUMENTS STORAGE SEARCH ARCHIVE ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS EXTERNAL DATA SOURCES 2014 Cloudera, Inc. All rights reserved. 11

The Old Way: Siloed Business Functions Lack of Coordination Increases Opportunity Costs and Decreases Data Availability Major Challenges Poor Visibility Inefficiency Extreme Cost Complexity MARKETING RISK TRANSACTIONAL LENDING CREDIT CARDS INVESTMENT BACK OFFICE TRANSACTIONS LOGS MARKET DATA RESEARCH CUSTOMER DATA 2014 Cloudera, Inc. All rights reserved. 12

Enterprise Data Hub adoption is a progression 4 Multi-workload analytic platform Bring applications to data Combine different workloads on common data (i.e. SQL + Search) True BI agility 4 3 3 2 Self-service exploratory BI Simple search + BI tools Schema on read agility Reduce BI user backlog requests Data management, transformations One source of data for all analytics Persisted state of transformed data Significantly faster & cheaper Servers Marts EDWs 2 1 Storage Archives Documents Search 1 1 Active archive Full fidelity original data Indefinite time, any source Lowest cost storage ERP, CRM, RDBMS, Machines Files, Images, Video, Logs, Clickstreams External Data Sources 13

The New Way: Bringing Compute to Data Maximize Benefit from All Your Data for Mission-Critical Jobs and Innovation Major Benefits Active Compliance Archive Full fidelity original data Indefinite time, any source Lowest cost storage Persistent Storage One source of data for all analytics Persist state of transformed data Significantly faster & cheaper Self-Service Exploratory BI Simple search + BI tools Schema on read agility Reduce BI user backlog requests Diverse Analytic Platform Bring applications to data Combine different workloads on common data (i.e. SQL + Search) True analytic agility SERVERS Data Marts EDW DOCUMENTS STORAGE SEARCH ARCHIVE ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS EXTERNAL DATA SOURCES 2014 Cloudera, Inc. All rights reserved. 14

The New Way: Bring Business Functions to Data Consolidate Relevant Services and Data in Multi-tenant Environment Major Benefits Compliant Centralized Self-Service Multiple Workloads MARKETING BACK OFFICE RISK 360 o VIEW RESEARCH MARKET LOGS TRANSACTIONS CUSTOMER INVESTMENT TRANSACTIONAL LENDING CREDIT CARDS 2014 Cloudera, Inc. All rights reserved. 15

The Modern Information Architecture Data Architects System Operators Engineers Data Scientists Analysts Business Users META DATA / ETL TOOLS CLOUDERA MANAGER CONVERGED APPLICATIONS MACHINE LEARNING BI / ANALYTICS ENTERPRISE REPORTING ENTERPRISE DATA HUB ENTERPRISE DATA WAREHOUSE ONLINE SERVING SYSTEM SYS LOGS WEB LOGS FILES RDBMS WEB/MOBILE APPLICATIONS Customers & End Users 16

Enterprise Data Hub: Key Attributes Enterprise Data Hub 1. Secure & Compliant Robust access controls Data encryption options Shared security policies Online NoSQL DBMS Analytic MPP DBMS Search Engine Batch Processing Resource Management Stream Processing Scale-out Storage For Cloudera Enterprise Hadoop HDFS Machine Learning Metadata, Security, Audit, Lineage The Platform for Big Data System Management Data Management 2. Enterprise Data Governance Meta data management Data lineage/tethering Audit histories 3. Unified & manageable Common storage & resource management On-premise & cloud Highly available (including DR) 4. Open Architecture Open source platform APIs & engines for multiple workloads Extensible for 3 rd parties 17

Customer Example Insurer: Real-time Enterprise Data Hub Comprehensive Risk Analysis Structured and unstructured data includes customer data, economic trends, telematic sensors, weather, public data Integrated with mainframes and EDWs Before Hadoop, could analyze only one state, took 24 hours With Cloudera, can analyze risk across all 50 states, in 16 hours (500x improvement) Customer Data Cloudera Hadoop EDW and Mainframe First 3 use cases: Data hub, ETL offload, advanced analytics 2014 Cloudera, Inc. All rights reserved. 18

Customer Example Mobile Telecom Provider: ETL Engine From 1% sampling to 100% analysis Exponential growth in data, generated by new consumer devices ETL and storage constraints limited analytics to 1% sample Now combined data warehouse and Cloudera Hadoop delivers analytics on 100% of data (half a PB per day!) Query times reduced dramatically (i.e. from 4 days to 53 minutes) 90% reduction of ETL code base Teleco m Service s Teleco m Service s Filter & Split Filter & Split Before Event Monitoring Streaming ETL Streaming ETL After Event Monitoring Hadoop Archive Storage ETL Correlation Stage 1 DWH Alerting Complex Correlation Data Warehouse Archive Storage Alerting Data Warehouse 2014 Cloudera, Inc. All rights reserved. 19

MasterCard Cloudera: The first PCI-Certified Hadoop Platform Challenge: All applications, databases, or file systems that have the potential to handle personal account-related data must undergo full PCI certification Solution: MasterCard s Cloudera environment fully conforms to the PCI-DSS V 2.0 security standards so it can host PCI datasets and potentially integrate with other internal systems Data privacy and protection is a top priority for MasterCard. As we maximize the most advanced technologies from partners and vendors, they must meet the rigorous security standards we ve set. With Cloudera s commitment to the same standards, we now have additional options in how we manage our data center. Gary VonderHaar Chief Technology Officer, Architecture MasterCard 20

The Most Complete Partner Ecosystem Applications Operational Tools Data Systems Enterprise Data Hub Process Discove r Model Security and Administration Unlimited Storage Serve More than 1,300 partners ensure compatibility with existing investments, lower skill barriers, and help maximize value from your data. System Integration Infrastructure 21

Why Cloudera? We deliver long-term production success with enterprise Hadoop. Open Source Innovation No one knows Hadoop better than Cloudera. Cloudera leads development of enterprise Hadoop and offers the best support, training, and services. Powerful Enterprise Tools Cloudera extends open source Hadoop with capabilities required by the largest enterprises. Ecosystem Cloudera partners with industry leaders to ensure Hadoop works with the platforms, tools, and integrators our customers rely on. Enterprise Security Meet compliance requirements and reduce risk exposure from storing sensitive data. Data Governance Enable compliance and maximize analyst productivity. Complete Management Deliver optimum system utilization and meet SLA commitments, on-premises or in the cloud, with minimum effort. 22

Thank You Bernard Doering, Cloudera EMEA bdoering@cloudera.com Tel. + 49 172 692 9837

From Hadoop to an Enterprise Data Hub Open Source Scalable Flexible Cost- Effective CLOUDERA S ENTERPRISE DATA HUB BATCH PROCESSING MAPREDUCE ANALYTIC SQL IMPALA SEARCH ENGINE SOLR MACHINE LEARNING SPARK WORKLOAD MANAGEMENT YARN STREAM PROCESSING SPARK STREAMING 3 RD PARTY APPS DATA MANAGEMENT CLOUDERA NAVIGATOR Managed Open Architecture STORAGE FOR ANY TYPE OF DATA FILESYSTEM HDFS UNIFIED, ELASTIC, RESILIENT,, SECURE SENTRY ONLINE NOSQL HBASE SYSTEM MANAGEMENT CLOUDERA MANAGER Secure and Governed 2014 Cloudera, Inc. All rights reserved. 24