BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP

Similar documents
Big Data Can Drive the Business and IT to Evolve and Adapt

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

The Future of Data Management

Apache Hadoop in the Enterprise. Dr. Amr Awadallah,

The Future of Data Management with Hadoop and the Enterprise Data Hub

NEWLY EMERGING BEST PRACTICES FOR BIG DATA

The Enterprise Data Hub and The Modern Information Architecture

Luncheon Webinar Series May 13, 2013

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

HDP Hadoop From concept to deployment.

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Getting Started Practical Input For Your Roadmap

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

How the oil and gas industry can gain value from Big Data?

VIEWPOINT. High Performance Analytics. Industry Context and Trends

Are You Ready for Big Data?

Modern Data Architecture for Predictive Analytics

Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO. Big Data Everywhere Conference, NYC November 2015

Big Data and Trusted Information

HDP Enabling the Modern Data Architecture

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Are You Ready for Big Data?

Using Tableau Software with Hortonworks Data Platform

Information Architecture

Extend your analytic capabilities with SAP Predictive Analysis

Putting Apache Kafka to Use!

Oracle Big Data SQL Technical Update

How To Handle Big Data With A Data Scientist

Business Intelligence for Big Data

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Ganzheitliches Datenmanagement

How to Enhance Traditional BI Architecture to Leverage Big Data

SAP and Hortonworks Reference Architecture

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Bringing the Power of SAS to Hadoop. White Paper

Big Data & Analytics. The. Deal. About. Jacob Büchler jbuechler@dk.ibm.com Cand. Polit. IBM Denmark, Solution Exec IBM Corporation

The Internet of Things and Big Data: Intro

Data Virtualization A Potential Antidote for Big Data Growing Pains

BEYOND BI: Big Data Analytic Use Cases

IBM Big Data Platform

Testing Big data is one of the biggest

Traditional BI vs. Business Data Lake A comparison

Cisco IT Hadoop Journey

Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

NoSQL for SQL Professionals William McKnight

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

Managing Data in Motion

Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Big Data Analytics Nokia

BIG DATA What it is and how to use?

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

The 4 Pillars of Technosoft s Big Data Practice

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

A New Era Of Analytic

Information Builders Mission & Value Proposition

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE

Real Time Big Data Processing

Big Data: Making Sense of it all!

Beyond Lambda - how to get from logical to physical. Artur Borycki, Director International Technology & Innovations

Talend Big Data. Delivering instant value from all your data. Talend

Big Data Analytics Best Practices

Driving Growth in Insurance With a Big Data Architecture

Big Data Course Highlights

A Case Study of Hadoop in Healthcare

Modernizing Your Data Warehouse for Hadoop

Oracle Big Data Spatial & Graph Social Network Analysis - Case Study

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

Oracle Big Data Essentials

Big Data and Apache Hadoop Adoption:

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

More Data in Less Time

Big Data for the Rest of Us Technical White Paper

Big Data Realities Hadoop in the Enterprise Architecture

Where is... How do I get to...

Exploiting Data at Rest and Data in Motion with a Big Data Platform

How To Use Big Data For Business

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Making Sense of Big Data in Insurance

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

Data Virtualization and ETL. Denodo Technologies Architecture Brief

Deploying an Operational Data Store Designed for Big Data

Interactive data analytics drive insights

An Oracle White Paper October Oracle: Big Data for the Enterprise

Transcription:

BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP Business Analytics for All Amsterdam - 2015

Value of Big Data is Being Recognized Executives beginning to see the path from data insights to revenue and profit Big data often illuminates behavior Analytic sandbox results taken directly to management Data is becoming an asset on the balance sheet Valued based on Cost to produce the data Cost to replace the data if it is lost Revenue or profit opportunity provided by the data Revenue or profit loss if data falls into competitors hands Legal exposure from fines and lawsuits if data is exposed to the wrong parties

Big Data Analytic Use Cases Behavior tracking Search ranking Ad tracking Location and proximity tracking Causal factor discovery Social Media Share of voice, audience engagement, conversation reach, active advocates, advocate influence, advocacy impact, resolution rate, resolution time, satisfaction score, topic trends, sentiment ratio, and idea impact Financial account fraud detection/intervention System hacking detection/intervention On line game gesture tracking

More Sample Big Data Use Cases Non-numeric data and unique algorithms Document similarity testing Genomics analysis Cohort group discovery Satellite image comparison CAT scan comparisons Big science data collection Complex numeric data Smart utility meters Building sensors In flight aircraft status

The Challenge Legacy RDBMS s have been hugely successful And we will continue to use them The traditional pure relational data warehouse architecture can t handle some of the use cases New strategic data types Unstructured, semi-structured, machine data Evolving, just-in-time schema Links, images, geo-positions, genomes, log files

The Larger Picture Why Hadoop as Part of EDW? EDW mission unchanged: Publish the Right Data Tactical Lowered operational costs Linear scalability Reliable, redundantly store data archive Improve ETL SLA s Strategic Support variety of new data sources New analytics near impossible in RDBMS Schema on read for exploratory analysis Archive - Keep granular data forever

Hadoop Data Warehouse Configurations Hadoop as an ETL engine Text and numbers loaded into a conventional RDBMS Leverage Hadoop and data lake/data hub to feed EDW Extended RDBMS environment Extend the current formidable RDBMS legacy Add features/functions to address big data analytics Hadoop as a destination environment Direct querying of text and numbers and other data types with augmented SQL in Hive, Impala or equivalent Simultaneous co-existence of same data sets used by NoSQL analytic applications Active archive of all data, forever Complementary analytic sandbox Stand up Hadoop environment next to EDW environment

Extended Relational Data Warehouse with Big Data Additions

Hadoop Can Support a Multi-Modal Destination EDW: Enterprise Data Hub Hadoop Distributed File System HDFS Sources: Transactions Free text Images Machines/ Sensors Links/ Networks Commodity hardware Profoundly distributed Scalable to infinity PB+ Agnostic content Implemented in standard Java High speed streaming in/out Fault tolerant Write once, always replicated HDFS is NOT a database, it is a file system Flume, Sqoop, ETL Tools: Files: Metadata: HCatalog SQL Apps: Hive Impala BI Tools: Tableau, Cognos, Bus Obj,

10 Complementary Analytic Sandbox Credit: Bill Schmarzo, EMC, https://infocus.emc.com/william_schmarzo/schmarzos-2015-bigdata-predictions/

Implement Caches of Increasing Latency Each cache supports different class of applications Data quality improves left to right High value facts and dimensions pushed right to left

Focus on the Integration Challenge Data warehouse integration is drilling across: Establish conformed attributes (e.g., Customer Category) in each database Fetch separate answer sets from different platforms grouped on the conformed attributes Sort-merge the answer sets at the BI layer

Comparing the Two Architectures Relational DBMSs Proprietary, mostly Open source Expensive Less expensive MapReduce/Hadoop Data requires structuring Data does not require structuring Great for speedy indexed lookups Great for massive full data scans Deep support for relational Indirect support for relational semantics semantics, e.g. Hive Indirect support for Deep support for complex data structures complex data structures Indirect support for iteration, Deep support for iteration, complex branching complex branching Deep support for Little or no support for transaction processing transaction processing

Whither the Data Warehouse? Corral and embrace the big data analysts Analysts and IT must meet each other half way Insist on using shared data warehouse resources Conformed dimensions - integrate now, avoid data silos! Virtualized data sets - quick prototypes, migrate to prod. Build a cross department analytic community with IT partnership Big Data and the EDW Hadoop becomes partnered with the EDW Hadoop to support new data types and analytics BI tools must step up to support the final integration

End