Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing



Similar documents
Big Data and Its Impact on the Data Warehousing Architecture

Best Practices in Creating a Successful Business Intelligence Program

BIG DATA APPLIANCES. July 23, TDWI. R Sathyanarayana. Enterprise Information Management & Analytics Practice EMC Consulting

Big Data Technologies Compared June 2014

Advanced In-Database Analytics

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Il mondo dei DB Cambia : Tecnologie e opportunita`

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

HDP Enabling the Modern Data Architecture

Big Data and Data Science: Behind the Buzz Words

HDP Hadoop From concept to deployment.

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

BIG DATA CHALLENGES AND PERSPECTIVES

BIG DATA: ARE YOU READY? Andy Kyiet Demand Flow Intelligence May, 2013

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Tap into Hadoop and Other No SQL Sources

Big Data Analytics: Profiling the Use of Analytical Platforms in User Organizations

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

The Future of Data Management

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

SAP and Hortonworks Reference Architecture

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

BIG DATA TRENDS AND TECHNOLOGIES

Big Data solutions to support Intelligent Systems and Applications

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

I/O Considerations in Big Data Analytics

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Data Integration Checklist

IBM Data Warehousing and Analytics Portfolio Summary

Big Data Big Data/Data Analytics & Software Development

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

Doing Multidisciplinary Research in Data Science

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

Investor Presentation. Second Quarter 2015

Oracle Big Data SQL Technical Update

SAP Real-time Data Platform. April 2013

How To Scale Out Of A Nosql Database

EMC/Greenplum Driving the Future of Data Warehousing and Analytics

Big Analytics: A Next Generation Roadmap

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics Nokia

Composite Software Data Virtualization Turbocharge Analytics with Big Data and Data Virtualization

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

Hadoop Big Data for Processing Data and Performing Workload

Data Warehouse design

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Big + Fast + Safe + Simple = Lowest Technical Risk

Modern Data Warehouse

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

Introduction to Predictive Analytics. Dr. Ronen Meiri

BIG DATA What it is and how to use?

In-memory computing with SAP HANA

NoSQL for SQL Professionals William McKnight

Age of Big data. Presented by: Mohammad Iqbal BCM -2014

Using Attunity Replicate with Greenplum Database Using Attunity Replicate for data migration and Change Data Capture to the Greenplum Database

<Insert Picture Here> Big Data

Introducing Oracle Exalytics In-Memory Machine

A Survey on Big Data Concepts and Tools

INVESTOR PRESENTATION. First Quarter 2014

Cost-Effective Business Intelligence with Red Hat and Open Source

Open Source Business Intelligence Intro

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012

What happens when Big Data and Master Data come together?

Evolving Data Warehouse Architectures

NextGen Infrastructure for Big DATA Analytics.

Bringing the Power of SAS to Hadoop

Performance and Scalability Overview

Introduction to Analytics and Big Data - Hadoop. Rob Peglar EMC Isilon

Microsoft Analytics Platform System. Solution Brief

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Native Connectivity to Big Data Sources in MSTR 10

Big Data Realities Hadoop in the Enterprise Architecture

Agile Business Intelligence Data Lake Architecture

Getting Started Practical Input For Your Roadmap

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Big Data and Industrial Internet

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

SAS Enterprise Data Integration Server - A Complete Solution Designed To Meet the Full Spectrum of Enterprise Data Integration Needs

The Future of Data Management with Hadoop and the Enterprise Data Hub

Transcription:

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics TechTarget

What comes next? Kilobyte (KB) Megabyte (MB) Gigabyte (GB) Terabyte (TB) Petabyte (PB) Exabyte (EB) Zettabyte (ZB) Yottabyte (YB) 10 3 bytes 10 6 bytes 10 9 bytes 10 12 bytes 10 15 bytes 10 18 bytes 10 21 bytes 10 24 bytes 3

Information explosion Unstructured & Content Depot Structured & Replicated Source: IDC Digital Universe 2009; White Paper, Sponsored by EMC, May 2009 2005 2006 2007 2008 2009 2010 2011 2012 Every 18 months, non-rich structured and unstructured enterprise data doubles 4

Data deluge Structured data - Call detail records - Point of sale records - Claims data Semi-structured data - Web logs - Sensor data - Email, Twitter Unstructured data - Video, Audio, - Images, Text A Sea of Sensors, The Economist, Nov 4, 2010 5

Three Big Data revolutions Data warehousing (1995+) Analytical platforms (2005+) Hadoop ecosystem (2010+) Business Analytics TechTarget 6

First revolution: data warehousing Operational System Operational System ETL Data Data Warehouse ETL Data Mart BI Server Reports / Dashboards Operational System Operational System Business Analytics TechTarget 7

Second revolution: analytical platforms 1010data Aster Data (Teradata) Calpont Datallegro (Microsoft) Exasol Greenplum (EMC) IBM SmartAnalytics Infobright Kognitio Netezza (IBM) Oracle Exadata Paraccel Pervasive Sand Technology SAP HANA Sybase IQ (SAP) Teradata Vertica (HP) Purpose-built database management systems designed explicitly for query processing and analysis that provides dramatically higher price/performance and availability compared to general purpose solutions. Deployment Options -Software only (Paraccel, Vertica) -Appliance (SAP, Exadata, Netezza) -Hosted(1010data, Kognitio)

Game-changing technology Purpose built - For analytics in general - For specific analytic workloads Quicker to deploy - Preconfigured and tuned - Fast ROI Faster and more scalable - Faster query response times - Linear performance Built-in analytics - Libraries of functions - Extensible SDK Less costly - Less power, cooling, space - Fewer people to maintain

Business value of analytic platforms Kelley Blue Book Consolidates millions of auto transactions each week to calculate car valuations AT&T Mobility Tracks purchasing patterns for 80M customers daily to optimize targeted marketing CBS Interactive Analyzes Web visitor behavior to optimize content/ad placement and revenue Analytical appliance MPP Analytical Database Hadoop + Analytical database

Third Revolution - Hadoop Open source projects Hosted by Apache Foundation Initially developed by Google, Yahoo, etc. Offers scale out architecture on commodity servers with direct attached storage Business Analytics TechTarget 11

Hadoop distilled Click to edit Master title style Data scientist Open Source $$ Unstructured data BIG DATA MapReduce Distributed File System Schema at Read Benefits - Any data - Agile - Expressive - Affordable Drawbacks - Immature - Batch oriented - Security, concurrency, metadata, etc. - Expertise - TCO? 12

Click Hadoop to edit hype Master title style Overheard Hadoop will replace relational databases. Hadoop will replace data warehouses. Hadoop has a superior query engine compared to analytical platforms. Gartner Group Hype Cycle Use Hadoop for any application that requires more than one node. 13

Hadoop adoption rates No plans 38% Considering 32% Experimenting 20% Implementing 5% In production 4% Based on 158 respondents, BI Leadership Forum, April, 2012 14

Hadoop workloads Today In 18 Months Staging area Online archive Transformation Engine 83% 92% 92% 92% 92% 92% Ad hoc queries 58% 67% Scheduled reports 42% 67% Visual exploration 25% 67% Data mining 58% 83% Based on respondents that have implemented Hadoop. BI Leadership Forum, April, 2012 15

Hadoop s impact on the data warehouse Replaces it 0% Offloads existing workloads 50% Handles new workloads 67% Shares existing workloads 33% Shares new workloads 25% Don't know 8% Based on respondents that have implemented Hadoop. BI Leadership Forum, April, 2012 16

BI Framework 2020 17 Content Intelligence Keyword search, BI tools, Xquery, Hive, Java, etc. MapReduce, XML schema, Key-value pairs, graph notation, etc. Business Intelligence End-User Tools Reports and Dashboards HDFS, NoSQL databses Design Framework MAD Dashboards Architecture Data Warehousing Data Warehousing Reporting & Analysis Analytic Analytic Sandboxes Sandboxes CEP, Streams Event-driven Ad hoc query, Ad hoc Spreadsheets, SQL OLAP, Visual Analysis, Analytic Workbenches, Hadoop Excel, Access, OLAP, Data mining, visual exploration Analytics Intelligence Exploration Power Users Event-Driven Alerts and Dashboards Event detection and correlation Dashboard Alerts Continuous Intelligence

Pros: - Alignment -Consistency Cons: - Hard to build - Politically charged - Hard to change - Expensive - Schema Heavy Data Warehousing Architecture BI Framework TOP DOWN- Business Intelligence Corporate Objectives and Strategy Reporting & Monitoring (Casual Users) Predefined Metrics Non-volatile Data Reports Beget Analysis Analysis Begets Reports Pros: - Quick to build - Politically uncharged - Easy to change -Low cost Cons: - Alignment - Consistency - Schema Light Analytics Architecture Ad hoc queries Analysis and Prediction (Power Users) Processes and Projects 18 Volatile Data

The new analytical ecosystem Operational Systems (Structured data) Operational System Extract, Transform, Load (Batch, near real-time, or real-time) Streaming/ CEP Engine Casual User Operational System Machine Data Hadoop Cluster Data Warehouse Virtual Sandboxes Dept Data Mart BI Server Top-down Architecture Bottom-up Architecture Web Data Inmemory Sandbox Audio/video Data Free- Standing Sandbox External Data Documents & Text www.bileadership.com Analytic platform or nonrelational database 19 Power User

Analytical sandboxes Operational Systems (Structured data) Operational System Extract, Transform, Load (Batch, near real-time, or real-time) Streaming/ CEP Engine Casual User Operational System Machine Data Hadoop Cluster Data Warehouse Virtual Sandboxes Dept Data Mart BI Server Top-down Architecture Bottom-up Architecture Web Data Inmemory Sandbox Audio/video Data Free- Standing Sandbox External Data Documents & Text www.bileadership.com Analytic platform or nonrelational database 20 Power User

Recommendations Your BI architecture is now an analytical ecosystem Deploy analytical platforms to turbo-charge performance Explore Hadoop for big data Reconcile top-down and bottom-up BI environments Business Analytics TechTarget 21

Questions? Wayne Eckerson weckerson@techtarget.com Business Analytics TechTarget 22

Hadoop ecosystem Courtesy, Hortonworks, 2012. Business Analytics TechTarget 23