P4.1 Reference Architectures for Enterprise Big Data Use Cases Romeo Kienzler, Data Scientist, Advisory Architect, IBM Germany, Austria, Switzerland



Similar documents
Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

The Inside Scoop on Hadoop

BIG DATA APPLIANCES. July 23, TDWI. R Sathyanarayana. Enterprise Information Management & Analytics Practice EMC Consulting

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

How To Scale Out Of A Nosql Database

IBM Big Data in Government

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Information Architecture

Big Data Technologies Compared June 2014

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

TECHNOLOGY TRANSFER PRESENTS MIKE FERGUSON BIG DATA MULTI-PLATFORM JUNE 25-27, 2014 RESIDENZA DI RIPETTA - VIA DI RIPETTA, 231 ROME (ITALY)

Sunnie Chung. Cleveland State University

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Raul F. Chong Senior program manager Big data, DB2, and Cloud IM Cloud Computing Center of Competence - IBM Toronto Lab, Canada

ANALYTICS CENTER LEARNING PROGRAM

So What s the Big Deal?

UNIFY YOUR (BIG) DATA

IBM BigInsights for Apache Hadoop

The 4 Pillars of Technosoft s Big Data Practice

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Consulting and Systems Integration (1) Networks & Cloud Integration Engineer

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

How To Create A Data Science System

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Survey of Big Data Architecture and Framework from the Industry

Microsoft Big Data. Solution Brief

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

Big Data and Analytics in Government

How To Handle Big Data With A Data Scientist

Big Data and Data Science: Behind the Buzz Words

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Driving Peak Performance IBM Corporation

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Business Intelligence for Big Data

Cost-Effective Business Intelligence with Red Hat and Open Source

NoSQL for SQL Professionals William McKnight

Marketing Analytics Technology Overview

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

White Paper: Datameer s User-Focused Big Data Solutions

Architecting for the Internet of Things & Big Data

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Big Data. Lyle Ungar, University of Pennsylvania

Ganzheitliches Datenmanagement

Tap into Hadoop and Other No SQL Sources

Protecting Big Data Data Protection Solutions for the Business Data Lake

The Future of Data Management

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Big Data Multi-Platform Analytics (Hadoop, NoSQL, Graph, Analytical Database)

Big Data Zurich, November 23. September 2011

MINIMIZING THE FIVE LATENCIES IN

How To Use Hp Vertica Ondemand

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

BIG Data Analytics Move to Competitive Advantage

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY

Step by Step: Big Data Technology. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

Introducing Oracle Exalytics In-Memory Machine

Big Data Explained. An introduction to Big Data Science.

Big Data at Cloud Scale

More Data in Less Time

Big Data Defined Introducing DataStack 3.0

Data Refinery with Big Data Aspects

Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics

How To Use Big Data For Business

IBM InfoSphere BigInsights Enterprise Edition

Poslovni slučajevi upotrebe IBM Netezze

INVESTOR PRESENTATION. First Quarter 2014

Performance and Scalability Overview

Teradata s Big Data Technology Strategy & Roadmap

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Big Data Big Data/Data Analytics & Software Development

Real Time Big Data Processing

Hadoop Big Data for Processing Data and Performing Workload

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

GridGain In- Memory Data Fabric: UlCmate Speed and Scale for TransacCons and AnalyCcs

Enabling Big Data with Cloud. Go faster Reduce risk Scale as you grow Avoid mistakes

Disrupting The Market: Predictive Analytics As A Service

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

Luncheon Webinar Series May 13, 2013

The Rembrandt Group Strategies for BIG DATA

Investor Presentation. Second Quarter 2015

Accelerating and Simplifying Apache

ADVANCED ANALYTICS AND FRAUD DETECTION THE RIGHT TECHNOLOGY FOR NOW AND THE FUTURE

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Analyzing Big Data with AWS

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Big Data Analytics Nokia

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Report Data Management in the Cloud: Limitations and Opportunities

Transcription:

P4.1 Reference Architectures for Enterprise Big Data Use Cases Romeo Kienzler, Data Scientist, Advisory Architect, IBM Germany, Austria, Switzerland IBM Center of Excellence for Data Science, Cognitive Systems and BigData (A joint-venture between IBM Research Zurich and IBM Innovation Center DACH)

Agenda Motivation Use Cases How Databases scale Evolution of Large Scale Data Processing Requirements and Ingredients Architectural Proposal

Motivation the World before 2000.coms MySQL, Postgres Traditional Enterprises DB2, Oracle, Teradata

Motivation the World after 2000.coms NoSQL Traditional Enterprises DB2, Oracle, Teradata

Use Cases Basic Idea Use ALL available data independently weather it is

Use Cases Basic Idea Use ALL available data independently weather it is Inside

Use Cases Basic Idea Use ALL available data independently weather it is Inside or outside your company

Use Cases Basic Idea Use ALL available data independently weather it is Inside or outside your company Structured

Use Cases Basic Idea Use ALL available data independently weather it is Inside or outside your company Structured, semi-structured

Use Cases Basic Idea Use ALL available data independently weather it is Inside or outside your company Structured, semi-structured, unstructured

Use Cases Basic Idea Use ALL available data independently weather it is Inside or outside your company Structured, semi-structured, unstructured or binary

Use Cases Basic Idea Use ALL available data independently weather it is Inside or outside your company Structured, semi-structured, unstructured or binary At rest

Use Cases Basic Idea Use ALL available data independently weather it is Inside or outside your company Structured, semi-structured, unstructured or binary At rest or in motion

How do Databases Scale? Scale-out because Scale-up not possible Why? There is a sweet spot (global optimum) for the ideal node size Determines the number of cluster nodes Because node price vs node size is not linear

How do Databases Scale, Conclusion To minimize overall cluster price Use many node because of rather small node size Use commodity hardware Fault tolerance Won't go into CAP theorem here Google For dynamic workloads and dynamic scale-in/out CLOUD

Current situation Current situation in Enterprises BI Tools SQL (42%) R (33%) Python (26%) Excel (25%) Java, Ruby, C++ (17%) SPSS, SAS (9%) Current situation in.coms Writing MapReduce Jobs Using proprietary query languages Code everything from scratch

Evolution Programing language implementations Usage of high level query languages Pig Jaql SQL or SQL like BigSQL HQL CQL R push back BigR Rhadoop BigData spread sheets BI push back SPSS push back

Pig(Latin) / JAQL

BigSQL

Big Spreadsheets

BigR

SPSS

SPSS

Business Intelligence

Requirements Summary Fault tolerance Dynamic and elastic scale-in and out Processing data of all types Use familiar ways of working with data

Ingredients NoSQL DB Cloud Push-back from Business Applications

IBM Reference Architecture (current) Information Interaction Traditional Sources Information Ingestion Applications Visualization, Visualization, Data DataMining Mining&& Exploration Exploration Analytics Sources Warehouse WarehouseZone Zone Landing Area Zones Landing Area Zones 3 Party rd Exploration ExplorationZone Zone (Discovery (DiscoveryArea) Area) Integrated IntegratedWarehouse Warehouse &&Marts MartsZone Zone (Guided (GuidedAnalytical Analytical Area) Area) Shared Operational Information Master Master Data DataHubs Hubs Reference Reference Data DataHubs Hubs Governing Systems Security and Business Continuity Management Content Content Repository Repository User UserReports Reports && Dashboards Dashboards Accelerators Accelerators&& Application Application Frameworks Frameworks User UserGuided Guided Applications Applications&& Advanced Advanced Analytics Analytics

IBM Reference Architecture (transition)

15 minutes Discussion Twitter: @romeokienzler