BI/Analytics for NoSQL: Review of Architectures



Similar documents
How To Handle Big Data With A Data Scientist

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

The 3 questions to ask yourself about BIG DATA

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

Big Data Analytics Nokia

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Data Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Zynga Analytics Leveraging Big Data to Make Games More Fun and Social

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Microsoft Services Exceed your business with Microsoft SharePoint Server 2010

Business Intelligence for Big Data

MapR: Best Solution for Customer Success

Powerful analytics. and enterprise security. in a single platform. microstrategy.com 1

Big Data on Microsoft Platform

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

Architecting for the Internet of Things & Big Data

Using distributed technologies to analyze Big Data

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Native Connectivity to Big Data Sources in MSTR 10

Next-Generation Cloud Analytics with Amazon Redshift

Big Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016

Azure Data Lake Analytics

INTRODUCTION TO CASSANDRA

Sisense. Product Highlights.

IBM Cognos 8 Business Intelligence Analysis Discover the factors driving business performance

Ernesto Ongaro BI Consultant February 19, The 5 Levels of Embedded BI

BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

INTEROPERABILITY OF SAP BUSINESS OBJECTS 4.0 WITH GREENPLUM DATABASE - AN INTEGRATION GUIDE FOR WINDOWS USERS (64 BIT)

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Moving From Hadoop to Spark

Traditional BI vs. Business Data Lake A comparison

Big Data Visualization and Dashboards

A Comprehensive Review of Self-Service Data Visualization in MicroStrategy. Vijay Anand January 28, 2014

Understanding the Value of In-Memory in the IT Landscape

CRGroup Whitepaper: Digging through the Data. Reporting Options in Microsoft Dynamics GP

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013

WINDOWS AZURE DATA MANAGEMENT

Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.

Data Virtualization A Potential Antidote for Big Data Growing Pains

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges

Big Data on the Open Cloud

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Ad Hoc Analysis of Big Data Visualization

Tableau Visual Intelligence Platform Rapid Fire Analytics for Everyone Everywhere

An Architectural Review Of Integrating MicroStrategy With SAP BW

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot

NEWLY EMERGING BEST PRACTICES FOR BIG DATA

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Business Intelligence & Product Analytics

Artur Borycki. Director International Solutions Marketing

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Big Data Analytics on Cab Company s Customer Dataset Using Hive and Tableau

SAP BO 4.1 COURSE CONTENT

Tap into Hadoop and Other No SQL Sources

Big Data and Your Data Warehouse Philip Russom

Performance and Scalability Overview

Using Tableau Software with Hortonworks Data Platform

The Lab and The Factory

Reference Architecture, Requirements, Gaps, Roles

A very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop

Modern Data Warehouse

The Principles of the Business Data Lake

QAD Business Intelligence

Real Time Big Data Processing

Welcome to online seminar on. Oracle Agile PLM BI. Presented by: Rapidflow Apps Inc. January, 2011

uncommon thinking ORACLE BUSINESS INTELLIGENCE ENTERPRISE EDITION ONSITE TRAINING OUTLINES

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

How to Enhance Traditional BI Architecture to Leverage Big Data

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

Welcome to The Future of Analytics In Action

<Insert Picture Here> Oracle and/or Hadoop And what you need to know

<Insert Picture Here> Oracle BI Standard Edition One The Right BI Foundation for the Emerging Enterprise

Step by Step: Big Data Technology. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015

A Comparative Study on Operational Database, Data Warehouse and Hadoop File System T.Jalaja 1, M.Shailaja 2

Integrated Enterprise Reporting

Big Data and Analytics in Government

Navigating the Big Data infrastructure layer Helena Schwenk

Architectures for Big Data Analytics A database perspective

Splice Machine: SQL-on-Hadoop Evaluation Guide

Performance and Scalability Overview

Big Data Introduction

WELCOME TO THE WORLD OF BIG DATA. NEW WORLD PROBLEMS, NEW WORLD SOLUTIONS

BI, Analytics and Big Data A Modern-Day Perspective

Introduction to Oracle Business Intelligence Standard Edition One. Mike Donohue Senior Manager, Product Management Oracle Business Intelligence

Transcription:

BI/Analytics for NoSQL: Review of Architectures

What we'll answer in 50 minutes Who is this guy? How do I enable AdHoc, self service reporting on NoSQL? How do I improve the performance of dashboards on top of NoSQL? How do I integrate NoSQL data with my other data not inside NoSQL? How do I enable, easy to build simple reports but also preserve the ability for rich NoSQL queries?

Nicholas Goodman Open Source BI thought leader 50+ Open Source BI customer projects Blogger, whitepapers, etc Entrepreneur DynamoBI Corporation Bayon Technologies, Inc. Data Geek, hacker, tinkerer, committer GOAL: Share perspectives, research, opinions. DISCLAIMER: Your Mileage...

How do we answer those Q's?

Promise of Big Data NoSQL/Hadoop/MapReduce Systems Keep more of it Cost effective analysis Massive scale data, now accessible to everyone (elastic) Not just SQL queries, more complex analysis ACCOMPLISHED: WEB SCALE, MASSIVE NEVER BEFORE SEEN SCALE OF DATA STORAGE AND PROCESSING

Reality Check! Petabytes? Y Cheap Storage? Y Raw Processing? Y Rich Query Languages? Y Flexible data structures? Y Reliable, Fault Tolerant? Y Fast Queries? N Ad Hoc access? N Accessibility to commodity BI tools? N Easy report authoring? N Levels of Aggregation? N Integrated Data? N Big Data has solved the INFRASTRUCTURE of raw/core data storage but has provided less value to what BUSINESS users want for analytics.

Data Gaps too! Code, Developers MR, Rich Graph/Access Hierarchical, Unstructured Analysts w/ Excel, Dashboards Simple 2D (tables, charts) Filtering and easy analytics

Levels of Aggregation SAME DATA AT VARIOUS LEVELS OF AGGREGATION HUGELY IMPORTANT IN REAL LIFE IMPLEMENTATIONS! 1 ROW TO 1 BILLION ROWS 10K 1 MILLION 100 MILLION 100 BILLION

Architectures NoSQL reports NoSQL thru and thru NoSQL + MySQL NoSQL as ETL Source NoSQL programs in BI Tools NoSQL via BI Database (SQL)

NoSQL reports Pay Developer to build applications for reports Apps 100% Richness of NoSQL Up to date, current Excellent performance on large datasets Custom built, beautiful reports/dashboards Single system to manage $$, developer driven process No commodity BI tools Managing rollups/summaries Schema-less = Harder! Hard to integrate other reporting information

NoSQL thru and thru Pay Developer to build FLEXIBLE applications for reports Indices Aggs Advanced Apps All of NoSQL report advantages Managed aggregations, rollups Guided Adhoc available inside application Higher performance for dashboards/summaries $$, developer driven process $$, app required for aggs No commodity BI tools Hard to integrate other reporting information Limited AdHoc (only developer built combinations)

NoSQL + MySQL Pay Developer to build FLEXIBLE applications for reports ETL App MySQL Less IT $$ since developers aren't building reports Rich, NoSQL analysis left in place (ETL + NoSQL) Easy, Ad Hoc reporting via commodity BI tools Easier to understand data for self service reports Data freshness (24 hrs old) Once into MySQL no rich NoSQL application use (M/R) BI Tool can connect ONLY to data in MySQL, not NoSQL Aggregations still self managed in MySQL

NoSQL as ETL Data Source NoSQL treated like any other data source Informatica Teradata Allows use of consolidated, BI tool for AdHoc Enables integrated (combined) datasets for reporting Aggregations Often managed Best of Breed tools ETL Development Expense Data Latency Loss of NoSQL language richness Traditional DW tools are $$ Scaling issues with DW Database

NoSQL programs in BI Tools Write a program in BI tool that flattens data, output into report Rich use of NoSQL native language Direct, up to date access Access to 100% of dataset Leverage guided report parameter pages Less expensive than apps Developer required to write program ($$) Slow-er (aggs, summaries) Lacks integration with other datasets Still (usually) no AdHoc access

NoSQL via BI Database (SQL) Enable NoSQL data access via SQL (gasp!) Live Query Cached, 24hr data Easy reports, easy (SQL) Integration with other data ETL is simple INSERT/MERGEs Live, up to date access High performance, cached data AdHoc access to Live + Cached Aggregations/Summaries Another system in between Still needs to be refreshed, nightly Not all capabilities for NoSQL richness available via SQL

Mozilla: NoSQL thru and thru(db) Socorro Project: Crash reports, optionally sent to Mozilla https://crash-stats.mozilla.com

X: NoSQL via SQL Using Splunk (ie, a commercial NoSQL-eee data aggregator/etc) Desire to use Tableau for advanced analytics/visualization

Meteor Solutions: NoSQL thru and thru Using Cloudant BigCouch solution (SaaS) High performance set of multi purpose indices on pre defined aggregations Up to date aggregation/reports Better fit for Social Media graph structures over relational DB Custom built BI applications (dashboards/reports) providing a flexible guided view through data Advanced Apps

A,B,C: NoSQL + MySQL Many Many companies (3 we've worked with) All web related companies (semi structured, some, mostly volume) Heavy lifting and storage, and ETL/Data prepartion inside Hadoop Push summarized, aggregated data into MySQL for analysis by easy, dashboarding/bi Tools ETL App MySQL