Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy



Similar documents
Native Connectivity to Big Data Sources in MSTR 10

Tap into Hadoop and Other No SQL Sources

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

How to Enhance Traditional BI Architecture to Leverage Big Data

How Companies are! Using Spark

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Real Time Big Data Processing

Big Data at Cloud Scale

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

Microsoft SQL Server 2012 with Hadoop

Implement Hadoop jobs to extract business value from large and varied data sets

Ganzheitliches Datenmanagement

INDUS / AXIOMINE. Adopting Hadoop In the Enterprise Typical Enterprise Use Cases

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

MICROSTRATEGY ANALYTICS PLATFORM. MicroStrategy The World s!most Comprehensive Business. Analytics Platform!

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Three Reasons Why Visual Data Discovery Falls Short

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Bringing the Power of SAS to Hadoop. White Paper

The Future of Data Management

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

CIO Guide How to Use Hadoop with Your SAP Software Landscape

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Workshop on Hadoop with Big Data

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

#TalendSandbox for Big Data

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

HDP Hadoop From concept to deployment.

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Big Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016

SQLSaturday #399 Sacramento 25 July, Big Data Analytics with Excel

SAP and Hortonworks Reference Architecture

Luncheon Webinar Series May 13, 2013

Hadoop Ecosystem B Y R A H I M A.

Unified Batch & Stream Processing Platform

Customized Report- Big Data

# Not a part of 1Z0-061 or 1Z0-144 Certification test, but very important technology in BIG DATA Analysis

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Hadoop implementation of MapReduce computational model. Ján Vaňo

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Big Data & Cloud Computing. Faysal Shaarani

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

Hadoop Big Data for Processing Data and Performing Workload

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Hadoop & Spark Using Amazon EMR

BIRT in the World of Big Data

Navigating the Big Data infrastructure layer Helena Schwenk

Actian SQL in Hadoop Buyer s Guide

The Internet of Things and Big Data: Intro

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

More Data in Less Time

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Hadoop and Map-Reduce. Swati Gore

Information Architecture

Approaches for parallel data loading and data querying

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

How To Scale Out Of A Nosql Database

Big Data on Microsoft Platform

BIG DATA CHALLENGES AND PERSPECTIVES

<Insert Picture Here> Big Data

Constructing a Data Lake: Hadoop and Oracle Database United!

Internals of Hadoop Application Framework and Distributed File System

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Self-service BI for big data applications using Apache Drill

Self-service BI for big data applications using Apache Drill

How to Build MicroStrategy Projects on Top of Big Data Sources in the Cloud

Big Data and Hadoop with Components like Flume, Pig, Hive and Jaql

BIG DATA TRENDS AND TECHNOLOGIES

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Large scale processing using Hadoop. Ján Vaňo

Salesforce.com and MicroStrategy. A functional overview and recommendation for analysis and application development

BIG DATA What it is and how to use?

WHITE PAPER. Building Big Data Analytical Applications at Scale Using Existing ETL Skillsets INTELLIGENT BUSINESS STRATEGIES

Data Warehouse design

}w!"#$%&'()+,-./012345<ya

Using Tableau Software with Hortonworks Data Platform

Market Overview: Big Data Integration

Big Data Management and Security

P4.1 Reference Architectures for Enterprise Big Data Use Cases Romeo Kienzler, Data Scientist, Advisory Architect, IBM Germany, Austria, Switzerland

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Interactive data analytics drive insights

Information Builders Mission & Value Proposition

Improving Data Processing Speed in Big Data Analytics Using. HDFS Method

Big Data and Market Surveillance. April 28, 2014

Transcription:

Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy

Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics Platform connect to Hadoop prior to MicroStrategy Release 10? Hadoop YARN How can MicroStrategy Analytics Platform connect to Hadoop with Release 10 (Big Data Engine (BDE))? Usage patterns for MicroStrategy with Hadoop as a Data Source Demo of an Use Case

Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single Database MapReduce & NOSQL Databases Elastic Map Reduce BigInsights Distribution HDFS Columnar Databases Redshift Data Warehouse Appliances HANA Parallel Data Warehouse Relational Databases Multidimensional Databases Analysis Services SaaS-Based App Data Google Analytics Zendesk Generic Web Services SOAP REST Generic Web Services with OAuth..many more.. User / Departmental Data Clipboard MicroStrategy Dataset

Why Hadoop? By 2016, the expectation is majority of enterprise data will be processed in Apache Hadoop Why? Volume Volume Orders of magnitude larger than conventional data (Petabytes, Exabytes) Use commodity hardware Variety Variety Structured, semi-structured, unstructured formats Velocity Velocity Speed of ingesting incoming data streams

Why Hadoop? Hadoop can be challenging! CONNECTIVITY ODBC/JDBC/PIG Connector DATAWAREHOUSE INFRASTRUCTURE Hive SQL on Hadoop Pig DATA PROCESSING MapReduce Framework DATA STORAGE Hadoop Distributed File Systems (HDFS) Traditionally, connection to HDFS need intermediate layers and overhead (Hive/ Pig, etc.) that generate MapReduce jobs. MapReduce can be relatively complicated, and harder skill to master. MapReduce is a model for processing large data sets with a parallel, distributed algorithm on a Hadoop cluster.

MapReduce ODBC / JDBC How does MicroStrategy connect with Hadoop prior to MicroStrategy 10? SQL on Hadoop Translates SQL to MapReduce (Cloudera Impala, BigSQL, Shark) Apache Hive Translates HiveQL to MapReduce Apache Pig Pig-Latin script to generate MapReduce All need an additional layer(s) and overhead (ODBC or JDBC) between MicroStrategy and HDFS MicroStrategy generates either SQL (SQL on Hadoop), HiveQL or Freeform Pig, that in turn create MapReduce jobs to get data from HDFS

Hadoop 2.0 - YARN Prior to Hadoop 2.0 to crunch data in Hadoop you wrote or generated MapReduce via Hive PIG SQL on Hadoop Hadoop 2.0 YARN Yet Another Resource Negotiator YARN s execution model is more generic than the earlier MapReduce implementation. YARN can run applications that do not follow the MapReduce model Hadoop YARN is an attempt to take Apache Hadoop beyond MapReduce for dataprocessing. The current Apache MapReduce version is built over Apache YARN Internal MicroStrategy tests show a 5x improvement in speed when moving to YARN External data shows 3 min for YARN vs. 3 hours for MapReduce! MapReduce paradigm is hard to master! How YARN Opens Doors to Easier Programming Tools for Hadoop 2.0 Users - John Lilley

MicroStrategy Taps into Hadoop Natively using YARN MicroStrategy 9.4.1 and Prior Hadoop MicroStrategy Analytics Platform Hive ODBC Connector Hadoop Distribution Hive HDFS MicroStrategy 10 NEW MicroStrategy Analytics Platform Big Data Engine / Hadoop Gateway Hadoop HDFS Big Data Engine / Hadoop Gateway gives us higher performance, as we bypass the Hive/MapReduce layer and use YARN Ability to consume unstructured data natively from Hadoop! No ODBC or other overhead 12

Connect Live How does MicroStrategy Big Data Engine (BDE) / Hadoop Gateway work? Big Data Engine (BDE) / Hadoop Gateway is a native YARN application that enables direct access to HDFS BDE component would be installed on the Hadoop cluster Data Data. partition partition Parallel Partitioned In-Memory Cube BDE creates the metadata on the fly when files are selected and imported With internal testing, BDE/Hadoop Gateway is at least 5 times faster than other comparable Hive based technologies. Data Node Big Data Execution Engine Data Node Big Data Execution Engine Name Node Big Data Query Engine Hadoop Cluster

Usage Patterns for MicroStrategy with Hadoop as a Data Source 1.Visually explore subject matter extract in-memory through a one-time query to Hadoop 2.Self-service parameterized queries directly to Hadoop 3.Model-driven access to Hadoop. 4.Query multi-source schema model and drill down among Intelligent Cubes, EDW, Hive Multi-dimensional Business Model RDBMS ETL Maturity of Data Access

Three Steps for Self Service Access to Hadoop with Native Connectivity Import Data from HDFS directly Cleanse, Refine with Data Wrangler Analyze with Visual Insight Cleanse, refine and transform data from HDFS, make it ready for analysis. Designed for business users Get full insights from Hadoop/HDFS data using Visual Insight Web logs, survey/feedback forms, machine generated data 15

Demo Demo

Questions? Q&A