Tap into Hadoop and Other No SQL Sources



Similar documents
#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Native Connectivity to Big Data Sources in MSTR 10

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

The Inside Scoop on Hadoop

How to Build MicroStrategy Projects on Top of Big Data Sources in the Cloud

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

The Future of Data Management

Bringing Big Data to People

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Microsoft SQL Server 2012 with Hadoop

Using Tableau Software with Hortonworks Data Platform

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

So What s the Big Deal?

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

BIG DATA TRENDS AND TECHNOLOGIES

Big Data Technologies Compared June 2014

Analyzing Big Data with AWS

How To Handle Big Data With A Data Scientist

Big Data on Microsoft Platform

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Microsoft Big Data. Solution Brief

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

The Enterprise Data Hub and The Modern Information Architecture

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

HDP Hadoop From concept to deployment.

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

How To Scale Out Of A Nosql Database

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Big Data and Hadoop for the Executive A Reference Guide

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Luncheon Webinar Series May 13, 2013

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Hadoop. for Oracle database professionals. Alex Gorbachev Calgary, AB September 2013

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Business Intelligence for Big Data

Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.

<Insert Picture Here> Big Data

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Data Warehouse design

SQLSaturday #399 Sacramento 25 July, Big Data Analytics with Excel

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

BIG DATA CHALLENGES AND PERSPECTIVES

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Big Data at Cloud Scale

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Market Overview: Big Data Integration

A Survey on Big Data Analytical Tools

The Next Wave of Data Management. Is Big Data The New Normal?

Hadoop and Map-Reduce. Swati Gore

Big Data: Tools and Technologies in Big Data

HDP Enabling the Modern Data Architecture

Big Data and Market Surveillance. April 28, 2014

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Apache Hadoop: The Big Data Refinery

Bringing the Power of SAS to Hadoop

Step by Step: Big Data Technology. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015

Actian SQL in Hadoop Buyer s Guide

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012

Please give me your feedback

Modernizing Your Data Warehouse for Hadoop

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

WHITE PAPER. Four Key Pillars To A Big Data Management Solution

Ganzheitliches Datenmanagement

Information Builders Mission & Value Proposition

Next-Generation Cloud Analytics with Amazon Redshift

NZ BI User Group Auckland 18 September, Big Data Analytics with PowerPivot and Power View

Big Data Multi-Platform Analytics (Hadoop, NoSQL, Graph, Analytical Database)

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Bringing Big Data into the Enterprise

Apache Hadoop in the Enterprise. Dr. Amr Awadallah,

White Paper: Datameer s User-Focused Big Data Solutions

CIO Guide How to Use Hadoop with Your SAP Software Landscape

Transcription:

Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru

What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data (Terabytes, Petabytes, Exabytes) Cost-prohibitive or practically impossible to store, manage or analyze in typical database software Variety Variety Structured, semi-structured, unstructured formats Diverse sources - complex event processing, application logs, machine sensors, social media data Velocity Velocity Distribute the same analytics across the organization Reuse, improve and extend easily 2

Use Case Categories for Big Data Four broad categories of Big Data sources and their value Traditional sources becoming bigger Digital exhaust from interactions SOURCE Company, Government, Financial sector, Business and consumer studies, Surveys, Polls SOURCE Online click-stream, Application logs, Call/service records, ID scans, Security cameras VALUE All business performance drivers Operational efficiency, Revenue management, Strategic planning VALUE New revenue sources, Consumer promotions, Risk management, Fraud detection Web 2.0 phenomenon Internet of things SOURCE Content generated from social media posts, tweets, blogs, pictures, videos, ratings SOURCE Machine generated sensor data and connected device communication VALUE Customer engagement, Customer service, Brand management, Viral marketing VALUE Operational efficiency, Cost control, Risk avoidance

Bring All Relevant Data to Decision Makers No Data Left Behind Optimized access to your entire Big Data ecosystem as if it were a single database MapReduce & NOSQL Databases Elastic Map Reduce BigInsights Distribution Columnar Databases Redshift Data Warehouse Appliances HANA Parallel Data Warehouse Relational Databases Multidimensional Databases Analysis Services SaaS-Based App Data User / Departmental Data

Why Hadoop and No SQL Sources are such a major force in big data? Extreme Scalability and Reliability These sources provide scalable and reliable data storage that is designed to span large clusters of commodity servers Affordable Data Storage Prior to Hadoop data storage was expensive. And the need to store increasingly large amounts of data and be able to easily get to it for a wide variety of purposes, makes Hadoop special. Highly Flexible Hadoop and No SQL sources bypass the need to specify a schema/structure the data. Allows to dump the data and ask questions later.

All About Hadoop 6

What is Hadoop? A scalable fault-tolerant distributed system for data storage and processing, open sourced under the Apache License. Hadoop Distributed File System: A highly reliable, high-bandwidth clustered storage used for buffering, copying and transferring data. Map Reduce: A software framework for distributed parallel processing of data. Client Map Reduce HDFS Master Job Tracker Name Node Master Data Node Task Tracker Data Node Task Tracker Data Node Task Tracker Slaves

Where does Hadoop fit in the Enterprise? Hadoop is a complement to a relational data warehouse Enterprises are generally not replacing their relational DWH with Hadoop Hadoop s strengths Inexpensive High reliability Extreme scalability Flexibility: Data can be added without defining a schema Hadoop s weaknesses Hadoop is not an interactive query environment Processing data in Hadoop requires writing code

Seconds Execution Times Minutes In-memory RDBMS Query Execution Times in an Environment with Hadoop Hadoop COLD RDBMS WARM Hadoop In-memory HOT GBs Data Volume PBs

Interfaces Available to Analyze Data Stored in Hadoop SQL on Hadoop Interactive SQL interface for Hadoop typically used for BI, analytics, and adhoc queries. Apache Hive Data warehouse infrastructure built on top of Hadoop. Allows SQL-like queries to be submitted to a Hadoop cluster. Apache Shark/Spark Apache Pig In-memory data analytics cluster computing framework on the HDFS. Promises better performance when compared to map reduce High level procedural language, typically useful for batch data flow workloads. Streaming/Java MapReduce The native way of accessing data in Hadoop but requires writing code in Java.

How does MicroStrategy integrates with Hadoop? SQL on Hadoop Apache Hive Apache Shark/Spark Apache Pig MicroStrategy certifies Cloudera Impala, Google Big Query and Pivotal HAWQ as a data source. MicroStrategy optimizes and certifies Hadoop/Hive as a data source. MicroStrategy certifies Spark/Shark on HDFS. MicroStrategy also provides a connector to execute Freeform Pig-Latin reports

Easier Connection to Hadoop with Each MicroStrategy Version MicroStrategy 9.3 MicroStrategy 9.3.1 MicroStrategy 9.4.1 2012 2013 2014 Apache ODBC MicroStrategy Universal ODBC driver Hortonworks Simba ODBC driver Cloudera ODBC 1.0 Cloudera ODBC 2.0 MSTR Greenplum driver for HAWQ MicroStrategy Thrift Connector Splunk ODBC driver

Usage Patterns for MicroStrategy with Hadoop as a Data Source 1.Visually explore subject matter extract in-memory through a one-time query to Hadoop 2.Self-service parameterized queries directly to Hadoop 3.Model-driven access to Hadoop. 4.Query multi-source schema model and drill down among Intelligent Cubes, EDW, Hive Multi-dimensional Business Model RDBMS ETL Maturity of Data Access

Key BI Characteristics: INDUSTRY: BI COMPONENTS: USERS ~200 DATABASE: HADOOP DISTRIBUTION: VOLUME OF DATA TYPE OF DATA APPLICATIONS: Entertainment 1 Application; Traditional Reports Hadoop, Teradata Amazon EMR Petabytes Log and Events data Sales Analysis Business Use and Benefits Sales Analysis generally with a new launch in new region, quick report analysis to understand the new accounts, number of hours of viewing etc. Directly querying and reporting from MicroStrategy on logs via Hive Able to make better Sales decisions Making Better Sales Decisions by Getting Insights from Web Logs in Hadoop

Key BI Characteristics: INDUSTRY: E-commerce BI COMPONENTS: 1 Application; Reports, Dashboards, VI USERS ~200 DATABASE: Hadoop, Oracle HADOOP DISTRIBUTION: Apache VOLUME OF DATA Petabytes TYPE OF DATA Web Logs, Online behavior APPLICATIONS: Sales Analysis Business Use and Benefits Analyzing web logs/online behavior stored in Hadoop. Dashboards and VI analysis run against our in-memory cubes. And ad-hoc reports run live against Hive. Analyzing Online Behavior with Live Queries to Hadoop End users do not need to code with MapReduce Developers are more productive delivering self service BI through a tool instead of coding custom user interface.

Multi-Channel Digital Distribution Provider Key BI Characteristics: INDUSTRY: Electronics and Media BI COMPONENTS: 1 Application; Reports, VI, Dashboards DATABASE: Hadoop, Hive HADOOP DISTRIBUTION: Cloudera Impala VOLUME OF DATA Over 1 Billion traffic attribute combinations APPLICATIONS: Traffic Attribute Multiplier Business Use and Benefits The Traffic Attribute Multiplier application is helping Adconion to get precisely target their digital ads, shorten the time to prepare and tune models and better ad delivery ROI for their customers Precise Targeting of Digital Ads Leveraging MicroStrategy s integration to Impala and the rich visualizations library, making it easy to be consumed by business users. Achieved 2.4% improvement in ad budgets spending efficiency

Demo

Demo: Enriching Customer Interactions with Big Data Sales Associate and Customer Conversation Hello!, How may I help you? Thanks, just looking for a handbag, I saw online. Sure, I would be able to assist you better, if I can get your loyalty card. Sure, here you go. Sales Associate Welcome back Ms. Miller, we have the tan Hamilton style handbag available in store. Also we have a jacket in your size and in your favorite color tan and goes very well with the bag. Feel fee to try it on. And because you are our loyal customer, if you purchase the two items, there will be an additional 25%off! Customer Great, Thanks! Result of MicroStrategy s Big Data Analytics accessing data stored in Hadoop and a relational warehouse.

Demo: Enriching Customer Interactions with Big Data Customer s online browsing creates web traffic logs which get stored in Hadoop. Associate scans the loyalty card which brings up: Customer Information: stored in a relational warehouse Browsing History: pages visited, items visited, shopping cart information etc. all in web logs stored in Hadoop, accessing via Cloudera s ODBC Connector Market Basket Analysis brings up: Frequently bought items shown based on the store purchase history, stored in a relational warehouse. In-Store Promotion: Customized discount offered to the customer after analyzing their purchase behavior, if there is a percent that triggers the purchase. Also, higher discount is offered if the customer buys more as per the retail value threshold.

Full and Flexible Capabilities for Hadoop Analytics The dominant BI solution for interactive access to Hadoop data All Leading Distributions Multiple Connection Options In-Memory Cubes for Interactive Hadoop Analytics BigInsights 3. Interact and Discover 2. Accelerate > 100x 1. Extract Pig Certified integration with the most popular Hadoop and MapReduce solutions. Point-and-click query builder. No need to train analysts in HiveQL or MapReduce. Transform PBs of slow big data into GBs of agile-ready data

MicroStrategy taps into MongoDB 21

MicroStrategy Taps into MongoDB MicroStrategy Analytics Enterprise Freeform SQL Query Builder Data Import MongoDB 2.4 is supported as a warehouse with MicroStrategy 9s We certify the 32-bit Simba MongoDB ODBC driver on Windows and Linux. Simba Mongo DB ODBC Driver Web Services with XQuery MicroStrategy can also retrieve data from MongoDB s REST interface via web services

MicroStrategy Taps into MongoDB More About Simba ODBC Driver Sends SQL Converts to MongoDB query MicroStrategy Analytics Enterprise Simba Mongo DB ODBC Driver Converts to tabular data Mongo Proprietary Limitation: The Simba ODBC driver does not support Temp Table creation at this point.

MicroStrategy Taps into MongoDB More About Simba ODBC Driver

Recap MicroStrategy s big data solution is backed up with its organic and tightly integrated technology. And therefore these leading industries use MicroStrategy: Facebook Netflix Yahoo! and many more! MicroStrategy provides a certified and reliable solution for connecting and analyzing data from Hadoop/Hive and other map reduce solutions. Latest support: Google Big Query Pivotal HAWQ Spark/Shark MicroStrategy taps into No SQL sources like Mongo DB. Support for more such sources in future! For any further questions, please email at tmaru@microstrategy.com

Q&A Questions?