Big Data Technologies Compared June 2014



Similar documents
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Bringing Big Data to People

I/O Considerations in Big Data Analytics

So What s the Big Deal?

Big Data Processing: Past, Present and Future

Modernizing Your Data Warehouse for Hadoop

Big Data Explained. An introduction to Big Data Science.

Big Data and Data Science: Behind the Buzz Words

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

Parallel Data Warehouse

The Inside Scoop on Hadoop

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

Cost-Effective Business Intelligence with Red Hat and Open Source

SAP Real-time Data Platform. April 2013

Tap into Hadoop and Other No SQL Sources

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Peninsula Strategy. Creating Strategy and Implementing Change

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Microsoft Analytics Platform System. Solution Brief

Next-Generation Cloud Analytics with Amazon Redshift

Step by Step: Big Data Technology. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

How To Handle Big Data With A Data Scientist

Big Data and Its Impact on the Data Warehousing Architecture

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

Performance and Scalability Overview

Big Data: Are You Ready? Kevin Lancaster

How To Scale Out Of A Nosql Database

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

INTRODUCTION TO CASSANDRA

Main Memory Data Warehouses

EMC BACKUP MEETS BIG DATA

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

Can the Elephants Handle the NoSQL Onslaught?

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Please give me your feedback

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Composite Software Data Virtualization Turbocharge Analytics with Big Data and Data Virtualization

Open Source Technologies on Microsoft Azure

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

<Insert Picture Here> Big Data

Il mondo dei DB Cambia : Tecnologie e opportunita`

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Modern Data Warehousing

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

Hurtownie Danych i Business Intelligence: Big Data

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

Cisco Solutions for Big Data and Analytics

Performance and Scalability Overview

Structured Data Storage

Microsoft technológie pre BigData. Ľubomír Goryl Solution Professional

Oracle Big Data SQL Technical Update

May 6, 2014 Presented by: Kyle McNerney

TABLE OF CONTENTS 1 Chapter 1: Introduction 2 Chapter 2: Big Data Technology & Business Case 3 Chapter 3: Key Investment Sectors for Big Data

In-memory computing with SAP HANA

Sunnie Chung. Cleveland State University

This Symposium brought to you by

Big Data Success Step 1: Get the Technology Right

Choosing The Right Big Data Tools For The Job A Polyglot Approach

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Advanced In-Database Analytics

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Einsatzfelder von IBM PureData Systems und Ihre Vorteile.

Real Life Performance of In-Memory Database Systems for BI

Oracle Database 12c Plug In. Switch On. Get SMART.

SAP BusinessObjects Business Intelligence 4.1 One Strategy for Enterprise BI. May 2013

Introduction to NOSQL

Proact whitepaper on Big Data

TOP 8 TRENDS FOR 2016 BIG DATA

How To Understand The Business Case For Big Data

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?

Data Warehouse design

Mind Commerce. Commerce Publishing v3122/ Publisher Sample

SQL Server 2012 Parallel Data Warehouse. Solution Brief

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

High performance analytics. Benchmark results.

SQL Server What s New? Christopher Speer. Technology Solution Specialist (SQL Server, BizTalk Server, Power BI, Azure) v-cspeer@microsoft.

Big Data and Industrial Internet

BIG DATA-AS-A-SERVICE

Hadoop. for Oracle database professionals. Alex Gorbachev Calgary, AB September 2013

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

BIG DATA APPLIANCES. July 23, TDWI. R Sathyanarayana. Enterprise Information Management & Analytics Practice EMC Consulting

Native Connectivity to Big Data Sources in MSTR 10

Cloud Scale Distributed Data Storage. Jürmo Mehine

BIG DATA TRENDS AND TECHNOLOGIES

In-Memory Analytics for Big Data

High Performance IT Insights. Building the Foundation for Big Data

Transcription:

Big Data Technologies Compared June 2014

Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2

What is Big Data by Example The SKA Telescope is a new development in South Africa & Australia comes online in 2020 Square Kilometre Array The SKA will generate enough raw data to fill 15 million 64GB ipods every day 913 PB per day or 10.8 TB per sec of RAW Unprocessed Data Post Processing generates a rather modest 10 GB per sec or 844 TB per day But how do you process this Super Data? Naturally with a Super Computer! The SKA will require very high performance supercomputers capable of 100 petaflops per second processing power. This is about 50 times more powerful than the most powerful supercomputer in 2010 and equivalent to the processing power of about 100 million PCs 3

What is Big Data by Definition Marketing term to describe large, complex & varying data sets that are difficult to process You tend to have a Big Data problem if your data sets have these 3 properties [V]olume Large amounts of data [V]elocity that continually arrive in a short timespan [V]ariety* and are varying, non-standard and/or undefined. To solve the problem you need specialist technologies Business want to solve the problem because they believe it offers competitive advantage Traditional Database is by the problem definition* not a Big Data technology or solution However traditional RDBMS vendors extend their products to include Big Data capabilities 4

What is Big Data by Market Size / Growth Gartner predicts the big data market will reach $16.1 billion in 2014 (incl. servers/storage) The big data professional services market will exceed $4.5 billion in 2014 The adoption of analytics-as-a-service will accelerate (ie ready-made analytics in the cloud) The number of vendors providing big data services will triple over the next three years Different industries will leverage Big Data at different rates Growing 6 times faster than the overall IT market 5

Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 6

Big Data Some Technology Comparisons - Summary Factor // Technology Microsoft HDInsight (Cloud based version of Hadoop) Splunk Microsoft APS (aka Microsoft SQL Server PDW) EMC Greenplum SAP Hana Sweet Spot Unstructured and Semistructured data loading, storage, and query processing Unstructured and Semistructured data loading, storage, search, reporting and analytics applications Structured tabular data warehousing applications and OLAP analytics Structured tabular data warehousing applications and OLAP analytics SAP Structured tabular OLTP applications Who owns it Microsoft Splunk Microsoft EMC SAP Schema on On Read On Read On Write On Write On Write RDBMS OOTB No No Yes (native MPP) Yes (native MPP) Yes (in memory) Hadoop OOTB Yes (native MPP) No (Hadoop Extn) No (Hadoop Extn) No (Hadoop Extn) No (Hadoop Extn) SQL Language No (HIVE / Map Reduce) No (SPL) Yes (ANSI SQL) Yes (ANSI SQL) Yes (ANSI SQL) Reports OOTB No (MS & 3 rd party) Yes & 3 rd party No (MS & 3 rd party) No (GP & 3 rd party) No (BO & 3 rd party) Format / Footprint As-a-Service Software & As-a-Service Appliance Software & Appliance Appliance 7

Agenda What is Big Data Big Data Technology Comparison Summary Big Data Technology in Detail Other Big Data Technologies Questions 8

Microsoft HDInsight (Hadoop) $3,000/month for 10TB in 9 nodes Can provision a 9 node cluster in less than 15 minutes HDInsight is part of the range of Azure services Cluster sizes from 4 to 32 nodes 9

Splunk Search is the basis of all data interaction Cluster architecture replicates copies of data for redundancy Control configuration across a system (single & distributed) Index data, and host primary data store, manage data aging (hot, warm, cold, frozen) based on age Distributed model for index, data store and search 10

Microsoft APS (PDW) Hadoop Region SQL Queries Aggregated Result Data Obtained From Relevant Nodes Result from Each Node Sent to Control Node Obtain Unstructured Data (if required) PDW Region All nodes in appliance are VM s All nodes have redundancy OS is Windows 2012 11

EMC Greenplum Structured Analytics SQL BI tools Analytical tools Unstructured Analytics Hadoop MapReduce Standard Business Intelligence and Analytical tools Master Server Query planning & dispatch Segment Servers Query processing & data storage Network Interconnect primary server, plus hot failover...... Clients see a single Postgres database Queries distributed across all available resources Shared Nothing, Massively Parallel Processing means no bottlenecks and linear scalability. 12

SAP Hana SQL Query Processing SQL Relational Engine Database / Data persistence 13

Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 14

Other Technologies under the Big Data Banner You May Hear About Everyone who is anyone will probably claim to have a Big Data technology / solution Traditional Database Technologies (extended) Tabular & Relational based technologies Teradata, IBM DB2, Oracle Exadata, SAP/Sybase IQ, Netezza, + many more NoSQL Technologies Column, Key-Value Pair, Object and Graph based technologies Casandara, MongoDB, Hbase, Allegro, BigTable, + many more Not Only SQL Hadoop Technologies Cloudera, Hortonworks, Amazon, Google, Xplenty, + many more 15

Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 16

Questions?

Appendix References Big Data on Wiki http://en.wikipedia.org/wiki/big_data Gartner Data Warehousing Magic Quadrant http://www.gartner.com/technology/reprints.do?id=1-1rp452a&ct=140310&st=sb The SKA Telescope http://www.skatelescope.org/the-technology/ Splunk on Wiki http://en.wikipedia.org/wiki/splunk NoSQL on Wiki http://en.wikipedia.org/wiki/nosql The Big Data Marketplace http://www.forbes.com/sites/gilpress/2013/12/12/16-1-billion-big-data-market-2014-predictions-from-idc-and-iia/ 18