Real World Big Data Architecture - Splunk, Hadoop, RDBMS

Size: px
Start display at page:

Download "Real World Big Data Architecture - Splunk, Hadoop, RDBMS"

Transcription

1 Copyright 2015 Splunk Inc. Real World Big Data Architecture - Splunk, Hadoop, RDBMS Raanan Dagan, Big Data Specialist, Splunk

2 Disclaimer During the course of this presentagon, we may make forward looking statements regarding future events or the expected performance of the company. We caugon you that such statements reflect our current expectagons and esgmates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward- looking statements, please review our filings with the SEC. The forward- looking statements made in the this presentagon are being made as of the Gme and date of its live presentagon. If reviewed arer its live presentagon, this presentagon may not contain current or accurate informagon. We do not assume any obligagon to update any forward looking statements we may make. In addigon, any informagon about our roadmap outlines our general product direcgon and is subject to change at any Gme without nogce. It is for informagonal purposes only and shall not, be incorporated into any contract or other commitment. Splunk undertakes no obligagon either to develop the features or funcgonality described or to include any such feature or funcgonality in a future release. 2

3 About Your Presenter Raanan Dagan - Sr. SE, Big Data specialist In Splunk for 3.5 Years Before Splunk worked for Cloudera and Oracle 3

4 Agenda Splunk Big Data Architecture AlternaGve Open Source Approach Real- World Customer Architecture Customer Roadmap Architecture End- to- end DemonstraGon (Splunk, Hadoop, RDBMS) 4

5 Big Data Technologies Rela(onal Database Structured NoSQL Semi- Structured Hadoop Semi- Structured Splunk Schema at Write Schema at Read Schema at Read Schema at Read SQL ETL Key- Value, Column, Document & Other Stores MapReduce HDFS Storage Search Real- Time Indexing RDBMS Oracle, MySQL, IBM DB2, Teradata Cassandra, Accumulo, MongoDB 5 MapReduce Distributed File System Time- Series, Unstructured, Heterogenous

6 Big Data Are you ge\ng the most out of your Big Data investments? 6

7 Big Data in Goverment How can you accelerate your Gme to value? 7

8 Splunk Big Data Copyright 2015 Splunk Inc.

9 Splunk Big Data Technologies DB Connect Hunk Splunk Schema at Write Schema at Read Schema at Read Schema at Read SQL ETL Key- Value, Column, Document & Other Stores MapReduce HDFS Storage Search Real- Time Indexing RDBMS Oracle, MySQL, IBM DB2, Teradata Cassandra, Accumulo, MongoDB 9 MapReduce Distributed File System Time- Series, Unstructured, Heterogenous

10 Splunk Scalability Enterprise- class Availability and Scale " AutomaGc load balancing linearly scales indexing " Distributed search and MapReduce linearly scales search and reporgng Offload search load to Splunk Search Heads Auto load- balanced forwarding to Splunk Indexers Send data from thousands of servers using any combinagon of Splunk forwarders 10

11 Splunk Real- Time AnalyGcs Data Monitor Input TCP/UDP Input Scripted Input Parsing Queue Parsing Pipeline Source, event typing Character set normalizagon Line breaking Timestamp idengficagon Regex transforms Index Queue Real- Gme Buffer Indexing Pipeline Raw data Index Files Real- Gme Search Process Splunk Index 11

12 Hunk - AnalyGcs Plaeorm for Hadoop Full- featured, Integrated Product Explore Analyze Visualize Dashboard s Share Insights for Everyone Works with What You Have Today Hadoop Clusters Hadoop Client Libraries NoSQL, EMR, S3 Buckets 12

13 Hunk Unique Features Virtual Index Enables seamless use of the Splunk technology stack on data wherever it rests NaGvely handles MapReduce Schema- on- the- fly Structure applied at search Gme No briile schema AutomaGcally find paierns and trends Flexibility and Fast Time to Value InteracGve search Preview results while MapReduce jobs run Drag- and- drop analygcs Security: Access Control, Pass Through Authen(ca(on, Kerberos 13

14 Hunk Provides Self- Service AnalyGcs for Hadoop Explore Analyze Visualize Dashboards Share Hadoop Storage 14

15 What About Structured Data? Customer profile Product azributes Employee details Pricing and Rate plans Asset info 15

16 Use cases for structured data in Splunk Index machine data from databases, such as logs or sales records Enrich machine data with high- level data, such as customer records Update structured databases with Splunk info, such as risk scores InteracGvely browse structured and unstructured data from Splunk reports

17 Machine Data Delivers Real- Gme Insights Media server logs (machine data) Phone Number IP Address Track ID Mar 01 19:18:50:000 aaa2 radiusd[12548]:[id local1.info] INFO RADOP(13) acct start for from recorded OK.! :18:50: GET /sync/addtolibrary/ "Mozilla/5.0 (iphone; CPU iphone OS 5_0_1 like Mac OS X) AppleWebKit/ (KHTML, like Gecko) Version/5.1 Mobile/9A405 Safari/ " ! Mar 01 19:18:50:163 aaa2 radiusd[12548]:[id local1.info] INFO RADOP(13) acct stop for from recorded OK.! 17

18 Structured Data Contains Business Context Media server logs (machine data) Phone number IP address Track ID Mar 01 19:18:50:000 aaa2 radiusd[12548]:[id local1.info] INFO RADOP(13) acct start for from recorded OK.! :18:50: GET /sync/addtolibrary/ "Mozilla/5.0 (iphone; CPU iphone OS 5_0_1 like Mac OS X) AppleWebKit/ (KHTML, like Gecko) Version/5.1 Mobile/9A405 Safari/ " ! Mar 01 19:18:50:163 aaa2 radiusd[12548]:[id local1.info] INFO RADOP(13) acct stop for from recorded OK.! Track ID ArGst Title Format ID Run Gme Maroon 5 Moves like Jagger MP3 4:30 Customer, product databases Phone # Subscriber ID Subscriber ID First Name Last Name Age State Customer Score Jim Morrison 25 CA 93 18

19 Splunk DB Connect Reliable, scalable, real- (me integra(on between Splunk and tradi(onal rela(onal databases " Enrich search results with addigonal business context " Easily import data into Splunk for deeper analysis " Integrate mulgple DBs concurrently " Simple set- up, non- evasive and secure Database lookup Oracle database Java Bridge Server ConnecGon pooling JDBC MicrosoR SQL server Database query Other databases 19

20 Copyright 2015 Splunk Inc. Customer Open Source AlternaGve

21 Hadoop Ecosystem OpGons

22 Hadoop Advantage / Disadvantage Advantage Cheap Storage Batch Distributed Processing Disadvantage Requires Coding for most AnalyGcs No VisualizaGon Tools No OOTB Apps / SoluGons 22

23 Copyright 2015 Splunk Inc. Real- World Customer Architecture

24 Summary Architecture 3 instances Splunk / Hunk / DB Connect Search Heads Real Time Data - 25 Indexers Historical data (VIX) 60 Hortonworks nodes Enrichment data (lookup) - MySQL DB 2000 Forwarders

25 Splunk Deployment Architecture indexer Web server 2,000 forwarders ~2TB per day indexer 25 indexers 3 search head ~250 Users ~30 Concurrent Users Web server forwarder 25

26 Hadoop Architecture WebLogic app server ~30 Flume Agents ~60 Data Nodes ~1.2 PB of storage ~2 years data retengon data node WebLogic app server data node data node data node 26

27 Splunk + Hunk = All the Data indexer Web server indexer app server Real Time AnalyGcs Alerts Apps Batch Compliment Splunk AnalyGcs Historical searches data node data node data node 27

28 DB Connect Architecture Install DB Connect on a Search Head Use DB Connect for Lookup Several Lookups coming from two different MySQL Databases Lookup Enrich log data with business insight Search Head DB- 1 DB- 2 MySQL JDBC Driver 28

29 DB - Architecture Performance Impact Command Connec(on Architecture Indexing Inputs - dbmon- tail ** Recommended Inputs dbmon- dump Outputs Not Indexing Medium number of connecgons (Small amount of data - only delta) Small amount of connecgons (Lots of data per connecgon) Lots of DB ConnecGons (Small amount of data) DB to Index (connecgon pooling) DB to Index (connecgon pooling) Search Head to DB (connecgon pooling) Search DBXQuery Lots of DB ConnecGons DB to Search Head Lookups ** Selected this op(on Lots of DB ConnecGons DB to Search Head

30 Summary Architecture 3 instances Splunk / Hunk / DB Connect Search Heads Real Time Data - 25 Indexers Historical data (VIX) 60 Hortonworks nodes Enrichment data (lookup) - MySQL DB 2000 Forwarders

31 Copyright 2015 Splunk Inc. Customer Roadmap Architecture

32 Archive Splunk Data to HDFS or AWS S3 WARM Hadoop Clusters COLD FROZEN Drive Down TCO by Archiving Historical Data to Commodity Hardware

33 Unified Search Intelligently Search Across Real- Time and Historical Data using the Same Splunk Interface Real- Time Data Historical Data in Hadoop

34 Roadmap Architecture Unified Search 3 instances Splunk / Hunk / DB Connect Search Heads Real Time Data - 25 Indexers Archive Historical data (VIX) 60 Hortonworks nodes Enrichment data (lookup) - MySQL DB 2000 Forwarders

35 Copyright 2015 Splunk Inc. Customer Chosen Architecture Demo

36 Support Our Military Kids Take our Survey! Splunk will Donate $10 Dollars to our Military Kids Plus a bonus if we hit 350 number of completed surveys onsite. Text Splunk to

37 Thank You Copyright 2015 Splunk Inc.

Leveraging Machine Data to Deliver New Insights for Business Analytics

Leveraging Machine Data to Deliver New Insights for Business Analytics Copyright 2015 Splunk Inc. Leveraging Machine Data to Deliver New Insights for Business Analytics Rahul Deshmukh Director, Solutions Marketing Jason Fedota Regional Sales Manager Safe Harbor Statement

More information

Big Data Unlock the mystery and see what the future holds. Philip Sow SE Manager, SEA

Big Data Unlock the mystery and see what the future holds. Philip Sow SE Manager, SEA Big Data Unlock the mystery and see what the future holds Philip Sow SE Manager, SEA THE ERA OF BIG DATA Big Data Market: Reach $32.1 Billion in 2015 & to $54.4 billion by 2017 The 3 + 1 Vs Structure/Semi/Unstructured

More information

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS Copyright 2014 Splunk Inc. Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS Dritan Bi=ncka BD Solu=ons Architecture Disclaimer During the course of this presenta=on, we may make forward looking statements

More information

Big Data Technologies Compared June 2014

Big Data Technologies Compared June 2014 Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development

More information

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case

More information

So What s the Big Deal?

So What s the Big Deal? So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

Mobile Big Data AnalyEcs

Mobile Big Data AnalyEcs Copyright 2014 Splunk Inc. Mobile Big Data AnalyEcs Marc Courtemanche, Sr. Director Alain Brunet, Sr. Lead Developer Vantrix CorporaEon Disclaimer During the course of this presentaeon, we may make forward-

More information

Tap into Hadoop and Other No SQL Sources

Tap into Hadoop and Other No SQL Sources Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data

More information

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

#TalendSandbox for Big Data

#TalendSandbox for Big Data Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND

More information

BIG DATA SOLUTION DATA SHEET

BIG DATA SOLUTION DATA SHEET BIG DATA SOLUTION DATA SHEET Highlight. DATA SHEET HGrid247 BIG DATA SOLUTION Exploring your BIG DATA, get some deeper insight. It is possible! Another approach to access your BIG DATA with the latest

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

HDP Enabling the Modern Data Architecture

HDP Enabling the Modern Data Architecture HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,

More information

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.

More information

Cost-Effective Business Intelligence with Red Hat and Open Source

Cost-Effective Business Intelligence with Red Hat and Open Source Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,

More information

Native Connectivity to Big Data Sources in MSTR 10

Native Connectivity to Big Data Sources in MSTR 10 Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah Apache Hadoop: The Pla/orm for Big Data Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah 1 The Problems with Current Data Systems BI Reports + Interac7ve Apps RDBMS (aggregated

More information

DBX. SQL database extension for Splunk. Siegfried Puchbauer

DBX. SQL database extension for Splunk. Siegfried Puchbauer DBX SQL database extension for Splunk Siegfried Puchbauer Agenda Features Architecture Supported platforms Supported databases Roadmap Features Database connection management SQL database input (content

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

HYPER-CONVERGED INFRASTRUCTURE STRATEGIES

HYPER-CONVERGED INFRASTRUCTURE STRATEGIES 1 HYPER-CONVERGED INFRASTRUCTURE STRATEGIES MYTH BUSTING & THE FUTURE OF WEB SCALE IT 2 ROADMAP INFORMATION DISCLAIMER EMC makes no representation and undertakes no obligations with regard to product planning

More information

Apache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com

Apache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com Apache Hadoop in the Enterprise Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com Cloudera The Leader in Big Data Management Powered by Apache Hadoop The Leading Open Source Distribution of Apache

More information

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct

More information

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based

More information

A Scalable Data Transformation Framework using the Hadoop Ecosystem

A Scalable Data Transformation Framework using the Hadoop Ecosystem A Scalable Data Transformation Framework using the Hadoop Ecosystem Raj Nair Director Data Platform Kiru Pakkirisamy CTO AGENDA About Penton and Serendio Inc Data Processing at Penton PoC Use Case Functional

More information

Big Data Analytics Platform @ Nokia

Big Data Analytics Platform @ Nokia Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform

More information

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager steve.gonzales@thinkbiganalytics.com

More information

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

TE's Analytics on Hadoop and SAP HANA Using SAP Vora TE's Analytics on Hadoop and SAP HANA Using SAP Vora Naveen Narra Senior Manager TE Connectivity Santha Kumar Rajendran Enterprise Data Architect TE Balaji Krishna - Director, SAP HANA Product Mgmt. -

More information

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Kai Wähner kwaehner@tibco.com @KaiWaehner www.kai-waehner.de Disclaimer! These opinions are my own and do not necessarily

More information

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop

More information

Building Your Big Data Team

Building Your Big Data Team Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.

More information

ITG Software Engineering

ITG Software Engineering IBM WebSphere Administration 8.5 Course ID: Page 1 Last Updated 12/15/2014 WebSphere Administration 8.5 Course Overview: This 5 Day course will cover the administration and configuration of WebSphere 8.5.

More information

Big Data Analytics - Accelerated. stream-horizon.com

Big Data Analytics - Accelerated. stream-horizon.com Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based

More information

Reference Architecture, Requirements, Gaps, Roles

Reference Architecture, Requirements, Gaps, Roles Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture

More information

Dominik Wagenknecht Accenture

Dominik Wagenknecht Accenture Dominik Wagenknecht Accenture Improving Mainframe Performance with Hadoop October 17, 2014 Organizers General Partner Top Media Partner Media Partner Supporters About me Dominik Wagenknecht Accenture Vienna

More information

Big Data Too Big To Ignore

Big Data Too Big To Ignore Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction

More information

Ge6ng to Know Hadoop. Sriram Mohan Senior Consultant, Avalon Consul,ng, LLC Associate Professor of CSSE, Rose- Hulman Ins,tute of Technology

Ge6ng to Know Hadoop. Sriram Mohan Senior Consultant, Avalon Consul,ng, LLC Associate Professor of CSSE, Rose- Hulman Ins,tute of Technology Ge6ng to Know Hadoop Sriram Mohan Senior Consultant, Avalon Consul,ng, LLC Associate Professor of CSSE, Rose- Hulman Ins,tute of Technology Presenter Overview Who we are Consultants providing expert technical

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

Upcoming Announcements

Upcoming Announcements Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

Copyright 2013 Splunk, Inc. Splunk 6 Overview. Presenter Name, Presenter Title

Copyright 2013 Splunk, Inc. Splunk 6 Overview. Presenter Name, Presenter Title Copyright 2013 Splunk, Inc. Splunk 6 Overview Presenter Name, Presenter Title Safe Harbor Statement During the course of this presentahon, we may make forward looking statements regarding future events

More information

Parallel Data Warehouse

Parallel Data Warehouse MICROSOFT S ANALYTICS SOLUTIONS WITH PARALLEL DATA WAREHOUSE Parallel Data Warehouse Stefan Cronjaeger Microsoft May 2013 AGENDA PDW overview Columnstore and Big Data Business Intellignece Project Ability

More information

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this

More information

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov An Industrial Perspective on the Hadoop Ecosystem Eldar Khalilov Pavel Valov agenda 03.12.2015 2 agenda Introduction 03.12.2015 2 agenda Introduction Research goals 03.12.2015 2 agenda Introduction Research

More information

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved. EMC Federation Big Data Solutions 1 Introduction to data analytics Federation offering 2 Traditional Analytics! Traditional type of data analysis, sometimes called Business Intelligence! Type of analytics

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

From Spark to Ignition:

From Spark to Ignition: From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for

More information

Open source large scale distributed data management with Google s MapReduce and Bigtable

Open source large scale distributed data management with Google s MapReduce and Bigtable Open source large scale distributed data management with Google s MapReduce and Bigtable Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory

More information

An Introduction to MOHAMMAD REZA KARIMI DASTJERDI SPRING

An Introduction to MOHAMMAD REZA KARIMI DASTJERDI SPRING An Introduction to MOHAMMAD REZA KARIMI DASTJERDI SPRING 2015 1 Table Of Contents Introduction Problems with RDBMs What is Hadoop? Who use Hadoop? Job Positions History Hadoop Distributions Hadoop Ecosystem

More information

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

Open source Google-style large scale data analysis with Hadoop

Open source Google-style large scale data analysis with Hadoop Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical

More information

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014 BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014 Ralph Kimball Associates 2014 The Data Warehouse Mission Identify all possible enterprise data assets Select those assets

More information

Architecting for the Internet of Things & Big Data

Architecting for the Internet of Things & Big Data Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to

More information

Oracle s Big Data solutions. Roger Wullschleger.

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here> s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline

More information

Splunk Operational Visibility

Splunk Operational Visibility Copyright 2015 Splunk Inc. Splunk Operational Visibility Matthias Maier Sales Engineer, CISSP Safe Harbor Statement During the course of this presentation, we may make forward looking statements regarding

More information

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze

More information

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate

More information

MapReduce with Apache Hadoop Analysing Big Data

MapReduce with Apache Hadoop Analysing Big Data MapReduce with Apache Hadoop Analysing Big Data April 2010 Gavin Heavyside gavin.heavyside@journeydynamics.com About Journey Dynamics Founded in 2006 to develop software technology to address the issues

More information

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Simplifying Big Data Analytics: Unifying Batch and Stream Processing John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Streaming Analy.cs S S S Scale- up Database Data And Compute Grid

More information

From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication with Tungsten

From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication with Tungsten From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication with Tungsten MC Brown, Director of Documentation Linas Virbalas, Senior Software Engineer. About Tungsten Replicator Open source drop-in

More information

Using distributed technologies to analyze Big Data

Using distributed technologies to analyze Big Data Using distributed technologies to analyze Big Data Abhijit Sharma Innovation Lab BMC Software 1 Data Explosion in Data Center Performance / Time Series Data Incoming data rates ~Millions of data points/

More information

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce

More information

Qsoft Inc www.qsoft-inc.com

Qsoft Inc www.qsoft-inc.com Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

Big Data SQL and Query Franchising

Big Data SQL and Query Franchising Big Data SQL and Query Franchising An Architecture for Query Beyond Hadoop Dan McClary, Ph.D. Big Data Product Management Oracle Copyright 2014, Oracle and/or its affiliates. All rights reserved. Safe Harbor

More information

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning

More information

Oracle BI Roadmap & Visual Analyzer Ljiljana Perica, Oracle Business Solution Leader Ljiljana.perica@oracle.com

Oracle BI Roadmap & Visual Analyzer Ljiljana Perica, Oracle Business Solution Leader Ljiljana.perica@oracle.com Oracle BI Roadmap & Visual Analyzer Ljiljana Perica, Oracle Business Solution Leader Ljiljana.perica@oracle.com Copyright 2015, Oracle and/or its affiliates. All rights reserved. 1 Safe Harbor Statement

More information

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING

More information

Presenters: Luke Dougherty & Steve Crabb

Presenters: Luke Dougherty & Steve Crabb Presenters: Luke Dougherty & Steve Crabb About Keylink Keylink Technology is Syncsort s partner for Australia & New Zealand. Our Customers: www.keylink.net.au 2 ETL is THE best use case for Hadoop. ShanH

More information

Welkom! Copyright 2014 Oracle and/or its affiliates. All rights reserved.

Welkom! Copyright 2014 Oracle and/or its affiliates. All rights reserved. Welkom! WIE? Bestuurslid OGh met BI / WA ervaring Bepalen activiteiten van de vereniging Deelname in organisatie commite van 1 of meerdere events Faciliteren van de SIG s Redactie van OGh-Visie Onderhouden

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools

More information

Background on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros

Background on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros David Moses January 2014 Paper on Cloud Computing I Background on Tools and Technologies in Amazon Web Services (AWS) In this paper I will highlight the technologies from the AWS cloud which enable you

More information

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

NextGen Infrastructure for Big DATA Analytics.

NextGen Infrastructure for Big DATA Analytics. NextGen Infrastructure for Big DATA Analytics. So What is Big Data? Data that exceeds the processing capacity of conven4onal database systems. The data is too big, moves too fast, or doesn t fit the structures

More information

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate

More information

Big Data for everyone Democratizing big data with the cloud. Steffen Krause Technical Evangelist @AWS_Aktuell skrause@amazon.de

Big Data for everyone Democratizing big data with the cloud. Steffen Krause Technical Evangelist @AWS_Aktuell skrause@amazon.de Big Data for everyone Democratizing big data with the cloud Steffen Krause Technical Evangelist @AWS_Aktuell skrause@amazon.de Does this Data make me look big? Overview Designing big data solutions in

More information

Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization

Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization Kimberly Palko, Product Manager Red Hat JBoss Doug Reid, Director Partner Product Management Hortonworks Cojan van

More information

The Inside Scoop on Hadoop

The Inside Scoop on Hadoop The Inside Scoop on Hadoop Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. Orion.Gebremedhin@Neudesic.COM B-orgebr@Microsoft.com @OrionGM The Inside Scoop

More information

Scalable Architecture on Amazon AWS Cloud

Scalable Architecture on Amazon AWS Cloud Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect

More information

Investor Presentation. Second Quarter 2015

Investor Presentation. Second Quarter 2015 Investor Presentation Second Quarter 2015 Note to Investors Certain non-gaap financial information regarding operating results may be discussed during this presentation. Reconciliations of the differences

More information

Telemetry: The Customer Experience

Telemetry: The Customer Experience Copyright 2014 Splunk Inc. Telemetry: The Customer Experience Simon Warrington Senior Program Manager, Microso@ Disclaimer During the course of this presentagon, we may make forward- looking statements

More information

Big Data Infrastructure at Spotify

Big Data Infrastructure at Spotify Big Data Infrastructure at Spotify Wouter de Bie Team Lead Data Infrastructure June 12, 2013 2 Agenda Let s talk about Data Infrastructure, how we did it, what we learned and how we ve failed Some Context

More information

Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com

Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com Data Warehousing and Analytics Infrastructure at Facebook Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com Overview Challenges in a Fast Growing & Dynamic Environment Data Flow Architecture,

More information

An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise

An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise An Oracle White Paper October 2011 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5

More information

INDUS / AXIOMINE. Adopting Hadoop In the Enterprise Typical Enterprise Use Cases

INDUS / AXIOMINE. Adopting Hadoop In the Enterprise Typical Enterprise Use Cases INDUS / AXIOMINE Adopting Hadoop In the Enterprise Typical Enterprise Use Cases. Contents Executive Overview... 2 Introduction... 2 Traditional Data Processing Pipeline... 3 ETL is prevalent Large Scale

More information

Bringing Big Data to People

Bringing Big Data to People Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process

More information

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D. Big Data Technology ดร.ช ชาต หฤไชยะศ กด Choochart Haruechaiyasak, Ph.D. Speech and Audio Technology Laboratory (SPT) National Electronics and Computer Technology Center (NECTEC) National Science and Technology

More information

Introducing Oracle Exalytics In-Memory Machine

Introducing Oracle Exalytics In-Memory Machine Introducing Oracle Exalytics In-Memory Machine Jon Ainsworth Director of Business Development Oracle EMEA Business Analytics 1 Copyright 2011, Oracle and/or its affiliates. All rights Agenda Topics Oracle

More information

BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION

BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION FACT SHEET BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION BIGDATA & HADOOP CLASS ROOM SESSION GreyCampus provides Classroom sessions for Big Data & Hadoop Developer Certification. This course will

More information

Ubuntu and Hadoop: the perfect match

Ubuntu and Hadoop: the perfect match WHITE PAPER Ubuntu and Hadoop: the perfect match February 2012 Copyright Canonical 2012 www.canonical.com Executive introduction In many fields of IT, there are always stand-out technologies. This is definitely

More information