Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS
|
|
|
- Dorothy Richard
- 10 years ago
- Views:
Transcription
1 Copyright 2014 Splunk Inc. Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS Dritan Bi=ncka BD Solu=ons Architecture
2 Disclaimer During the course of this presenta=on, we may make forward looking statements regarding future events or the expected performance of the company. We cau=on you that such statements reflect our current expecta=ons and es=mates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward- looking statements, please review our filings with the SEC. The forward- looking statements made in the this presenta=on are being made as of the =me and date of its live presenta=on. If reviewed ater its live presenta=on, this presenta=on may not contain current or accurate informa=on. We do not assume any obliga=on to update any forward looking statements we may make. In addi=on, any informa=on about our roadmap outlines our general product direc=on and is subject to change at any =me without no=ce. It is for informa=onal purposes only and shall not, be incorporated into any contract or other commitment. Splunk undertakes no obliga=on either to develop the features or func=onality described or to include any such feature or func=onality in a future release. 2
3 About Me! Member of BD Solu=on Architecture team! Large scale deployments! Cloud and Big Data! Fourth.Conf
4 Agenda! Hunk! Amazon EMR! Understanding how Hunk and EMR can work together! Demo Analyzing HDFS/S3 data with Hunk on EMR 4
5 Introduc=on to Hunk
6 Splunk as a single pane of glass for your machine data 6
7 RDBM NoSQL Splunk> 7
8 Splunk> RDBM NoSQL Splunk> RDBM NoSQL 8
9 Hunk for Hadoop and NoSQL Data Stores Explore Splunk> Analyze Visualize RDBM NoSQL 9
10 Hunk for Hadoop and NoSQL Data Stores Explore Splunk> Analyze Visualize RDBM NoSQL 10
11 Hadoop Components HDFS NameNode DataNode Distributed, replicated, massively scalable file system MapReduce JobTracker TaskTracker Programming paradigm; two phase processing of large datasets ê We also use it, though a simplified version of it Scalable, fault tolerant etc. STORAGE COMPUTE 11
12 Splunk and Hadoop Data Splunk Hadoop Connect Export: Write data out to Hadoop, search based (push) Explore: Read data from Hadoop and analyze on SH 12
13 Splunk and Hadoop Data Splunk Hadoop Connect Export: Write data out to Hadoop, search based (push) Explore: Read data from Hadoop and analyze on SH PULL 13
14 Splunk and Hadoop Data Splunk Hadoop Connect Export: Write data out to Hadoop, search based (push) Explore: Read data from Hadoop and analyze on SH PULL STORAGE COMPUTE 14
15 Splunk and Hadoop Data Today Explore Analyze Visualize Dashboard s Share STORAGE COMPUTE 15
16 Splunk Stack Explore Analyze Visualize Dashboards Share splunkweb Web and Applica=on server Python, AJAX, CSS, XSLT, XML REST API Search Head Virtual Indexes C++, Web Services COMMAND LINE splunkd ODBC 64- bit Linux OS 16
17 Hunk Stack Explore Analyze Visualize Dashboards Share splunkweb Web and Applica=on server Python, AJAX, CSS, XSLT, XML REST API Search Head Virtual Indexes C++, Web Services COMMAND LINE splunkd 64- bit Linux OS ODBC Hadoop Interface Hadoop Client Libraries JAVA 17
18 Scaling with Hadoop Explore Analyze Visualize Dashboards Share splunkweb Web and Applica=on server Python, AJAX, CSS, XSLT, XML REST API Search Head Virtual Indexes C++, Web Services COMMAND LINE splunkd ODBC Hadoop Interface Hadoop Client Libraries JAVA Connect Hunk to mul=ple Hadoop clusters Hadoop Cluster 1 Hadoop Cluster 2 Hadoop Cluster bit Linux OS 18
19 What Makes it Stick? In order to access and process data in external data stores (supports HDFS out-of-the-box), Hunk External Resource Providers (ERP) carry out the store-specific file system implementation and computational semantics. Hunk ERP Provider Family Provider Family is a logical grouping of data store framework that accesses the same kind of external systems and shares a global set of configura=ons. Hadoop A provider is a collec=on of specific Hunk ERP helper process implementa=on within the provider family and shares a cluster- specific configura=ons. ERP1 (prod) ERP2 (test) ATer you set up a provider, you configure virtual indexes (VIX) by giving Hunk informa=on about the data loca=on. Hunk then use the informa=on and its underlying implementa=on to distribute searches. VIX- 1 VIX- 2 VIX- 3 VIX- 4
20 Explore, Analyze, Visualize Data in Hadoop! Unlock business value of data in Hadoop! Fast to learn instead of scarce skills! Integrated explore, analyze and visualize! No fixed schema to search unstructured data! Preview results while MapReduce jobs start! Easier app development than in raw Hadoop 20
21 Integrated Analy=cs Plaoorm for Hadoop Data Full- featured, Integrated Product Explore Analyze Visualize Dashboards Share Insights for Everyone Works with What You Have Today Hadoop (MapReduce & HDFS) 21 21
22 Introduc=on to EMR
23 Amazon EMR! Amazon EMR is Hadoop framework in the cloud offered as a managed service! Used in variety of applica.ons, including log analysis, web indexing, data warehousing, machine learning, financial analysis, scien.fic simula.on, and bioinforma.cs Amazon EMR 23
24 Provisioning Hadoop on AWS 1. Login to AWS Console 2. Fill in a form 3. Click Create Cluster 4. Wait a few minutes for a fully operayonal Hadoop cluster 24
25 Why is EMR Compelling?! No Hadoop/HDFS management! NaYve support for AWS S3 Vast amounts of data in S3! Cluster Elas=city! Spot vs. Reserved Instances Long running vs. transient! Pay for what you use! Thousands of customers HDFS Master S
26 Integra=ng Hunk with EMR EMR Managed Hadoop framework on the cloud with access to vast amounts of data in HDFS and S3 Hunk Explore, analyze and visualize data from a central place Full analy=cs solu=on for Big Data on the cloud
27 Hunk on EMR: Op=on 1! Classic Hunk + Hadoop Provision an EMR cluster Provision a Hunk EC2 instance using the AWS Marketplace Hunk AMI Bring Your Own License (BYOL) Configure Hunk with EMR cluster ê Edit Security Groups to allow access ê Master IP addresses & Ports ê Create provider ê Create Virtual Index ê Search 27
28 Hunk on EMR: Op=on 2! Placeholder 28
29 ! Analyze ELB or S3 Access Logs Demo! Analyze CloudTrail Access Logs 29
30 Copyright 2014 Splunk Inc. QUESTIONS? You may also like: Hunk 6.1 Technical Deep Dive Hunk Report AcceleraYon Deep Dive Comprehensive Security AnalyYcs for Modern Threats with Hunk
31 THANK YOU feedback:
Architec;ng Splunk for High Availability and Disaster Recovery
Copyright 2014 Splunk Inc. Architec;ng Splunk for High Availability and Disaster Recovery Dritan Bi;ncka BD Solu;on Architecture Disclaimer During the course of this presenta;on, we may make forward- looking
How To Use Splunk For Android (Windows) With A Mobile App On A Microsoft Tablet (Windows 8) For Free (Windows 7) For A Limited Time (Windows 10) For $99.99) For Two Years (Windows 9
Copyright 2014 Splunk Inc. Splunk for Mobile Intelligence Bill Emme< Director, Solu?ons Marke?ng Panos Papadopoulos Director, Product Management Disclaimer During the course of this presenta?on, we may
Real World Big Data Architecture - Splunk, Hadoop, RDBMS
Copyright 2015 Splunk Inc. Real World Big Data Architecture - Splunk, Hadoop, RDBMS Raanan Dagan, Big Data Specialist, Splunk Disclaimer During the course of this presentagon, we may make forward looking
Incident Response Using Splunk for State and Local Governments
Copyright 2013 Splunk Inc. Incident Response Using Splunk for State and Local Governments Bert Hayes Solu=ons Engineer [email protected] #splunkconf Legal No=ces During the course of this presenta=on, we
Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More
Copyright 2015 Splunk Inc. Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More Stela Udovicic Sr. Product Marke?ng Manager Clayton
Splunk for Networking and SDN
Copyright 2013 Splunk Inc. Splunk for Networking and SDN Stela Udovicic Senior Product Marke?ng Manager, Splunk #splunkconf Legal No?ces During the course of this presenta?on, we may make forward- looking
Architec;ng Splunk for High Availability and Disaster Recovery
Copyright 2013 Splunk Inc. Architec;ng Splunk for High Availability and Disaster Recovery Dritan Bi;ncka Professional Services #splunkconf Legal No;ces During the course of this presenta;on, we may make
Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)
Open Cloud System (Integration of Eucalyptus, Hadoop and into deployment of University Private Cloud) Thinn Thu Naing University of Computer Studies, Yangon 25 th October 2011 Open Cloud System University
Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh
1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets
Data Center Evolu.on and the Cloud. Paul A. Strassmann George Mason University November 5, 2008, 7:20 to 10:00 PM
Data Center Evolu.on and the Cloud Paul A. Strassmann George Mason University November 5, 2008, 7:20 to 10:00 PM 1 Hardware Evolu.on 2 Where is hardware going? x86 con(nues to move upstream Massive compute
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
Linux Clusters Ins.tute: Turning HPC cluster into a Big Data Cluster. A Partnership for an Advanced Compu@ng Environment (PACE) OIT/ART, Georgia Tech
Linux Clusters Ins.tute: Turning HPC cluster into a Big Data Cluster Fang (Cherry) Liu, PhD [email protected] A Partnership for an Advanced Compu@ng Environment (PACE) OIT/ART, Georgia Tech Targets
Background on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros
David Moses January 2014 Paper on Cloud Computing I Background on Tools and Technologies in Amazon Web Services (AWS) In this paper I will highlight the technologies from the AWS cloud which enable you
Apache Hadoop. Alexandru Costan
1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open
BENCHMARKING V ISUALIZATION TOOL
Copyright 2014 Splunk Inc. BENCHMARKING V ISUALIZATION TOOL J. Green Computer Scien
Hadoop & Spark Using Amazon EMR
Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?
Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
www.boost ur skills.com
www.boost ur skills.com AWS CLOUD COMPUTING WORKSHOP Write us at [email protected] BOOSTURSKILLS No 1736 1st Amrutha College Road Kasavanhalli,Off Sarjapur Road,Bangalore-35 1) Introduction &
Cloud computing - Architecting in the cloud
Cloud computing - Architecting in the cloud [email protected] 1 Outline Cloud computing What is? Levels of cloud computing: IaaS, PaaS, SaaS Moving to the cloud? Architecting in the cloud Best practices
BIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
Hadoop Architecture. Part 1
Hadoop Architecture Part 1 Node, Rack and Cluster: A node is simply a computer, typically non-enterprise, commodity hardware for nodes that contain data. Consider we have Node 1.Then we can add more nodes,
Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect
on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze
Introducing Data Visualiza2on Cloud Service
Introducing Data Visualiza2on Cloud Service Vasu Murthy Sr. Director, Product Management Samar Lo2a VP of Development Oracle Business Analy2cs October 28, 2015 Note: The speaker notes for this slide include
Using Amazon EMR and Hunk to explore, analyze and visualize machine data
Using Amazon EMR and Hunk to explore, analyze and visualize machine data Machine data can take many forms and comes from a variety of sources; system logs, application logs, service and system metrics,
INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE
INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe
Introduction to Hadoop. New York Oracle User Group Vikas Sawhney
Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop
Open source Google-style large scale data analysis with Hadoop
Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: [email protected] Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical
Copyright 2013 Splunk Inc. Introducing Splunk 6
Copyright 2013 Splunk Inc. Introducing Splunk 6 Safe Harbor Statement During the course of this presentation, we may make forward looking statements regarding future events or the expected performance
Hadoop Setup. 1 Cluster
In order to use HadoopUnit (described in Sect. 3.3.3), a Hadoop cluster needs to be setup. This cluster can be setup manually with physical machines in a local environment, or in the cloud. Creating a
IntroducJon to Splunk Cloud & Case Study: MindTouch. Praveen Rangnath Splunk César López- Natarén MindTouch Aaron Fulkerson MindTouch
Copyright 2014 plunk Inc. Copyright @ 2 014 CSomcast IntroducJon to Splunk Cloud & Case Study: MindTouch Praveen Rangnath Splunk César López- Natarén MindTouch Aaron Fulkerson MindTouch Disclaimer During
End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ
End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,
A Tutorial Introduc/on to Big Data. Hands On Data Analy/cs over EMR. Robert Grossman University of Chicago Open Data Group
A Tutorial Introduc/on to Big Data Hands On Data Analy/cs over EMR Robert Grossman University of Chicago Open Data Group Collin BenneE Open Data Group November 12, 2012 1 Amazon AWS Elas/c MapReduce allows
Accelera'ng Your Solu'on Development with Splunk Reference Apps
Copyright 2015 Splunk Inc. Accelera'ng Your Solu'on Development with Splunk Reference Apps Grigori Melnik Principal Product Manager Developer PlaAorm, Splunk @gmelnik Disclaimer During the course of this
Large scale processing using Hadoop. Ján Vaňo
Large scale processing using Hadoop Ján Vaňo What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data Includes: MapReduce offline computing engine
Amazon EC2 Product Details Page 1 of 5
Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Functionality Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of
Chapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
XpoLog Competitive Comparison Sheet
XpoLog Competitive Comparison Sheet New frontier in big log data analysis and application intelligence Technical white paper May 2015 XpoLog, a data analysis and management platform for applications' IT
Leveraging SAP HANA & Hortonworks Data Platform to analyze Wikipedia Page Hit Data
Leveraging SAP HANA & Hortonworks Data Platform to analyze Wikipedia Page Hit Data 1 Introduction SAP HANA is the leading OLTP and OLAP platform delivering instant access and critical business insight
Assignment # 1 (Cloud Computing Security)
Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual
Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components
Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components of Hadoop. We will see what types of nodes can exist in a Hadoop
MyCloudLab: An Interactive Web-based Management System for Cloud Computing Administration
MyCloudLab: An Interactive Web-based Management System for Cloud Computing Administration Hoi-Wan Chan 1, Min Xu 2, Chung-Pan Tang 1, Patrick P. C. Lee 1 & Tsz-Yeung Wong 1, 1 Department of Computer Science
Savanna Hadoop on. OpenStack. Savanna Technical Lead
Savanna Hadoop on OpenStack Sergey Lukjanov Savanna Technical Lead Mirantis, 2013 Agenda Savanna Overview Savanna Use Cases Roadmap & Current Status Architecture & Features Overview Hadoop vs. Virtualization
HDFS Cluster Installation Automation for TupleWare
HDFS Cluster Installation Automation for TupleWare Xinyi Lu Department of Computer Science Brown University Providence, RI 02912 [email protected] March 26, 2014 Abstract TupleWare[1] is a C++ Framework
Last time. Today. IaaS Providers. Amazon Web Services, overview
Last time General overview, motivation, expected outcomes, other formalities, etc. Please register for course Online (if possible), or talk to Yvonne@CS Course evaluation forgotten Please assign one volunteer
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
Qsoft Inc www.qsoft-inc.com
Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:
Hadoop Parallel Data Processing
MapReduce and Implementation Hadoop Parallel Data Processing Kai Shen A programming interface (two stage Map and Reduce) and system support such that: the interface is easy to program, and suitable for
Deploying Splunk on Amazon Web Services
Copyright 2014 Splunk Inc. Deploying Splunk on Amazon Web Services Simeon Yep Senior Manager, Business Development Technical Services Roy Arsan Senior SoHware Engineer Disclaimer During the course of this
Passwords are for Chumps
Copyright 2014 Splunk Inc. Passwords are for Chumps David Veuve SE, Splunk Who Am I?! David Veuve Sales Engineer for Major Accounts in Northern California! [email protected]! Former Splunk Customer (For
Big Data and Hadoop with components like Flume, Pig, Hive and Jaql
Abstract- Today data is increasing in volume, variety and velocity. To manage this data, we have to use databases with massively parallel software running on tens, hundreds, or more than thousands of servers.
CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)
CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2016 MapReduce MapReduce is a programming model
MapReduce, Hadoop and Amazon AWS
MapReduce, Hadoop and Amazon AWS Yasser Ganjisaffar http://www.ics.uci.edu/~yganjisa February 2011 What is Hadoop? A software framework that supports data-intensive distributed applications. It enables
ArcGIS for Server: In the Cloud
DevSummit DC February 11, 2015 Washington, DC ArcGIS for Server: In the Cloud Bonnie Stayer, Esri Session Outline Cloud Overview - Benefits - Types of clouds ArcGIS in AWS - Cloud Builder - Maintenance
Apache HBase. Crazy dances on the elephant back
Apache HBase Crazy dances on the elephant back Roman Nikitchenko, 16.10.2014 YARN 2 FIRST EVER DATA OS 10.000 nodes computer Recent technology changes are focused on higher scale. Better resource usage
Apache Hadoop new way for the company to store and analyze big data
Apache Hadoop new way for the company to store and analyze big data Reyna Ulaque Software Engineer Agenda What is Big Data? What is Hadoop? Who uses Hadoop? Hadoop Architecture Hadoop Distributed File
APP DEVELOPMENT ON THE CLOUD MADE EASY WITH PAAS
APP DEVELOPMENT ON THE CLOUD MADE EASY WITH PAAS This article looks into the benefits of using the Platform as a Service paradigm to develop applications on the cloud. It also compares a few top PaaS providers
Data Management in the Cloud: Limitations and Opportunities. Annies Ductan
Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management
Using SUSE Studio to Build and Deploy Applications on Amazon EC2. Guide. Solution Guide Cloud Computing. www.suse.com
Using SUSE Studio to Build and Deploy Applications on Amazon EC2 Guide Solution Guide Cloud Computing Cloud Computing Solution Guide Using SUSE Studio to Build and Deploy Applications on Amazon EC2 Quickly
A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud
A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud Thuy D. Nguyen, Cynthia E. Irvine, Jean Khosalim Department of Computer Science Ground System Architectures Workshop
Return on Experience on Cloud Compu2ng Issues a stairway to clouds. Experts Workshop Nov. 21st, 2013
Return on Experience on Cloud Compu2ng Issues a stairway to clouds Experts Workshop Agenda InGeoCloudS SoCware Stack InGeoCloudS Elas2city and Scalability Elas2c File Server Elas2c Database Server Elas2c
HADOOP BIG DATA DEVELOPER TRAINING AGENDA
HADOOP BIG DATA DEVELOPER TRAINING AGENDA About the Course This course is the most advanced course available to Software professionals This has been suitably designed to help Big Data Developers and experts
Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science
A Seminar report On Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science SUBMITTED TO: www.studymafia.org SUBMITTED BY: www.studymafia.org
Microsoft Big Data Solutions. Anar Taghiyev P-TSP E-mail: [email protected];
Microsoft Big Data Solutions Anar Taghiyev P-TSP E-mail: [email protected]; Why/What is Big Data and Why Microsoft? Options of storage and big data processing in Microsoft Azure. Real Impact of Big
Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas
Big Data The Big Picture Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas What is Big Data? Big Data gets its name because that s what it is data that
Prepared By : Manoj Kumar Joshi & Vikas Sawhney
Prepared By : Manoj Kumar Joshi & Vikas Sawhney General Agenda Introduction to Hadoop Architecture Acknowledgement Thanks to all the authors who left their selfexplanatory images on the internet. Thanks
A Cost-Evaluation of MapReduce Applications in the Cloud
1/23 A Cost-Evaluation of MapReduce Applications in the Cloud Diana Moise, Alexandra Carpen-Amarie Gabriel Antoniu, Luc Bougé KerData team 2/23 1 MapReduce applications - case study 2 3 4 5 3/23 MapReduce
Clusters in the Cloud
Clusters in the Cloud Dr. Paul Coddington, Deputy Director Dr. Shunde Zhang, Compu:ng Specialist eresearch SA October 2014 Use Cases Make the cloud easier to use for compute jobs Par:cularly for users
Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
Big Data Analytics(Hadoop) Prepared By : Manoj Kumar Joshi & Vikas Sawhney
Big Data Analytics(Hadoop) Prepared By : Manoj Kumar Joshi & Vikas Sawhney General Agenda Understanding Big Data and Big Data Analytics Getting familiar with Hadoop Technology Hadoop release and upgrades
Hadoop on OpenStack Cloud. Dmitry Mescheryakov Software Engineer, @MirantisIT
Hadoop on OpenStack Cloud Dmitry Mescheryakov Software Engineer, @MirantisIT Agenda OpenStack Sahara Demo Hadoop Performance on Cloud Conclusion OpenStack Open source cloud computing platform 17,209 commits
Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop
Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social
CDH 5 Quick Start Guide
CDH 5 Quick Start Guide Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained in this
How to use. ankus v0.2.1 ankus community 작성자 : 이승복. This work is licensed under a Creative Commons Attribution 4.0 International License.
How to use ankus v0.2.1 ankus community 작성자 : 이승복 This work is licensed under a Creative Commons Attribution 4.0 International License. Table of Contents Lesson 01. Sign in ankus Lesson 02. User Management
A very short Intro to Hadoop
4 Overview A very short Intro to Hadoop photo by: exfordy, flickr 5 How to Crunch a Petabyte? Lots of disks, spinning all the time Redundancy, since disks die Lots of CPU cores, working all the time Retry,
Leveraging Machine Data to Deliver New Insights for Business Analytics
Copyright 2015 Splunk Inc. Leveraging Machine Data to Deliver New Insights for Business Analytics Rahul Deshmukh Director, Solutions Marketing Jason Fedota Regional Sales Manager Safe Harbor Statement
Big Data on Microsoft Platform
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
Cloud Based Tes,ng & Capacity Planning (CloudPerf)
Cloud Based Tes,ng & Capacity Planning (CloudPerf) Joan A. Smith Emory University Libraries [email protected] Frank Owen Owenworks Inc. [email protected] Full presenta,on materials and CloudPerf screencast
Introduction to Cloud Computing
Introduction to Cloud Computing Cloud Computing I (intro) 15 319, spring 2010 2 nd Lecture, Jan 14 th Majd F. Sakr Lecture Motivation General overview on cloud computing What is cloud computing Services
Parallel Data Mining and Assurance Service Model Using Hadoop in Cloud
Parallel Data Mining and Assurance Service Model Using Hadoop in Cloud Aditya Jadhav, Mahesh Kukreja E-mail: [email protected] & [email protected] Abstract : In the information industry,
Big Data Storage Options for Hadoop Sam Fineberg, HP Storage
Sam Fineberg, HP Storage SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material in presentations
Getting Started with Hadoop. Raanan Dagan Paul Tibaldi
Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop
Big data blue print for cloud architecture
Big data blue print for cloud architecture -COGNIZANT Image Area Prabhu Inbarajan Srinivasan Thiruvengadathan Muralicharan Gurumoorthy Praveen Codur 2012, Cognizant Next 30 minutes Big Data / Cloud challenges
Deploying Hadoop with Manager
Deploying Hadoop with Manager SUSE Big Data Made Easier Peter Linnell / Sales Engineer [email protected] Alejandro Bonilla / Sales Engineer [email protected] 2 Hadoop Core Components 3 Typical Hadoop Distribution
Data Stream Algorithms in Storm and R. Radek Maciaszek
Data Stream Algorithms in Storm and R Radek Maciaszek Who Am I? l Radek Maciaszek l l l l l l Consul9ng at DataMine Lab (www.dataminelab.com) - Data mining, business intelligence and data warehouse consultancy.
Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
Mr. Apichon Witayangkurn [email protected] Department of Civil Engineering The University of Tokyo
Sensor Network Messaging Service Hive/Hadoop Mr. Apichon Witayangkurn [email protected] Department of Civil Engineering The University of Tokyo Contents 1 Introduction 2 What & Why Sensor Network
BIG DATA SOLUTION DATA SHEET
BIG DATA SOLUTION DATA SHEET Highlight. DATA SHEET HGrid247 BIG DATA SOLUTION Exploring your BIG DATA, get some deeper insight. It is possible! Another approach to access your BIG DATA with the latest
SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES
SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES AWS GLOBAL INFRASTRUCTURE 10 Regions 25 Availability Zones 51 Edge locations WHAT
Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco
Decoding the Big Data Deluge a Virtual Approach Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco High-volume, velocity and variety information assets that demand
Gain Insight into Your Cloud Usage with the Splunk App for AWS
Copyright 2013 Splunk Inc. Gain Insight into Your Cloud Usage with the Splunk App for AWS Nilesh Khe
Cloud Computing. Adam Barker
Cloud Computing Adam Barker 1 Overview Introduction to Cloud computing Enabling technologies Different types of cloud: IaaS, PaaS and SaaS Cloud terminology Interacting with a cloud: management consoles
