Instance Types. Standard Instances:
|
|
- Madlyn Anderson
- 7 years ago
- Views:
Transcription
1 Instance Types Standard Instances: 1EC2CU: equivalent of GHz 2007 AMD Opteron or 2007 Intel Xeon processor Small: 1.7GBmem, 1EC2Compute Unit (EC2CU), 160GB local instance storage(lis), 32/64bits. Medium: 3.75 GBmem, 2EC2CU, 410GBlis, 32/64bits. Large: 7.5GBmem, 4EC2CU, 850GBlis, 64bits Extra Large: 15GBmem, 8EC2CU, 1690GBlis, 64bits. Micro Instances: 613MBmem, 2ECUs, EBS High-Memory Instances: 17.1, 34.2, 68.4GBs. High-CPU Instances (5EC2CU or 20EC2CU) Cluster GPU Instances (22GBmem, 33.5EC2CU, 2xNVIDIA Tesla Fermi M2050 GPUs, 1690GBlis, 10GEthernet. 21
2 Instance vs. VM Instance = VM + hardware (instance type) AMI (Amazon Machine Image) = VM image VM image = OS + software Users specify the type of VM and hardware (i.e., instance type) when setting up an instance 22
3 OS and Software Amazon Machine Images (AMIs) are preconfigured with an evergrowing list of operating systems (win2008os including in price!!) 23
4 Pricing: On-Demand Instance 24
5 Data Transfer Charge chow 25
6 AWS s Free Usage Tier 26
7 Amazon S3 (Simple Storage Service) Basics Data stored as objects (files) in buckets key to file is path identified by <bucket> + <path> No real directories, just path segments Great as persistent storage for data Reliable up to % Scalable up to petabytes of data Fast highly parallel requests
8 S3 Access Via your web browser Various command line tools s3cmd Or via HTTP REST interface Create (PUT/POST), Read (GET), Delete (DELETE)
9 S3 Limitations Can t be modified (no random write or append) Max size of 5TB (5GB per upload request)
10 S3 Pricing Varies by region Data in is (currently) free Data out is also free within same region Otherwise starts at $0.12/GB Storage cost is per GB-month Starts at $0.140/GB, drops w/volume
11 S3 Access Control List (ACL) Read/Write permissions on per-bucket basis Read == listing objects in bucket Write == create/overwrite/delete objects in bucket Read/Write permissions on per-object (file) basis Read == read object data & metadata
12 S3 Amazon web services S3 API support the ability to: Find buckets and objects (jar file, data file, etc.) Discover their meta data Create new buckets Upload new objects Delete existing buckets and objects Distcp/s3distcp from S3 to HDFS for computation
13 Amazon EMR A web service that allow cost-effective large data processing Hadoop (HDFS + Map-Reduce) over EC2 and S3 EMR is mostly used for data intensive tasks Examples: web indexing, data mining, log analysis, data warehousing, machine learning, financial analysis, scientific simulation, bioinformatics 33
14 Apache Hadoop Stack for Data analytics Resource Management & Workflow HBase Pig, Hive, Mahout Map Reduce Yarn Zookeeper HDFS Sqoop Flume 34
15 Why Use Elastic MapReduce? Reduce hardware & IT personnel costs Pay for what you actually use Don t pay for people you don t need Don t pay for capacity you don t need More agility, less wait time for hardware Don t waste time buying/racking/configuring servers Many server classes to choose from (micro to massive) Less time doing Hadoop deployment & version mgmt Optimized Hadoop is pre-installed
16 Amazon Mechanical Turk A web service that exposes an ondemand global workforce ready to complete small tasks in exchange for micro-payments Frictionless. Outsourcing per-se is irrelevant. A web services API Examples?
17 Identify Road Markings
18 How It Works Requester (Developer) Human Intelligence Tasks (HITs) Worker Qualifications Artificial, Artificially Intelligent Software Completed HITs Workers 38
19 Example Application: Podcast transcription service provider, which transcribes audio into high-quality text Amazon Simple Storage: Stores the podcasts and related files Amazon Mechanical Turk + EMR: voice recognition algorithms transcribe podcasts Amazon EMR: index text within search engine
20 Learn More About AWS AWS: EC2 Resources: C2/latest/UserGuide/ Amazon EMR: ce/
21 Homework (Last Friday) Setup AWS account Watch Video on AWS EMR Getting Started (11:04) Signing up for an AWS account, generating a keypair, and setting up an S3 bucket. Running Jobs (14:47) Creating, monitoring, and getting results from you EMR Job Flow. Clusters of Servers (10:50) EC2 instance types, pricing, and Hadoop cluster configuration. Dealing with Data (18:54) S3 architectures, pricing, and access control. 41
22 Homework (Cont.) AWS Hands-on Lab 0 Follow the instructions from the tutorial and repeat the tasks including: create an account, working with S3, create cluster, and run a job, setup instances. Compile jar file using the source code posted on course website 42
23 Summary Cloud Computing AWS EC2 and S3 EMR and AMT Hands-on Lab 0 warming up 43
Cloud Computing and Amazon Web Services
Cloud Computing and Amazon Web Services Gary A. McGilvary edinburgh data.intensive research 1 OUTLINE 1. An Overview of Cloud Computing 2. Amazon Web Services 3. Amazon EC2 Tutorial 4. Conclusions 2 CLOUD
More informationShadi Khalifa Database Systems Laboratory (DSL) khalifa@cs.queensu.ca
Shadi Khalifa Database Systems Laboratory (DSL) khalifa@cs.queensu.ca What is Amazon!! American international multibillion dollar electronic commerce company with headquarters in Seattle, Washington, USA.
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationWorkshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
More informationHADOOP BIG DATA DEVELOPER TRAINING AGENDA
HADOOP BIG DATA DEVELOPER TRAINING AGENDA About the Course This course is the most advanced course available to Software professionals This has been suitably designed to help Big Data Developers and experts
More informationTHE DEFINITIVE GUIDE FOR AWS CLOUD EC2 FAMILIES
THE DEFINITIVE GUIDE FOR AWS CLOUD EC2 FAMILIES Introduction Amazon Web Services (AWS), which was officially launched in 2006, offers you varying cloud services that are not only cost effective, but also
More informationCloud Computing For Bioinformatics
Cloud Computing For Bioinformatics Cloud Computing: what is it? Cloud Computing is a distributed infrastructure where resources, software, and data are provided in an on-demand fashion. Cloud Computing
More informationSAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES
SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES AWS GLOBAL INFRASTRUCTURE 10 Regions 25 Availability Zones 51 Edge locations WHAT
More informationStorReduce Technical White Paper Cloud-based Data Deduplication
StorReduce Technical White Paper Cloud-based Data Deduplication See also at storreduce.com/docs StorReduce Quick Start Guide StorReduce FAQ StorReduce Solution Brief, and StorReduce Blog at storreduce.com/blog
More informationSavanna Hadoop on. OpenStack. Savanna Technical Lead
Savanna Hadoop on OpenStack Sergey Lukjanov Savanna Technical Lead Mirantis, 2013 Agenda Savanna Overview Savanna Use Cases Roadmap & Current Status Architecture & Features Overview Hadoop vs. Virtualization
More informationCIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.
CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Cloud Computing and Amazon Web Services Cloud Computing Amazon
More informationCloud computing - Architecting in the cloud
Cloud computing - Architecting in the cloud anna.ruokonen@tut.fi 1 Outline Cloud computing What is? Levels of cloud computing: IaaS, PaaS, SaaS Moving to the cloud? Architecting in the cloud Best practices
More informationBig Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect
on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze
More informationAmazon EC2 Product Details Page 1 of 5
Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Functionality Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of
More informationHigh Performance Applications over the Cloud: Gains and Losses
High Performance Applications over the Cloud: Gains and Losses Dr. Leila Ismail Faculty of Information Technology United Arab Emirates University leila@uaeu.ac.ae http://citweb.uaeu.ac.ae/citweb/profile/leila
More informationCloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
More informationReal Time Big Data Processing
Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
More informationHadoop IST 734 SS CHUNG
Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to
More informationRunning R from Amazon's Elastic Compute Cloud
Running R on the Running R from Amazon's Elastic Compute Cloud Department of Statistics University of NebraskaLincoln April 30, 2014 Running R on the 1 Introduction 2 3 Running R on the Pre-made AMI Building
More informationIntroduction to Big Data Training
Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB
More informationCloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
More informationLarge-Scale Data Engineering. Cloud Computing - Computing as a Service
Large-Scale Data Engineering Cloud Computing - Computing as a Service Utility computing What? Computing resources as a metered service ( pay as you go ) Ability to dynamically provision virtual machines
More informationBig Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08
More informationBig Data on Microsoft Platform
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
More informationApache Hadoop: Past, Present, and Future
The 4 th China Cloud Computing Conference May 25 th, 2012. Apache Hadoop: Past, Present, and Future Dr. Amr Awadallah Founder, Chief Technical Officer aaa@cloudera.com, twitter: @awadallah Hadoop Past
More informationLecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop
Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social
More informationScalable Architecture on Amazon AWS Cloud
Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect
More informationFREE computing using Amazon EC2
FREE computing using Amazon EC2 Seong-Hwan Jun 1 1 Department of Statistics Univ of British Columbia Nov 1st, 2012 / Student seminar Outline Basics of servers Amazon EC2 Setup R on an EC2 instance Stat
More informationPV213 Enterprise Information Systems in Practice 07 - Architecture of the EIS in the cloud
PV213 Enterprise Information Systems in Practice 07 - Architecture of the EIS in the cloud PV213 EIS in Practice: 07 - Architecture of the EIS in the cloud 1 PV213 EIS in Practice: 07 - Architecture of
More informationIntroduction to. Thilina Gunarathne Salsa Group, Indiana University. With contributions from Saliya Ekanayake.
Introduction to Amazon Web Services Thilina Gunarathne Salsa Group, Indiana University. With contributions from Saliya Ekanayake. Introduction Fourth Paradigm Data intensive scientific discovery DNA Sequencing
More informationHadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
More informationDéveloppement logiciel pour le Cloud (TLC)
Développement logiciel pour le Cloud (TLC) 7. Infrastructure-as-a-Service Guillaume Pierre Université de Rennes 1 Fall 2012 http://www.globule.org/~gpierre/ Développement logiciel pour le Cloud (TLC) 1
More informationCertified Big Data and Apache Hadoop Developer VS-1221
Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer Certification Code VS-1221 Vskills certification for Big Data and Apache Hadoop Developer Certification
More informationGetting Started with Hadoop. Raanan Dagan Paul Tibaldi
Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop
More informationCollaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
More informationDepartment of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14
Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases Lecture 14 Big Data Management IV: Big-data Infrastructures (Background, IO, From NFS to HFDS) Chapter 14-15: Abideboul
More informationInfomatics. Big-Data and Hadoop Developer Training with Oracle WDP
Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools
More informationTechnology and Cost Considerations for Cloud Deployment: Amazon Elastic Compute Cloud (EC2) Case Study
Creating Value Delivering Solutions Technology and Cost Considerations for Cloud Deployment: Amazon Elastic Compute Cloud (EC2) Case Study Chris Zajac, NJDOT Bud Luo, Ph.D., Michael Baker Jr., Inc. Overview
More informationIntroduction to Cloud Computing
Discovery 2015: Cloud Computing Workshop June 20-24, 2011 Berkeley, CA Introduction to Cloud Computing Keith R. Jackson Lawrence Berkeley National Lab What is it? NIST Definition Cloud computing is a model
More informationCloud Computing. Adam Barker
Cloud Computing Adam Barker 1 Overview Introduction to Cloud computing Enabling technologies Different types of cloud: IaaS, PaaS and SaaS Cloud terminology Interacting with a cloud: management consoles
More informationHadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee dhruba@apache.org dhruba@facebook.com
Hadoop Distributed File System Dhruba Borthakur Apache Hadoop Project Management Committee dhruba@apache.org dhruba@facebook.com Hadoop, Why? Need to process huge datasets on large clusters of computers
More informationData processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
More informationBIG DATA TECHNOLOGY. Hadoop Ecosystem
BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big
More informationAmazon EC2 XenApp Scalability Analysis
WHITE PAPER Citrix XenApp Amazon EC2 XenApp Scalability Analysis www.citrix.com Table of Contents Introduction...3 Results Summary...3 Detailed Results...4 Methods of Determining Results...4 Amazon EC2
More informationENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE
ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics
More informationIntro to AWS: Storage Services
Intro to AWS: Storage Services Matt McClean, AWS Solutions Architect 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved AWS storage options Scalable object storage Inexpensive archive
More informationAnalytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
More informationAmazon Elastic Compute Cloud Getting Started Guide. My experience
Amazon Elastic Compute Cloud Getting Started Guide My experience Prepare Cell Phone Credit Card Register & Activate Pricing(Singapore) Region Amazon EC2 running Linux(SUSE Linux Windows Windows with SQL
More informationBig Data Explained. An introduction to Big Data Science.
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
More informationMySQL and Virtualization Guide
MySQL and Virtualization Guide Abstract This is the MySQL and Virtualization extract from the MySQL Reference Manual. For legal information, see the Legal Notices. For help with using MySQL, please visit
More informationPeers Techno log ies Pv t. L td. HADOOP
Page 1 Peers Techno log ies Pv t. L td. Course Brochure Overview Hadoop is a Open Source from Apache, which provides reliable storage and faster process by using the Hadoop distibution file system and
More informationAmazon Hosted ESRI GeoPortal Server. GeoCloud Project Report
Amazon Hosted ESRI GeoPortal Server GeoCloud Project Report Description of Application Operating Organization The USDA participated in the FY 2011 Federal Geographic Data Committee (FGDC) GeoCloud Sandbox
More informationBig Data and Natural Language: Extracting Insight From Text
An Oracle White Paper October 2012 Big Data and Natural Language: Extracting Insight From Text Table of Contents Executive Overview... 3 Introduction... 3 Oracle Big Data Appliance... 4 Synthesys... 5
More informationwu.cloud: Insights Gained from Operating a Private Cloud System
wu.cloud: Insights Gained from Operating a Private Cloud System Stefan Theußl, Institute for Statistics and Mathematics WU Wirtschaftsuniversität Wien March 23, 2011 1 / 14 Introduction In statistics we
More informationOpen source Google-style large scale data analysis with Hadoop
Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical
More informationIntel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013
Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache
More informationScalable Application. Mikalai Alimenkou http://xpinjection.com 11.05.2012
Scalable Application Development on AWS Mikalai Alimenkou http://xpinjection.com 11.05.2012 Background Java Technical Lead/Scrum Master at Zoral Labs 7+ years in software development 5+ years of working
More informationTRAINING PROGRAM ON BIGDATA/HADOOP
Course: Training on Bigdata/Hadoop with Hands-on Course Duration / Dates / Time: 4 Days / 24th - 27th June 2015 / 9:30-17:30 Hrs Venue: Eagle Photonics Pvt Ltd First Floor, Plot No 31, Sector 19C, Vashi,
More informationBig Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum
Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All
More informationAmazon Web Services. Luca Clementi clem@sdsc.edu Sriram Krishnan sriram@sdsc.edu. NBCR Summer Institute, August 2009
Amazon Web Services Luca Clementi clem@sdsc.edu Sriram Krishnan sriram@sdsc.edu NBCR Summer Institute, August 2009 Introduction Outline Different type of hosting Cloud offering Amazon Web Service Offering
More informationEfficient Cloud Management for Parallel Data Processing In Private Cloud
2012 International Conference on Information and Network Technology (ICINT 2012) IPCSIT vol. 37 (2012) (2012) IACSIT Press, Singapore Efficient Cloud Management for Parallel Data Processing In Private
More informationCSE 344 Introduction to Data Management. Section 9: AWS, Hadoop, Pig Latin TA: Yi-Shu Wei
CSE 344 Introduction to Data Management Section 9: AWS, Hadoop, Pig Latin TA: Yi-Shu Wei Homework 8 Big Data analysis on billion triple dataset using Amazon Web Service (AWS) Billion Triple Set: contains
More informationChukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
More informationHadoop. http://hadoop.apache.org/ Sunday, November 25, 12
Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using
More informationHadoop & Spark Using Amazon EMR
Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?
More informationVirtual Machine (VM) These VMs are to be used for teaching: they are not workstations for calculation.
Computer and Software Infrastructure Available to Teachers and Students of the MSc in Big Data Virtual Machine (VM) All students have at their disposal a VM Windows 7 64-bit, 3 GB RAM, 1 vcpu. This VM
More informationThe Greenplum Analytics Workbench
The Greenplum Analytics Workbench External Overview 1 The Greenplum Analytics Workbench Definition Is a 1000-node Hadoop Cluster. Pre-configured with publicly available data sets. Contains the entire Hadoop
More informationApache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com
Apache Sentry Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture
More informationBIG DATA HADOOP TRAINING
BIG DATA HADOOP TRAINING DURATION 40hrs AVAILABLE BATCHES WEEKDAYS (7.00AM TO 8.30AM) & WEEKENDS (10AM TO 1PM) MODE OF TRAINING AVAILABLE ONLINE INSTRUCTOR LED CLASSROOM TRAINING (MARATHAHALLI, BANGALORE)
More informationAnalyzing Big Data with AWS
Analyzing Big Data with AWS Peter Sirota, General Manager, Amazon Elastic MapReduce @petersirota What is Big Data? Computer generated data Application server logs (web sites, games) Sensor data (weather,
More informationBest Practices for Sharing Imagery using Amazon Web Services. Peter Becker
Best Practices for Sharing Imagery using Amazon Web Services Peter Becker Objectives Making Imagery Accessible Store massive volumes of imagery on inexpensive cloud storage Use elastic compute for image
More informationBIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM. An Overview
BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM An Overview Contents Contents... 1 BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM... 1 Program Overview... 4 Curriculum... 5 Module 1: Big Data: Hadoop
More informationAn Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov
An Industrial Perspective on the Hadoop Ecosystem Eldar Khalilov Pavel Valov agenda 03.12.2015 2 agenda Introduction 03.12.2015 2 agenda Introduction Research goals 03.12.2015 2 agenda Introduction Research
More informationA Service for Data-Intensive Computations on Virtual Clusters
A Service for Data-Intensive Computations on Virtual Clusters Executing Preservation Strategies at Scale Rainer Schmidt, Christian Sadilek, and Ross King rainer.schmidt@arcs.ac.at Planets Project Permanent
More informationIntroduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
More informationUse of Hadoop File System for Nuclear Physics Analyses in STAR
1 Use of Hadoop File System for Nuclear Physics Analyses in STAR EVAN SANGALINE UC DAVIS Motivations 2 Data storage a key component of analysis requirements Transmission and storage across diverse resources
More informationSOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce
More informationScalable Services for Digital Preservation
Scalable Services for Digital Preservation A Perspective on Cloud Computing Rainer Schmidt, Christian Sadilek, and Ross King Digital Preservation (DP) Providing long-term access to growing collections
More informationGetting Started with Hadoop with Amazon s Elastic MapReduce
Getting Started with Hadoop with Amazon s Elastic MapReduce Scott Hendrickson scott@drskippy.net http://drskippy.net/projects/emr-hadoopmeetup.pdf Boulder/Denver Hadoop Meetup 8 July 2010 Scott Hendrickson
More information[Type text] Week. National summer training program on. Big Data & Hadoop. Why big data & Hadoop is important?
1 Week National summer training program on Big Data & Hadoop Why big data & Hadoop is important? Highlights of Big Data & Hadoop Implement a Hadoop Project Learn to write Complex MapReduce programs Perform
More informationGTC Presentation March 19, 2013. Copyright 2012 Penguin Computing, Inc. All rights reserved
GTC Presentation March 19, 2013 Copyright 2012 Penguin Computing, Inc. All rights reserved Session S3552 Room 113 S3552 - Using Tesla GPUs, Reality Server and Penguin Computing's Cloud for Visualizing
More informationOnline Backup Guide for the Amazon Cloud: How to Setup your Online Backup Service using Vembu StoreGrid Backup Virtual Appliance on the Amazon Cloud
Online Backup Guide for the Amazon Cloud: How to Setup your Online Backup Service using Vembu StoreGrid Backup Virtual Appliance on the Amazon Cloud Here is a step-by-step set of instructions to get your
More informationUpcoming Announcements
Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within
More informationChase Wu New Jersey Ins0tute of Technology
CS 698: Special Topics in Big Data Chapter 4. Big Data Analytics Platforms Chase Wu New Jersey Ins0tute of Technology Some of the slides have been provided through the courtesy of Dr. Ching-Yung Lin at
More informationBIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
More informationThere Are Clouds In Your Future. Jeff Barr Amazon Web Services jbarr@amazon.com @jeffbarr (Twitter)
There Are Clouds In Your Future Jeff Barr Amazon Web Services jbarr@amazon.com @jeffbarr (Twitter) My Goals For This Talk Introduce you to cloud computing Show you what others are already doing Alert you
More informationST 810, Advanced computing
ST 810, Advanced computing Eric B. Laber & Hua Zhou Department of Statistics, North Carolina State University January 30, 2013 Supercomputers are expensive. Eric B. Laber, 2011, while browsing the internet.
More informationBUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business
BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business Instructor: Kunpeng Zhang (kzhang@rmsmith.umd.edu) Lecture-Discussions:
More informationCloud Computing. Alex Crawford Ben Johnstone
Cloud Computing Alex Crawford Ben Johnstone Overview What is cloud computing? Amazon EC2 Performance Conclusions What is the Cloud? A large cluster of machines o Economies of scale [1] Customers use a
More informationHadoop on OpenStack Cloud. Dmitry Mescheryakov Software Engineer, @MirantisIT
Hadoop on OpenStack Cloud Dmitry Mescheryakov Software Engineer, @MirantisIT Agenda OpenStack Sahara Demo Hadoop Performance on Cloud Conclusion OpenStack Open source cloud computing platform 17,209 commits
More informationAmazon Elastic MapReduce. Jinesh Varia Peter Sirota Richard Cole
Amazon Elastic MapReduce Jinesh Varia Peter Sirota Richard Cole Start End From IDE Command line Web Console Notify Input Data Get Results Start End From IDE Command line Web Console AWS EC2 Instance Notify
More informationCloud Computing for Research. Jeff Barr - January 2011
Cloud Computing for Research Jeff Barr - January 2011 The Cloud is Suddenly Everywhere Current Research Challenges There is never enough: Time Money CPU power Storage Physical space Power or cooling How
More informationApplication and practice of parallel cloud computing in ISP. Guangzhou Institute of China Telecom Zhilan Huang 2011-10
Application and practice of parallel cloud computing in ISP Guangzhou Institute of China Telecom Zhilan Huang 2011-10 Outline Mass data management problem Applications of parallel cloud computing in ISPs
More informationLast time. Today. IaaS Providers. Amazon Web Services, overview
Last time General overview, motivation, expected outcomes, other formalities, etc. Please register for course Online (if possible), or talk to Yvonne@CS Course evaluation forgotten Please assign one volunteer
More informationLambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
More informationHadoop & its Usage at Facebook
Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System dhruba@apache.org Presented at the Storage Developer Conference, Santa Clara September 15, 2009 Outline Introduction
More informationOTM in the Cloud. Ryan Haney
OTM in the Cloud Ryan Haney The Cloud The Cloud is a set of services and technologies that delivers real-time and ondemand computing resources Software as a Service (SaaS) delivers preconfigured applications,
More informationWROX Certified Big Data Analyst Program by AnalytixLabs and Wiley
WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley Disclaimer: This material is protected under copyright act AnalytixLabs, 2011. Unauthorized use and/ or duplication of this material or
More informationIntroduction. Various user groups requiring Hadoop, each with its own diverse needs, include:
Introduction BIG DATA is a term that s been buzzing around a lot lately, and its use is a trend that s been increasing at a steady pace over the past few years. It s quite likely you ve also encountered
More informationApache HBase. Crazy dances on the elephant back
Apache HBase Crazy dances on the elephant back Roman Nikitchenko, 16.10.2014 YARN 2 FIRST EVER DATA OS 10.000 nodes computer Recent technology changes are focused on higher scale. Better resource usage
More information