Building Real-Time Analytics Into Big Data Applications
|
|
|
- Logan Davis
- 10 years ago
- Views:
Transcription
1 Building Real-Time Analytics Into Big Data Applications Shawn Gandhi, Solutions 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved
2 { } "payerid": "Joe", "productcode": "AmazonS3", "clientproductcode": "AmazonS3", "usagetype": "Bandwidth", "operation": "PUT", "value": "22490", "timestamp": " " Metering Record user-identifier frank [10/Oct/2000:13:55: ] "GET /apache_pb.gif HTTP/1.0" Common Log Entry <165> T22:14:15.003Z mymachine.example.com evntslog - ID47 [examplesdid@32473 iut="3" eventsource="application" eventid="1011"][examplepriority@32473 class="high"] Syslog Entry SeattlePublicWater/Kinesis/123/Realtime MQTT Record <R,AMZN,T,G,R1> NASDAQ OMX Record
3 Data volume Generated data Available for analysis Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011 IDC: Worldwide Business Analytics Software Forecast and 2011 Vendor Shares
4 Abraham Wald ( )
5
6
7 Big data: best served fresh Big data Hourly server logs: were your systems misbehaving one hour ago Weekly / Monthly Bill: what you spent this billing cycle Daily customer-preferences report from your website s click stream: what deal or ad to try next time Daily fraud reports: was there fraud yesterday Real-time big data Amazon CloudWatch metrics: what went wrong now Real-time spending alerts/caps: prevent overspending now Real-time analysis: what to offer the current customer now Real-time detection: block fraudulent use now
8 Talk outline A tour of Amazon Kinesis concepts in the context of a Twitter Trends service Implementing the Twitter Trends service Amazon Kinesis in the broader context of a big data ecosystem Q&A
9 Sending and reading data from Amazon Kinesis streams Sending Reading HTTP Post Get* APIs AWS SDK LOG4J Kinesis Client Library + Connector Library Apache Storm Flume Fluentd Amazon Elastic MapReduce
10 A Tour of Amazon Kinesis Concepts in the Context of a Twitter Trends Service
11 Demo
12 twitter-trends.com website twitter-trends.com Elastic Beanstalk twitter-trends.com
13 Too big to handle on one box twitter-trends.com
14 The solution: streaming map/reduce My top-10 twitter-trends.com My top-10 Global top-10 My top-10
15 Core concepts Data record Stream Partition key twitter-trends.com My top-10 My top-10 Global top-10 My top-10 Data record Shard Worker Shard: Sequence number
16 How this relates to Amazon Kinesis twitter-trends.com Kinesis Kinesis application
17 Core concepts recapped Data record ~ tweet Stream ~ all tweets (the Twitter Firehose) Partition key ~ Twitter topic (every tweet record belongs to exactly one of these) Shard ~ all the data records belonging to a set of Twitter topics that will get grouped together Sequence number ~ each data record gets one assigned when first ingested Worker ~ processes the records of a shard in sequence number order
18 Using the Amazon Kinesis API directly twitter-trends.com K I N E S I S process(records): iterator = getsharditerator(shardid, { LATEST); while for (record (true) { in records) { [records, updatelocaltop10(record); iterator] = } getnextrecords(iterator, maxrecstoreturn); if process(records); (timetodooutput()) { } writelocaltop10toddb(); } } } while (true) { localtop10lists = scanddbtable(); updateglobaltop10list( localtop10lists); sleep(10);
19 Challenges with using the Amazon Kinesis API directly Manual creation How many of workers and assignment How per many EC2 to shards instance? EC2 instances? twitter-trends.com K I N E S I S Kinesis application
20 Using the Amazon Kinesis Client Library twitter-trends.com K I N E S I S Shard mgmt table Kinesis application
21 Elastic scale and load balancing Auto scaling Group Amazon CloudWatch twitter-trends.com K I N E S I S Shard mgmt table
22 Shard management Auto scaling Group Amazon CloudWatch Amazon SNS AWS Console twitter-trends.com K I N E S I S Shard mgmt table
23 The challenges of fault tolerance K I N E S I S X X X Shard mgmt table twitter-trends.com
24 Fault tolerance support in KCL K I N E S I S Availability Zone X1 Shard mgmt table twitter-trends.com Availability Zone 3
25 Checkpoint, replay design pattern Amazon Kinesis Host A 18X 0310 Shard ID Shard-i Local top-10 {#Obama: 10235, #Congress: #Seahawks: 9910, 9835, } Shard-i Host B Shard ID Lock Seq num Shard-i Host AB 10 0
26 Catching up from delayed processing Amazon Kinesis Host A How long until we ve caught up? Shard-i X2 Host B Shard throughput SLA: - 1 MBps in - 2 MBps out => Catch up time ~ down time Unless your application can t process 2 MBps on a single host => Provision more shards
27 Concurrent processing applications Amazon Kinesis twitter-trends.com Shard-i spot-the-revolution.org
28 Why not combine applications? Amazon Kinesis twitter-trends.com Output state and checkpoint every 10 seconds twitter-stats.com Output state and checkpoint every hour
29 Implementing twitter-trends.com
30 Creating an Amazon Kinesis stream
31 How many shards? Additional considerations: - How much processing is needed per data record/data byte? - How quickly do you need to catch up if one or more of your workers stalls or fails? You can always dynamically reshard to adjust to changing workloads
32 Twitter Trends shard processing code Class TwitterTrendsShardProcessor implements IRecordProcessor { public TwitterTrendsShardProcessor() { public void initialize(string shardid) { public void processrecords(list<record> records, IRecordProcessorCheckpointer checkpointer) { } public void shutdown(irecordprocessorcheckpointer checkpointer, ShutdownReason reason) { }
33 Twitter Trends shard processing code Class TwitterTrendsShardProcessor implements IRecordProcessor { private Map<String, AtomicInteger> hashcount = new HashMap<>(); private long tweetsprocessed = 0; public void processrecords(list<record> records, IRecordProcessorCheckpointer checkpointer) computelocaltop10(records); } } if ((tweetsprocessed++) >= 2000) { } emittodynamodb(checkpointer);
34 Twitter Trends shard processing code private void computelocaltop10(list<record> records) { for (Record r : records) { String tweet = new String(r.getData().array()); String[] words = tweet.split(" \t"); for (String word : words) { if (word.startswith("#")) { if (!hashcount.containskey(word)) hashcount.put(word, new AtomicInteger(0)); hashcount.get(word).incrementandget(); } } } private void emittodynamodb(irecordprocessorcheckpointer checkpointer) { persistmaptodynamodb(hashcount); try { checkpointer.checkpoint(); } catch (IOException KinesisClientLibDependencyException InvalidStateException ThrottlingException ShutdownException e) { // Error handling } hashcount = new HashMap<>(); tweetsprocessed = 0; }
35 Amazon Kinesis as a gateway into the big data ecosystem
36 Big data has a lifecycle Twitter trends anonymized archive Twitter trends statistics Twitter trends processing application twitter-trends.com
37 Connector framework public interface IKinesisConnectorPipeline<T> { IEmitter<T> getemitter(kinesisconnectorconfiguration configuration); IBuffer<T> getbuffer(kinesisconnectorconfiguration configuration); ITransformer<T> gettransformer( KinesisConnectorConfiguration configuration); } IFilter<T> getfilter(kinesisconnectorconfiguration configuration);
38 Transformer and Filter interfaces public interface ITransformer<T> { public T toclass(record record) throws IOException; } public byte[] fromclass(t record) throws IOException; public interface IFilter<T> { } public boolean keeprecord(t record);
39 Tweet transformer for S3 archival public class TweetTransformer implements ITransformer<TwitterRecord> { private final JsonFactory fact = public TwitterRecord toclass(record record) { try { return fact.createjsonparser( record.getdata().array()).readvalueas(twitterrecord.class); } catch (IOException e) { throw new RuntimeException(e); } }
40 Example Amazon Redshift connector Amazon Redshift K I N E S I S Amazon S3
41 Example Amazon Redshift connector v2 K I N E S I S K I N E S I S Amazon Redshift Amazon S3
42
CAPTURING & PROCESSING REAL-TIME DATA ON AWS
CAPTURING & PROCESSING REAL-TIME DATA ON AWS @ 2015 Amazon.com, Inc. and Its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent
Real Time Big Data Processing
Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
Scalability in the Cloud HPC Convergence with Big Data in Design, Engineering, Manufacturing
Scalability in the Cloud HPC Convergence with Big Data in Design, Engineering, Manufacturing July 7, 2014 David Pellerin, Business Development Principal Amazon Web Services What Do We Hear From Customers?
Hadoop & Spark Using Amazon EMR
Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?
Thing Big: How to Scale Your Own Internet of Things. Walter'Pernstecher'-'[email protected]' Dr.'Markus'Schmidberger'-'schmidbe@amazon.
Thing Big: How to Scale Your Own Internet of Things Walter'Pernstecher'-'[email protected]' Dr.'Markus'Schmidberger'-'[email protected]' Internet of Things is the network of physical objects or "things"
Building your Big Data Architecture on Amazon Web Services
Building your Big Data Architecture on Amazon Web Services Abhishek Sinha @abysinha [email protected] AWS Services Deployment & Administration Application Services Compute Storage Database Networking
Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: [email protected] Website: www.qburst.com
Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...
Amazon Web Services. Lawrence Berkeley LabTech Conference 9/10/15. Jamie Baker Federal Scientific Account Manager AWS WWPS bakjames@amazon.
Web Services Lawrence Berkeley LabTech Conference 9/10/15 Jamie Baker Federal Scientific Account Manager AWS WWPS [email protected] 2015, Web Services, Inc. or its Affiliates. All rights reserved. AWS
Introduction to Amazon Web Services! Leo Zhadanovsky! @leozh [email protected]! Senior Solutions Architect
Introduction to Amazon Web Services! Leo Zhadanovsky! @leozh [email protected]! Senior Solutions Architect AWS HISTORY About How didamazon Amazon Web Services! Deep experience in building and operating global
Innovative Geschäftsmodelle Ermöglicht durch die AWS Cloud
Innovative Geschäftsmodelle Ermöglicht durch die AWS Cloud Rolf Kersten Business Development Manager Amazon Web Services Germany GmbH 2. Juli 2014 2014 Software AG. All rights reserved. Sechs Dinge, die
Real-time Data Analytics mit Elasticsearch. Bernhard Pflugfelder inovex GmbH
Real-time Data Analytics mit Elasticsearch Bernhard Pflugfelder inovex GmbH Bernhard Pflugfelder Big Data Engineer @ inovex Fields of interest: search analytics big data bi Working with: Lucene Solr Elasticsearch
Amazon Web Services. 2015 Annual ALGIM Conference. Tim Dacombe-Bird Regional Sales Manager Amazon Web Services New Zealand
Amazon Web Services 2015 Annual ALGIM Conference Tim Dacombe-Bird Regional Sales Manager Amazon Web Services New Zealand 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Who
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
Amazon EC2 Product Details Page 1 of 5
Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Functionality Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of
Background on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros
David Moses January 2014 Paper on Cloud Computing I Background on Tools and Technologies in Amazon Web Services (AWS) In this paper I will highlight the technologies from the AWS cloud which enable you
Introduction to AWS in Higher Ed
Introduction to AWS in Higher Ed Lori Clithero [email protected] 206.227.5054 University of Washington Cloud Day 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 2 Cloud democratizes
BERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved
BERLIN 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Build Your Mobile App Faster with AWS Mobile Services Jan Metzner AWS Solutions Architect @janmetzner Danilo Poccia AWS Technical
Amazon Glacier. Developer Guide API Version 2012-06-01
Amazon Glacier Developer Guide Amazon Glacier: Developer Guide Copyright 2016 Amazon Web Services, Inc. and/or its affiliates. All rights reserved. Amazon's trademarks and trade dress may not be used in
19.10.11. Amazon Elastic Beanstalk
19.10.11 Amazon Elastic Beanstalk A Short History of AWS Amazon started as an ECommerce startup Original architecture was restructured to be more scalable and easier to maintain Competitive pressure for
LONDON. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved
LONDON 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Best Practices for Building Partner Managed Services on AWS Kelly Hartman, Global Segment Leader, MSPs Kyle Lichtenberg, Solutions
SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES
SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES AWS GLOBAL INFRASTRUCTURE 10 Regions 25 Availability Zones 51 Edge locations WHAT
Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia
Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop
Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect
on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze
AWS Cloud for HPC and Big Data
AWS Cloud for HPC and Big Data David Pellerin, Business Development Principal IDC HPC User Forum September 16, 2014 AWS Regions US West (Oregon) US West (Northern California) GovCloud (ITAR Compliance)
15-319 / 15-619 Cloud Computing. Recitation 5 September 29 th & October 1 st 2015
15-319 / 15-619 Cloud Computing Recitation 5 September 29 th & October 1 st 2015 1 Overview Administrative Stuff Last Week s Reflection Project 2.1, OLI Modules 5 & 6, Quiz 3 New concepts This week s schedule
Amazon Kinesis and Apache Storm
Amazon Kinesis and Apache Storm Building a Real-Time Sliding-Window Dashboard over Streaming Data Rahul Bhartia October 2014 Contents Contents Abstract Introduction Reference Architecture Amazon Kinesis
AIST Data Symposium. Ed Lenta. Managing Director, ANZ Amazon Web Services
AIST Data Symposium Ed Lenta Managing Director, ANZ Amazon Web Services Why are companies adopting cloud computing and AWS so quickly? #1: Agility The primary reason businesses are moving so quickly to
Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
HADOOP BIG DATA DEVELOPER TRAINING AGENDA
HADOOP BIG DATA DEVELOPER TRAINING AGENDA About the Course This course is the most advanced course available to Software professionals This has been suitably designed to help Big Data Developers and experts
CLOUD COMPUTING FOR THE ENTERPRISE AND GLOBAL COMPANIES Steve Midgley Head of AWS EMEA
CLOUD COMPUTING FOR THE ENTERPRISE AND GLOBAL COMPANIES Steve Midgley Head of AWS EMEA AWS Introduction Why are enterprises choosing AWS? What are enterprises using AWS for? How are enterprise getting
Architectures for massive data management
Architectures for massive data management Apache Kafka, Samza, Storm Albert Bifet [email protected] October 20, 2015 Stream Engine Motivation Digital Universe EMC Digital Universe with
AWS Lambda. Developer Guide
AWS Lambda Developer Guide AWS Lambda: Developer Guide Copyright 2015 Amazon Web Services, Inc. and/or its affiliates. All rights reserved. Amazon's trademarks and trade dress may not be used in connection
www.boost ur skills.com
www.boost ur skills.com AWS CLOUD COMPUTING WORKSHOP Write us at [email protected] BOOSTURSKILLS No 1736 1st Amrutha College Road Kasavanhalli,Off Sarjapur Road,Bangalore-35 1) Introduction &
Scalable Architecture on Amazon AWS Cloud
Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies [email protected] 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect
Amazon Elastic MapReduce. Jinesh Varia Peter Sirota Richard Cole
Amazon Elastic MapReduce Jinesh Varia Peter Sirota Richard Cole Start End From IDE Command line Web Console Notify Input Data Get Results Start End From IDE Command line Web Console AWS EC2 Instance Notify
Monitoring Linux and Windows Logs with Graylog Collector. Bernd Ahlers Graylog, Inc.
Monitoring Linux and Windows Logs with Graylog Collector Bernd Ahlers Graylog, Inc. Structured Logging & Introduction to Graylog Collector Bernd Ahlers Graylog, Inc. Introduction: Graylog Open source log
Razvoj Java aplikacija u Amazon AWS Cloud: Praktična demonstracija
Razvoj Java aplikacija u Amazon AWS Cloud: Praktična demonstracija Robert Dukarić University of Ljubljana Faculty of Computer and Information Science Laboratory for information systems integration Competence
How To Use Aws.Com
Crypto-Options on AWS Bertram Dorn Specialized Solutions Architect Security/Compliance Network/Databases Amazon Web Services Germany GmbH Amazon.com, Inc. and its affiliates. All rights reserved. Agenda
Technology Enablement
SOLUTION OVERVIEW 1 ABOUT TECHMILEAGE Founded in 2008 / Tempe, Arizona Over 100 engagements Full range of business & technology services Software Development, Big Data, Cloud/AWS, BI, Advanced Analytics
Why should you look at your logs? Why ELK (Elasticsearch, Logstash, and Kibana)?
Authors Introduction This guide is designed to help developers, DevOps engineers, and operations teams that run and manage applications on top of AWS to effectively analyze their log data to get visibility
Big Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Source Software Sudhir Tonse (@stonse) Danny Yuan (@g9yuayon) Netflix is a log generating company that also happens to stream movies
Amazon Web Services. Steve Spano, Brig Gen (Ret), USAF GM, Defense and National Security AWS, Worldwide Public Sector
Amazon Web Services Steve Spano, Brig Gen (Ret), USAF GM, Defense and National Security AWS, Worldwide Public Sector Overview Historical Perspective of IT Transformational Value of Cloud Why organizations
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, [email protected] Assistant Professor, Information
Logentries Insights: The State of Log Management & Analytics for AWS
Logentries Insights: The State of Log Management & Analytics for AWS Trevor Parsons Ph.D Co-founder & Chief Scientist Logentries 1 1. Introduction The Log Management industry was traditionally driven by
Time series IoT data ingestion into Cassandra using Kaa
Time series IoT data ingestion into Cassandra using Kaa Andrew Shvayka [email protected] Agenda Data ingestion challenges Why Kaa? Why Cassandra? Reference architecture overview Hands-on Sandbox
Preparing Your IT for the Holidays. A quick start guide to take your e-commerce to the Cloud
Preparing Your IT for the Holidays A quick start guide to take your e-commerce to the Cloud September 2011 Preparing your IT for the Holidays: Contents Introduction E-Commerce Landscape...2 Introduction
Agenda. ! Strengths of PostgreSQL. ! Strengths of Hadoop. ! Hadoop Community. ! Use Cases
Postgres & Hadoop Agenda! Strengths of PostgreSQL! Strengths of Hadoop! Hadoop Community! Use Cases Best of Both World Postgres Hadoop World s most advanced open source database solution Enterprise class
Systems Integration in the Cloud Era with Apache Camel. Kai Wähner, Principal Consultant
Systems Integration in the Cloud Era with Apache Camel Kai Wähner, Principal Consultant Kai Wähner Main Tasks Requirements Engineering Enterprise Architecture Management Business Process Management Architecture
AWS Account Setup and Services Overview
AWS Account Setup and Services Overview 1. Purpose of the Lab Understand definitions of various Amazon Web Services (AWS) and their use in cloud computing based web applications that are accessible over
Microservices on AWS
Microservices on AWS AWS Summit Berlin 2016 Matthias Jung, Solutions Architect Julien Simon, Evangelist April, 12 th, 2016 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda
MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering
MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation
IAN MASSINGHAM. Technical Evangelist Amazon Web Services
IAN MASSINGHAM Technical Evangelist Amazon Web Services From 2014: Cloud computing has become the new normal Deploying new applications to the cloud by default Migrating existing applications as quickly
Cloud Computing with Amazon Web Services and the DevOps Methodology. www.cloudreach.com
Cloud Computing with Amazon Web Services and the DevOps Methodology Who am I? Max Manders @maxmanders Systems Developer at Cloudreach @cloudreach Director / Co-Founder of Whisky Web @whiskyweb Who are
Deep Dive: Maximizing EC2 & EBS Performance
Deep Dive: Maximizing EC2 & EBS Performance Tom Maddox, Solutions Architect 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved What we ll cover Amazon EBS overview Volumes Snapshots
Professional Hadoop Solutions
Brochure More information from http://www.researchandmarkets.com/reports/2542488/ Professional Hadoop Solutions Description: The go-to guidebook for deploying Big Data solutions with Hadoop Today's enterprise
Finding the Needle in a Big Data Haystack. Wolfgang Hoschek (@whoschek) JAX 2014
Finding the Needle in a Big Data Haystack Wolfgang Hoschek (@whoschek) JAX 2014 1 About Wolfgang Software Engineer @ Cloudera Search Platform Team Previously CERN, Lawrence Berkeley National Laboratory,
Bernd Ahlers Michael Friedrich. Log Monitoring Simplified Get the best out of Graylog2 & Icinga 2
Bernd Ahlers Michael Friedrich Log Monitoring Simplified Get the best out of Graylog2 & Icinga 2 BEFORE WE START Agenda AGENDA Introduction Tools Log History Logs & Monitoring Demo The Future Resources
AWS Directory Service. Simple AD Administration Guide Version 1.0
AWS Directory Service Simple AD Administration Guide AWS Directory Service: Simple AD Administration Guide Copyright 2015 Amazon Web Services, Inc. and/or its affiliates. All rights reserved. Amazon's
AWS Performance Tuning
AWS Performance Tuning Markus Albe @Percona Fernando Ipar @Percona Ryan Lowe @Square PLNY 2012 Amazon Web Services Cloud Formation CloudFront CloudSearch CloudWatch DirectConnect DynamoDB ec2 ElastiCache
SECURITY IS JOB ZERO. Security The Forefront For Any Online Business Bill Murray Director AWS Security Programs
SECURITY IS JOB ZERO Security The Forefront For Any Online Business Bill Murray Director AWS Security Programs Security is Job Zero Physical Security Network Security Platform Security People & Procedures
Cloudera Certified Developer for Apache Hadoop
Cloudera CCD-333 Cloudera Certified Developer for Apache Hadoop Version: 5.6 QUESTION NO: 1 Cloudera CCD-333 Exam What is a SequenceFile? A. A SequenceFile contains a binary encoding of an arbitrary number
ITIL Event Management in the Cloud
ITIL Event Management in the Cloud An AWS Cloud Adoption Framework Addendum July 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided for informational
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
Cloud computing - Architecting in the cloud
Cloud computing - Architecting in the cloud [email protected] 1 Outline Cloud computing What is? Levels of cloud computing: IaaS, PaaS, SaaS Moving to the cloud? Architecting in the cloud Best practices
Migration Scenario: Migrating Batch Processes to the AWS Cloud
Migration Scenario: Migrating Batch Processes to the AWS Cloud Produce Ingest Process Store Manage Distribute Asset Creation Data Ingestor Metadata Ingestor (Manual) Transcoder Encoder Asset Store Catalog
Application Security Best Practices. Matt Tavis Principal Solutions Architect
Application Security Best Practices Matt Tavis Principal Solutions Architect Application Security Best Practices is a Complex topic! Design scalable and fault tolerant applications See Architecting for
Amazon Web Services. For Government, Education, and Nonprofit Organizations. Jakob Huhn. [email protected]. Partner Manager Benelux, Public Sector
Amazon Web Services For Government, Education, and Nonprofit Organizations Jakob Huhn Partner Manager Benelux, Public Sector [email protected] 2015, Amazon Web Services, Inc. or its Affiliates. All rights
Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
Headline Goes Here Hari Shreedharan Speaker Name or Subhead Goes Here
Real Time Data Ingest into Hadoop DO NOT USE PUBLICLY using Flume PRIOR TO 10/23/12 Headline Goes Here Hari Shreedharan Speaker Name or Subhead Goes Here CommiDer and PMC Member, Apache Flume SoIware Engineer,
MapReduce (in the cloud)
MapReduce (in the cloud) How to painlessly process terabytes of data by Irina Gordei MapReduce Presentation Outline What is MapReduce? Example How it works MapReduce in the cloud Conclusion Demo Motivation:
Big Data Management and Security
Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value
Cloud Big Data Architectures
Cloud Big Data Architectures Lynn Langit QCon Sao Paulo, Brazil 2016 About this Workshop Real-world Cloud Scenarios w/aws, Azure and GCP 1. Big Data Solution Types 2. Data Pipelines 3. ETL and Visualization
Unified Batch & Stream Processing Platform
Unified Batch & Stream Processing Platform Himanshu Bari Director Product Management Most Big Data Use Cases Are About Improving/Re-write EXISTING solutions To KNOWN problems Current Solutions Were Built
Amazon Cloud Storage Options
Amazon Cloud Storage Options Table of Contents 1. Overview of AWS Storage Options 02 2. Why you should use the AWS Storage 02 3. How to get Data into the AWS.03 4. Types of AWS Storage Options.03 5. Object
Amazon Simple Notification Service. Developer Guide API Version 2010-03-31
Amazon Simple Notification Service Developer Guide Amazon Simple Notification Service: Developer Guide Copyright 2014 Amazon Web Services, Inc. and/or its affiliates. All rights reserved. The following
3 Reasons Enterprises Struggle with Storm & Spark Streaming and Adopt DataTorrent RTS
. 3 Reasons Enterprises Struggle with Storm & Spark Streaming and Adopt DataTorrent RTS Deliver fast actionable business insights for data scientists, rapid application creation for developers and enterprise-grade
A programming model in Cloud: MapReduce
A programming model in Cloud: MapReduce Programming model and implementation developed by Google for processing large data sets Users specify a map function to generate a set of intermediate key/value
Xiaoming Gao Hui Li Thilina Gunarathne
Xiaoming Gao Hui Li Thilina Gunarathne Outline HBase and Bigtable Storage HBase Use Cases HBase vs RDBMS Hands-on: Load CSV file to Hbase table with MapReduce Motivation Lots of Semi structured data Horizontal
Backup and Recovery of SAP Systems on Windows / SQL Server
Backup and Recovery of SAP Systems on Windows / SQL Server Author: Version: Amazon Web Services sap- on- [email protected] 1.1 May 2012 2 Contents About this Guide... 4 What is not included in this guide...
EEDC. Scalability Study of web apps in AWS. Execution Environments for Distributed Computing
EEDC Execution Environments for Distributed Computing 34330 Master in Computer Architecture, Networks and Systems - CANS Scalability Study of web apps in AWS Sergio Mendoza [email protected]
Building Success on Acquia Cloud:
Building Success on Acquia Cloud: 10 Layers of PaaS TECHNICAL Guide Table of Contents Executive Summary.... 3 Introducing the 10 Layers of PaaS... 4 The Foundation: Five Layers of PaaS Infrastructure...
Monitoring and Scaling My Application
Monitoring and Scaling My Application In the last chapter, we looked at how we could use Amazon's queuing and notification services to add value to our existing application. We looked at how we could use
Java SE 8 Programming
Oracle University Contact Us: 1.800.529.0165 Java SE 8 Programming Duration: 5 Days What you will learn This Java SE 8 Programming training covers the core language features and Application Programming
How To Set Up Wiremock In Anhtml.Com On A Testnet On A Linux Server On A Microsoft Powerbook 2.5 (Powerbook) On A Powerbook 1.5 On A Macbook 2 (Powerbooks)
The Journey of Testing with Stubs and Proxies in AWS Lucy Chang [email protected] Abstract Intuit, a leader in small business and accountants software, is a strong AWS(Amazon Web Services) partner
Edge Configuration Series Reporting Overview
Reporting Edge Configuration Series Reporting Overview The Reporting portion of the Edge appliance provides a number of enhanced network monitoring and reporting capabilities. WAN Reporting Provides detailed
PERFORMANCE MODELS FOR APACHE ACCUMULO:
Securely explore your data PERFORMANCE MODELS FOR APACHE ACCUMULO: THE HEAVY TAIL OF A SHAREDNOTHING ARCHITECTURE Chris McCubbin Director of Data Science Sqrrl Data, Inc. I M NOT ADAM FUCHS But perhaps
Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.
Architectures Cluster Computing Job Parallelism Request Parallelism 2 2010 VMware Inc. All rights reserved Replication Stateless vs. Stateful! Fault tolerance High availability despite failures If one
StorReduce Technical White Paper Cloud-based Data Deduplication
StorReduce Technical White Paper Cloud-based Data Deduplication See also at storreduce.com/docs StorReduce Quick Start Guide StorReduce FAQ StorReduce Solution Brief, and StorReduce Blog at storreduce.com/blog
How to Leverage Cloud to Quickly Build Scalable Applications
How to Leverage Cloud to Quickly Build Scalable Applications Chris Keyser Principal Solution Architect David Polley Senior Director Cloud Product Management Cloud Growth Recent IDC cloud research shows
Amazon Relational Database Service (RDS)
Amazon Relational Database Service (RDS) G-Cloud Service 1 1.An overview of the G-Cloud Service Arcus Global are approved to sell to the UK Public Sector as official Amazon Web Services resellers. Amazon
Conjugating data mood and tenses: Simple past, infinite present, fast continuous, simpler imperative, conditional future perfect
Matteo Migliavacca (mm53@kent) School of Computing Conjugating data mood and tenses: Simple past, infinite present, fast continuous, simpler imperative, conditional future perfect Simple past - Traditional
Cloud Computing project Report
Name: Trasily Krishnan Shanmugham Cloud Computing project Report An online business can use load balancing and auto-scaling to support unexpected usage peaks and save money when usage is lower. Amazon
The Inside Scoop on Hadoop
The Inside Scoop on Hadoop Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. [email protected] [email protected] @OrionGM The Inside Scoop
